ServiceNow: SOAP Journal

Integration using Web Services

Reading Large Tables

There are enhancements in the most recent release of the DataPump to better support loading from large tables, or I should say, unfiltered loading from large tables.  It is okay to frequently load data from large tables as long as your filter restricts the number of records.  However, unfiltered loads from large tables should happen rarely.  If you have a well designed load strategy then you should normally only load those records which have changed since your last load.

Nevertheless, there are times when it is necessary to perform large loads.  The most obvious situation is when a new data mart table is being seeded for the first time.  The DataPump philosophy is that it is okay for these loads to run a long time, because they are one-time (or at least rare) events.  You can kick off the load on a weekend when activity is at an ebb, and check back some hours later to see if it has finished.

ServiceNow limits the number of records that can be retrieved in a single Web Service request to 250.  If you need to read more than 250 records, there are two techniques:

  • getKeys – Use the getKeys method to return a complete list of sys_ids.  Then use repeated getRecords calls, each requesting 250 or fewer records, with an __encoded_query of the form sys_idINvalue,value,…
  • windowing – Use the extended query parameters __first_row and __last_row to fetch the records in chunks.

The advantage of the getKeys technique is that it is faster.  There is an up-front hit because you require the extra call to retrieve the list of keys.  However, the subsequent getRecords calls are much faster.  The disadvantage is that getKeys may encounter a timeout error if there are too many records.  I am not sure what the practical limit is here, but somewhere north of 100,000 rows, there is a risk that the getKeys call may never return.  So the windowing technique appears to be slower, but safer.

A few weeks ago I realized that getKeys can, itself, be windowed.  As a result, the DataPump now supports a new property: servicenow.getkeys.size.  If the property is 0 or omitted, then no getKeys windowing will be performed (i.e. the application will attempt to get all the keys in a single request).  This is the same as the behavior prior to this release.  However, if the property value is greater than zero the application will make repeated calls to getKeys until all keys have been fetched.

For the purpose of performance testing I chose to load a table with 3.1 million rows.  (The table was the cmdb_ci table.)  The property servicenow.chunksize was set to 100, meaning that records would be fetched 100 at a time.  The property servicenow.getkeys.size was set to 10000.  You can get better performance by setting these properties to higher values, but my objective was test the relative performance of using getKeys verses windowing.  The chunksize can be set as high as 250 and servicenow.getkeys.size should probably be set to 20000 or higher for normal operations.

I loaded the table twice.  Actually, I started two jobs at the same time, writing to different target tables but reading the same table.  As expected, the windowing job got off to a good start, and immediately began extracting and loading records at a rate of about 750 per minute.  The getKeys job had to make more than 300 Web Services calls to retrieve the 3.1 million keys at 10,000 a pop.  This took 2 hours and 18 minutes by which time the windowing job had already loaded 90,000 records.  Then the getKeys job took off.  It began extracting and loading records at a rate of about 10,000 per minute.  7 hours and 12 minutes after starting, the getKeys job had loaded all 3.1 million records.  The load rate for the getKeys job remained constant at about 10,000 per minute.  However, the windowing job slowed down as the __first_row parameter grew larger.  By the 7 hour mark it’s load rate had slowed to 400 rows per minute, and it was less than 8% of the way through the file.  The moral of this story is clear:  do not waste your time with windowing.

Post Script: Two weeks later

Since the slow job was running on a non-production instance, I allowed it to continue.  After a week it was 70% complete and the extract rate had dropped to 100 records per minute.  I finally killed the job because I could not bear to watch its tortured progress.  I was amazed that it would continue running for so long without some sort of “expired session” or other HTTP error.  Should I protect the innocent and simply remove the windowing option from the DataPump?  No.  I think not.  That would prevent someone else from reproducing my test results.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: