Android: Using DatabaseUtils.InsertHelper for faster insertions into SQLite database

AndroidOS includes the DatabaseUtils.InsertHelper class for speeding up insertions into an SQLite database. However, very little documentation or examples seem to be available to show how to use this class. I hope this post will help to make InsertHelper a little less mysterious than it apparently has been.

It’s often the case that bulk insertions are performed in the onCreate method of an SQLiteOpenHelper, so I’ll use that context for this example. (This has implications on transaction management, which in turn affects performance, as described below.)

Suppose the onCreate method currently looks something like this:

private class DatabaseHelper extends SQLiteOpenHelper {
    @Override
    public void onCreate(SQLiteDatabase db) {
        ContentValues values = new ContentValues();
        while (moreRowsToInsert) {
            // ... create the data for this row (not shown) ...

            // Add the data for each column
            values.put("Greek", greekData);
            values.put("Ionic", ionicData);
            // ...
            values.put("Roman", romanData);

            // Insert the row into the database.
            db.insert("columnTable", null, values);
        }
    }
    //...
}

Using DatabaseUtils.InsertHelper, this would be re-written as:

import android.database.DatabaseUtils.InsertHelper;
//...
private class DatabaseHelper extends SQLiteOpenHelper {
    @Override
    public void onCreate(SQLiteDatabase db) {
        // Create a single InsertHelper to handle this set of insertions.
        InsertHelper ih = new InsertHelper(db, "columnTable");

        // Get the numeric indexes for each of the columns that we're updating
        final int greekColumn = ih.getColumnIndex("Greek");
        final int ionicColumn = ih.getColumnIndex("Ionic");
        //...
        final int romanColumn = ih.getColumnIndex("Roman");

        while (moreRowsToInsert) {
            // ... Create the data for this row (not shown) ...

            // Get the InsertHelper ready to insert a single row
            ih.prepareForInsert();

            // Add the data for each column
            ih.bind(greekColumn, greekData);
            ih.bind(ionicColumn, ionicData);
            //...
            ih.bind(romanColumn, romanData);

            // Insert the row into the database.
            ih.execute();
        }
    }
    //...
}

As this shows, using InsertHelper is barely more complicated than using SQLiteDatabase.insert. The major differences are that you need to call ih.prepareForInsert() before adding (“binding”) the column data; and you need to obtain each column’s numeric index, which we get by calling ih.getColumnIndex() prior to the loop.

After replacing SQLiteDatabase.insert with DatabaseUtils.InsertHelper, the database insertion speed went from the equivalent of about 95 rows per second to about 525 rows per second. (“Equivalent of”, because the app also spends cycles creating the data to insert. Here, the performance timings are measured using constant data, eliminating that overhead.)

InsertHelper isn’t really doing anything magical here. It’s essentially a wrapper around compiled statements, which you can create yourself using SQLiteDatabase.compileStatement. Most people will probably find InsertHelper easier to use, though.

Other ways to speed up insertions

In addition to that gain, two more changes then brought the insertion speed to well over 900 rows per second. Whether these tricks techniques work for you will depend on your application.

Don’t bind empty columns

In my app, the data for at least 50% of the columns is empty. By skipping the call to ih.bind() when the column data is a null or empty string, I saw a roughly 30% performance boost.

Temporarily disable database thread locking

I’m loading the database during the onCreate method of my app’s SQLiteOpenHelper. During this time, it seems safe to assume that only one thread is accessing the database, so I use SQLiteDatabase.setLockingEnabled() to temporarily (let me emphasize that: temporarily) disable thread locks within the database API. This yielded about a 35% performance gain:

    public void onCreate(SQLiteDatabase db) {
    //...
    try {
        // *Temporarily* (have I emphasized that enough?) disable
        // thread locking in the database. Be sure to re-enable locking
        // within a finally block.
        db.setLockingEnabled(false);
        // ... load the database ...
    }
    finally {
        db.setLockingEnabled(true);
    }

Transactions and performance

A number of people have cited performance gains through use of explicit transactions in SQLite. However, SQLiteOpenHelper creates a transaction before invoking its callback methods (onCreate, onUpgrade, and onOpen), so explicit transaction control is unnecessary within those methods. (SQLiteOpenHelper will assume that the transaction was successful unless your method throws an exception.)

You would need to manage your own transactions if your insertion code is running outside of one of SQLiteOpenHelper‘s callback methods. The main APIs for this are SQLiteDatabase.beginTransaction, SQLiteDatabase.setTransactionSuccessful, and SQLiteDatabase.endTransaction.

It is possible to nest transactions, but, not surprisingly, this doesn’t seem to help performance. In fact, I saw a very slight performance degradation when using nested transactions (approximately 1%, probably below the accuracy of the measurements.) I also tried periodically closing the current transaction — the first of these being the transaction that was opened by SQLiteOpenHelper — then opening a new one. This didn’t yield much improvement, if any.

, , , , ,

21 Comments

  • Jonathan Foley says:

    In my hands nested insertions inside a single transaction was hugely faster than your approach. I’m still using InsertHelper, but using .insert() with ContentValues rather than the bind and execute pattern. I’m not sure what you mean by a transaction lock. Isn’t that was startTransaction() is doing explicitly? How else do you obtain a lock

    • Dan Breslau says:

      Hi, Jonathan,

      I confess, I hadn’t tried insert(ContentValues) before I wrote this. I’ve now tried it, and at least in my case I found it to be much slower than using prepareForInsertbindexecute. (Excluding non-database code, it’s roughly twice as slow.) Of course, I’m not going to claim that my results mean that prepareForInsertbindexecute will always be faster.

      I’m curious about why my results are different from yours, though. Would it be possible for you to send me your code, or traceview trace files (ideally for both test cases)?

      For what it’s worth, the table that I’m initializing has six columns, most of which are sparsely populated. The traceview log for the modified version of my program (using ContentValues) indicates that it’s spending most of the additional time iterating through the ContentValues and looking up the column indexes for each of the columns. If you have only two columns (for example), perhaps that overhead is much lower.

      By the way, this most recent test was on the emulator, running Android OS 2.2. What version are you running?

      In referring to a “transaction lock”, all I meant was that SQLiteOpenHelper starts a transaction before calling the onCreate method, and ends the transaction after onCreate returns. I’ve modified the post to try to clarify this.

    • Dan Breslau says:

      Jonathan explained in emails that his performance gain was achieved through creating transactions in his code, not from the use of ContentValues (which is actually somewhat slower than using the prepareForInsertbindexecute sequence.)

      My post hadn’t made it clear that the example code was running inside of onCreate, where a transaction has already been created by the SQLiteOpenHelper. I’ve updated the post to try to clarify this.

  • I’ve been fighting with a bulk insert of at least 2400 records into a db. The insert is taking place outside of the SQLiteOpenHelper so I wrapped the insert with the transact methods and it sped it up for sure. I’ve also moved away from .insert() to creating my own insert sql and escaping the proper values.

    Have you tested/noticed if raw sql is faster than prepared statements? My last round of testing brought the execute time down from 175,000ms using .insert() without the transact methods to 107,000ms using raw escaped inserts and transact methods.

    • Dan Breslau says:

      Hi, Ernest,

      I’d be very surprised if you didn’t gain a further performance boost from using prepared statements in place of raw SQL. As I noted in the article, you’ll probably see about the same effect whether you use InsertHelper or lower-level APIs for prepared statements. (I think using InsertHelper is simpler, though.)

      If you find that using prepared statements doesn’t seem to help performance, and you’re willing to send me your code (before and after the change to use prepared statements), I’d really appreciate it.

      • I’m going to try and used prepared statements again. When I tried before kept getting an error when it tried running the exec a second time.

        Also wish there was a way to subscribe to comments :/

      • UPDATE
        I tested my code on a Nexus S and it runs really fast. Importing 2400+ records takes about ~6sec. The emulator runs really slow. Testing on a G1 to get a feel for a low end device

  • Progress update
    Prepared inserts are so freaking fast! Finally solved all my issues by straightening out my JSON input. Thank you for this article.

    I also gained some more speed by executing this command
    “PRAGMA synchronous=OFF”

    I set it back to on when the bulk insert is done.

  • mbr says:

    Thank you for this post. It is very helpful.

    Out of interest, how are you reading in the data that goes into the bind statements? I have about 1000 rows (taken from a sqlite .dump) to insert into a table when onCreate is called. The two questions are:
    1) How to store the data. In the assets folder as a csv file?
    2) In the onCreate method, would you read one row at a time from the csv file into variables and then insert that row, or would you read all rows into an array and then loop over the array to insert the rows?

    My data is going into one table with 6 columns. The approx. 1000 rows as a csv file is about 0.5MB to give you an idea of the amount of data.

    Thank you.

    • Dan Breslau says:

      1) I think storing the data as a CSV file in the assets folder makes sense.

      2) If the data is going straight into the database, with no significant changes, then it’s probably easier to read and insert one row at a time. Intuitively, that also seems like it should be faster than using an array, which would add about 0.5MB of GC overhead.

      But if performance were critical, I’d want to base my decision on measurements, because intuition could be wrong. Suppose the transaction commit requires writing to three files — the data file, the index, and the log. With disk-based systems, it may well be faster to buffer the data in an array before doing the writes, as that could optimize the disk performance. That issue probably doesn’t arise with flash memory, which most Android environments would be using; but this still illustrates that intuition isn’t necessarily a reliable guide.

  • Francis says:

    Thanks for this.

    I had huge performance improvment (2min to 3 secs) by using:

    try {
        mDb.beginTransaction();
        for (Value value : values) {
            ih.prepareForInsert();
    
            ih.bind(colIdx, value.getSomeValue());
            // ...
            ih.execute();
        }
        mDb.setTransactionSuccessful();
    } finally {
        mDb.endTransaction();
    }
    
    • Sotiris says:

      I verify that Francis’s method works amazingly good.
      Huge performance improvement when inserting multiple entries across many tables.

      Definitely recommended :)

      I still wonder why Google doesn’t have samples of use at their online documentation…

  • Tom says:

    A very interesting blog post here that I stumbled upon when looking to speed up my xml -> database processing. I have a couple of questions though (unable to try anything out myself at the moment as I’m at work). Firstly are the ih.insert(ContentValues)/ih.replace(ContentValues) synonimous with a normal db.insert()/replace() or with avoiding doing multiple ih.bind() calls? Secondly, if using ih.bind() what is the behavour for existing data? I.e. does it do an insert() – exception on duplicate key, or replace? (the function prepareForInsert() would seem to suggest the former, but you never know with these things!)

    • Dan Breslau says:

      Hi, Tom,
      DatabaseUtils.InsertHelper operates on a single row at a time, much as SQLiteDatabase.insert and SQLiteDatabase.replace do. The difference is that under the hood, DatabaseUtils.InsertHelper uses SQLiteDatabase.CompileStatement, which improves performance.

      I have little experience using DatabaseUtils.InsertHelper.replace. However the documentation for it says:

      Performs an insert [sic], adding a new row with the given values. If the table contains conflicting rows, they are deleted and replaced with the new row.

      (Emphasis added)

      This implies that existing values in each row would be lost via the replace method; the only way to preserve them would be to read them from the DB first, and bind them as you would any updated values before executing the statement.

  • Nick says:

    Thank you for the post. This has helped me a lot. My only question is what do you think is the best way to create the data that we are going to insert? I thought about making a 2d array with the values that I want to be inserted into the table, but for larger projects with 100+ entries it seems inefficient.

  • Dan Breslau says:

    Hi, Nick,

    It really depends on the nature of the data and your application. If the data is mainly alphanumeric and relatively “flat”, then a simple CSV-style text file may be good enough. However, the CSV format is inherently limited regarding what kinds of data it can handle. Think carefully about your choice of delimiters, and consider the use of an external library such as opencsv to do the parsing.

    For somewhat more complicated cases, I’ve been tempted to use YAML a number of times. But as I’ve yet to succumb to that temptation, I can’t really say how well it’d work.

    There’s always XML. If you already have reason to be concerned about performance, then XML — especially with a DOM parser — is probably not the best choice. On the other hand, you probably know what they say about premature optimization. If the data seems complex enough to require XML, then give it a try and measure the performance before writing it off.

    (But wait: Isn’t this entire post about a possibly premature optimization? No, not really. If you know you need to do bulk insertions, then InsertHelper can provide a relatively simple way to implement them — one that just happens to be fast.)

  • [...] you might want to check out the DatabaseUtils.InsertHelper class. Out Of What Box? has a blog post, Android: Using DatabaseUtils.InsertHelper for faster insertions into SQLite database, that covers this in more detail. Developmentandroid, contentprovider, java, mobile, sqlite [...]

  • [...] Android: Using DatabaseUtils.InsertHelper for faster insertions into SQLite database [...]

  • [...] Android: Using DatabaseUtils.InsertHelper for faster insertions into SQLite database [...]

  • Ivan says:

    Thanks for this it helped me go from 2000 rows taking 2minutes to about 3 seconds! Big help.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>