Recently, I started work on a small, simple work queue written in C#, called NWorkQueue. Durability/persistence is one of the core requirements for the project. Where better to find durability with one of the most widely distributed embedded databases on the planet, SQLite? It’s rock solid journaling is famous for keeping corrupted database files to a minimum.
While performing some insert tests of 10,000 records, I noticed a HUGE increase in performance when all the inserts were wrapped in a single transaction. Within one transaction it took about 300+ ms. Not earth shattering, but not bad. With each insert in it’s own transaction, performance dropped off a cliff to 63 seconds. Can that be right?!
Sadly, many of my inserts will probably not be grouped together in a single transaction. I was now worried I was giving up performance for durability by using SQLite.
I began a search for SQLite performance tuning and stumbled upon numerous PRAGMA statements for SQLite. I found three that seemed to have an impact. The first two,
PRAGMA TEMP_STORE = MEMORY;
PRAGMA SYNCHRONOUS = NORMAL;
Temp_Store is a no brainer, as there is little disadvantage to moving temp storage into memory. SYNCHRONOUS is a little more tricky. By reducing it from FULL, I dropped the 63 seconds to 53 seconds and it appears I gave up very little in resiliency, although it is slightly less safe. Paraphrasing from the SQLite documentation, if you get a corrupt database file while in FULL mode, your disk is also probably non-operable, with the take-away from that being, that it’s hard to corrupt a SQLite database in FULL mode.
So, that’s a small victory, but a very small one. I’m down to 53 seconds, and that’s still terrible. On to more tuning…
There are some interesting options for the JOURNAL_MODE PRAGMA, but unfortunately they greatly reduce the resiliency, except for one:
PRAGMA JOURNAL_MODE = WAL;
In the documentation for SYNCHRONOUS, it mentions that FULL mode is a common setting when in WAL mode. WAL mode? What’s this new feature?
As of 2010, SQLite added an alternative method for tracking transactions instead of journaling: Write Ahead Logging. It’s turned off by default, since SQLite puts high value on backwards compatibility. Apparently, this is a much more advanced method and works especially well with frequent writes, the same scenario I’m expecting.
So switching on WAL caused my tests to run so fast, I thought they were broken. It dropped the single transaction from 300ms to between 75 and 100ms. What about the 10,000 transactions? It dropped from 53 seconds to 622ms. Wow! That’s an 85x improvement in speed.
If you need not only resiliency, but speed as well, I strongly encourage you to look at using WAL for journaling. There are some caveats and it may not be perfect for every situation, but the docs do a great job of explaining those.
As a bonus, with the introduction of Windows 10, Microsoft started including the SQLite DLL with windows. Keep in mind, if you are using .NET Core, you will need to make changes to accommodate the lack of a SQLite DLL in non-windows systems, but it’s wonderful to see Microsoft embrace such a great product as SQLite.