flux: October 2008

In my last entry I wrote a bit about optimizing my little project. One other significant optimization I found was inode-sorting, from an idea I got from some old postings on the mutt mailing list.

The idea is as follows: some file systems, in particular ext3, support hashed b-trees to speed-up lookups in large directories (paper). That's nice for finding particular files. However, as a side-effect, when you scan full directories (as mu does when indexing), you might get the entries back in a rather chaotic order. If you then try to open the files in that order, you suffer from long seek times, and consequently, bad performance.

The solution is to sort the dir entries by their inode (in ascending order), and then open the corresponding files in that order. This is what mu (mu-index) does by default, starting with version 0.3. You can turn it off with --tune-sort-inodes=0, but there is usually little need for that, as the overhead of sorting is negligible.

So, what difference does it make? Answer: it depends on how the files are laid out; if you already get your files back in their 'natural order', there won't be much difference - this is what happens on my main machine. But, on another (old) machine where the files are not in that order, the improvements are substantial: I found that indexing 1500 message in 25 seconds without inode-sorting, goes down to 15 seconds with inode-sorting; a nice 40% improvement.

Note(1): this works for ext3 directories with dir_index enabled; there's a HOWTO. There are other file systems that have similar features, but I haven't tested those. Note(2): This optimization is not very useful for flash-based file systems, as they don't really care in what order you open files.

As discussed before, I am working on a little hobby project called mu, for indexing/searching e-mail messages in maildirs. As a true hobby project, it's about finding things out. I'll take notes as I go along.

indexing

One important part of indexing and searching is.... indexing. Indexing (in this context) is the operation of recursively going through a maildir, analyzing each message file, and storing the results in a database. In mu's case, there are actually two databases, one SQLite-database and one Xapian-database (a really interesting tool - to be discussed later).

Indexing may take a considerable amount of time; mu version 0.1 took 192 seconds (on average) to index 10000 messages in my testing corpus. And this version did not even support the Xapian database. Indexing involves reading from disk, querying the database to see if the message is already there, and if not, storing the message metadata. Because of this scheme, re-indexing of the same 10000 messages only takes about 5 seconds (with re-indexing, only modified/new messages need to be indexed).

The full indexing operation probably does not happen very often, for most people. Still, I think it's very worthwhile to try and make it faster. Nobody likes to wait for 192 seconds, even once - and during development, I need to do a full index rather often. Another important reason is that optimizing software is simply interesting - which is a main motivator for a hobby project.

So, let's see how we can make this a bit faster; here I'll only discuss some of the database-related optimizations.

transactions

As mentioned, mu stores the indexing data in two databases; one SQLite-database and one Xapian-database. Both of these databases know the concept of a transaction. By default, SQLite puts every query in a separate transaction. This is very safe, but also quite expensive. When indexing messages, there is no risk of data loss, so it's quite reasonable to increase the transaction size. And this makes things a lot faster. Between mu version 0.1 to 0.2, I increased the default from one transaction per message (3 queries) to one transaction per 100 messages. This made indexing more than 2.5 times faster -- see the table below. This improvement is even more impressive when considering that I also added full-text search, indexing message bodies as well (this is what Xapian is for).

For Xapian transactions, the default value I chose is 1000 transactions -- but the performance effects are much smaller. So, my 'optimal' values, are 100 and 1000, respectively. I found that transactions bigger than that don't improve the performance very much, but of course still affect memory usage. You can tune these with --tune-sqlite-transaction-size and --tune-xapian-transaction-size. The defaults should be just fine for the normal desktop use case - still, if you need a less memory-hungry but slower version, that is possible too. See the mu-index(1) man page for details.

pragmatic

Another area for performance are SQLite's PRAGMA-statements. Some useful ones are PRAGMA synchronous= (which you can influence with --tune-synchronous and PRAGMA temp_store=, which you can tune with --tune-temp-store. Again, see the mu-index(1) man page for details.

It turns out that PRAGMA synchronous allows for some improvement. This setting determines whether SQLite does it writes in a synchronous way. It's faster (and slightly less safe, but the notes at the end of this blog entry). From the table below, it seems that PRAGMA temp_store does not make much difference in this case. This PRAGMA determines where we store temporary (non-committed) results. Some testing suggests this is because, when we do not enable synchronous writing (above), even the 'file' temp_store never physically hits the disk, due to caching by the kernel.

results

Having optimization options tunable through command line options is really useful. Software optimization, especially from what your read online, seems to be a field full of myths, outdated 'facts' and placebo-effects. And even if the information is correct, it may not apply to your use case. The only thing you can do is measure it. And with command line-options I can easily do that, as well as see how various combinations of optimizations perform.

Here's a table with the results for indexing 10000 messages with version 0.3. Between all the runs, I used

# sync && echo 3 > /proc/sys/vm/drop_caches

to flush the caches. That's a critical step - the kernel caches a lot of data, which makes subsequent runs much faster if you don't flush the caches. And that is not what I wanted to measure.

msg/sqlite tx	msg/xapian tx	synchronous sqlite	temp store sqlite	time (s)	notes
1	1	full	file	1536
1	1	normal	default	182	similar to defaults for mu 0.1, but faster
100	1000	full	file	73
100	1000	no	file	68
100	1000	no	memory	68	default for mu 0.3
10000	10000	no	memory	67

As an example, the default for mu version 0.3 is equivalent to:

./mu-index --tune-sqlite-transaction-size=100 --tune-xapian-transaction-size=1000  --tune-synchronous=0 --tune-temp-store=2 ~/data/testmaildir

Again, see the mu-index(1) manpage for details.

Note, these optimizations are a good strategy for indexing data, that is, generating data from data that is already safely stored somewhere else. If anything goes wrong, we can always restart the indexing later. However, if your database stores data that cannot easily be retrieved again afterwards (say, that one occurrence of the Higg's Boson in your particle accelerator), you would want to be a bit more careful.

There are some more optimizations possible; some I have even implemented, such as inode-sorting, which is documented in the mu-index(1) man page. To be discussed some other time.

flux

Wednesday 29 October 2008

a kind of magic

Wednesday 22 October 2008

seek & destroy

Saturday 18 October 2008

chasing time

indexing

transactions

pragmatic

results

Labels

Blog Archive

Links