Derby benchmarks

Why and what

In this post you will find the results of my tests on derby, sqlite and mysql. I did this test because I need a database for a user application and the program is going to be written in java. This means it should be a lightweight, no administrationbox with database. I also should be free, also in the meaning of open source. I know two of these that work with java: sqlite and derby. Before I started testing I already had these positive and negative things for each system:

Derby
+ Embedable and so crossplatform
+ Advanced sql functions (SQL-92)
- Uses directory structure
- Not so lightweight, it needs about 2mb of ram, but on modern computers this shouldn’t be a problem.
Sqlite
+ Very lightweight, about 250k ram
+ Only one file to store the data, so it could be exchangeable
- Not crosplatform and the JDBC connector is a bitch to compile
- Limited SQL-92 support, no transactions

The test

I ran the tests on a Fedora Core 2 box and an athlon optimized 2.6.8.1 kernel. I went to runlevel 3 and killed all daemons that weren’t needed for the tests. The machine is Athlon XP 3200+ with 512mb ram.
I only wanted to test the lightweight databases but you have to compare it with something so I used mysql. I used mysql because it was already running on my system. If I ran postgresql it was postgresql. It doesn’t really matter, it’s just an reference.

I wrote a little program that first create’s a new table with a int field and a varchar(50) field. I then created an index on the int field. I created a normal index, not an primary key because I generate a random int < 100000 and a random string of 50 char [a-zA-z 0-9].

A loops with queries are timed. In a loop I insert a random record generated like stated above. Then I do a select on the indexed int field, a select on the varchar field. As last a do an update with a random string on all field where the int field is equal to a random number smaller then 100k.
That's what the java program does. I used for the three databases a jdbc connector. The program accepts the database to test and the number of times everything needs to be done.

I then used a wrapper script that executes the java program for each database with 10, 1.000, 10.000, 50.000, 100.000, 250.000. It stops the java vm with every execution.

Results

Some results? sqlite only got to 50000 queries. I aborted the script when it got to 50.000 selects on the varchar field. It was running for more then 1 hour when I aborted the test. I also aborted the mysql run after one hour. (The run for 50000 selects on non-indexed field, the vm crashed.)
Derby completed everything but with some very strange results. I perform not good on the indexed field but I performs wonderfully on the non indexed field. Where the other databases didn’t even got to 250.000 queries it did the non-indexed select in 20,5 seconds wich is super fast. But it needed 1106 seconds to complete the 250k queries on an indexed field. Which is extremly slow.

In this pdf you’ll find the graphs for:

  • Select on an indexed field
  • Select on an unindex field
  • Inserts
  • Update with the where clause on the indexed field

Pdf with graphs (If somebody knows how the export an graph in openoffice, let me know)

Conclusion

Derby is a very good performing database. It is slower when you deal with small datasets but it scales very well. Sqlite somewhat disapointed me and mysql disapointed me a lot.
When I created the test program it was already clear I was going to use derby because it’s so easy to use it in a Java application. The jdbc driver for sqlite (http://www.ch-werner.de/javasqlite/) was really hard to compile. I finally found the mistake in the configure script. Here is a patch for so it will compile on FC2 (patch). This should be done for each platform so it you will lose the benifits of a cross platform application.
Some things that could influence the test and how the tests could be done better:

  1. Only create that much records and always do the same amount of queries (select, update)
  2. I used sqlite 2.8.14 and mysql 3.23.58
  3. I used a snapshot of the apache incubation project for derby, maybe the IBM version works better
  4. Using a primary key in the table, an incrementing field. Maybe bdery performss better on indexed selects.

About this entry