Monday, July 25, 2011

More On Read Scaling

In my previous blog post on read scaling out to 32 cores, I wrote a patch I recently committed to improve read scalability in PostgreSQL.  It turns out, however, that there's a downside to that patch, which was uncovered in testing by Stefan Kaltenbrunner.  (Fortunately, I have a fix.)  And, there's an opportunity for further performance improvement by applying a similar technique to an additional type of lock.  For full details, read on.

Thursday, July 21, 2011

Read Scaling Out to 32 Cores

With the exception of a week's vacation, my last month has been mostly absorbed by PostgreSQL performance work.  Specifically, I've been looking at the workload generated by "pgbench -S", which essentially fires off lots and lots of SELECT queries that all do primary key lookups against a single table.  Even more specifically, I've been looking at the way this workload performs on systems with many CPU cores where (I found) PostgreSQL was not able to use all of the available CPU time to answer queries. Although the single-core performance was around 4,300 tps, performance with 36 clients (on a 24-core server) was only about 36,000 tps.

Research revealed that performance was being limited mostly by PostgreSQL's lock manager.  Each SELECT query needed to lock the table being queried - and its index - against a concurrent DROP operation.   Since PostgreSQL 8.2, the lock manager has been partitioned: a lock request against a database object will be assigned to one of 16 "partitions", and lock requests against objects that fall into different partitions can proceed in parallel.  Unfortunately, that doesn't help much in this case, because only two objects are being locked: the table, and its index.  Most of the traffic therefore targets just two of the sixteen lock manager partitions.  Furthermore, because the query itself is so trivial, the rate of lock and unlock requests is extremely high - on a more complex query, the bottleneck wouldn't be as severe.