Oops

I’m coming down to the wire of this database re-architecture task. I’ve been working 60 hours a week for 8 weeks now on this thing, and it’s due this Friday. Unfortunately, I have come to the stunning realization that there is a gaping problem in my design.

The previous design used one Connection per JVM, but in auto-commit mode. I kept the one Connection per JVM, but turned off auto-commit mode. Also, in the old design we generally would make a PreparedStatement, do the query, and then forget about the PreparedStatement. I don’t know if PostgreSQL and JDBC do any of the sort of query pre-parsing and optimization with PreparedStatements, but way back when I worked with Oracle we were constantly told that pre-compiling statements was a major win, so this time around I prepare a bunch of PreparedStatements on connection, and just keep re-using them as needed.

So far the transition has been moderately painful – I’ve had to find all the places where something doing database operations is either firing an EventBroker event, calling an RMI method, or just finishing a thread so it can do a commit first. And every morning when I come in to work, I find the nightly database vacuum frozen with some Java task or other with a PostgreSQL task in “idle in transaction” state, and have to figure out what I missed.

But today I realized a flaw with my design: our system has always relied on autocommit, so there was no question about transactions, and rollbacks and that sort of thing. We use other methods to make sure things are consistent. But now we’re not committing every statement, I have to worry whether one thread’s commit is interfering with some other thread’s operations.

So ideally I think what I really want is a separate database connection per thread. And that means some sort of connection pooling mechanism. Not sure if that means I’m going to lose the whole “persistent PreparedStatement” thing I was hoping would help performance. Well, I don’t think that’s going to happen before Friday.