2010-01-21

E-Blob and NoSQL options

Morton Tocker just recently wrote the blog article When Should You Store Objects in the Database where he talks about what Josh Berkus calls the 'E-blob antipattern'. There is much disagreement on whether this is a pattern or an antipattern.

This pattern is where you take all the information about something, and serialize it into a string, usually in JSON format, and then insert them into a very simple MySQL InnoDB table that contains only a primary key column and a "data" BLOB column to keep that serialized string in.

I can understand (and even make) the arguments pro and con.

What bugs me about this pattern is that you have basically created a document store or an object store, but with all the complexity and brittleness of running a SQL database that you can't run SQL queries against. The only reason to be using MySQL here is because your organization's IT infrastructure is already knows how to run and maintain MySQL servers.

If you are going to keep your data in semi-regular JSON format, instead consider actually running a real document store instead, such as CouchDB or MongoDB. You can keep the data in the same JSON documents, still do the fast primary key lookups, and you can do more sophisticated queries, plus take advantage of the scaling solutions provided by those systems.

If you dont need the document store features and just want an object store, then use an object store. Get the SQL parsing and serialization overhead out of the way. Consider using Tokyo Tyrant, or Memcached, or Redis, or a web server that handles PUT, or even just a bunch of NFS mounted volumes. If you are running in AWS, keep them in S3.

The libmemcached library has an API for bolting the memcached protocol onto any sort of object store implementation. It's brand new, and pretty buggy, but it is improving quickly, and soon may be THE way to access an object store.

2 comments:

  1. Hi Mark,

    s/Morton Thocker/Morgan Tocker/ ;)

    I was trying to keep the arguments purely around performance and MySQL. I only deviated to talking NOSQL in the comments.

    I wasn't aware of such a libmemcached library. Any links to specifications?

    ReplyDelete
  2. Hi Morton. Sorry about that, fixed.

    It's Brian Aker's libmemcached library.

    ReplyDelete