February 12, 2012

Thoughts on Serialization

The problem domain is short and sweet;
  1. We need to write game data.
  2. We need to read game data.
The only real snag is, being single threaded, we need these operations to block as little as possible -- as in thousandths of a second. I've even toyed with the idea of reading everything at server start and only performing writes during execution. That would cut our blocking problem in half but removes our ability to modify data externally.

The model in my head calls for very little of the R in RDBMS. Heck, MUDs have managed very well for decades using flat files on systems with less processing power than your dishwasher.

Lastly, given that this hobby-coding, I have the luxury of asking, 'is the solution fun?' Let's set a couple goals;
  1. I don't want a crash to lose the overall state of the game -- which precludes saving writes for an extended period, or even worse, until server shutdown.
  2. Simple to install. Freely available software that does not require a DBA to manage.
  3. Simple to operate. No fretting dirty caches, scheduled housekeeping , etc.
  4. I would prefer back-ups and restores to be file copies. Tar'ing a directory or two is fine.
  5. Some external method to futz with the data while the game is running.
Flat files are certainly doable (and made crazy easy with Python's Pickle module). They meet goals #2, #3, and #4 but I cringe at the amount of runtime file-io needed to cover goal #1 and supporting goal #5 means we have to perform constant reads as well (plus some form of file-locking mechanism). Ick.

A lightweight dbms like SQLite would work except for goal #3. Wrapping Python CRUD in SQL statements is soul-destroying tedious. Yeah, I could tap a ORM like SQLAlchemy but let's look at that new sexy, NoSQL...

When I was a kid, one Christmas I got this electronics kit with 150 projects. It was awesome. You could build things like a crystal radio, lie detector, and light activated room alarm.

Python holds that same appeal for me. It's a big toy. Another one I've found is Redis -- a dead simple NoSQL data store. The distinguishing feature is all data is held in-memory which makes it wicked fast.

I'll admit, when I first read that it holds everything is RAM it struck me as rather pointless. I mean, wasn't I already doing that in my program? And who wants to hold everything in memory whether you need it or not?

Having played with it, the utility and genius starts to shine through. Redis has a clever system of serializing to a file based on the frequency of updates and you can copy this file at any moment without fear of corrupting it. The author also provides a really nice CLI tool with history and auto-complete similar to a Linux terminal.

We're pushing goal #2 a bit since it requires installing three packages, Redis, Hiredis, and Redis-py but they all loaded with minimal fuss. Hiredis needed the Python development libs you can get on Ubuntu using:

$ sudo apt-get install python-dev

...to be continued


Greg Taylor said...

Serialization/de-serialization are really boring things, so I took the path of least resistance at the time and used CouchDB. I only use it as a data store, and don't bother querying it.

The perk of this arrangement is that serialization is amazingly simple. Here's all that is involved in my save routine:


All in-game objects inherit from BaseObject, which stuffs all kwargs it's instantiated with into its data dict:


Saving then just becomes a matter of saving that data dict. For my purposes, you never directly instantiate or manipulate BaseObject._odata, and instead go through properties and such, since I much prefer to avoid direct tampering (in case I need to change things up).

I'm biased, but I think this is an even simpler approach :)

Jim said...

Thanks for the comments, Greg.

I'm working with Python's property() function to provide a getter/setter for those class attributes that I want persistent.

Probably the only difference in our approach is I'd like to be able to type "> set player:23 credits 100" into the Redis CLI and have a hundred credits immediate appear in player 23's coffers.

Is that superior to some in-game immortal command to do the same thing?

I'm not fully sure yet.

Greg Taylor said...

I briefly considered allowing modification of game data by directly messing with the documents in CouchDB, but ended up realizing that I didn't want to rely so heavily on my data store after server startup. The potential issues with that by far outweighed the gains. There's always the possibility for data being over-written if you've got the game and external sources with their hands in the data at the same time.

What I'll be doing is adding a really simple handful of JSON calls to the game server process used for interacting with in-game objects. This way, the game remains the authoritative go-to place for current data, and we avoid contention issues. Objects that have been modified are marked as "dirty" in-game, which will cause a Twisted LoopingCall to asynchronously dump that object to CouchDB as it works through the queue of dirty objects.