Dec 3, 2009

Evaluating Jailer - database subsetting tool

We need a small subset of production data out of hundreds of gigabytes - i,e. the records for a test user across all the tables where there is a FK reference. Having the small set of data for all tables, we could  build a database instance on developers' machine quickly. The referential integrity was the main concern. It seemed that Jailer handles things well. It's an open source platform independent tool.

What is Jailer?

Jailer is a tool for database subsetting and sampling, schema browsing, and rendering. It exports consistent, referentially intact row-sets from relational databases. It removes obsolete data without violating integrity.

But the problem I faced - we don't have a common column (for example, user_id or agency_id for which I wanted to pull records from all tables) directly referenced. Even, it's slow with some 100 tables where we have huge amount of data.

However, I love the tool. It's a good one for database analysis and data sampling for small schema!

For details -

http://jailer.sourceforge.net/

***

No comments: