I'm looking for recommendations of a good, free tool for generating sample data for the purpose of loading into test databases. By analogy, something that produces "lorem ipsum" text for any RDBMS. Features I'm looking for include:
- Flexibility to generate data for an existing table definition.
- Ability to generate small and large data sets (> 1 million rows or more).
- Generate in SQL script format (
INSERT
statements) or else in a flat file format suitable for bulk import (which is usually faster). - A command-line interface for easy scripting.
- Extensible, open source, written in a dynamic language (these are nice-to-haves, not strong requirements).
PS: I did search for a duplicate question on StackOverflow, but I didn't find one. If there is one, I'll be grateful to get a pointer to it.
Thanks for the great responses everyone! I should amend my requirements that I use Mac OS X as my primary development environment, not Windows (though I did say command-line interface is desirable, and that practically rules out Windows). The Windows-specific suggestions will no doubt be useful to other readers of this question, though, so thanks.
Here is my conclusion:
GenerateData:
- PHP web app interface, not command line
- limited to generating 200 records (or pay $20 for license to generating 5,000 records)
RedGate SQL Data Generator
- not free, price $295
- requires Windows, .NET, SQL Server
Visual Studio 2008 Database Edition
- requires Windows
- requires costly MSDN or ISV subscription
Banner Datadect
- not free, price $595
- requires Windows (?)
- no support for MySQL (?)
- GUI, not command line or scriptable
Ruby Faker gem
- way too slow to use ActiveRecord for bulk data load
Super Smack
- chiefly a load-testing tool, with a random data generator built in
- pretty simple to use nevertheless
- overall a good runner-up tool
Databene Benerator
- best solution for my needs
- XML scripts, compatible with DbUnit
- open source (GPL) Java code
- command-line usage
- access many databases directly via JDBC