ansaurus

Question

Data generators for SQL server?

Answer 1

+7 A:

I have used the data generator at http://www.generatedata.com/#generator in the past. May be worth a look.

Galwegian 2008-10-01 13:32:40

Great Answer. Thank you :)

GateKiller 2008-10-01 13:38:58

Answer 2

+1 A:

I have used this before

http://sqlmanager.net/en/products/mssql/datagenerator

Its not free though.

Ref integrity checking is quite important, or your tests will be no good without correlating related data.(in most cases)

mattlant 2008-10-01 13:39:07

Answer 3

+7 A:

Something similar has been asked here : http://stackoverflow.com/questions/16317/creating-test-data-in-a-database

Red Gate SQL Data Generator does a great job in that domain. You can customize every field of your database and using random data with seeds. And even create specific patterns using Regex expressions.

Pascal Paradis 2008-10-01 13:40:04

Answer 4

+2 A:

Visual Studio Team System Database Edition (aka Data Dude) does this.

I have not used it for data generation yet, but 2 features sound nice:

Set your own seed value for the random data generator. This allows you to prodcue the same random data more than once.
Point the wizard at a 'real' database and have it generate something that looks like real data.

Maybe these are standard features elsewhere?

ScottStonehouse 2008-10-01 13:50:09

Any specific features you find interesting? I never used it.

Pascal Paradis 2008-10-01 13:53:57

Answer 5

+5 A:

For generating sample data, I use simple Python applications.

Considerations:

Simple to modify and configure.
A repeatable set of data that you can for performance testing and get consistent results.
Follow all of the DB referential integrity rules and constraints.
Realistic data.

The first two indicate that you want to produce script files that will load your data. The third is tougher. There are ways to discover the database metadata and constraints. Looking at 3 and 4 together, you don't want simple reverse engineering -- you want something you can control to produce realistic values.

Generally, you want to build an entity model of your own so that you can be sure you have ranges and key relationships correct.

You can do this three ways.

Generate CSV files of data which you can load manually. Nice repeatable test data.
Generate SQL scripts which you can run. Nice repeatable data, also.
Use an ODBC connection to generate data directly into the database. I actually don't like this as much, but you might.

Here's a stripped-down one-table-only version of a data generator that writes a CSV file.

import csv
import random

class SomeEntity( list ):
    titles = ( 'attr1', 'attr2' ) # ... for all columns
    def __init__( self ):
        self.append( random.randrange( 1, 10 ) )
        self.append( random.randrange( 100, 1000 ) )
        # ... for all columns

myData = [ SomeEntity() for i in range(10000) ]
aFile= open( 'tmp.csv', 'wb' )
dest= csv.writer( aFile )
dest.writerow( SomeEntity.titles )   
dest.writerows( myData )
aFile.close()

For multiple entities, you have to work out the cardinality. Instead of generating random keys, you want to make a random selection from the other entities. So you might have ChildEntity picking a random element from ParentEntity to assure that the FK-PK relationship was correct.

Use random.choice(someList) and random.shuffle(someList) to assure referential integrity.

S.Lott 2008-10-01 13:51:34

Is this works with SQL Server 2000 and 2005?

Pascal Paradis 2008-10-01 13:54:33

This creates CSV files -- you can load the file it creates into just about any database on this earth.

S.Lott 2008-10-01 13:56:58

Good. Cheap. Easy. I like it :)

Pascal Paradis 2008-10-01 13:58:40

And reasonably scalable. You still have to model each Entity class, which requires "thinking". But the volume of code is low, it runs fast, and you have the files. You can use schema discovery to jump start the process.

S.Lott 2008-10-01 14:19:12

Answer 6

+1 A:

I've used a tool called Datatect for this (http://www.datatect.com/).

Some of the things I like about this tool:

Uses ODBC so you can generate data into any ODBC data source. I've used this for Oracle, SQL and MS Access databases, flat files, and Excel spreadsheets.
Extensible via VBScript. You can write hooks at various parts of the data generation workflow to extend the abilities of the tool.
Referentially aware. When populating foreign key columns, pulls valid keys from parent table.

Patrick Cuff 2008-10-01 14:22:29

Answer 7

+5 A:

I've rolled my own data generator that generates random data conforming to regular expressions. It turned into a learning project (under development) and is available at googlecode.

Goran 2008-10-24 22:16:34

Answer 8

+1 A:

I just found about that one: Spawner: http://spawner.sourceforge.net/

Loki 2008-12-03 15:09:01

ansaurus

tags:

views:

answers:

Data generators for SQL server?

related questions