views:

258

answers:

4

I'm not sure what the appropriate terminology is, but in trying to run simulations, I always find it tricky to create good fake data.

I don't have an particular application for this, but let's say I want to play around with some silly stock market predicting algorithm - if I were to just use a standard random number generator to get my test data, it would all hover around .5, even over short intervals, and this wouldn't really produce the kind of data that the stock market usually produces during the day (comparing it to stock charts). Even if the market closes with no gains or loses, you might still find volatility in the middle - simple random walks don't create those same effects.

I guess you could stack rngs on top of one another, a larger magnitude for a full day value, a smaller magnitude per hour, and magnitude still per second, summing them all together to get a more step-like pattern, but that's really too predicable - you know as a developer where those steps will be, or are likely to be if you randomize the durations.

You could literally simulate individual buyer and seller personalities, I guess, but that's a lot of work and computation. (As far as I know, real stock market data is not freely available in raw form)

So, might we go to find free, easily accessible, quick-flowing, "interesting" data?

+3  A: 

why use fake data? Why not gather up some random stock data from a few years ago and use those to test your algorithm?

Yuliy
I agree - if you have access to real data, especially if it's modeling a real world scenario, you want to use the best data you can. Random numbers can be great for certain testing, but depending on the generator you choose it can bias the results in a way that would never happen with "real" data.
Tai Squared
+6  A: 

Most modeling of stock prices is done using upward-biased geometric Brownian motion. Take a decent RNG and solve:

http://www.sitmo.com/eq/76

The explicit solution of this SDE is here:

http://www.sitmo.com/eq/166

David Crawshaw
This is also a good response, I voted it up, but I can't mark two as answers. Thanks
uosɐſ
A: 

You could use the Google Finance API to get real stock data for a random symbol from a set of 100 symbols for a random day in the last year. That should provide real data that's hopefully randomized enough for your purposes.

Franci Penov
+3  A: 
cletus