views:

188

answers:

1

I have a non-computer related data logger, that collects data from the field. This data is stored as text files, and I manually lump the files together and organize them. The current format is through a csv file per year per logger. Each file is around 4,000,000 lines x 7 loggers x 5 years = a lot of data. some of the data is organized as bins item_type, item_class, item_dimension_class, and other data is more unique, such as item_weight, item_color, date_collected, and so on ...

Currently, I do statistical analysis on the data using a python/numpy/matplotlib program I wrote. It works fine, but the problem is, I'm the only one who can use it, since it and the data live on my computer.

I'd like to publish the data on the web using a postgres db; however, I need to find or implement a statistical tool that'll take a large postgres table, and return statistical results within an adequate time frame. I'm not familiar with python for the web; however, I'm proficient with PHP on the web side, and python on the offline side.

users should be allowed to create their own histograms, data analysis. For example, a user can search for all items that are blue shipped between week x and week y, while another user can search for sort the weight distribution of all items by hour for all year long.

I was thinking of creating and indexing my own statistical tools, or automate the process somehow to emulate most queries. This seemed inefficient.

I'm looking forward to hearing your ideas

Thanks

+1  A: 

I think you can utilize your current combination(python/numpy/matplotlib) fully if the number of users are not too big. I do some similar works, and my data size a little more than 10g. Data are stored in a few sqlite files, and i use numpy to analyze data, PIL/matplotlib to generate chart files(png, gif), cherrypy as a webserver, mako as a template language.

If you need more server/client database, then you can migrate to postgresql, but you can still fully use your current programs if you go with a python web framework, like cherrypy.

tk
Well thanks for the suggestion .. crossing fingers
dassouki