views:

92

answers:

5

My biological research involves the measurement of a cellular structure as it changes length throughout the course of observation (capturing images every minute for several hours). As my data sets have become larger I am trying to store them in an Access database, from which I would like to perform various queries about their changes in size.

I know that the SELECT statement can incorporate some mathematical permutations, but I have been unable to incorporate many of my necessary calculations (probably due to my lack of knowledge). For example, one calculation involves determining the rate of change during specifically defined periods of growth. This calculation is entirely dependent on the raw data saved in the table, therefore I didn't this it would be appropriate to just calculate it in excel prior to entry into the field.

So my question is, what would be the most appropriate method of performing this calculation. Should I attempt to string together a huge SELECT calculation in my QUERY, or is there a way to use another language (I know perl?) which can be called to populate the new query field?

I'm not looking for someone to write the code, just where is it appropriate to incorporate each step. Also, I am currently using Office Access but would be interested in any mySQL answers as I may be moving to this platform at a later date. Thanks all!

+2  A: 

You could encapsulate your logic and maths into a custom function in VBA and then call that in your select statement. This methodology would also work with other database engines but the exact wording might be slightly different

Kevin Ross
this is sort of what i was thinking, however I think I might take the other advice i've gotten and move to SQL server or mySQL
ciclistadan
A: 

If you need to populate a field with the result of a complex calculation, it will be easier to do the calculation in a full-fledged programming language, such as Perl, instead of trying to do it in SQL.

Perl has a very good database API, "DBI", with drivers for just about every database engine known to man. Here's a good short article on DBI:

http://www.perl.com/pub/a/1999/10/DBI.html

Brock
and then you could forget about the power of set based solutions provided by the SQL and create a slow loop to process each row (but in a language of your choice).
KM
a spur of my main question would be, how improper is it to populate a table field which is entirely based on other fields already found in the database. the most commonly used calculation i have is the slope (rate) with which i perform a large number of comparisons between different sets of averages. would it be more correct to use SELECT to create a QUERY TABLE containing this information OR to simply calculate this information on UPDATE and include it as a unique field on the main table?
ciclistadan
If the calculation is expensive in terms of performance, you may choose to store it. Databases with triggers are very helpful for this, since they can update the data when the source fields it's calculated on change. Note that Access 2010 adds a new form of embedded macro that is stored at the table level and can function like a trigger, called data macros.
David-W-Fenton
A: 

You would be at the mercy of the SQL implementation for calculation options, precision, etc. Better to use a separate language where you have control, extensibility and flexibility to store the results you want and need.

Alex Reynolds
I'd be surprised if the database could not do the math functions necessary. The OP has not detailed what the calculations are, but I'd guess that the tough part of this is grouping and combining rows and not the availability of math functions.
KM
I'd be even more surprised if databases implemented mathematical operations and precision in a consistent and reliable way. What if the OP switches databases? That could be a lot of SQL to debug, or difficult-to-find errors could easily creep in from buggy or incomplete implementations.
Alex Reynolds
What makes you think the MS Access or MySql will not have the necessary mathematical operations or that they will not be reliable? I doubt that MS Access or MySql will have buggy math operations. However, I think it is pointless to speculate on this since the OP has provided zero details.
KM
sorry for the delayed response, I do not doubt that the individual mathematical functions are lacking, just the ease of linking together. I'll add the example above in a sec, but in excel I currently use the slope and pearson's correlation coefficient (r^2) function on a range of measurements that is defined in another field. i.e. find the slope of the 2nd through 24th measurements.
ciclistadan
the hard part will be grouping, getting the `2nd through 24th measurements` and that is a SQL job not for a separate language to loop over one row at a time
KM
As a counterexample, R provides a programming environment that does not need to loop over one row at a time.
Alex Reynolds
+1  A: 

Doing it in SQL will be a lot faster, however much harder to debug (I'm guessing that you're looking at things like ANOVA, t-tests, chi^2 etc).

Having said that, you may want to to store and calculate interim values like the delay since the previous measurement, and the change in measurement.

OTOH, the metrics you describe are very simple to do in SQL:

one calculation involves determining the rate of change during specifically defined periods of growth

C.

symcbean
+1, I'd dump MS Access an go for the free version of SQL Server 2008: microsoft.com/express/database which has more functionality than MS Access (like ROW_NUMBER(), which you'll need) but can still tie in to excel easily enough.
KM
I haven't really even done more in access than import some tables, if you think SQL Server 2008 would work better I'd take the hint. how do you suggest that I could 'tie in' with excel
ciclistadan
If you're familiar with MySQL, I wouldn't muck about with SQL Server. Also, keep in mind that Access is a database application development tool that ships with a default database engine, Jet/ACE. You can continue to use Access as front end to write queries, edit data and print reports, while using linked tables that represent data stored in any of the myriad data formats that Access can interface with. This includes SQL Server and MySQL, among many others.
David-W-Fenton
In general, despite making a living as a MSAccess programmer for many years, I would advise people to avoid MSAccess - however for a single user application managing medium amounts of data its fine (for multi-user / large datasets think again). But VBA is a pig.
symcbean
How, exactly is VBA a "pig"?
David-W-Fenton
A: 

I'd propose that you use Access as a front-end tool for entering, editing and printing your data. You can store the data in any back-end database engine (MySQL, SQL Server, etc.), though Jet/ACE (the default Access database engine) is likely to be completely adequate unless your data set gets very, very large (it's limited to 2GBs but you don't really want to continue using it if your data grows to much over 1GB during regular usage).

For complex statistical analysis, though, I'd recommend considering exporting the data and using a proper statistics package for doing the analysis. This means your reporting might all be done from there.

In that situation, you could leverage Access's capabilities in allowing you to create an interface for selecting the datasets you wanted to export for analysis. The last time I did this for a client, they were using SPSS for the data analysis and I built them a very flexible export interface (they could choose any variables they liked for analysis).

Whether this is a helpful alternative depends on the extent and type of the analysis you're going to do. If you're using a lot of functions that Access VBA lacks and have to borrow them from Excel or write replacements for them, then you might be better off doing all of that in some other program.

Also, it may be that some or many or all of your calculations belong in the presentation layer and not in SQL. Access reports have a lot of capabilities here, and if you're summarizing data, it may be best done at that level, rather than in the SQL recordsources underlying your reports.

David-W-Fenton