ansaurus

Question

Answer 1

+3 A:

You'll probably want to start out with a dogs table containing all the flat (non array) data for each dog, things which each dog has one of, like a name, a sex, and an age:

CREATE TABLE `dogs` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `name` VARCHAR(64),
  `age` INT UNSIGNED,
  `sex` ENUM('Male','Female')
);

From there, each dog "has many" measurements, so you need a dog_mesaurements table to store the 24 measurements:

CREATE TABLE `dog_measurements` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `dog_id` INT UNSIGNED NOT NULL,
  `paw` ENUM ('Front Left','Front Right','Rear Left','Rear Right'),
  `taken_at` DATETIME NOT NULL
);

Then whenever you take a measurement, you INSERT INTO dog_measurements (dog_id,taken_at) VALUES (*?*, NOW()); where * ? * is the dog's ID from the dogs table.

You'll then want tables to store the actual frames for each measurement, something like:

CREATE TABLE `dog_measurement_data` (
  `id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `dog_measurement_id` INT UNSIGNED NOT NULL,
  `frame` INT UNSIGNED,
  `sensor_row` INT UNSIGNED,
  `sensor_col` INT UNSIGNED,
  `value` NUMBER
);

That way, for each of the 250 frames, you loop through each of the 63 sensors, and store the value for that sensor with the frame number into the database:

INSERT INTO `dog_measurement_data` (`dog_measurement_id`,`frame`,`sensor_row`,`sensor_col`,`value`) VALUES
(*measurement_id?*, *frame_number?*, *sensor_row?*, *sensor_col?*, *value?*)

Obviously replace *measurement_id?*, *frame_number?*, *sensor_number?*, value? with real values :-)

So basically, each dog_measurement_data is a single sensor value for a given frame. That way, to get all the sensor values for all a given frame, you would:

SELECT `sensor_row`,sensor_col`,`value` FROM `dog_measurement_data`
WHERE `dog_measurement_id`=*some measurement id* AND `frame`=*some frame number*
ORDER BY `sensor_row`,`sensor_col`

And this will give you all the rows and cols for that frame.

Josh 2010-09-17 19:33:09

Answer 2

+2 A:

I think you are not able to figure out how to put 2D data in database.

If you think of relation between 2 columns, you can think of it as 2D data with 1st column as X axis data and 2nd column as Y axis data. Similarly for 3D data.

Finally your db should look like this:

Table: Dogs
    Columns: DogId, DogName -- contains data for each dog

Table: Measurements
    Columns: DogId, MeasurementId, 3D_DataId, 2D_DataId -- contains measurements of each dog

Table: 3D_data
    Columns: 3D_DataId, 3D_X, 3D_Y, 3D_Z -- contains all 3D data of a measurement

Table: 2D_data
    Columns: 2D_DataId, 2D_X, 2D_Y -- contains all 2D data of a measurement

Also you may want to store your 3D data and 2D data in an order. In that case, you will have to add a column to store that order in table of 3D data and 2D data

Niraj Nawanit 2010-09-17 19:44:10

Answer 3

A:

The only thing I would add to Josh's answer is that if you don't need to query individual frames or sensors, just store the arrays as BLOBs in the dog_measurement_data table. I have done this before with large binary set of sensor data and it worked out well. You basically query the 2d and 3d arrays with each measurement and manipulate them in code instead of the database.

James 2010-09-17 19:47:03

This should have been posted as a comment to my answer, but, I guess you can't post comments yet :-) In response, I'd say that the downside to this is, you need a format you can reliably store/retrieve from BLOB, like JSON or some other serialized python value. This might not be as easy as it sounds!

Josh 2010-09-17 19:52:12

Answer 4

+2 A:

Django has a library for encapsulating all the database work into Python classes, so you don't have to mess with raw SQL until you have to do something really clever. Even though Django is a framework for web applications, you can use the database ORM by itself.

Josh's models would look like this in Python using Django:

from django.db import models

class Dog(models.Model):
    # Might want to look at storing birthday instead of age.
    # If you track age, you probably need another field telling
    # you when in the year age goes up by 1... and at that point,
    # you're really storing a birthday.
    name = models.CharField(max_length=64)
    age = models.IntegerField()
    genders = [
        ('M', 'Male'),
        ('F', 'Female'),
    ]
    gender = models.CharField(max_length=1, choices=genders)

class Measurement(models.Model):
    dog = models.ForeignKey(Dog, related_name="measurements")
    paws = [
        ('FL', 'Front Left'),
        ('FR', 'Front Right'),
        ('RL', 'Rear Left'),
        ('RR', 'Rear Right'),
    ]
    paw = models.CharField(max_length=2, choices=paws)
    taken_at = models.DateTimeField(default=date, auto_now_add=True)

class Measurement_Point(models.Model):
    measurement = models.ForeignKey(Measurement, related_name="data_points")
    frame = models.IntegerField()
    sensor_row = models.PositiveIntegerField()
    sensor_col = models.PositiveIntegerField()
    value = models.FloatField()

    class Meta:
        ordering = ['frame', 'sensor_row', 'sensor_col']

The id fields are created automatically.

Then you can do things like:

dog = Dog()
dog.name = "Pochi"
dog.age = 3
dog.gender = 'M'
# dog.gender will return 'M', and dog.get_gender_display() will return 'Male'
dog.save()

# Or, written another way:
dog = Dog.objects.create(name="Fido", age=3, sex='M')

To take a measurement:

measurement = dog.measurements.create(paw='FL')
for frame in range(248):
    for row in range(255):
        for col in range(63):
            measurement.data_points.create(frame=frame, sensor_row=row, 
                sensor_col=col, value=myData[frame][row][col])

Finally, to get a frame:

# For the sake of argument, assuming the dogs have unique names.
# If not, you'll need some more fields in the Dog model to disambiguate.
dog = Dog.objects.get(name="Pochi", sex='M')
# For example, grab the latest measurement...
measurement = dog.measurements.all().order_by('-taken_at')[0]
# `theFrameNumber` has to be set somewhere...
theFrame = measurement.filter(frame=theFrameNumber).values_list('value')

Note: this will return a list of tuples (e.g. [(1.5,), (1.8,), ... ]), since values_list() can retrieve multiple fields at once. I'm not familiar with NumPy, but I'd imagine it's got a function similar to Matlab's reshape function for remapping vectors to matrices.

Mike DeSimone 2010-09-17 21:20:38

+1 for using an excellent model! Kidding... this is a great answer and will probably be very helpful to Ivo!

Josh 2010-09-17 21:52:15

Answer 5

A:

I've benefited a lot from the sqlalchemy package; it is an Object Relational Mapper. What this means is that you can create a very clear and distinct separation between your objects and your data:

SQL databases behave less like object collections the more size and performance start to matter; object collections behave less like tables and rows the more abstraction starts to matter. SQLAlchemy aims to accommodate both of these principles.

You can create a objects representing your different nouns (Dog, Measurement, Plate, etc). Then you create a table via sqlalchemy constructs which will contain all the data that you want to associate with, say, a Dog object. Finally you create a mapper between the Dog object and the dog_table.

This is difficult to understand without an example, and I will not reproduce one here. Instead, please start by reading this case study and then study this tutorial.

Once you can think of your Dogs and Measurements as you do in the real world (that is, the objects themselves) you can start factoring out the data that makes them up.

Finally, try not to marry your data with an specific format (as you do currently by using numpy arrays). Instead, you can think of the simple numbers and then transform them on demand to the specific format your current application demands (along the lines of a Model-View-Controller paradigm).

Good luck!

Arrieta 2010-09-17 21:30:01

Answer 6

A:

From your description, I would highly recommend looking into PyTables. It's not a relational database in the traditional sense, it has most of the features you're likely to be using (e.g. querying) while allowing easy storage of large, multidimensional datasets and their attributes. As an added bonus, it's tightly integrated with numpy.

Joe Kington 2010-09-17 21:33:03

ansaurus

tags:

views:

answers:

How to insert arrays into a database?

related questions