tags:

views:

192

answers:

4

Hi,

I'm working on an application that analizes music similarity. In order to do that I proccess audio data and store the results in txt files. For each audio file I create 2 files, 1 containing and 16 values (each value can be like this:2.7000023942731723) and the other file contains 16 rows, each row containing 16 values like the one previously shown.

I'd like to store the contents of these 2 file in a table of my MySQL database.

My table looks like:

Name varchar(100)
Author varchar (100)

in order to add the content of those 2 file I think I need to use the BLOB data type:

file1 blob
file2 blob

My question is how should I store this info in the data base? I'm working with Java where I have a double array containing the 16 values (for the file1) and a matrix containing the file2 info. Should I process the values as strings and add them to the columns in my database?

Thanks

A: 

Do you need to query the data (say for all the values that are bigger than 2.7) or just store it (you always load the whole file from the database)?

Given the information in the comment I would save the files in a BLOB or TEXT like said in other answers. You don't even need a line delimiter since you can do a modulus operation on the list of values to get the row of the matrix.

Timo
I just need to store it. Right now this information is stored in 2 .txt files. When I need to read these files I always need to read all the values. I just want to know how to do that using MySQL instead of using text file. Thanks for your post
dedalo
+1  A: 

I think you need to normalize a schema like this if you intend to keep it in a relational database.

Sounds like you have a matrix table that has a one-to-many relationship with its files.

If you insist on one denormalized table, one way to do it would be to store the name of the file, its author, the name of its matrix, and its row and column position in the named matrix that owns it.

Please clarify one thing: Is this a matrix in the linear algebra sense? A mathematical entity?

If yes, and you only use the matrix in its entirety, then maybe you can store it in a single column as a blob. That still forces you to serialize and deserialize to a string or blob every time it goes into and comes out of the database.

duffymo
I don't fully understand you. Each row in the table contains the name of a file, its author and I want to add the information extracted and that now is being stored in .txt files.
dedalo
ok, imagine i just want to store in a column of a table a matrix (16x16), how could I do it? what data type should I use when I create this column in MySQL?
dedalo
+1 I was going to suggest a blog also. That just seems like a lot of processing for something very simple. It might just be easier to convert the matrix to a csv string and store that.
Jim Schubert
@Jim Schubert, I would discourage using a blob if there is any use case that requires examining its contents. My first thought when I read "matrix" and "blob" was that it was a bad idea, and I discouraged it. Feel free to remove your up vote if I've earned it on false pretenses.
duffymo
@duffymo: blog => blob. sorry about that. I think your answer is pretty accurate.
Jim Schubert
+1  A: 

Hope I don't get negative repped into oblivion with this crazy answer, but I am trying to think outside the box. My first question is, how are you processing this data after a potential query? If I were doing something similar, I would likely use something like matlab or octave, which have a specific notation for representing matricies. It is basically a bunch of comma and semicolon delimited text with square brackets at the right spots. I would store just a string that my mathematics software or module can parse natively. After all, it doesn't sound like you want to do some kind of query based on a data point.

Dave
I'm working with Java. This data is used to perform mathematical operations. I was wondering if there is anyway to store a matrix (16x16) in a column of the table.
dedalo
I would still store it in an easily-parsed format, as a string. Maybe use spaces to separate the columns, and semicolons to separate the rows. This way, you don't have number issues with dot, comma, and engineering notation. And it's a simple matter of using split to get the rows, and then split on those rows to get the cells.
Dave
Anything can be parsed, but if you're doing a lot of operations with the matrix, why would you want to spend CPU cycles building strings on the way in and parsing on the way out? The alternative means JOINs, but at least the data is still in its native type. What if you wanted an operation with a particular row or column? A JOIN makes that easy; a BLOB makes it a pain.
duffymo
Unfortunately, I can't comment on the cost of doing joins to operate on the native data vs. doing the string to double conversions. I'm all ears! For me, the simplicity of understanding the data by just looking at a field far outweighs the benefits of selecting a bunch of SQL queries just to avoid the conversions. If you want a particular row, then you write a function to give you the desired row. Not elegant, but it's understandable to pretty much anyone that can write simple code.
Dave
A: 

I think the problem that dedalo is facing is that he's working with arrays (I assume one is jagged, one is multi-demensional) and he wants to serialize these to blob.

But, arrays aren't directly serializable so he's asking how to go about doing this.

The simplest way to go about it would be to loop through the array and build a string as Dave suggested and store the string. This would allow you to view the contents from the value in the database instead of deserializing the data whenever you need to inpsect it, as duffymo points out.

If you'd like to know how to serialize the array into BLOB...(this just seems like overkill)

You are able to serialize one-dimensional arrays and jagged arrays, e.g.:

public class Test {
    public static void main(String[] args) throws Exception {

        // Serialize an int[]
        ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("test.ser"));
        out.writeObject(new int[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9});
        out.flush();
        out.close();

        // Deserialize the int[]
        ObjectInputStream in = new ObjectInputStream(new FileInputStream("test.ser"));
        int[] array = (int[]) in.readObject();
        in.close();

        // Print out contents of deserialized int[]
        System.out.println("It is " + (array instanceof Serializable) + " that int[] implements Serializable");
        System.out.print("Deserialized array: " + array[0]);
        for (int i=1; i<array.length; i++) {
            System.out.print(", " + array[i]);
        }
        System.out.println();
    }
}

As for what data type to store it as in MySQL, there are only four blob types to choose from:
The four BLOB types are TINYBLOB, BLOB, MEDIUMBLOB, and LONGBLOB

Choose the best one depends on the size of the serialized object. I'd imagine BLOB would be good enough.

Jim Schubert
Hi, At this moment this information is stored in txt files. This works just fines. However, my manager wants me to try storing it in a MySQL table. At first I thought about doing it as Dave suggested, but i'd like to know if it is possible to add the file to the data base or if there is any other way of doing this.Thanks
dedalo
This code shows you how to serialize the array. You should be able to write the Stream object to the table as you would with any other SQL, you just have to specify the data type as BLOB. If it's a matter of how to get a byte array into the database, check out http://www.java2s.com/Code/Java/Database-SQL-JDBC/InsertpicturetoMySQL.htm I guess your question was a little hard to understand.
Jim Schubert