ansaurus

Question

Storing matrices in a relational database

Answer 1

+2 A:

Is your matrix dense of sparse? If it's sparse, it may be better for each entry to just store a list of hit's, rather than have a full 2D table that is mostly 0's.

anon 2010-01-26 20:57:42

the matrices will be dynamic and editable so flexibility is a must

Jimmy 2010-01-26 21:07:39

Answer 2

+2 A:

Rather than two tables, I would just use one table: (x, y, outcome). Beyond that it's hard to give any more advice with the limited information given.

Tom H. 2010-01-26 20:58:47

this could be possible but I feel it would be easier to code with multiple tables representing one thing, and being able to easily tie each element to have a human readable representation

Jimmy 2010-01-26 21:07:15

I'm not sure how having a random ID number makes things more readable

Tom H. 2010-01-27 03:58:39

Answer 3

A:

Video memory, a very simple 2D matrix is stored as follows:

ABCD
EFGH
IJKL

in ram sequentially like an array as

A,B,C,D,E,F,G,H,I,J,K,L

element x,y can be found at array offset

[y*width+x]

for instance, x=2,y=2 (zero-based) refers to element K.

[y*width+x]=[2*4+2]=10. array element 10 (again zero-based) = K, so you're good.

Storing in a comma-delimited list will let you put a matrix of any size in an nvarchar field. This assumes that you don't need to query individual cells in SQL, but just grab the matrix as a whole and process it client-side.

Your table may look like this:

tbl_matrices
----
id
user_id
matrix nvarchar(max)

Also, this works very well if you're matrices are sparse, otherwise you'll wind up with a lot of empty ,,,,, elements. You can work around that as well, though.

David Lively 2010-01-26 21:06:20

BUt only do this if you areSURE that you will not need to query individual cells

HLGEM 2010-01-26 21:09:40

This is a very un-normalized technique. It appears very flexible at first, but in fact, virtually anything that you might want to do with it in SQL down the road becomes more and more difficult to implement.

RBarryYoung 2010-01-26 21:44:24

... thus the phrase, "assumes you don't need to query individual cells in SQL." If the matrices in question need to be manipulated mathematically, etc by a client, and don't need to be queryable in the strictest since on the SQL side, this approach works well. If the need should arise to normalize each matrix, this can be accomplished very easily with a throw-away client app. Also, a CLR UDF or sproc can be used to separate this data and perform whatever is necessary.

David Lively 2010-01-26 21:46:34

Answer 4

+2 A:

There are lots of way to do this, we would need a lot more information to be more specific about what would be best for you. However, here are the two SOP ways:

Either a separate table for each matrix:

CREATE TABLE YourMatrixName(
    RowNo smallint NOT NULL,
    ColNo smallint NOT NULL,
    CellValue varchar](50) NULL,
 CONSTRAINT [PK_Matrices] PRIMARY KEY CLUSTERED 
    ([RowNo] ASC, [ColNo] ASC)
) ON [PRIMARY];
GO

CREATE UNIQUE NONCLUSTERED INDEX IX_YourMatrixName ON dbo.YourMatrixName
    (ColNo, RowNo); 
GO

Or, all of the matrices in one table:

CREATE TABLE Matrices(
    MatrixName varchar(24) NOT NULL,
    RowNo smallint NOT NULL,
    ColNo smallint NOT NULL,
    CellValue varchar](50) NULL,
 CONSTRAINT [PK_Matrices] PRIMARY KEY CLUSTERED 
    ([MatrixName] ASC, [RowNo] ASC, [ColNo] ASC)
) ON [PRIMARY];
GO

CREATE UNIQUE NONCLUSTERED INDEX IX_Matrices ON dbo.Matrices
    (ColNo, RowNo); 
GO

These are standard normal form, virtually all other ways of doing it are not well normalized. Some advantages of these approaches:

You do not have to fill in every cell, only the ones you are using. Or have a default value (0 or "") and skip those.
This is easily the most flexible approach, even in the "all in one" model, there is no need to restrict them to the same size in any way, and it is very easy to resize them.
You can easily query the contents of the matrix, something that is increasingly difficult in more compact storage methods.
"Hit"s or any other aspect of the matrix cells are easy to implement as additional fields in the rows. Make them Null-able if you're worried about the additional space, and index them if you want to query/report on these attributes separately. Its also just as easy to retrofit features like this with this model also.

The primary disadvantage is that there is typically a high space to data overhead. Many assume that there is also high overhead to Insert or retrieve new matrices but in fact there are several documented techniques that can make it quite fast.

RBarryYoung 2010-01-26 21:18:46

ansaurus

tags:

views:

answers:

Storing matrices in a relational database

related questions