views:

283

answers:

4

Hey guys,

I am working on a project for a client and going through the initial database design. The project will be a simple web app for tracking processes and their outcomes within a matrix diagram, I am looking for a good way to store these in relational tables.

Right now I am thinking I have a general table for Routines which the x and y coords will map too and maybe off from that a lookup table containing the ID of coordinates in which a "hit" is recorded. Anyone have any better ways of doing this?

Thanks!

EDIT:

This is just the beginning of the project so I have limited detail as of yet, but my main reasoning behind multiple tables is because the matrices will be completely dynamic in size and generic so that each one may be different and they will be tied to a user

I also forgot to mention that order of the x/y values are important, which further supported my reasoning behind having multiple tables for x y and values, from this I strongly assume that needing to know each individual cell is important

EXAMPLE:

The basic example (albeit abstract) of this lies in the process regarding a restaurant. The actions being stuff along the lines of sit down, order food, look over menu, order drinks, eat, pay, etc. the outcomes being order taken, drinks delivered, food delivered, change given. While seemingly simple it becomes complex when taken into consideration things happen differently with each occurrence, also in the case of take out or buffets. the order of the actions and outcomes becomes integral in seeing the differences between the situations

+2  A: 

Is your matrix dense of sparse? If it's sparse, it may be better for each entry to just store a list of hit's, rather than have a full 2D table that is mostly 0's.

anon
the matrices will be dynamic and editable so flexibility is a must
Jimmy
+2  A: 

Rather than two tables, I would just use one table: (x, y, outcome). Beyond that it's hard to give any more advice with the limited information given.

Tom H.
this could be possible but I feel it would be easier to code with multiple tables representing one thing, and being able to easily tie each element to have a human readable representation
Jimmy
I'm not sure how having a random ID number makes things more readable
Tom H.
A: 

Video memory, a very simple 2D matrix is stored as follows:

ABCD
EFGH
IJKL

in ram sequentially like an array as

A,B,C,D,E,F,G,H,I,J,K,L

element x,y can be found at array offset

[y*width+x]

for instance, x=2,y=2 (zero-based) refers to element K.

[y*width+x]=[2*4+2]=10. array element 10 (again zero-based) = K, so you're good.

Storing in a comma-delimited list will let you put a matrix of any size in an nvarchar field. This assumes that you don't need to query individual cells in SQL, but just grab the matrix as a whole and process it client-side.

Your table may look like this:

tbl_matrices
----
id
user_id
matrix nvarchar(max)

Also, this works very well if you're matrices are sparse, otherwise you'll wind up with a lot of empty ,,,,, elements. You can work around that as well, though.

David Lively
BUt only do this if you areSURE that you will not need to query individual cells
HLGEM
This is a very un-normalized technique. It appears very flexible at first, but in fact, virtually anything that you might want to do with it in SQL down the road becomes more and more difficult to implement.
RBarryYoung
... thus the phrase, "assumes you don't need to query individual cells in SQL." If the matrices in question need to be manipulated mathematically, etc by a client, and don't need to be queryable in the strictest since on the SQL side, this approach works well. If the need should arise to normalize each matrix, this can be accomplished very easily with a throw-away client app. Also, a CLR UDF or sproc can be used to separate this data and perform whatever is necessary.
David Lively
+2  A: 

There are lots of way to do this, we would need a lot more information to be more specific about what would be best for you. However, here are the two SOP ways:

Either a separate table for each matrix:

CREATE TABLE YourMatrixName(
    RowNo smallint NOT NULL,
    ColNo smallint NOT NULL,
    CellValue varchar](50) NULL,
 CONSTRAINT [PK_Matrices] PRIMARY KEY CLUSTERED 
    ([RowNo] ASC, [ColNo] ASC)
) ON [PRIMARY];
GO

CREATE UNIQUE NONCLUSTERED INDEX IX_YourMatrixName ON dbo.YourMatrixName
    (ColNo, RowNo); 
GO

Or, all of the matrices in one table:

CREATE TABLE Matrices(
    MatrixName varchar(24) NOT NULL,
    RowNo smallint NOT NULL,
    ColNo smallint NOT NULL,
    CellValue varchar](50) NULL,
 CONSTRAINT [PK_Matrices] PRIMARY KEY CLUSTERED 
    ([MatrixName] ASC, [RowNo] ASC, [ColNo] ASC)
) ON [PRIMARY];
GO

CREATE UNIQUE NONCLUSTERED INDEX IX_Matrices ON dbo.Matrices
    (ColNo, RowNo); 
GO

These are standard normal form, virtually all other ways of doing it are not well normalized. Some advantages of these approaches:

  1. You do not have to fill in every cell, only the ones you are using. Or have a default value (0 or "") and skip those.
  2. This is easily the most flexible approach, even in the "all in one" model, there is no need to restrict them to the same size in any way, and it is very easy to resize them.
  3. You can easily query the contents of the matrix, something that is increasingly difficult in more compact storage methods.
  4. "Hit"s or any other aspect of the matrix cells are easy to implement as additional fields in the rows. Make them Null-able if you're worried about the additional space, and index them if you want to query/report on these attributes separately. Its also just as easy to retrofit features like this with this model also.

The primary disadvantage is that there is typically a high space to data overhead. Many assume that there is also high overhead to Insert or retrieve new matrices but in fact there are several documented techniques that can make it quite fast.

RBarryYoung