The basic table schema looks something like this (I'm using MySQL BTW):
integer unsigned vector-id
integer unsigned fk-attribute-id
float attribute-value
primary key (vector-id,fk-attribute-id)
The vector is represented as multiple records in the table with the same vector-id
I need to build a separate table with the dot product (also euclidean distance) of all vectors that exist in this table. So, I need a result table that looks like this:
integer unsigned fk-vector-id-a
integer unsigned fk-vector-id-b
float dot-product
...and one like this...
integer unsigned fk-vector-id-a
integer unsigned fk-vector-id-b
float euclidean-distance
What is the best query structure to produce my result?
With very large vectors, is a relational database the best approach to solve this problem, or should I internalize the vectors in an application and do the calculation there?