This is not a question about using another tool. It is not a question about using different data structure. It is issue about WHY I see what I see -- please read to the end before answering. Thank you.
THE STORY
I have one single table which has one condition, the records are not deleted. Instead the record is marked as not active (there is field for that) and in such case all fields (except for identifiers and this isActive field) are considered irrelevant.
More on identifiers -- there are two fields:
- id -- int, primary key, clustered
- name -- unique, varchar, external index
How the update is done for example (I use C#/Linq/MSSQL2005): I fetch the record basing on name, then change required fields and commit the changes, so the update is executed (UPDATE uses id, not name).
However there is a problem with storage. So why not not break this table into dual structure -- "header" table (id, name, isActive) and data table (id, rest of the fields). In case of a problem with storage we can delete all records from data table for real (for isActive=false).
edit (by Shimmy): header+data are not retrieved by LINQ with join. Data records are loaded on demand (and this always occurs because of the code).
comment (by poster): AFAIR there is no join, so this is irrelevant. Data for headers were loaded manually. See below.
Performance -- (my) Theory
Now, what about performance? Which one will be faster? Let's say you have 10000 records in both tables (single, header, data) and you update them one by one (all 3 tables) -- fields isActive and some field from the "data" fields.
My calculation was/is:
mono table -- search using external index, then jumping into the structure, fetching all the data, update using primary key.
dual tables -- search using external index, jumping into the header table, fetching all the data, search using primary key on data table (no jumping here, it is clustered index), fetching all the data, update both tables using primary keys.
So, for me mono structure should be faster, because in dual case I have the same operations plus some extras.
The results
No matter what I do, update, select, insert, dual structure is either slightly better (speed) or up to 30% faster. And now I am all puzzled -- I would understand that I if were insert/update/select only header records, but in every case data records are used as well.
The question -- why/how dual structure can be faster?