I've nearly killed myself, multiple times, trying to find the perfect, flexible schema for storing many different types of objects with a wide variety of links between them in a relational database. But models such as EAV and things like polymorphic relationships just aren't true to relational databases.
After learning about NoSQL graph databases, I realized I had been trying to fit a square peg in a round hole. (I look forward to trying it out sometime but can't right now.) Looking for a fresh approach, I found myself asking a new question: Why am I not using my code to dynamically create this flexibility?
If you have n
different entities in n
different tables, why not let your code generate n(n+1)/2
"link" tables and the queries between them? Would this not result in a true graph in a normalized schema?
I'm sure this has been done before, but I've never seen it. What am I missing?
Edit:
Examples of application:
- Storing printed works and their bibliographic data (there could be many fields which might link not only to strings but whole objects). In the library world, there are no simple (and relational) schemas which can store data "losslessly" without extremely complex, manual schemas.
- Keeping track of physical objects, including owner, maintainer, customer, etc. as well as all relevant properties for each specific type of object (which can vary wildly). These objects have many relationships, of different types, amongst themselves.
Edit 2:
Piet asks, "What problem are you trying to solve?"
Andrew says, "I would hate to be the guy who has to understand or maintain it after it's gone into production."
Perhaps I should clarify:
The problem is that maintaining a schema which constantly grows (or simply needs to be large to fully describe your data) is nearly impossible. Why are we doing this manual and predictable task of creating tables for links (or "edges", in the graph context)?
Does the fact that we traditionally do this by hand limit their complexity and power?
In a highly interlinked database, there will always be exponentially more edges than vertices. Why not focus on creating proper, normalized verticles (tables) and let our code maintain the edges?