views:

269

answers:

3

I have a lot of model classes with ralations between them with a CRUD interface to edit. The problem is that some objects can't be deleted since there are other objects refering to them. Sometimes I can setup ON DELETE rule to handle this case, but in most cases I don't want automatic deletion of related objects till they are unbound manually. Anyway, I'd like to present editor a list of objects refering to currently viewed one and highlight those that prevent its deletion due to FOREIGN KEY constraint. Is there a ready solution to automatically discover referers?

Update

The task seems to be quite common (e.g. django ORM shows all dependencies), so I wonder that there is no solution to it yet.

There are two directions suggested:

  1. Enumerate all relations of current object and go through their backref. But there is no guarantee that all relations have backref defined. Moreover, there are some cases when backref is meaningless. Although I can define it everywhere I don't like doing this way and it's not reliable.
  2. (Suggested by van and stephan) Check all tables of MetaData object and collect dependencies from their foreign_keys property (the code of sqlalchemy_schemadisplay can be used as example, thanks to stephan's comments). This will allow to catch all dependencies between tables, but what I need is dependencies between model classes. Some foreign keys are defined in intermediate tables and have no models corresponding to them (used as secondary in relations). Sure, I can go farther and find related model (have to find a way to do it yet), but it looks too complicated.

Solution

Below is a method of base model class (designed for declarative extention) that I use as solution. It is not perfect and doesn't meet all my requirements, but it works for current state of my project. The result is collected as dictionary of dictionaries, so I can show them groupped by objects and their properties. I havn't decided yet whether it's good idea, since the list of referers sometimes is huge and I'm forced to limit it to some reasonable number.

def _get_referers(self):
    db = object_session(self)
    cls, ident = identity_key(instance=self)
    medatada = cls.__table__.metadata
    result = {}
    # _mapped_models is my extension. It is collected by metaclass, so I didn't
    # look for other ways to find all model classes.
    for other_class in medatada._mapped_models:
        queries = {}
        for prop in class_mapper(other_class).iterate_properties:
            if not (isinstance(prop, PropertyLoader) and \
                    issubclass(cls, prop.mapper.class_)):
                continue
            query = db.query(prop.parent)
            comp = prop.comparator
            if prop.uselist:
                query = query.filter(comp.contains(self))
            else:
                query = query.filter(comp==self)
            count = query.count()
            if count:
                queries[prop] = (count, query)
        if queries:
            result[other_class] = queries
    return result

Thanks to all who helped me, especially stephan and van.

+1  A: 

In general, there's no way to "discover" all of the references in a relational database.

In some databases, they may use declarative referential integrity in the form of explicit Foreign Key or Check constraints.

But there's no requirement to do this. It can be incomplete or inconsistent.

Any query can include a FK relationship that is not declared. Without the universe of all queries, you can't know the relationships which are used but not declared.

To find "referers" in general, you must actually know the database design and have all queries.

S.Lott
information_schema does provide the information needed to discover the references. and Denis is only interested in those that actually prevent the row from being deleted, which means it is an actual FOREIGN KEY constraint.
van
The question was about SQLAlchemy, which already has schema definition as some structure in memory (defined in code or automatically loaded from database).
Denis Otkidach
@van: The issue is that there are still informal rules. If a query expects an undeclared FK relationship, then an allowed delete will still break the application. Hence the "in general" part of the answer. Declared relationships aren't the whole store in general. The question doesn't reflect this, which will -- in general -- lead to problems. The specific schema may be really, really complete and may have all FK's declared. But, in general, there's no way to know unless you reverse engineer every query.
S.Lott
@S.Lott: sure, I have seen databases without a single FK constraint as well (although there were references, which you call "undeclared"). Being a very good point on its own, it is not what the question asks
van
@van: Correct. It's not what the question literally asks. The question is poor in that respect. One can hope to find all kinds of juicy information in the schema. But the only the entire universe of queries will reveal the actual relationships actually used in the actual application.
S.Lott
A: 

For each model class, you can easily see if all its one-to-many relations are empty simply by asking for the list in each case and seeing how many entries it contains. (There is probably a more efficient way implemented in terms of COUNT, too.) If there are any foreign keys relating to the object, and you have your object relations set up correctly, then at least one of these lists will be non-zero in length.

Kylotan
This is not always true. The current object can have no relations, while there is an object that refer to it via foreign key and does have relation without backref. Also note that checking all objects in all object relations is not an option, since it might be huge collection (lazy='dynamic').
Denis Otkidach
Huge or not, without reading at least 1 row, you can't know if there are any references. As for lack of backrefs, that was what I meant by "set up correctly" though I acknowledge that it isn't incorrect to lack the backref, just inconvenient.
Kylotan
+4  A: 

SQL: I have to absolutely disagree with S.Lott' answer. I am not aware of out-of-the-box solution, but it is definitely possible to discover all the tables that have ForeignKey constraints to a given table. One needs to use properly the INFORMATION_SCHEMA views such as REFERENTIAL_CONSTRAINTS, KEY_COLUMN_USAGE, TABLE_CONSTRAINTS, etc. See SQL Server example. With some limitations and extensions, most versions of new relational databases support INFORMATION_SCHEMA standard. When you have all the FK information and the object (row) in the table, it is a matter of running few SELECT statements to get all other rows in other tables that refer to given row and prevent it from being deleted.

SqlAlchemy: As noted by stephan in his comment, if you use orm with backref for relations, then it should be quite easy for you to get the list of parent objects that keep reference to the object you are trying to delete, because those objects are basically mapped properties of your object (child1.Parent).

If you work with Table objects of sql alchemy (or not always use backref for relations), then you would have to get values of foreign_keys for all the tables, and then for all those ForeignKeys call references(...) method, providing your table as a parameter. In this way you will find all the FKs (and tables) that have reference to the table your object maps to. Then you can query all the objects that keep reference to your object by constructing the query for each of those FKs.

van
Thanks (+1) for the idea to check `foreign_keys` for all tables, so far it seems to be the most promissing/reliable way. I almost never use `backref` since it defines relation outside the model definition: you can't see all available properties from class definition, the code is not self-documented.
Denis Otkidach
You can discover all the FK constraints. I never said you couldn't. We agree on finding FK constraints. However, you can't discover all the queries that will break because they depend on an *undeclared* FK relationship.
S.Lott