Reference can return a cached invalidated object if the local key is the primary key

Bug #435962 reported by Dan Halbert
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Storm
Fix Released
Medium
Thomas Herve

Bug Description

I do a remove() through a Reference. After doing the remove, the deleted row reappears in the Reference, even when I refetch the object containing the Reference. A test program is below. I would expect it to print None twice, showing that both References are empty.

I tried various workaround for this, including calling .autoreload() and .invalidate() on both the object holding the Reference and the Reference itself. (Workarounds are inserted at "#***" in the test program.) The problem appears to be with store._alive. The only workarounds that work are:

    store._alive.clear()
and
    store.reset()

which are fairly draconian.

I've duplicated this on 0.14 and 0.15, with both sqlite and postgresql.

#-----------------------------------------------------------------
from storm.locals import *

class T1(Storm):
   __storm_table__ = 't1'
   id = Int(primary=True)
   t2 = Reference(id, 'T2.t1_id')

   def __init__(self, id):
       self.id = id

class T2(Storm):
   __storm_table__ = 't2'
   t1_id = Int(primary=True) # foreign key referencing T1

   def __init__(self, t1_id):
       self.t1_id = t1_id

db = create_database('sqlite:')
store = Store(db)

store.execute('create temporary table t1 (id INTEGER PRIMARY KEY)')
store.execute('create temporary table t2 (t1_id INTEGER PRIMARY KEY)')

# Store two rows in T1 and two corresponding rows in T2
t1a = T1(33); store.add(t1a)
t1b = T1(66); store.add(t1b)
t2a = T2(33); store.add(t2a)
t2b = T2(66); store.add(t2b)
store.commit()

# Now remove both T2 rows, but in two different ways.
store.remove(t2a) # removing t2a directly
store.remove(t1b.t2) # remove t2b indirectly, via the Reference
store.commit()

#*** Tried various workarounds at this point (see above).

# Confirm there are no T2 rows now.
print "# of T2 rows:", store.find(T2).count()

# Get the T1 objects again.
t1a_new = store.find(T1, T1.id == 33).one()
t1b_new = store.find(T1, T1.id == 66).one()

# Shouldn't these both print None, because both T2 rows have been deleted?
print "t1a_new.t2:", t1a_new.t2
print "t1b_new.t2:", t1b_new.t2

Related branches

Revision history for this message
James Henstridge (jamesh) wrote :

First of all, it isn't a bug that the object remains in _alive: Storm is supposed to revalidate such an object before returning it after a transaction boundary. As long as an object remains alive, it is valid for it to remain in store._alive.

Next, you'll find that t1a is the same object as t1a_new, and the same for t1b and t1b_new: the only thing the store.find() call is doing is making sure that the corresponding DB rows still exist, which is not relevant to this test case.

With that out of the way, the only difference between t2a and t2b is that t2b has been accessed through the T1.t2 reference. The Reference class caches the resulting object, but invalidates that cache when the local key changes.

Normally this would happen on the transaction boundary when the local key is invalidated (i.e. at the transaction boundary), but in this case the local key is the object's primary key. Since the primary key is left as is, the reference's cache never gets cleared, and the now-dead object is returned.

So the correct fix would be for Reference() to validate that the remote object still exists (probably via store._validate_alive) before returning the cached object.

summary: - remove through a Reference leaves obsolete objects in _alive
+ Reference can return a cached invalidated object if the local key is the
+ primary key
Revision history for this message
James Henstridge (jamesh) wrote :

I thought a bit more about this problem, and now think the bug resides in the example code rather than Storm itself.

If we were using a database with proper foreign key constraints, to delete the T2 records, the FK would need to go from T2.t1_id to T1.id. Therefore, T1.t2 is a back reference.

If you modify the reference definition to the following, the problem should go away:

    t2 = Reference(id, 'T2.t1_id', on_remote=True)

That seemed to fix the problem in the simplified test case I was working on.

Changed in storm:
status: New → Incomplete
Revision history for this message
James Henstridge (jamesh) wrote :

Actually, it looks like this can be triggered even if the reference is corrected to use on_remote=True if we delete the row behind Storm's back (something that can easily happen between transactions).

Attached is a simplified test case that demonstrates the bug.

Changed in storm:
status: Incomplete → Confirmed
Revision history for this message
Thomas Herve (therve) wrote :

This is a fix with the suggestion you gave, which seems to fix the bug.

Thomas Herve (therve)
Changed in storm:
importance: Undecided → Medium
assignee: nobody → Thomas Herve (therve)
Revision history for this message
Thomas Herve (therve) wrote :

Merge in storm trunk in r343.

Changed in storm:
status: Confirmed → Fix Committed
Changed in storm:
milestone: none → 0.16
Jamu Kakar (jkakar)
Changed in storm:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.