Atomicity of visible state change in complex action sequences in DataObjects.Net

This article describes one of important features of event notification system in DataObjects.Net, that is used during synchronization of paired (inverse) associations and entity removals.

Content:

The Problem

1. Synchronization of paired (inverse) associations

2. Entity removal and destruction of references pointing to it

The Solution

1. Recursion-based algorithm

2. Queue-based algorithm

Epilogue

The Problem

Virtually any action with entities in DataObjects.Net leads to generation of a set of events. For example, a simple "setting persistent property" action (entity.Property = value) leads to the following events (the order may not be correct - it’s written by memory):

Prologue (what happens before any actual changes happen)

The action itself:

Epilogue (what happens after actual state change):

Ie there are lots of events. But the most important point here is that all this complicated sequence of events can be represented as:

For simplicity, let’s assume that every action with prologue and epilogue is performed using the following code:

try {

  Prologue();

  Action();

  Epilogue(null);

}

catch (Exception e) {

  try {

    Epilogue(e);

  } catch {}; // To avoid original exception masking

  throw;

}

The code seems nearly perfect. But there are actions, that are actually combined of a set of more simple ones, and these simple actions are actions with prologue-epilogue as well.

Let’s take a look on two of such scenarios.

1. Synchronization of paired (inverse) associations

Imagine that we have the following model:

[HierarchyRoot]

public sealed class Author: Entity

{

  [Field, Association(PairTo = “Author”)]

  EntitySet<Book> Books { get; private set; }

}

[HierarchyRoot]

public sealed class Book: Entity

{

  [Field]

  Author Author { get; set; }

}

And the following code:

var book = new Book();

var author1 = new Author();

book.Author = author1;

var author2 = new Author();

book.Author = author2; // Let’s see what happens when this line is executed

To synchronize the other end of paired association set by the last line, we need to execute the following low-level actions (ie actions which don’t lead to synchronization themselves):

author1.Books.Remove(book);

author2.Books.Add(book);

book.Author = author2;

Let's see how this sequence should look like, if concept of action with prologue-epilogue described above (remember the code with try-finally) is applied to this sequence:

  1. Prologue (1): book.Author = author2;
  2. Action (1):
  1. Prologue (2): author1.Books.Remove(book);
  2. Action (2): author1.Books.Remove(book);
  3. Epilogue (2): author1.Books.Remove(book);
  4. Prologue (3): author2.Books.Add(book);
  5. Action (3): author2.Books.Add(book);
  6. Epilogue (3): author2.Books.Add(book);
  7. book.Author = author2;
  1. Epilogue (1): book.Author = author2;

Or on the diagram:

As you see, partially applied changes are visible in 1 prologue and 2 epilogues - they’re marked as bold (yellow on the diagram).

Let’s try to figure out which unexpected effects may caused by this.

Warning: the scenarios described below work exactly as you expect in DataObjects.Net. The examples below are applicable to all ORM / BLL frameworks that use similar event notification system, but don’t provide the atomicity of visible state change in similar scenarios (e.g. using one of approaches described below).

Imagine that Epilogue (3) invokes some code that performs the following LINQ query: author1.Books.Count(). We expect to get 1 here, but this won’t happen. Instead, we’ll get 0 - despite the fact that ORM flushes all the accumulated changes before running this query. Why?

Since the Author.Books collection is paired to the Book.Author property, physically its elements are records in the Book type table, with the given value of the Author field. But since the book entity has not yet been changed, there is still nothing to flush to the database (at least, to the Book type table), and thus the above LINQ query will return 0.

So the following assertion should fail, if it is executed inside Epilogue (3):

Assert.AreEqual(author1.Books.Count(), author1.Books.Count);

Funny, but actually everything can be even more confusing: many ORM tools implement lazy and partial loading of collection state.

The following section is specific to DataObjects.Net - it explains how lazy and partial collection state loading works in this framework. If the concept is generally known to you, you can skip this part.


E.g. in case with DataObjects.Net, the state of collection can be:

Also, changes made to the collection don’t always lead to loading of its content:

Other ORM tools may implement lazy / partial loading of collection state differently, but in general, the problem will be still there.


Taking this into account, let’s look again at the case we studied:

So in this case Assert won’t fail, although actual result can be wrong. On the other hand, the same assertion will fail, if we’ll ensure author1.Books is fully loaded before execution of the whole action (it is enough to enumerate collection elements to achieve this).

To summarize, following assertions may fail being executed inside Epilogue (3) (e.g. in INotifyCollectionChanged.CollectionChanged event handler):

Assert.AreEqual(1, author1.Books.Count());

Assert.AreEqual(1, author1.Books.Count);

Assert.IsTrue(author1.Books.Contains(book));

Thus, the visibility of partially modified state may lead to really complex and unexpected consequences in some cases. It was shown it may affect not on just on results of queries, but on read-only operations with collections as well, if lazy loading of their state is implemented (that’s almost always true).

Of course, ORM developers may say, “you should simply take this into account - it’s your own problem”. But:

  1. As you see, the issue is quite hard to debug. Frankly speaking, it took few days to fully analyze and eliminate one of bugs related to this in DataObjects.Net - even although we already had an implementation of algorithms ensuring the atomicity of visible state change (they’re described further) and were aware about possible issues.
  2. If life cycle events are provided, you must consider they will be used to build quite interconnected models. The typical case is WPF MVVM application: view model objects there subscribe to various BLL events to listen the state changes there and update view model accordingly.
  3. Normally, the code in MVVM event handlers knows nothing about particular BLL scenario, in which event is raised. It is quite generic: e.g. it just knows it should fully refresh the collection in view model. Obviously, such code may fail in this case, and debugging will be really hard.

So IMO, it’s much better when such issues are resolved by ORM.

Let’s turn to the second example now.

2. Entity removal and destruction of references pointing to it

Suppose we execute the following code (model is the same):

var book1 = new Book();

var book2 = new Book();

var author = new Author();

book1.Author = author;

book2.Author = author;

author.Remove(); // Let’s see what happens when this line is executed

The sequence of events:

  1. Prologue (1): author.Remove();
  2. Action (1):
  1. Prologue (2): author.Books.Remove(book1);
  2. Action (2): author.Books.Remove(book1);
  3. Epilogue (2): author.Books.Remove(book1);
  4. Prologue (3): book1.Author = null;
  5. Action (3): book1.Author = null;
  6. Epilogue (3): book1.Author = null;
  7. Prologue (4): author.Books.Remove(book2);
  8. Action (4): author.Books.Remove(book2);
  9. Epilogue (4): author.Books.Remove(book2);
  10. Prologue (5): book2.Author = null;
  11. Action (5): book2.Author = null;
  12. Epilogue (5): book2.Author = null;
  13. author.Remove();
  1. Epilogue (1): author.Remove();

It is evident that the situation is almost completely identical in this case - the only difference is that the chain of actions we have here is longer. Earlier its length was fixed, but here it depends on the number of items in the collection.

So the issues we may face here are the same:

Exact set of anomalies depends on a particular ORM.

The Solution

Actually, it is obvious: we must ensure that in scenarios we described:

I imagine two ways of implementing this:

1. Recursion-based algorithm

Code for each action in this scenario should look like:

public void ActionX()

{

  ActionX(new RecursionBasedAtomicContext());

}

internal void ActionX(RecursionBasedAtomicContext context)

{

  Exception error = null;

  Prologue();

  try {

    context.EnqueueSideEffect(() => {

      // Important: check if side effect is still applicable

      // Side effect

    });

    context.EnqueueAction(() => {

      ActionY(context); // Internal method is called, context is passed to it

    });

    context.EnqueueAction(() => {

      ActionZ(context); // Internal method is called, context is passed to it

    });

    context.Recurse();

  }

  catch(Exception e) {

    error = e;

    throw;

  }

  finally {

    Epilogue(error);

  }

}

RecursionBasedAtomicContext methods do the following:

The diagram shows how this approach works:

So this approach leads to the required sequence of prologue, action (side effect) and epilogue invocations.

One more important aspect is the atomicity of side effects. But in case with DataObjects.Net everything is pretty easy here - we should just mark the method invoking all side effects (Recurse) as transactional, as well as all Action* methods, so any error in their implementation will lead to rollback (and locking) of the the outermost or nested transaction, and this automatically cancels all side effects.

There are almost no benefits from passing the exception to Prologue in this case: if it was thrown, we anyway can’t work with the database (since the transaction is locked), but normally this is not necessary here, because the part of state related to database is already rolled back. On the other hand, we can compensate any side effects that aren’t related to the database (e.g. we can fix something in our own caches - all DataObjects.Net state caches are aware of rollbacks as well)..

Important: before applying any side effect, we should first check if it still can be performed. The the code invoked inside one of the Prologues could make this impossible. For example, an object whose properties we plan to change can be removed by the code from one of Prologues.

The solution with recursion is applicable when we’re sute the number of steps in call chain is limited. In DataObjects.Net this approach is used to sync paired associations (you can see this by studying call stack on such sync in debugger, we use SyncContext instead RecursionBasedAtomicContext; we do everything a bit more effectively taking into account the peculiarities of this particular problem).

2. Queue-based algorithm

Code for each action in this case should look like:

public void ActionX()

{

  var context = new QueueBasedAtomicContext();

  try {

    ActionX(context);
   
context.Process();

  }

  catch (Exception e) {

    context.Process(e);

  }

}

internal void ActionX(QueueBasedAtomicContext context)

{

  context.EnqueuePrologue(() => {

    // Important: the code here can be executed directly as well;

    // EnqueuePrologue must be used only to get rid of possible deep recursion

    Prologue();

    ActionY(context); // Internal method is called, context is passed to it

    ActionZ(context); // Internal method is called, context is passed to it

  });

  context.EnqueueSideEffect(() => {

    // Important: check if side effect is still applicable

    // Side effect

  });

  context.EnqueueEpilogue(error => Epilogue(error));

}

QueueBasedAtomicContext methods do the following:

The diagram explaining this::

As you see, this approach leads to the required sequence of prologue, action (side effect) and epilogue invocations as well.

Important: exception in this case must be processed in the same fashion as if they’d be processed in the solution with recursion. The first exception (if any) is passed to the first epilogue; if it throws another one, the new exception is passed further (to the next epilogue). If the result (return value or thrown exception) of an epilogue is the same instance of error (this is just one of many possible approaches of doing this), it should be transferred to the next epilogue (this corresponds to throw), otherwise, null should be passed (this corresponds to exception suppression, ie catch without throw).

A separate theme is returning the result in such call chain. I haven’t touched it, but generally, this is pretty simple - for example, you can pass instances of ActionXResult ... ActionZResult (objects describing result of a particular action) through the entire chain, transferring the data from callee’s result to caller’s result in Action* methods using closures.

The queue-based solution is applicable in almost any case; its disadvantages in comparison with the first approach are:

DataObjects.Net uses it on entity removals: pseudo-recursive destruction of references to removing instance and cascade removal are handled by this way.

Epilogue

Currently DataObjects.Net uses algorithms described above in two scenarios:

  1. Synchronization of paired (inverse) associations of persistent entities. Here we use the recursion-based algorithm, since maximum length of action chain in this case is 4 ("destroy the old link on the left side", "destroy the old link on the right side", "establish a new link on the left side", "establish a new link on the right side”).
  2. Removal of persistent entity. Queue-based algorithm ideally fits here, since maximum length of action chain is unlimited in this case.

We haven’t identified any other scenarios in DataObjects.Net where similar behavior (atimicity of visible state change) would be helpful. If you find any, we are always happy to consider your suggestions.

The algorithms described in this article should be useful in other similar cases, since the problem seems pretty common, but we didn’t try to find other applications for them.