ADR-0005: DAG Change Manager for Provider Populate Topology¶
Status¶
Accepted
Context¶
ADR-0004 addressed the diamond problem in provider topology by validating
require() against already-created state. That fix prevents conflicts
after an aggregate has been created, but does not control the order in
which populate() is called across the provider network.
Current approach: recursive DFS¶
Currently populate() works via recursive depth-first traversal. Each
provider calls populate() on its dependencies from within its own
populate():
BookProvider.populate()
+-- author_id (ReferenceProvider) --> AuthorProvider.populate()
| +-- tenant_id (ReferenceProvider) --> TenantProvider.populate()
+-- publisher_id (ReferenceProvider) --> PublisherProvider.populate()
+-- tenant_id (ReferenceProvider) --> TenantProvider.populate() <-- diamond
Protection against double invocation relies on:
is_complete()guard at the top of eachpopulate()– if the provider is already populated, the repeated call is a no-op.DiamondUpdateConflictexception inrequire()when new criteria conflict with an already-established state.Repository lookup in
AggregateProvider.populate()– if the ID is already known, the aggregate is loaded from the repository instead of being created again.
Problems with the recursive approach¶
Populate order depends on declaration order. The iteration order of
self.providers(a dict built from class annotations) determines which path reaches a shared dependency first. The winning path populates the provider; subsequent paths seeis_complete() == Trueand skip. If the paths carry different criteria, the second path may either silently accept a foreign result or raiseDiamondUpdateConflict.Criteria may arrive after populate. In a diamond,
require()from path 2 may arrive afterpopulate()from path 1 has already created the aggregate. ADR-0004’s validation-instead-of-reset fix handles this, but the root cause – uncontrolled traversal order – remains.No single point of control. Each provider independently decides when to call
populate()on its dependencies. There is no centralized view of the dependency graph, making it difficult to reason about ordering guarantees or to detect cycles.
GoF DAGChangeManager¶
The Gang of Four book describes a DAGChangeManager (Mediator + Observer)
for propagating changes through a directed acyclic graph of dependencies. The
key algorithms are:
collectAffected(subject, visited) – DFS traversal that collects all reachable observers into a set (deduplication via visited map). Each observer is collected exactly once regardless of how many paths lead to it.
topoSort(affected) – Kahn’s algorithm. Computes in-degree for each affected node, starts from nodes with zero in-degree, and processes nodes in topological order. Guarantees: (a) each node is processed exactly once, (b) a node is processed only after all its dependencies.
The combination solves the diamond problem structurally: in a diamond
A -> B -> D and A -> C -> D, node D appears in the affected set
once and is processed after both B and C.
Reference implementation (Go):
dckms-private/private/it/ddd/grade/ascetic-ddd/observer/dag_change_manager-2.go
Decision¶
Adopt the DAGChangeManager pattern for the faker provider populate topology, implemented as a separate ProviderChangeManager (Mediator) – Variant A.
Variant A: Separate Mediator (accepted)¶
A dedicated ProviderChangeManager owns the dependency graph and controls the
populate() invocation order. Providers delegate population to the manager
rather than recursively calling populate() on their dependencies.
Conceptual mapping:
DAGChangeManager concept |
Faker provider analog |
|---|---|
Subject |
|
Observer |
|
Subject + Observer (dual role) |
|
|
|
|
Collect all providers reachable from the root |
|
Determine |
Single notification per observer |
Single |
Conceptual API:
class ProviderChangeManager:
"""Mediator that owns the provider dependency graph
and controls populate() invocation order."""
deps: dict[AggregateProvider, list[ReferenceProvider]]
reverse_deps: dict[ReferenceProvider, list[AggregateProvider]]
def register(self, agg_provider, ref_provider):
"""Called during provider network construction."""
...
async def populate(self, root_provider, session):
"""Populate the entire reachable subgraph in topological order."""
affected = set()
self._collect_affected(root_provider, affected)
sorted_providers = self._topo_sort(affected)
for provider in sorted_providers:
await provider.populate(session) # exactly once, correct order
def _collect_affected(self, provider, visited):
"""DFS: collect all reachable providers into a set."""
for dep in self.deps.get(provider, []):
if dep not in visited:
visited.add(dep)
if isinstance(dep, AggregateProvider):
self._collect_affected(dep, visited)
def _topo_sort(self, affected):
"""Kahn's algorithm: topological order over the affected set."""
in_degree = {p: 0 for p in affected}
for p in affected:
for dep in self.reverse_deps.get(p, []):
if dep in affected:
in_degree[p] += 1
queue = [p for p, deg in in_degree.items() if deg == 0]
sorted_ = []
while queue:
node = queue.pop(0)
sorted_.append(node)
for dep in self.deps.get(node, []):
if dep in affected:
in_degree[dep] -= 1
if in_degree[dep] == 0:
queue.append(dep)
return sorted_
Variant B: Inline topological sort (rejected)¶
The topological sort logic would live inside the root AggregateProvider,
which would introspect self.providers and
ReferenceProvider.aggregate_provider to build the DAG on the fly.
Rejected because:
Violates Single Responsibility:
AggregateProviderwould own both domain logic and graph traversal.No centralized graph: each
populate()call would rebuild the DAG via introspection.Harder to extend for dynamic provider registration.
Less aligned with GoF Mediator pattern, which explicitly separates the coordination concern into a dedicated object.
Key guarantees¶
Each provider’s populate() is called exactly once – deduplication via the visited set in
collectAffected().Topological order – a provider is populated only after all its dependencies are populated. In the diamond
Book -> Author -> TenantandBook -> Publisher -> Tenant,TenantProvider.populate()runs before bothAuthorProvider.populate()andPublisherProvider.populate().All criteria arrive before populate – because the topological order processes dependencies first, all
require()calls from all paths reach a shared provider before itspopulate()runs. Conflicts are detected before aggregate creation, not post-factum.Single point of control – the Mediator owns the graph and the traversal logic, making it easier to debug, trace, and extend (e.g., cycle detection, visualization, parallel population of independent branches).
Consequences¶
The
ProviderChangeManagerbecomes the single entry point for population: client code callsawait manager.populate(root_provider, session)instead ofawait root_provider.populate(session)directly.Provider
populate()methods no longer callpopulate()on their dependencies – the manager does this externally in the correct order.ADR-0004’s
require()validation remains as a safety net for edge cases, but the topological ordering makesDiamondUpdateConflictstructurally impossible for well-formed DAGs.Registration of providers in the manager can be done either explicitly (
manager.register(agg, ref)) or via introspection of provider annotations at setup time.The pattern aligns with GoF Mediator + Observer, making the design portable to Go (see ADR-0003) and recognizable to developers familiar with the original patterns.
Related¶
ADR-0004: Diamond Problem in Provider Topology – predecessor ADR that introduced
DiamondUpdateConflictvalidationGoF Design Patterns, p. 299-303 – Mediator + Observer, ChangeManager, DAGChangeManager
Reference implementation:
dckms-private/private/it/ddd/grade/ascetic-ddd/observer/dag_change_manager-2.go