Mime-Version: 1.0
Message-Id: <p04330102b62762712767@[10.128.22.116]>
Date: Wed, 8 Nov 2000 16:55:31 +0000
To: Perforce Defect Tracking Integration Project staff <p4dti-staff@ravenbrook.com>
From: Richard Brooksby <rb@ravenbrook.com>
Subject: Replicator architecture design discussion, 2000-11-01/02
Content-Type: text/plain; charset="us-ascii" ; format="flowed"

Gareth and I have been discussing the overall design of the 
replicator in response to the problems we've seen during alpha tests. 
I believe that better abstraction will reduce the number of internal 
dependencies and make the replicator more reliable in the face of 
unexpected configurations.  Many of our problems have been to do with 
assumptions about the kinds of configurations that people use.

The main idea I've had is to design an abstract class of defect 
trackers, of which both Perforce and the DT are subclasses.  This 
means that the replicator will be symmetrical: it will just take two 
defect trackers and replicate between them.  This will force the 
replicator to abstract away from details of the defect tracker and 
Perforce, and improve abstraction.

We also discussed scalability.  We've had several scalability issues.

1. When the replicator starts it tries to replicate all existing 
changelists 
("//info.ravenbrook.com/project/p4dti/version/0.3/code/replicator/run_teamtrack.py#1" 
line 7) [GDR 2000-10-26, item 4].  There are often many tens of 
thousands.  The replicator automatically replicates changelists 
involved in a change to an issue, so there's no need to do this 
except to provide the DT with a complete set.  This probably isn't 
needed, because no one has expressed any interest in editing fixes by 
description from the DT.

2. When the replicator starts for the first time it selects all 
entries in the TeamTrack CHANGES table since the beginning of time. 
At Quokka Sports we saw over 130000 such entries, and they have to 
come through a network MS SQL Server interface, through the TeamShare 
API, and then through the Python interface.  We didn't wait.  This is 
likely to be a problem in other defect trackers.  We think most 
organizations will be happy to replicate only the issues that change 
after they start replication for the first time, plus some others 
that are migrated using an external script.

3. We were concerned about batching operations in the replicator to 
reduce the number of individual queries.  This means passing around 
lists of things to do rather than individual items in most cases, and 
shouldn't be too much of a challenge.

4. The current implementation only allows one P4S per replicator. 
The abstract design proposed above doesn't help matters.  We need to 
decide whether we want to implement multiple P4Ss per replicator. 
The main reason to do it is to reduce the number of polling queries 
on the DT, so we need to estimate the cost of these.  At the moment, 
I'm assuming one P4S and one DTS per replicator.

With these points in mind, here's a sketch of an improved replicator 
architecture.

The replicator object is initialized with two defect tracker objects 
and a configuration object.  The configuration object contains the 
translations between the defect trackers.  The "replicator.poll" 
method asks each defect tracker what's changed.  It pairs up the 
changed issues to make a list of things to do.  For each changed 
issue in left defect tracker which hasn't changed in the right defect 
tracker, the replicator translates it for the right defect tracker 
and asks the right defect tracker to update it.  The right defect 
tracker can refuse because this violates a rule, in which case the 
replicator informs the person who attempted the change and undoes it 
in the left defect tracker.  And then vice versa, right to left.

Each defect tracker compares the issue it's given with the existing 
issue to see if anything needs to be done.

A separate polling object is initialized with a list of replicators 
and polls them round-robin, so that a number of DTS/P4S pairs can be 
serviced with the same replicator process.


A. REFERENCES

[GDR 2000-10-26] "Alpha test report: Perforce, 2000-10-26" (e-mail 
message); Gareth Rees; Ravenbrook Limited; 2000-10-26.