Received: from martin.ravenbrook.com (martin.ravenbrook.com [193.112.141.241]) by raven.ravenbrook.com (8.9.3/8.9.3) with ESMTP id QAA04652 for ; Wed, 8 Nov 2000 16:56:05 GMT Received: from [193.112.141.252] (skylark.ravenbrook.com [193.112.141.252]) by martin.ravenbrook.com (8.8.8/8.8.7) with ESMTP id QAA13181 for ; Wed, 8 Nov 2000 16:51:35 GMT (envelope-from rb@ravenbrook.com) Mime-Version: 1.0 X-Sender: rb@pop3-ravenbrook Message-Id: Date: Wed, 8 Nov 2000 16:55:31 +0000 To: Perforce Defect Tracking Integration Project staff From: Richard Brooksby Subject: Replicator architecture design discussion, 2000-11-01/02 Content-Type: text/plain; charset="us-ascii" ; format="flowed" Gareth and I have been discussing the overall design of the replicator in response to the problems we've seen during alpha tests. I believe that better abstraction will reduce the number of internal dependencies and make the replicator more reliable in the face of unexpected configurations. Many of our problems have been to do with assumptions about the kinds of configurations that people use. The main idea I've had is to design an abstract class of defect trackers, of which both Perforce and the DT are subclasses. This means that the replicator will be symmetrical: it will just take two defect trackers and replicate between them. This will force the replicator to abstract away from details of the defect tracker and Perforce, and improve abstraction. We also discussed scalability. We've had several scalability issues. 1. When the replicator starts it tries to replicate all existing changelists ("//info.ravenbrook.com/project/p4dti/version/0.3/code/replicator/run_teamtrack.py#1" line 7) [GDR 2000-10-26, item 4]. There are often many tens of thousands. The replicator automatically replicates changelists involved in a change to an issue, so there's no need to do this except to provide the DT with a complete set. This probably isn't needed, because no one has expressed any interest in editing fixes by description from the DT. 2. When the replicator starts for the first time it selects all entries in the TeamTrack CHANGES table since the beginning of time. At Quokka Sports we saw over 130000 such entries, and they have to come through a network MS SQL Server interface, through the TeamShare API, and then through the Python interface. We didn't wait. This is likely to be a problem in other defect trackers. We think most organizations will be happy to replicate only the issues that change after they start replication for the first time, plus some others that are migrated using an external script. 3. We were concerned about batching operations in the replicator to reduce the number of individual queries. This means passing around lists of things to do rather than individual items in most cases, and shouldn't be too much of a challenge. 4. The current implementation only allows one P4S per replicator. The abstract design proposed above doesn't help matters. We need to decide whether we want to implement multiple P4Ss per replicator. The main reason to do it is to reduce the number of polling queries on the DT, so we need to estimate the cost of these. At the moment, I'm assuming one P4S and one DTS per replicator. With these points in mind, here's a sketch of an improved replicator architecture. The replicator object is initialized with two defect tracker objects and a configuration object. The configuration object contains the translations between the defect trackers. The "replicator.poll" method asks each defect tracker what's changed. It pairs up the changed issues to make a list of things to do. For each changed issue in left defect tracker which hasn't changed in the right defect tracker, the replicator translates it for the right defect tracker and asks the right defect tracker to update it. The right defect tracker can refuse because this violates a rule, in which case the replicator informs the person who attempted the change and undoes it in the left defect tracker. And then vice versa, right to left. Each defect tracker compares the issue it's given with the existing issue to see if anything needs to be done. A separate polling object is initialized with a list of replicators and polls them round-robin, so that a number of DTS/P4S pairs can be serviced with the same replicator process. A. REFERENCES [GDR 2000-10-26] "Alpha test report: Perforce, 2000-10-26" (e-mail message); Gareth Rees; Ravenbrook Limited; 2000-10-26.