Received: from martin.ravenbrook.com (martin.ravenbrook.com [193.112.141.241]) by raven.ravenbrook.com (8.9.3/8.9.3) with ESMTP id MAA21607 for ; Thu, 10 Aug 2000 12:25:45 +0100 (BST) Received: from [193.112.141.252] (skylark.ravenbrook.com [193.112.141.252]) by martin.ravenbrook.com (8.8.8/8.8.7) with ESMTP id MAA07731 for ; Thu, 10 Aug 2000 12:24:02 +0100 (BST) (envelope-from rb@ravenbrook.com) Mime-Version: 1.0 X-Sender: rb@pop3.ravenbrook.com Message-Id: Date: Thu, 10 Aug 2000 12:27:03 +0100 To: Perforce Defect Tracking Integration Project staff From: Richard Brooksby Subject: Replication mapping design notes Content-Type: text/plain; charset="us-ascii" ; format="flowed" To manage replication there has to be a 1:1 mapping between tTrack cases (or Bugzilla bugs) and Perforce jobs. We won't use the Perforce job name to represent the mapping, because we need to support migration from just using Perforce jobs without renaming the existing jobs . It's probably not even a good idea to create jobs with special names (just to "gensym" them) because it will _look_ like we're using the name, and we're not. We don't want to confuse users or administrators or developers who won't read this paragraph. Instead, we'll store the Perforce job name corresponding to an issue in the issue record. This means extending the record in the DTDB. We'll also store the issue ID in the Perforce job. This means extending the Perforce job spec. Each replication daemon will have a organization-unique ID (the "replicator ID"), which is a short identifier-like string (e.g. "bork") assigned by the administrator at installation time. This will be used as a prefix to disambiguate stuff where necessary. All the fields with which we extend the job spec will include the replicator ID, to support possible extension to mapping each job to an issue in more than one DTDB. For example, we might add a field called "P4DTI-BORK-DTII" for Perforce Defect Tracking Integration, replication ID "bork", defect tracking issue identifier. Other fields might be last replicated time, etc. There needs to be a corresponding thing in the DTDB. Again, the replicator ID should be used in the extra columns so that, possibly, the system could be extended to mapping each issue to more than one Perforce job via separate replicators. One can identify which jobs are replicated in which DTDBs by asking Perforce using "p4 jobs". There'll be obvious SQL queries to do the corresponding operation in the DTDB. We could simplify this by assuming that each job is replicated once into a single defect tracker via single replicator and have a field like "P4DTI-RID = bork". Better not. Someone's bound to want to extend it. We could have a single field in the job which holds structured data with all the state, e.g. "P4DTI-STATE = { iid = 4, ...}" but this prevents use of "p4 jobs" to query the Perforce job "table", so we won't do that. In general, we want to make the design extensible where this doesn't cost very much. That's always a good principle, and essential to evolutionary delivery ("Open ended design" -- Gilb). We need to think about installation. For 0.1 it's OK to have a manual procedure for extending the job spec and the DTDB with the necessary fields, but we'll have to make it easier than that. At installation you need to (inter alia): - assign an RID to the replicator - extend the Perforce job spec with the fields the replicator needs - extend the DTDB issue table with the fields the replicator needs - tell the replicator how to contact the Perforce server - tell the replicator how to contact the DT server and what kind it is (e.g. "tTrack") - tell the replicator who to contact when there's a problem (and how?) - tell the replicator which fields will be replicated (and possibly their types) Perforce field types are "word", "line", "text", "select" (validated as one of a list of keywords), and "date" (time). Chris floated the idea of adding "number" when we met in 2000-07. For 0.1 we'll do the extension by hand and have the replicator information in variables at the beginning of the Python script. We'll hack tTrack's DB with MS Access to add the extra column and connect the cases with Perforce jobs. Then all we have to do is detect changes to the cases and do "p4 job" for each changes, squirting in fields from the case. (There are some "required" fields which we must supply. See the Perforce System Administrator's Guide. Perforce jobs need certain fields filled in to begin with. For example, the status, user, and possible other site-specific fields. (I added "History" as a required field to the jobs at perforce.ravenbrook.com, for example.) So, the administrator will need to give defaults for these fields to the replicator when the data is absent or not wanted from the DT. (In most cases, the "status" will be replicated and transformed, but maybe not.) There may be some transformation to do the data from DTDB to Perforce (and vice versa). In fact, there might be an arbitrary mapping from one side to the other, for example, combining information from several DT fields into one p4 field. To be most general, we need a function from the issue data to the p4 job data. We can provide various functions: a simple copy with simple table (which I suggest for 0.1), a function which takes a table and a function for each field. Here's a sketch for 0.1: def simple_transform_dt_to_p4(change): data = default_job_data data['job'] = case['p4dti_' + rid + '_jobName']; for field in replicated_fields: data[field.p4_name] = case[field.dt_name] transform_dt_to_p4 = simple_transform_dt_to_p4 while true: sleep for a bit or wait for trigger of some sort changes = dt.changes() for change in changes: data = transform_dt_to_p4(change) p4 -G job -i < data For 0.2 we can add transforming functions of various sorts.