P4DTI issue job000443

TitleReplicator needlessly re-does work
Statussuspended
Priorityessential
Assigned userNick Barnes
OrganizationRavenbrook
DescriptionThe replicator interprets many errors when replicating to Perforce as general server errors. This causes it to abandon the current poll and start over next time. However, if the error isn't a general server error then this starting over is needlessly wasteful, and can cause unnecessary conflicts.
This came up again in job00485.
AnalysisSee the customer report [1] for an example of an error when replicating to Perforce and a description of how bad the problem can be when it fails. See also my analysis [2]. It's hard to find a good solution to this. Here are some possibilities, and what's wrong with them:
1. Continue when you get an error in replicating to Perforce. (Goes very wrong if it error is a real server but misinterpreted. Also, it leaves the issue in an inconsistent state, violating requirement 1.)
2. Remember locally to the replicator that the issue needs to be replicated again. Don't forget to notice and take appropriate action when it also appears in the course of normal replication! (If the problem doesn't go away, the administrator gets an e-mail every poll. Also, when the replicator stops, we forget that the issue needs to be replicated and leave it inconsistent.)
3. Remember in the defect tracker database that the issue needs to be replicated again. (If the problem doesn't go away, the administrator gets an e-mail every poll.)
3b. [NB 2002-03-27] Flag problem jobs and issues. In every poll, try to replicate all the problem jobs and issues. Send e-mail if you succeed, but not if you fail. Don't fail the poll for a failed replication of a problem job or issue.
4. As 2 or 3, but keep a separate list of issues with problems replicating them to Perforce, and apply exponential back-off to these issues only. (Complex. E-mail volume might still not be acceptable.)
5. This wouldn't be so bad if the replicator didn't generate so much email when a lot of jobs conflict. See job000444.
See Michael Biczynski's analysis [3].
How foundcustomer
Evidence[1] <http://info.ravenbrook.com/mail/2001/12/19/02-54-20/0.txt>
[2] <http://info.ravenbrook.com/mail/2001/12/19/16-26-34/0.txt>
[3] <http://info.ravenbrook.com/mail/2001/12/19/19-05-25/0.txt>
Observed in1.2.1
Created byGareth Rees
Created on2001-12-19 16:51:38
Last modified byNick Barnes
Last modified on2018-07-05 17:27:53
History2001-12-19 GDR Created.
       2018-07-05 NB Suspended because the P4DTI is obsolete.