P4DTI issue job000045

Title	The replicator needs restarting too often
Status	closed
Priority	essential
Assigned user	Gareth Rees
Organization	Ravenbrook
Description	The replicator stops when there are certain errors, and the administrator has to figure this out and restart it.
Analysis	It should probably never stop unless there's an assertion failure, but have recovery behaviour for all run-time error conditions. We need to go through and find all the error conditions and add recovery behaviour.
How found	manual_test
Evidence	<`http://www.ravenbrook.com/project/p4dti/doc/2000-11-01/quokka-alpha-test/`> item 6
Observed in	0.3.2
Created by	Richard Brooksby
Created on	2000-11-21 16:04:37
Last modified by	Gareth Rees
Last modified on	2001-12-10 19:01:09
History	2000-11-21 RB Created from sources (see evidence).

Fixes

Change	Effect	Date	User	Description
5388	closed	2000-12-04 18:41:38	Gareth Rees	Removing resolver role from the integration: Tidied up the replication and mailing methods to make this part of the replicator clearer. Changed replicate_issue_to_job() to replicate_issue_dt_to_p4() and replicate_job_to_issue() to replicate_issue_p4_to_dt() for consistency with other method names. Added new methods overwrite_issue_dt_to_p4() and overwrite_issue_p4_to_dt(), which are wrappers around the above methods that also e-mail the affected parties. In order to be able to e-mail the owner of a job, added the replicator method user_email_address for getting a Perforce user's e-mail address and the replicator configuration parameter job-owner-field that names the owner field in a job (the automatic configuration generator specifies 'Owner' for this). Removed the mail_administrator method() (use mail instead), and improved the mail() method. Improved many log entries by using issue.readable_name instead of issue.id. Re-organized and documented the replicate() function. It's no less complicated than it was before, but I think it's easier to understand. Functions like replicate() and replicate_issue_p4_to_dt() no longer return a meaningful error code. They either return or throw an exception. Changed conflict_policy so that it always returns 'dt'. This is the key change that removes the resolver's role. Changed the handling of errors to support the change in the conflict policy. There's now no need for conflict_error, so this is gone. The reverting of jobs to the corresponding issue is handled by the revert_issue_dt_to_p4() method: this ensures that it gets called in the correct place only (that is, normal replication from Perforce to the defect tracker has failed). The replicate_many() method no longer does any handling of errors. All other errors are caught by the run() method so that the replicator can keep on going, except AssertionError and KeyboardInterrupt, both of which stop the replicator.