climateprediction.net home page
Restore from Checkpoint

Restore from Checkpoint

Questions and Answers : Wish list : Restore from Checkpoint
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user2500

Send message
Joined: 28 Aug 04
Posts: 65
Credit: 9,636,280
RAC: 0
Message 3049 - Posted: 5 Sep 2004, 0:24:03 UTC

I have a lost a number of workunits due to system crashes and lockups. In each case CPDN downloads a completely new workunit and I have to restart from zero.

It would be nice to have the option to checkpoint the CPDN system state from time to time and then restore and resume processing from a saved state if need be.
ID: 3049 · Report as offensive     Reply Quote
Profile Keck_Komputers
Avatar

Send message
Joined: 5 Aug 04
Posts: 426
Credit: 2,426,069
RAC: 0
Message 3105 - Posted: 5 Sep 2004, 22:04:06 UTC

CPDN checkpoints every 144 timestamps. However any time there is a crash there is a chance that the checkpoint file will be corrupt and cause the model to either restart or be abandoned.
<br>John Keck -- BOINCing since 2002/12/08 -- <a href="http://www.boinc.dk/index.php?page=user_statistics&amp;project=cpdn&amp;userid=191"><img border="0" height="80" src="http://191.cpdn.sig.boinc.dk?188"></a>
ID: 3105 · Report as offensive     Reply Quote
old_user2500

Send message
Joined: 28 Aug 04
Posts: 65
Credit: 9,636,280
RAC: 0
Message 3106 - Posted: 5 Sep 2004, 22:14:18 UTC

I probably phrased the question poorly .. what I should have been asking for is a "restore point" . A point where the entire workunit can be restarted at my option.

I see a suggestion in another thread that the workunit can be backed up manually and then copied back if needed. I will give that a try.

Thanks for your response John. It's an interesting project .. it is just going to take a litle time to get work through the inevitable gotchas.
ID: 3106 · Report as offensive     Reply Quote
old_user2500

Send message
Joined: 28 Aug 04
Posts: 65
Credit: 9,636,280
RAC: 0
Message 3202 - Posted: 7 Sep 2004, 4:54:45 UTC - in response to Message 3106.  

&gt; I see a suggestion in another thread that the workunit can be backed up
&gt; manually and then copied back if needed. I will give that a try.

Sigh .. nice idea but no cigar. Boinc seems to keep track of what units it is working on. A restore has no effect if a new model has been downloaded already.

My workunit appears to have been toasted when I killed another users logon session under Win XP home. BOINC on my session appeared to be frozen. I suspect that it was running in the other users session. After killing that session and restarting BOINC the CPDN app reported a corrupt or missing file and downnloaded a new model.




ID: 3202 · Report as offensive     Reply Quote

Questions and Answers : Wish list : Restore from Checkpoint

©2024 climateprediction.net