climateprediction.net home page
Lost Work Unit!

Lost Work Unit!

Message boards : Number crunching : Lost Work Unit!
Message board moderation

To post messages, you must log in.

AuthorMessage
M0CZY
Avatar

Send message
Joined: 17 Jul 08
Posts: 12
Credit: 1,431,573
RAC: 781
Message 34455 - Posted: 1 Aug 2008, 9:38:39 UTC

I am sad to report that my work unit has gone.
There was nothing wrong with it, and it had reached
44% in 210 hrs crunching.
While I was transfering the Boinc folder from one
computer to another, my USB flash drive died, and all
the data was lost, and no, I didn\'t have a backup of
it!
Let that be a lesson to me.
ID: 34455 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 34457 - Posted: 1 Aug 2008, 12:00:57 UTC
Last modified: 1 Aug 2008, 12:01:22 UTC

That\'s really bad luck. For a newish member you were doing well to attempt to move the BOINC folder contents to another computer. Fortunately it was one of the relatively \'short\' models. You\'d have been even madder if you\'d reached 95% of a 160-year model, though most of its data would have been usable for the researchers anyway from the decadal zipped uploads.

If there\'s anything we can learn from your experience, it\'s this:

* Try never to delete the contents of a BOINC folder until you have two backups, preferably in different places and better still on different drives.

* Deleted BOINC folder contents can be successfully restored from the Recycle bin/Trash, so don\'t empty that until the entire transfer process is completed with BOINC + tasks up and running everywhere you want them.

* If you\'re in a mess and have the contents of two BOINC folders in Trash with all the files & folders mixed up, you can usually still recognise what belongs to which package by time of deletion and can selectively restore one package file by file.

When hardware fails, some of what I\'ve said will just be pie in the sky.

It\'s a good idea for anyone who can afford it to invest in an external hard drive and keep it physically disconnected when not in use. The price of these has come down and the same external drive can store backups from more than one computer.

Better luck with your next flash drive!
Cpdn news
ID: 34457 · Report as offensive     Reply Quote
M0CZY
Avatar

Send message
Joined: 17 Jul 08
Posts: 12
Credit: 1,431,573
RAC: 781
Message 43585 - Posted: 21 Dec 2011, 11:03:08 UTC

Can an Administrator with access to the project server manually abort workunits 7648624 and 7719794 for me, as all my data has been lost again.
There is no reason for these work units to wait until their deadline before they get resent.
Unfortunately, I wasn't able to backup my data, as I had only just started one of the work units.
ID: 43585 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1061
Credit: 6,463,915
RAC: 0
Message 43588 - Posted: 21 Dec 2011, 22:23:46 UTC

The administration of work units is automated. The project doesn't expect all models within a work unit to complete nor does it even rely on any model within a work unit completing - the analysis is statistical. So there's no manual intervention except where a complete batch is bad for some reason and the whole batch needs to be cancelled.

There have been completions on that machine. What do you think the problem is for these two models?
ID: 43588 · Report as offensive     Reply Quote
M0CZY
Avatar

Send message
Joined: 17 Jul 08
Posts: 12
Credit: 1,431,573
RAC: 781
Message 43596 - Posted: 22 Dec 2011, 15:51:49 UTC - in response to Message 43588.  

There have been completions on that machine. What do you think the problem is for these two models?


There were no problems with the work units, the data loss was caused by my USB drive becoming corrupted, and all my BOINC data for that machine was lost and gone.
For that particular machine I have no choice but to run from a USB flash drive.
ID: 43596 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 43598 - Posted: 22 Dec 2011, 18:16:13 UTC - in response to Message 43596.  

It's well known that flash drives have a limited number of write/read cycles. They were intended to be a replacement for floppy disks, not hard drives.


Backups: Here
ID: 43598 · Report as offensive     Reply Quote

Message boards : Number crunching : Lost Work Unit!

©2022 climateprediction.net