climateprediction.net home page
\"No space left on device\" error

\"No space left on device\" error

Message boards : Number crunching : \"No space left on device\" error
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 35438 - Posted: 6 Nov 2008, 2:19:25 UTC
Last modified: 6 Nov 2008, 2:54:03 UTC

One of my HadSM3 6.07 slab models here is trying to upload its final .zip file (the model run is 100% complete) but BM 6.2.19 is showing these messages:

11/5/2008 6:06:47 PM|climateprediction.net|Started upload of hadsm3fub_k2xu_005968597_1_3.zip
11/5/2008 6:07:19 PM|climateprediction.net|[error] Error on file upload: can\'t open file /home/cpdn/boinc/hadsm3fub_k2xu_005968597_1_3.zip: No space left on device
11/5/2008 6:07:19 PM|climateprediction.net|Temporarily failed upload of hadsm3fub_k2xu_005968597_1_3.zip: transient upload error
11/5/2008 6:07:19 PM|climateprediction.net|Backing off 3 hr 1 min 43 sec on upload of hadsm3fub_k2xu_005968597_1_3.zip

My disk is NOT out of space - there are many GBs left. On the BM Transfers tab, it uploads the whole 1.27 MB file and the progress on this tab shows 100%, but it keeps retrying again and again with the above errors.

I have bounced BM and rebooted but I\'m still getting the error. Any ideas what is happening?
ID: 35438 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35439 - Posted: 6 Nov 2008, 2:56:23 UTC
Last modified: 6 Nov 2008, 3:02:05 UTC

The device in question won\'t be your own disk. It means the CPDN server disk. I\'ll report this to Milo. I thought the new server had sorted the disk space problem for a while. Apparently the recent CPDN database backup wasn\'t entirely successful and the job\'s going to have to be repeated next week. I don\'t know whether this has caused the disk to fill up.

Just suspend your BOINC\'s network activity for the time being. It\'s best to avoid repeated failed upload attempts.

(I\'m sure I asked in a BOINC Trac ticket ages ago for this \'No space left on device\' BOINC message to be reworded so crunchers don\'t get the impression that their own disk has filled up and possibly panic. I think I suggested \'No space left on server disk\' instead. I\'ll check my previous tickets to either request or re-request this rewording. As it stands this is a dreadful message. Experienced crunchers don\'t all understand it so heaven help the newbies. Could you please tell me what BOINC version that computer has?)
Cpdn news
ID: 35439 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 35440 - Posted: 6 Nov 2008, 3:04:24 UTC - in response to Message 35439.  
Last modified: 6 Nov 2008, 3:12:41 UTC

The device in question won\'t be your own disk. It means the CPDN server disk. I\'ll report this to Milo. I thought the new server had sorted the disk space problem for a while. Apparently the recent CPDN database backup wasn\'t entirely successful and the job\'s going to have to be repeated next week. I don\'t know whether this has caused the disk to fill up.

Just suspend your BOINC\'s network activity for the time being. It\'s best to avoid repeated failed upload attempts.

(I\'m sure I asked in a BOINC Trac ticket ages ago for this \'No space left on device\' BOINC message to be reworded so crunchers don\'t get the impression that their own disk has filled up and possibly panic. I think I suggested \'No space left on server disk\' instead. I\'ll check my previous tickets to either request or re-request this rewording. As it stands this is a dreadful message. Experienced crunchers don\'t all understand it so heaven help the newbies. Could you please tell me what BOINC version that computer has?)


Thanks, Mo.

What happens to the trickles from my other four CPDN models currently running when I suspend network activity? Are they queued and appear on the Transfers tab or are they just repeated in full when the network activity is unsuspended?
ID: 35440 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 35442 - Posted: 6 Nov 2008, 3:44:47 UTC

Given that the UK time is in the wee hours and Mo might not be awake for hours, I\'ll try to substitute, temporarily.

- Trickles do not appear in the Transfers Tab, only \'uploads\', which vary per the Model (Ten Model years for some, Model-thirds for some, -quarters for others.) You can see stored Trickles if you dig into the Model\'s/Models\' files.

- Yes, Trickles and uploads will be stored in your boinc folder until the next time boinc gets an Internet fix. No problem -- unless the machine is starved for hard-drive space.
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 35442 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 35444 - Posted: 6 Nov 2008, 4:57:56 UTC - in response to Message 35442.  

Given that the UK time is in the wee hours and Mo might not be awake for hours, I\'ll try to substitute, temporarily.

- Trickles do not appear in the Transfers Tab, only \'uploads\', which vary per the Model (Ten Model years for some, Model-thirds for some, -quarters for others.) You can see stored Trickles if you dig into the Model\'s/Models\' files.

- Yes, Trickles and uploads will be stored in your boinc folder until the next time boinc gets an Internet fix. No problem -- unless the machine is starved for hard-drive space.


Thanks, Astro.
ID: 35444 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35447 - Posted: 6 Nov 2008, 18:59:31 UTC

Ed, I reported the problem to Milo and had a private message from him hours ago while I was at work. He says (and I hope he doesn\'t mind me quoting him):

Mo,
Do you have any more information about this, e.g. any messages from the client suggesting what host it\'s trying to connect to? I\'ve not received any warnings from our scripts and the obvious culprit as suggested by the URL seems fine.
Thanks,
Milo.


Could you enable network activity again to see whether your zip file now uploads?

I\'ll give Milo a link to this thread (should have done so earlier).
Cpdn news
ID: 35447 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 35449 - Posted: 6 Nov 2008, 20:47:12 UTC

Different error message, but possibly the same problem:


7/11/2008 7:26:58 AM||Resuming network activity
7/11/2008 7:26:58 AM|climateprediction.net|Sending scheduler request: To send trickle-up message. Requesting 0 seconds of work, reporting 0 completed tasks
7/11/2008 7:26:59 AM|climateprediction.net|Started upload of hadsm3fub_k2sk_005968407_3_3.zip
7/11/2008 7:27:03 AM|climateprediction.net|Scheduler request succeeded: got 0 new tasks
7/11/2008 7:27:04 AM||Project communication failed: attempting access to reference site
7/11/2008 7:27:04 AM|climateprediction.net|Temporarily failed upload of hadsm3fub_k2sk_005968407_3_3.zip: connect() failed
7/11/2008 7:27:04 AM|climateprediction.net|Backing off 1 min 0 sec on upload of hadsm3fub_k2sk_005968407_3_3.zip
7/11/2008 7:27:06 AM||Internet access OK - project servers may be temporarily down.
7/11/2008 7:27:19 AM||Suspending network activity - user request


This is for the overnight trickles and final zip file of a slab.
According to client_state.xml, the zip is going to oerc

According to the server status page, uploadatm is not running.

ID: 35449 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 35450 - Posted: 7 Nov 2008, 3:52:09 UTC

Mo and Milo,

My CPDN uploads are still not working - I\'ve got three queued on the Transfers tab, all for models that have finished.

I\'m now getting the same message that Les is getting:

11/6/2008 7:39:27 PM|climateprediction.net|Started upload of hadsm3fub_k2xu_005968597_1_3.zip
11/6/2008 7:39:35 PM||Project communication failed: attempting access to reference site
11/6/2008 7:39:35 PM|climateprediction.net|Temporarily failed upload of hadsm3fub_k2xu_005968597_1_3.zip: connect() failed
11/6/2008 7:39:35 PM|climateprediction.net|Backing off 3 hr 1 min 15 sec on upload of hadsm3fub_k2xu_005968597_1_3.zip
11/6/2008 7:39:36 PM||Internet access OK - project servers may be temporarily down.

For whatever it\'s worth, my other project, Superlink at Technion, is communicating just fine.

All the uploads are apparently trying to go to http://uploader.oerc.ox.ac.uk/cpdn_cgi/file_upload_handler
ID: 35450 · Report as offensive     Reply Quote
Profile Milo Thurston
Volunteer moderator
Volunteer developer

Send message
Joined: 2 Mar 06
Posts: 253
Credit: 363,646
RAC: 0
Message 35453 - Posted: 7 Nov 2008, 9:09:21 UTC - in response to Message 35450.  


All the uploads are apparently trying to go to http://uploader.oerc.ox.ac.uk/cpdn_cgi/file_upload_handler


That\'s strange, as that particular machine as about 5.5TB free on it.
Anyway, I think that I have tracked down the culprit; a machine in atmospheric physics with a broken sendmail configuration that wasn\'t warning us when it was full. I\'ve cleared this out and you should now start seeing some results.
ID: 35453 · Report as offensive     Reply Quote
old_user452941

Send message
Joined: 22 May 07
Posts: 35
Credit: 1,065,741
RAC: 0
Message 35454 - Posted: 7 Nov 2008, 12:52:43 UTC

Everything is working OK now. Thank you.
ID: 35454 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35455 - Posted: 7 Nov 2008, 13:14:30 UTC

Good, so I can now say that in the News threads. I didn\'t want to announce Monday\'s planned outage before this problem was fixed.
Cpdn news
ID: 35455 · Report as offensive     Reply Quote

Message boards : Number crunching : \"No space left on device\" error

©2024 climateprediction.net