climateprediction.net home page
MORE DOWNLOAD ERRORS

MORE DOWNLOAD ERRORS

Message boards : Number crunching : MORE DOWNLOAD ERRORS
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1088
Credit: 19,563,235
RAC: 320
Message 47653 - Posted: 26 Nov 2013, 0:22:14 UTC
Last modified: 26 Nov 2013, 0:24:49 UTC


ID: 47653 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 1
Message 47655 - Posted: 26 Nov 2013, 0:37:00 UTC


It's a brand-new batch, and from what people are reporting it doesn't look like it is a good batch. Andy will need to take a look tomorrow.

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 47655 · Report as offensive     Reply Quote
MartinNZ

Send message
Joined: 22 Mar 06
Posts: 144
Credit: 24,695,428
RAC: 0
Message 47657 - Posted: 26 Nov 2013, 9:22:06 UTC - in response to Message 47655.  

Just to let you know all my PNW downloads have failed - all 11 of them, including the 3 that are currently shown as In Progress. Comp ID 1290283.
ID: 47657 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 1
Message 47658 - Posted: 26 Nov 2013, 10:06:01 UTC - in response to Message 47657.  

Just to let you know all my PNW downloads have failed - all 11 of them, including the 3 that are currently shown as In Progress. Comp ID 1290283.


Yeah, it's a bad batch.

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 47658 · Report as offensive     Reply Quote
Profile Byron Leigh Hatch @ team Carl Sagan
Avatar

Send message
Joined: 17 Aug 04
Posts: 289
Credit: 43,673,626
RAC: 4,436
Message 47663 - Posted: 26 Nov 2013, 15:59:13 UTC

Also reporting that all my PNW downloads have failed too.
ID: 47663 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7007
Credit: 20,926,388
RAC: 5,087
Message 47664 - Posted: 26 Nov 2013, 19:03:15 UTC

The recent batch was a mass re-issue from June 2012 by the BOINC software, so Abort anything that hasn't failed by itself.

ID: 47664 · Report as offensive     Reply Quote
Profile Bonsai911

Send message
Joined: 9 Sep 04
Posts: 210
Credit: 28,317,278
RAC: 210
Message 47669 - Posted: 26 Nov 2013, 22:52:19 UTC

????

Should we also abort the hadcm3n ?
ID: 47669 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7007
Credit: 20,926,388
RAC: 5,087
Message 47670 - Posted: 27 Nov 2013, 0:32:37 UTC - in response to Message 47669.  

Ah. No information available about them. I have one, and I'm letting it run.

ID: 47670 · Report as offensive     Reply Quote
MartinNZ

Send message
Joined: 22 Mar 06
Posts: 144
Credit: 24,695,428
RAC: 0
Message 47671 - Posted: 27 Nov 2013, 0:44:54 UTC - in response to Message 47670.  
Last modified: 27 Nov 2013, 0:46:53 UTC

I thought the recent hadcm3n models (I have 7) were just normal downloads, but since Bonsai911's post, I do notice that some of them have exactly the same sent times as the PNW models. However, looking that the recently sent hadcm3n work unit IDs, all of the tasks had 100% failures in the past, so are possibly part of a normal reissue. Like Les, I'm letting them run.

Perhaps a wrong flag got set somewhere that resulted in the PNW models being released as well as reissued hadcm3n models?
ID: 47671 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 320
Credit: 3,106,205
RAC: 1,936
Message 47672 - Posted: 27 Nov 2013, 2:32:06 UTC - in response to Message 47653.  

Me to. (I am not sure the following two items are the same work unit or not. I think they are, but if not, they behave similarly.

8170815

26-Nov-2013 19:00:03 [climateprediction.net] Requesting new tasks
26-Nov-2013 19:00:08 [climateprediction.net] Scheduler request completed: got 1 new tasks
26-Nov-2013 19:00:10 [climateprediction.net] Started download of hadam3p_pnw_c6xt_1969_1_008024812.zip
26-Nov-2013 19:00:10 [climateprediction.net] Started download of atmos_c6xt_1969_1_008024812_0.gz
26-Nov-2013 19:00:12 [climateprediction.net] Giving up on download of hadam3p_pnw_c6xt_1969_1_008024812.zip: file not found
26-Nov-2013 19:00:12 [climateprediction.net] Giving up on download of atmos_c6xt_1969_1_008024812_0.gz: file not found
26-Nov-2013 19:00:12 [climateprediction.net] Started download of pnw_c6xt_1969_1_008024812_0.gz
26-Nov-2013 19:00:12 [climateprediction.net] Started download of HadISST_SST_N96_1968_12_1971_01f.gz
26-Nov-2013 19:00:13 [climateprediction.net] Giving up on download of pnw_c6xt_1969_1_008024812_0.gz: file not found
26-Nov-2013 19:00:13 [climateprediction.net] Started download of HadISST_SI_N96_1968_12_1971_01f.gz
26-Nov-2013 19:00:15 [climateprediction.net] Finished download of HadISST_SI_N96_1968_12_1971_01f.gz
26-Nov-2013 19:00:15 [climateprediction.net] Started download of so2dms_N96_1968_12_1971_02.gz
26-Nov-2013 19:00:18 [climateprediction.net] Finished download of so2dms_N96_1968_12_1971_02.gz
26-Nov-2013 19:00:20 [climateprediction.net] Finished download of HadISST_SST_N96_1968_12_1971_01f.gz

ID: 47672 · Report as offensive     Reply Quote
Eirik Redd

Send message
Joined: 31 Aug 04
Posts: 364
Credit: 114,143,086
RAC: 993
Message 47673 - Posted: 27 Nov 2013, 2:54:43 UTC

Looking at the logs here last 2 days -- seems like a batch of hadcm3n/RAPID/Rapit hit the server and got taken up by crunchers, no problems at all. Strange but good to be running a range of models some starting in 1880, and some in 2060.

The regional batch - yeah, problems.

The crew will likely figure and fix the regional model problem soonish.

Meanwhile - keep on crunching.

ID: 47673 · Report as offensive     Reply Quote
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1276
Credit: 15,546,782
RAC: 0
Message 47677 - Posted: 27 Nov 2013, 9:43:30 UTC

As far as I can see all of the download failures are for reissued tasks from workunits created on 23rd July 2012. Every one I've looked at has one task which appears to be unsent (status unknown), for example WU 8179645.

I suspect it's another instance of BOINC automatically generating new tasks 12500 hours after creation of a workunit which is still in the database and hasn't been completed or reached its error limit. This can only happen on projects like CPDN which don't remove results and workunits from the database. The download files files were probably deleted from the server when that batch of work was "completed" (or weren't transferred over when the server was rebuilt).
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 47677 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1088
Credit: 19,563,235
RAC: 320
Message 47819 - Posted: 19 Dec 2013, 21:50:01 UTC
Last modified: 19 Dec 2013, 21:55:38 UTC


ID: 47819 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 377
Credit: 7,127,507
RAC: 0
Message 47820 - Posted: 19 Dec 2013, 22:29:51 UTC - in response to Message 47819.  

ID: 47820 · Report as offensive     Reply Quote
Profile Alan K

Send message
Joined: 22 Feb 06
Posts: 295
Credit: 14,768,121
RAC: 313
Message 47823 - Posted: 20 Dec 2013, 10:29:18 UTC

Looks like I have a similar problem:-

20/12/2013 06:15:36 | climateprediction.net | Scheduler request completed: got 1 new tasks
20/12/2013 06:15:38 | climateprediction.net | Started download of hadam3p_eu_a166_1984_1_008055647.zip
20/12/2013 06:15:38 | climateprediction.net | Started download of atmos_a166_1984_1_008055647_0.gz
20/12/2013 06:15:40 | climateprediction.net | Giving up on download of hadam3p_eu_a166_1984_1_008055647.zip: permanent HTTP error
20/12/2013 06:15:40 | climateprediction.net | Giving up on download of atmos_a166_1984_1_008055647_0.gz: permanent HTTP error
20/12/2013 06:15:40 | climateprediction.net | Started download of eu_a166_1984_1_008055647_0.gz
20/12/2013 06:15:40 | climateprediction.net | Started download of HadISST_SST_N96_1983_12_1986_01f.gz
20/12/2013 06:15:42 | climateprediction.net | Giving up on download of eu_a166_1984_1_008055647_0.gz: permanent HTTP error
20/12/2013 06:15:42 | climateprediction.net | Started download of HadISST_SI_N96_1983_12_1986_01f.gz
20/12/2013 06:15:43 | climateprediction.net | Finished download of HadISST_SST_N96_1983_12_1986_01f.gz
20/12/2013 06:15:43 | climateprediction.net | Finished download of HadISST_SI_N96_1983_12_1986_01f.gz
20/12/2013 06:15:43 | climateprediction.net | Started download of so2dms_N96_1983_12_1986_02.gz
20/12/2013 06:15:45 | climateprediction.net | Finished download of so2dms_N96_1983_12_1986_02.gz
ID: 47823 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 982
Credit: 3,152,324
RAC: 2,152
Message 47825 - Posted: 20 Dec 2013, 15:11:13 UTC - in response to Message 47823.  

[chavk wrote:]
Looks like I have a similar problem:-

20/12/2013 06:15:36 | climateprediction.net | Scheduler request completed: got 1 new tasks
20/12/2013 06:15:38 | climateprediction.net | Started download of hadam3p_eu_a166_1984_1_008055647.zip ...

That hadam3p_eu_a166_1984_1_008055647 work unit was generated on 17 July 2012. At some point when the bulk of the models in that batch had completed, the project removed the download files, not realising that the work units would reappear later. A better policy might have been to remove only those work units that had completed and leave us to slowly mop up the rest - or, indeed, to stop the server reviving these phantoms.

In any event, you needn't be concerned that there is anything you can do to stop these errors. It's a project error.
ID: 47825 · Report as offensive     Reply Quote
Profile Alan K

Send message
Joined: 22 Feb 06
Posts: 295
Credit: 14,768,121
RAC: 313
Message 47826 - Posted: 20 Dec 2013, 15:13:22 UTC
Last modified: 20 Dec 2013, 15:15:52 UTC

Have got this error on 5 downloads

Task 16151138 work ID 8211626
16150408 8210761
16149592 8209709
16149590 8209707
16148812 8206403

all HADAM3P_eu

Thanks Ian, I'll ignore them.
ID: 47826 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 377
Credit: 7,127,507
RAC: 0
Message 47827 - Posted: 20 Dec 2013, 16:04:43 UTC - in response to Message 47825.  

[chavk wrote:]
Looks like I have a similar problem:-

20/12/2013 06:15:36 | climateprediction.net | Scheduler request completed: got 1 new tasks
20/12/2013 06:15:38 | climateprediction.net | Started download of hadam3p_eu_a166_1984_1_008055647.zip ...

That hadam3p_eu_a166_1984_1_008055647 work unit was generated on 17 July 2012. At some point when the bulk of the models in that batch had completed, the project removed the download files, not realising that the work units would reappear later. A better policy might have been to remove only those work units that had completed and leave us to slowly mop up the rest - or, indeed, to stop the server reviving these phantoms.

In any event, you needn't be concerned that there is anything you can do to stop these errors. It's a project error.

It's curious that, for both that a166, and the 9ckn I found for Jim, it looks as if replication_0 was never issued in the first place.
ID: 47827 · Report as offensive     Reply Quote
Profile Bonsai911

Send message
Joined: 9 Sep 04
Posts: 210
Credit: 28,317,278
RAC: 210
Message 47835 - Posted: 22 Dec 2013, 16:03:15 UTC

First time appearance:


22.12.2013 16:53:48 | climateprediction.net | Scheduler request failed: HTTP gateway timeout

ID: 47835 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 13,363,707
RAC: 2
Message 47836 - Posted: 22 Dec 2013, 17:03:08 UTC

The real reason for the download failure was:

WU download error: couldn't get input files

and the explanation for this is in Iain Inglis's post four above this. The task must have tried and tried to get all its files before the download timed out.

The numbering of tasks in this batch of workunits appears chaotic, or perhaps it's the order in which they are sent out that's chaotic. It several workunits that I've looked at the _2 task was sent out first. They are all part of the defective batch created on 16 and 17 July 2012.




Cpdn news
ID: 47836 · Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : MORE DOWNLOAD ERRORS

©2019 climateprediction.net