climateprediction.net home page
Download errors on UK Met Office HadAM4 at N216 resolution v8.52 tasks

Download errors on UK Met Office HadAM4 at N216 resolution v8.52 tasks

Message boards : Number crunching : Download errors on UK Met Office HadAM4 at N216 resolution v8.52 tasks
Message board moderation

To post messages, you must log in.

AuthorMessage
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 384
Credit: 8,864,416
RAC: 6,872
Message 62346 - Posted: 28 Apr 2020, 15:51:56 UTC
Last modified: 28 Apr 2020, 16:02:52 UTC

Two successive tasks have failed to download cleanly. task 21931151 reports:

<message>
WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>a10l_867_atmos.gz</file_name>
  <error_code>-119 (md5 checksum failed for file)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>ic_N216_2002_12_000004.nc.gz</file_name>
  <error_code>-119 (md5 checksum failed for file)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>HAPPI_1.5K_sst_N216_2095-10-01_2096-04-30.gz</file_name>
  <error_code>-119 (md5 checksum failed for file)</error_code>
</file_xfer_error>
</message>
but local client reports

Tue 28 Apr 2020 11:34:04 BST | climateprediction.net | [unparsed_xml] SCHEDULER_REPLY::parse(): unrecognized ?xml
Tue 28 Apr 2020 11:34:04 BST | climateprediction.net | [unparsed_xml] SCHEDULER_REPLY::parse(): unrecognized upload_template
Tue 28 Apr 2020 11:34:06 BST | climateprediction.net | Started download of a10l_867_atmos.gz
Tue 28 Apr 2020 11:34:06 BST | climateprediction.net | Started download of ic_N216_2002_12_000004.nc.gz
Tue 28 Apr 2020 11:34:06 BST | climateprediction.net | Started download of HAPPI_1.5K_sst_N216_2095-10-01_2096-04-30.gz
Tue 28 Apr 2020 11:34:08 BST | climateprediction.net | Temporarily failed download of a10l_867_atmos.gz: connect() failed
Tue 28 Apr 2020 11:34:08 BST | climateprediction.net | Temporarily failed download of ic_N216_2002_12_000004.nc.gz: connect() failed
Tue 28 Apr 2020 11:34:08 BST | climateprediction.net | Temporarily failed download of HAPPI_1.5K_sst_N216_2095-10-01_2096-04-30.gz: connect() failed
These two sets of messages don't seem to tie up.

Machine is just completing task 21922323, successfully downloaded 22 April. Anyone know of any changes since then?

Edit - second failure is task 21922854 - still in reporting delay, but similar symptoms locally.
ID: 62346 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 511
Credit: 22,395,784
RAC: 6,466
Message 62347 - Posted: 28 Apr 2020, 17:51:22 UTC - in response to Message 62346.  

I currently have one that has been stuck in download for two hours.
It has failed on three other machines.
I don't know if there is any connection or not.
https://www.cpdn.org/workunit.php?wuid=12016400
ID: 62347 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2792
Credit: 3,659,580
RAC: 11,331
Message 62348 - Posted: 28 Apr 2020, 18:29:00 UTC - in response to Message 62347.  

It has failed on three other machines.
I don't know if there is any connection or not.


Three previous errors are missing 32bit libraries.

Will contact project but might not manage till tomorrow morning so someone else might beat me to it.
ID: 62348 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 384
Credit: 8,864,416
RAC: 6,872
Message 62349 - Posted: 28 Apr 2020, 19:03:31 UTC - in response to Message 62348.  

Thanks Dave. Other tasks are running to completion, so it's not the libs here. A third has just failed - I'd better set NNT overnight.
ID: 62349 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7262
Credit: 23,235,750
RAC: 5,211
Message 62350 - Posted: 28 Apr 2020, 21:01:37 UTC

Problem reported.
ID: 62350 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7262
Credit: 23,235,750
RAC: 5,211
Message 62353 - Posted: 29 Apr 2020, 0:11:44 UTC

Had a reply to say that it should be fixed now.
ID: 62353 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 511
Credit: 22,395,784
RAC: 6,466
Message 62354 - Posted: 29 Apr 2020, 0:54:34 UTC - in response to Message 62353.  

Yes, the download finished and all is OK.
ID: 62354 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7262
Credit: 23,235,750
RAC: 5,211
Message 62355 - Posted: 29 Apr 2020, 2:43:05 UTC - in response to Message 62354.  

That's good.
I'm half way through, so a few more days yet.
ID: 62355 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 384
Credit: 8,864,416
RAC: 6,872
Message 62357 - Posted: 29 Apr 2020, 7:44:45 UTC
Last modified: 29 Apr 2020, 8:18:29 UTC

A second (near identical) machine has downloaded a new task which is currently ready to run. I'll re-enable the machine with yesterday's problem.

Has anyone heard what the problem was? I couldn't make any sense of the mixed messages.

Edit - problem machine has downloaded new work and is running again.
ID: 62357 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2792
Credit: 3,659,580
RAC: 11,331
Message 62360 - Posted: 29 Apr 2020, 8:43:37 UTC - in response to Message 62349.  

Thanks Dave. Other tasks are running to completion, so it's not the libs here. A third has just failed - I'd better set NNT overnight.


I would have been very surprised if you were missing the libs Richard!
ID: 62360 · Report as offensive     Reply Quote
bernard_ivo

Send message
Joined: 18 Jul 13
Posts: 388
Credit: 14,366,763
RAC: 9,023
Message 62428 - Posted: 15 May 2020, 9:46:37 UTC

I have a stuck download of batch 867 WU

Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] HTTP_OP::init_get(): http://download.cpdn.org/download//batch_867/workunits/hadam4h_a17c_209511_4_867_012013459.zip
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | Started download of hadam4h_a17c_209511_4_867_012013459.zip
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] HTTP_OP::init_get(): http://download.cpdn.org/download//batch_867/ancils/a17c_867_atmos.gz
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | Started download of a17c_867_atmos.gz
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] [ID#71828] Info: Connection 3859 seems to be dead!
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] [ID#71828] Info: Closing connection 3859
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] [ID#71828] Info: TLSv1.2 (OUT), TLS alert, Client hello (1):
Fri 15 May 2020 12:38:32 PM EEST | climateprediction.net | [http] [ID#71829] Info: Found bundle for host download.cpdn.org: 0x559f10a67390 [serially]
Fri 15 May 2020 12:38:33 PM EEST | climateprediction.net | [http] [ID#71828] Info: Trying 129.67.193.131...
..........................
Fri 15 May 2020 12:42:34 PM EEST | climateprediction.net | Temporarily failed download of ic_N216_2002_12_000004.nc.gz: transient HTTP error
Fri 15 May 2020 12:42:34 PM EEST | climateprediction.net | Backing off 00:04:42 on download of ic_N216_2002_12_000004.nc.gz
Fri 15 May 2020 12:42:34 PM EEST | climateprediction.net | Temporarily failed download of HAPPI_1.5K_sst_N216_2095-10-01_2096-04-30.gz: transient HTTP error
Fri 15 May 2020 12:42:34 PM EEST | climateprediction.net | Backing off 00:04:19 on download of HAPPI_1.5K_sst_N216_2095-10-01_2096-04-30.gz
ID: 62428 · Report as offensive     Reply Quote

Message boards : Number crunching : Download errors on UK Met Office HadAM4 at N216 resolution v8.52 tasks

©2020 climateprediction.net