climateprediction.net home page
Good news for Mac users. HadAM3P Latest News???

Good news for Mac users. HadAM3P Latest News???

Message boards : Number crunching : Good news for Mac users. HadAM3P Latest News???
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37625 - Posted: 3 Aug 2009, 21:11:38 UTC
Last modified: 3 Aug 2009, 21:14:11 UTC

Two more tasks (using v 6.08 for Mac) completed just before 8pm UTC today, 3rd August.
Both registered 72,000 in Task Details page and in Trickles Info page.
It is strange that both apparently completed Post Processing successfully and in both tasks, the 3 zip files were uploaded, but both tasks sat in the Tasks page flagged as Ready to Report 100% (and the next 2 tasks were already running as would be expected). Normally I would have expected these completed tasks to have been reported, but they had to be pushed by an update from the Project page.

I would not have thought that the delay in credits on trickles would have delayed the final display of 72,096 Time Steps, neither should the graphics display be still missing from the Task Details pages on both tasks 9268429 and 9268416 .

Maybe, maybe it will update tomorrow, but I cannot see why anything other than the credits granted should be delayed.

Keith
ID: 37625 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37626 - Posted: 3 Aug 2009, 21:22:43 UTC
Last modified: 3 Aug 2009, 21:24:04 UTC

The post here by Ageless lists the conditions that will trigger a "Report".
In the case of this project, the most common trigger is the next trickle_up, which will be from the next model started.

It's now suspected that there may be a clash between Post Processing, and the generation of the last trickle. This will need to be investigated.

In the meantime, I'd suggest running other model types. The slab models are fairly short.
Backups: Here
ID: 37626 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37627 - Posted: 4 Aug 2009, 2:58:57 UTC - in response to Message 37626.  
Last modified: 4 Aug 2009, 3:00:07 UTC

The post here by Ageless lists the conditions that will trigger a "Report".
In the case of this project, the most common trigger is the next trickle_up, which will be from the next model started.

It's now suspected that there may be a clash between Post Processing, and the generation of the last trickle. This will need to be investigated.

In the meantime, I'd suggest running other model types. The slab models are fairly short.


Les

Both tasks now show eror as follows:-

<core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
20:09:46 (12434): called boinc_finish

</stderr_txt>
]]>

Two trickle up messages were made an hour ago from the next 2 tasks, but, no doubt too late for the 72,096 to be amended on the relevant pages? This seems to require an amendment to the script even if I was "wrong" to use the Update command for the project?

Next 2 tasks will be left without using update to await next tricle as you suggest may complete the tasks and produce the elusive 72,096 report.

But maybe all will work normally when the trickle credits are also working correctly.
In another week I will have details from next 2 tasks to report on this thread again!!!

Keith
ID: 37627 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37629 - Posted: 4 Aug 2009, 9:41:46 UTC
Last modified: 4 Aug 2009, 9:49:39 UTC

As far as I could see all the Trickles were "triggered" as would be expected.
Only the reporting of the 2 tasks was missed although it did appear to be effected after the Update.
Here is the message page for the period of the last trickles and uploads, followed by the Update and also trickles of the following two tasks.
Maybe it will be of some help, I hope:-

Mon 3 Aug 19:59:34 2009 climateprediction.net Sending scheduler request: To send trickle-up message.
Mon 3 Aug 19:59:34 2009 climateprediction.net Not reporting or requesting tasks
Mon 3 Aug 19:59:48 2009 climateprediction.net Computation for task hadam3p_nh59_1961_2_006240311_0 finished
Mon 3 Aug 19:59:49 2009 climateprediction.net Starting hadam3p_naj9_1990_2_006231743_1
Mon 3 Aug 19:59:49 2009 climateprediction.net Starting task hadam3p_naj9_1990_2_006231743_1 using hadam3p version 608
Mon 3 Aug 19:59:51 2009 climateprediction.net Started upload of hadam3p_nh59_1961_2_006240311_0_1.zip
Mon 3 Aug 19:59:51 2009 climateprediction.net Started upload of hadam3p_nh59_1961_2_006240311_0_2.zip
Mon 3 Aug 19:59:55 2009 climateprediction.net Scheduler request completed
Mon 3 Aug 20:09:31 2009 climateprediction.net Sending scheduler request: To send trickle-up message.
Mon 3 Aug 20:09:31 2009 climateprediction.net Not reporting or requesting tasks
Mon 3 Aug 20:09:46 2009 climateprediction.net Scheduler request completed
Mon 3 Aug 20:09:48 2009 climateprediction.net Computation for task hadam3p_nh5f_1997_2_006240317_1 finished
Mon 3 Aug 20:09:49 2009 climateprediction.net Starting hadam3p_naxg_1966_2_006232254_2
Mon 3 Aug 20:09:49 2009 climateprediction.net Starting task hadam3p_naxg_1966_2_006232254_2 using hadam3p version 608
Mon 3 Aug 20:14:11 2009 climateprediction.net Finished upload of hadam3p_nh59_1961_2_006240311_0_2.zip
Mon 3 Aug 20:14:11 2009 climateprediction.net Started upload of hadam3p_nh59_1961_2_006240311_0_3.zip
Mon 3 Aug 20:14:50 2009 climateprediction.net Finished upload of hadam3p_nh59_1961_2_006240311_0_3.zip
Mon 3 Aug 20:14:50 2009 climateprediction.net Started upload of hadam3p_nh5f_1997_2_006240317_1_1.zip
Mon 3 Aug 20:17:42 2009 climateprediction.net Finished upload of hadam3p_nh59_1961_2_006240311_0_1.zip
Mon 3 Aug 20:17:42 2009 climateprediction.net Started upload of hadam3p_nh5f_1997_2_006240317_1_2.zip
Mon 3 Aug 20:32:17 2009 climateprediction.net Finished upload of hadam3p_nh5f_1997_2_006240317_1_1.zip
Mon 3 Aug 20:32:17 2009 climateprediction.net Started upload of hadam3p_nh5f_1997_2_006240317_1_3.zip
Mon 3 Aug 20:32:43 2009 climateprediction.net Finished upload of hadam3p_nh5f_1997_2_006240317_1_3.zip
Mon 3 Aug 20:33:06 2009 climateprediction.net Finished upload of hadam3p_nh5f_1997_2_006240317_1_2.zip
Mon 3 Aug 20:57:52 2009 climateprediction.net update requested by user
Mon 3 Aug 20:57:54 2009 climateprediction.net Sending scheduler request: Requested by user.
Mon 3 Aug 20:57:54 2009 climateprediction.net Reporting 2 completed tasks, not requesting new tasks
Mon 3 Aug 20:57:59 2009 climateprediction.net Scheduler request completed
Tue 4 Aug 01:47:36 2009 climateprediction.net Sending scheduler request: To send trickle-up message.
Tue 4 Aug 01:47:36 2009 climateprediction.net Not reporting or requesting tasks
Tue 4 Aug 01:47:41 2009 climateprediction.net Scheduler request completed
Tue 4 Aug 01:48:32 2009 climateprediction.net Sending scheduler request: To send trickle-up message.
Tue 4 Aug 01:48:32 2009 climateprediction.net Not reporting or requesting tasks
Tue 4 Aug 01:48:37 2009 climateprediction.net Scheduler request completed
ID: 37629 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37630 - Posted: 4 Aug 2009, 19:52:54 UTC

REFERRING TO THE PREVIOUS 2 MESSAGES I HAVE SENT, YOU WILL NOTICE THAT THE ERROR MESSAGE SHOWS THAT AT 20:09:46 THE BOINC QUIT REQUEST WAS SENT AND ACTIONED.

THAT WAS BEFORE THE UPLOADS OF THE ZIP FILES, WHICH SURELY MUST BE WRONG.

KEITH
ID: 37630 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 37635 - Posted: 5 Aug 2009, 9:43:55 UTC - in response to Message 37630.  

REFERRING TO THE PREVIOUS 2 MESSAGES I HAVE SENT, YOU WILL NOTICE THAT THE ERROR MESSAGE SHOWS THAT AT 20:09:46 THE BOINC QUIT REQUEST WAS SENT AND ACTIONED.

THAT WAS BEFORE THE UPLOADS OF THE ZIP FILES, WHICH SURELY MUST BE WRONG.

KEITH

The uploading of the Zip files is a separate process to model computation. If network activity is off, for example, the Zip files will be kept until communication is possible again - that's a BOINC design feature. So it doesn't matter if uploading takes place after the "Computation for task X finished" message.
ID: 37635 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37636 - Posted: 5 Aug 2009, 13:16:37 UTC

I note that Mac 823790 is still getting tasks completed at 72,000 instead of 72,096.
Using version 6.08.

Keith
ID: 37636 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 925
Credit: 34,100,818
RAC: 11,270
Message 37638 - Posted: 5 Aug 2009, 15:58:07 UTC - in response to Message 37636.  

I note that Mac 823790 is still getting tasks completed at 72,000 instead of 72,096.
Using version 6.08.

Keith

But no longer displaying

Unable to load library hadam3p_se_6.07_i686-apple-darwin.dylib

in the stderr_out.

So it seems to be suffering the new version of the problem (cross-platform), not the old version (Mac and Linux only).
ID: 37638 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37639 - Posted: 5 Aug 2009, 17:12:59 UTC
Last modified: 5 Aug 2009, 17:14:52 UTC

That last error message was from a task running v6.07, which is the problem version.

It is the most recent task that was run by v 6.08 that still has the problem of finishing at 72,000, which is the very thing it was written to cure!!!

It shows error message:-

<core_client_version>5.10.32</core_client_version>
<![CDATA[
<stderr_txt>
19:49:20 (28555): called boinc_finish

</stderr_txt>
]]>

With no reference to a missing library.

Keith
ID: 37639 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 925
Credit: 34,100,818
RAC: 11,270
Message 37641 - Posted: 5 Aug 2009, 17:49:22 UTC - in response to Message 37639.  

Please read my post more carefully, and see my PM reply to your PM.
ID: 37641 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37642 - Posted: 5 Aug 2009, 19:03:46 UTC - in response to Message 37641.  

Please read my post more carefully, and see my PM reply to your PM.



Yes. OK Richard.
My apologies.

Keith
ID: 37642 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37806 - Posted: 17 Aug 2009, 17:28:10 UTC

Now that the 5% credit inflation seems to have been sorted out.

The problem of 72,096 time steps not being completed along with the post processing report and graphs not being included on the result.

I have 2 HADAM3P tasks just completed.

One is hadam3p_n0b3_1971_2_006218489_0
It has been uploadied
and it\'s _2.zip file has been transferred,
It\'s stderr out message shows as:--

=============
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
22:44:23 (68933): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No \'heartbeat\' from BOINC...
CPDN Monitor - Quit request from BOINC...
03:36:56 (270): called boinc_finish

</stderr_txt>
]]>
==================

The other task is hadam3p_n9f0_1992_2_006230294_3
It is awaiting uploading at 100% progress
It\'s _2.zip file is awaiting transfer
and, of course there\'s no stderr out message yet.

Both tasks register 72,000 as the last trickle & with 2079 credits (with added 5%).
And the final trickle of 72,096 is still missing.

It has been seen in \"Show Graphics\" on BOINC Manager that Post processing does appear to start after the 72,096 T/step seems to complete without registering in the task details.

It might be noted that proir to the recent stoppages and the 5% credit increases, etc., etc. the stderr out messages showed as :--

================
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
15:36:53 (90410): called boinc_finish

</stderr_txt>
]]>
=======================

Hope this may help to diagnose the missing 72,096 problem on the Mac OSX (and others?)

Keith
ID: 37806 · Report as offensive     Reply Quote
transient

Send message
Joined: 3 Oct 06
Posts: 43
Credit: 8,017,057
RAC: 0
Message 37808 - Posted: 18 Aug 2009, 5:01:17 UTC

Yes, and others. Since the end of July I haven\'t had this final step go through.
ID: 37808 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37811 - Posted: 18 Aug 2009, 7:26:33 UTC - in response to Message 37806.  

Now that the 5% credit inflation seems to have been sorted out.

The problem of 72,096 time steps not being completed along with the post processing report and graphs not being included on the result.

I have 2 HADAM3P tasks just completed.

One is hadam3p_n0b3_1971_2_006218489_0
It has been uploadied
and it\'s _2.zip file has been transferred,
It\'s stderr out message shows as:--

=============
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
22:44:23 (68933): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No \'heartbeat\' from BOINC...
CPDN Monitor - Quit request from BOINC...
03:36:56 (270): called boinc_finish

</stderr_txt>
]]>
==================

The other task is hadam3p_n9f0_1992_2_006230294_3
It is awaiting uploading at 100% progress
It\'s _2.zip file is awaiting transfer
and, of course there\'s no stderr out message yet.

Both tasks register 72,000 as the last trickle & with 2079 credits (with added 5%).
And the final trickle of 72,096 is still missing.

It has been seen in \"Show Graphics\" on BOINC Manager that Post processing does appear to start after the 72,096 T/step seems to complete without registering in the task details.

It might be noted that proir to the recent stoppages and the 5% credit increases, etc., etc. the stderr out messages showed as :--

================
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
15:36:53 (90410): called boinc_finish

</stderr_txt>
]]>
=======================

Hope this may help to diagnose the missing 72,096 problem on the Mac OSX (and others?)

Keith


Now have had the message on stderr for task hadam3p_n9f0_1992_2_006230294_3
It is : --
==============
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
22:44:23 (68934): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No \'heartbeat\' from BOINC...
CPDN Monitor - Quit request from BOINC...
04:14:47 (271): called boinc_finish

</stderr_txt>
]]>
==================

Hopefully this 72,096 missing problem will soon be history.

Keith
ID: 37811 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37812 - Posted: 18 Aug 2009, 8:34:42 UTC

Jim
I see that both recent tasks finished at the same time without the final 72,096 trickle shown.
stderr out message shows there as : --
========================
core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
called boinc_finish

</stderr_txt>
]]>
=====================

Previously when a completed 72,096 got the full credit, the stderr out message was : --

================
<core_client_version>6.6.36</core_client_version>
<![CDATA[
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
called boinc_finish

</stderr_txt>
]]>
=======================

How relevant the difference is, I do not know (with 21 quit requests compared to only 10).
Although I did see one other success at 18 quit requests!!!!
I have been running HADAM3P on my Mac for some time and am trying to see a pattern in why the 72,096 trickle is always missing with the graph detail.

Next time I shall try running only one HADAM3P task and see if that finishes properly.

Keith
ID: 37812 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 37813 - Posted: 18 Aug 2009, 9:54:50 UTC
Last modified: 18 Aug 2009, 9:55:13 UTC

I don\'t think that running just one HadAM3P at a time will make any difference to the final timestep.

The v.6.08 HadAM3P that Tolu ran on the Beta project produced its final timestep, and the current version for Windows and Linux used to almost always produce it so I don\'t understand why none of the current models seem to be producing it on any of the 3 platforms. I\'m going to bring this to Tolu\'s attention again.

It\'s as if the generation of this final ts depends on the batch of models, not the model version.

If we don\'t get the final ts we don\'t get the graph either.
Cpdn news
ID: 37813 · Report as offensive     Reply Quote
old_user1
Avatar

Send message
Joined: 5 Aug 04
Posts: 907
Credit: 299,864
RAC: 0
Message 37817 - Posted: 18 Aug 2009, 11:23:48 UTC - in response to Message 37813.  

I talked to Tolu - 72,000 timesteps is the end of the run, the model just has an odd feature that it likes to run one more day afterwards.
ID: 37817 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 37819 - Posted: 18 Aug 2009, 11:35:33 UTC

Keith,

The \"CPDN Monitor - Quit request from BOINC...\" isn\'t significant - it just records when BOINC stops, including when the user requests an exit. So, both the logs you report are effectively the same and are both \"clean\".

The \"stderr out\" facility has its uses but also has some deficiencies, which start with the name:

1. \"stderr out\" is an appalling bit of computer jargon to which no normal human being should be exposed. Its principal defect is that it suggests that the associated log contains errors, when it may not.

2. The log has no dates or times, so people usually think that all these messages were created at the end of the run, which isn\'t the case - the log is for the whole run.

3. No distinction is made between errors, warnings and plain old recording of activity, so it\'s only by experience or knowledge (from where?) that the log becomes meaningful.

4. The error numbers and text come from various systems (BOINC, the science application, the operating system) in various formats and with no explanations.

5. The text randomly vanishes, so no systematic analysis of the logs is possible.

It\'s really a facility intended for the project (but not, I suspect, used very often) and provided as a courtesy to the users. Treat with caution.

Iain
ID: 37819 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37822 - Posted: 18 Aug 2009, 13:06:59 UTC - in response to Message 37819.  

Keith,

The \"CPDN Monitor - Quit request from BOINC...\" isn\'t significant - it just records when BOINC stops, including when the user requests an exit. So, both the logs you report are effectively the same and are both \"clean\"................

It\'s really a facility intended for the project (but not, I suspect, used very often) and provided as a courtesy to the users. Treat with caution.

Iain


Thanks for explanation, Iain.
But, something is consistently different between the 2 sets of tasks, so I thought there might be an indication to what could be causing the result being curtailed at 72,000 time step (even though it was apparently following that by the post processing according to the graphics display for the task each time)..
But, alas, it seems not, you reckon.

The very reason I started this thread was to query the loss of the 72,096 trickle and also other info, which somewhat dampened the pleasure of getting a faster version 6.07, with which to crunch.

The bug that was found and corrected, did not however cure the 72,096 problem when version 6.08 was released.

This is the history of improvement for my Mac.
Two tasks were completed each time:-

Version 6.06 successfully to 72,096
Apr 17 Av 8.82 sec/ts
Apr 23 Av 9.05 sec/ts

Version 6.07 failing to get last trickle and graph
July 9/10 Av 7.22 sec/ts
July 20 Av 6.87 sec/ts
July 26/27 Av 6.82 sec/ts

Version 6.08 also failing to get last trickle and graph
Aug 3 Av 6.80 sec/ts
Aug 11 Av 6.74 sec/ts
Aug 17 Av 6.67 sec/ts

Keith
ID: 37822 · Report as offensive     Reply Quote
old_user3

Send message
Joined: 5 Aug 04
Posts: 173
Credit: 1,843,046
RAC: 0
Message 37823 - Posted: 18 Aug 2009, 17:02:11 UTC
Last modified: 18 Aug 2009, 17:02:32 UTC

This is a common issue on all platforms. It must have occurred during
the migration. I\'ll fix this asap.
ID: 37823 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Good news for Mac users. HadAM3P Latest News???

©2024 climateprediction.net