climateprediction.net home page
Good news for Mac users. HadAM3P Latest News???

Good news for Mac users. HadAM3P Latest News???

Message boards : Number crunching : Good news for Mac users. HadAM3P Latest News???
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Profile Thyme Lawn
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1283
Credit: 15,824,334
RAC: 0
Message 37508 - Posted: 19 Jul 2009, 20:31:38 UTC - in response to Message 37501.  

The problem library "se" file seems to be present as is shown by the following list of files:-

Which would seem to point the finger towards the BOINC API leaving the program's working directory as slots/<n>. Do you have any tools on the Mac which can analyse file accesses by process name or id (I'm thinking of something like Sysinternals Process Monitor which I use on Windows)?
"The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer
ID: 37508 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,284,966
RAC: 11,093
Message 37509 - Posted: 19 Jul 2009, 20:33:38 UTC - in response to Message 37507.  

As I said in my last post, a quick glance at the top computers will show you that the majority of them are having HADAM3P tasks stopping short of 72,096 (at 72,000) excluding the last time step, which includes the vital information. This is not confined to version 6.07, nor is it confined to Mac OSX.

I've looked at the first 40 results (two pages) of the top 18 non-Mac hosts - so 720 results in all.

I found just two tasks showing 1980.00 credits:
Task 9120558
Task 9118282

But both of them show the final timestep 72,096, and both were reported in the last 24 hours: I expect both will be fully credited by tomorrow.

Keith, could you repeat your "quick glance" and point us to some of those "majority" that you're seeing? I can't make it happen here.
ID: 37509 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37510 - Posted: 19 Jul 2009, 21:09:44 UTC

I check mine as I finish, and I don't recall any with problems with the last trickle.
Rechecked several just now, and no problems.

This is the top end of the list of models for one of my quads. Skip the first, which is a hadam3h; the next 7 are hadam3ps, and there's 3 more at the bottom of the page, with more on the next page.
All version 6.06

ID: 37510 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37511 - Posted: 20 Jul 2009, 3:04:55 UTC - in response to Message 37509.  
Last modified: 20 Jul 2009, 3:22:14 UTC

Richard

Keith, could you repeat your "quick glance" and point us to some of those "majority" that you're seeing? I can't make it happen here.


Yes. I promise you I was getting the "success" with 72,000 on many types of PC before, but only Mac Darwin 9.7.0 are now showing those symptoms. Possibly the trickles update was behind schedule at that time? But, if so, why were they all marked as "success"? At least it is not just my Mac that has the problem as all other Darwin 9.7.0 versions are the same as mine.

Thyme

Which would seem to point the finger towards the BOINC API leaving the program's working directory as slots/<n>. Do you have any tools on the Mac which can analyse file accesses by process name or id (I'm thinking of something like Sysinternals Process Monitor which I use on Windows)?


The Mac uses Activity Monitor to do the type of function you mention, I believe.
There is a function called "Sample" which does, I think, what you mention, on any selected process.
I will look further, or maybe another Mac user might be able to help more than I.
Will send you a private message of the result of the "Inspect" report from a HADA3P process.

Keith
ID: 37511 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37512 - Posted: 20 Jul 2009, 3:32:37 UTC
Last modified: 20 Jul 2009, 3:36:10 UTC

Both my HADAM3P tasks are up to 90% and due to be completing in 13 hours time (16:30 UTC 20th July).
Does anyone want me to do anything specific before I allow them to complete processing?
I will suspend both before 72,000 time step completes if no suggestions have been made by then, so that they may be continued later.

Keith
ID: 37512 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37514 - Posted: 20 Jul 2009, 8:42:40 UTC

I think that the only thing that you can do, is to make a backup before the finish. That way, it will be possible to check on what files are/were still there at that time, if it DOES fail to finish properly.
However, as Tolu has said that he's found the problem, fixed it, tested it, and will move the new app version to this site soon, it's all a bit academic.

ID: 37514 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37515 - Posted: 20 Jul 2009, 9:17:43 UTC - in response to Message 37514.  

I think that the only thing that you can do, is to make a backup before the finish. That way, it will be possible to check on what files are/were still there at that time, if it DOES fail to finish properly.
However, as Tolu has said that he's found the problem, fixed it, tested it, and will move the new app version to this site soon, it's all a bit academic.



OK. Les

No point in messing around. I shall leave them to run with 72,000 time step "success".
When will version 7.08 be operative? (No point in starting another task under 7.07.)

I suppose I will just have to try starting a task and abort whenever 7.07 is loaded.
Hopefully 7.08 will be running by the time my version 7.07 tasks complete this evening.

Keith
ID: 37515 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,284,966
RAC: 11,093
Message 37516 - Posted: 20 Jul 2009, 9:35:42 UTC - in response to Message 37515.  

I suppose I will just have to try starting a task and abort whenever 7.07 is loaded.

It would be better not to waste all that bandwidth (HadAM3P downloads a lot of data).

You can check when the new application has been installed on the Applications page.

Still v6.07 as I type.
ID: 37516 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,284,966
RAC: 11,093
Message 37519 - Posted: 20 Jul 2009, 13:23:19 UTC

v6.08 for Mac OS X Intel was installed about 80 minutes ago. Who's going to be the first brave soul to try it out?
ID: 37519 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37522 - Posted: 20 Jul 2009, 17:12:12 UTC - in response to Message 37519.  
Last modified: 20 Jul 2009, 17:20:05 UTC

v6.08 for Mac OS X Intel was installed about 80 minutes ago. Who's going to be the first brave soul to try it out?


Thanks Richard
I will be picking up 2 tasks at 8pm BST !!!
That has been just timed right. [In fact, just downloading now. Switched network on.]

NOW HERE IS SOMETHING INTERESTING ON THESE 2 WORK UNITS.
EVERY SINGLE TASK ON THE WORK UNITS OF MY 2 TASKS NOW COMPLETING HAVE CRASHED FOR THE OTHER COMPUTERS.
MINE ARE THE ONLY 2 COMPLETING (PRESUMABLY TO 72,000 INSTEAD OF 72,096) AT 8PM!!!
They are in work units 6356998 and 6356996

How do you explain that?????

Keith
ID: 37522 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 37524 - Posted: 20 Jul 2009, 17:26:35 UTC - in response to Message 37522.  
Last modified: 20 Jul 2009, 17:31:46 UTC

How do you explain that?????

The casualty rate for CPDN work units is very high. People naturally expect that the models will just download and run without intervention, but experience shows that it's a bit more difficult than that.

There's a variety of errors in those two work units: when those people pitch up here, they'll be pointed at the 'read me' posts - and if that doesn't explain it then another bug hunt will start ...

PS Do report back on what happens to your new models!
ID: 37524 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 37526 - Posted: 20 Jul 2009, 18:06:25 UTC

An example of a Linux computer repeatedly finishing at 1980.0 credits is here - with library problems.

If there are more like this, then there may need to be something done on that platform as well.
ID: 37526 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37528 - Posted: 20 Jul 2009, 20:23:07 UTC
Last modified: 20 Jul 2009, 20:24:50 UTC

Just after 8pm UTC

Both v6.07 tasks completed.
Watched graphics through to T/s 72,006
Then post processing T/s 96 did not count down but showed 100%, & new 6.08 version started (yipee).
Needed update before the 100% task would upload.
Trickles not yet updated, but expected to be 72,000 with 1980.00 credit.

Now awaiting v 7.08 tasks in 6 days time!!!!

Keith
ID: 37528 · Report as offensive     Reply Quote
Roger Bates

Send message
Joined: 20 Jul 09
Posts: 3
Credit: 966,892
RAC: 0
Message 37549 - Posted: 26 Jul 2009, 6:23:04 UTC
Last modified: 26 Jul 2009, 6:25:41 UTC

I have just completed my first two HADAM3P v6.0.8 simulations.

One seemed to complete properly. Result ID 8784020 Workunit ID 6351126 Host ID 991942
The other is stuck on "Ready to Report" Result ID 8784038 Workunit ID 6351130 Host ID 991942
This is after about 8 hours wait.

Is this normal?? If not is there something I can do to get the process to report properly.

P.S Running iMac dual Intel running Tiger 10.4.11

Roger Bates
ID: 37549 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37550 - Posted: 26 Jul 2009, 8:47:00 UTC

There are several triggers for "Reporting":
1) The next time a trickle is uploaded
2) 24 hours after the final zip(s) are uploaded
3) By clicking the Update button. WARNING: do this too soon, and there's a risk that the Report will get there before the zips are transferred from the upload server to the storage server. This is caused by congested servers, and results in a message something like: Files rejected; model already completed. And you won't get the Over Success Done messages against that model.

There are other triggers, but I forget what they are.

Basicly, Relax, Don't Worry.
BOINC will take care of things in it's own good time.

ID: 37550 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 37551 - Posted: 26 Jul 2009, 9:17:37 UTC - in response to Message 37549.  

[Roger Bates wrote:]I have just completed my first two HADAM3P v6.0.8 simulations.

... and, for the purposes of this thread, both models have the final mini-trickle at 72,096. The credit process hasn't run yet, but when it does these models will show the full 1,982.64 credits as expected.

So, the Mac library fix appears to have worked.
ID: 37551 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37553 - Posted: 27 Jul 2009, 1:46:40 UTC
Last modified: 27 Jul 2009, 1:52:09 UTC

Roger

Well done. This is the REAL GOOD NEWS now upgraded to 6.08 from 6.07.
The report being made in full (72,096 TS and 1,982.64 credits now showing on your account).
And the graphs show on the Task ID details.

You have just beaten me "at the post" with 5.2 TS/sec.
I am getting just over 7 TS/sec under 6.08 (and 6.07), and was just over 9 TS/sec under 6.06.
My 2 tasks are at 92.5% and should complete at about 15:00 UTC today.
I already have my next 2 tasks downloaded and "Ready to Start".

Keith
ID: 37553 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37556 - Posted: 28 Jul 2009, 7:18:00 UTC

I much regret that both tasks with v 6.08 completed at 72,000 TS/sec with 1,742.40 credit on 27th July.
I waited a full day and updated to make sure that the trickles had not been delayed.

They are
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=9124942
and
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=8784464

I have now detached and reset CPDN, loaded 2 more tasks, hoping that they will be successful this time.
Further news in 7 days time.

Keith
ID: 37556 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 37570 - Posted: 29 Jul 2009, 14:36:29 UTC

Roger

The question of when "Reporting" occurs has now been answered in full here
ID: 37570 · Report as offensive     Reply Quote
old_user294426

Send message
Joined: 20 Feb 06
Posts: 158
Credit: 1,251,176
RAC: 0
Message 37586 - Posted: 30 Jul 2009, 4:47:52 UTC - in response to Message 37556.  

I much regret that both tasks with v 6.08 completed at 72,000 TS/sec with 1,742.40 credit on 27th July.
I waited a full day and updated to make sure that the trickles had not been delayed.

They are
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=9124942
and
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=8784464

I have now detached and reset CPDN, loaded 2 more tasks, hoping that they will be successful this time.
Further news in 7 days time.

Keith


It seems that there are problems with trickles reporting (mentioned on other threads), and this may be the reason for the failure of the last 2 tasks not completing to 72,096?
The present tasks are up to 29% but no trickles have been recorded on the statistics page although they are shown on the Task ID details for my computer.
As soon as the trickles are rectified, hopefully both the current 2 tasks and the previous 2 tasks will be credited and the last trickle of 72,096 will appear?

Keith
ID: 37586 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Good news for Mac users. HadAM3P Latest News???

©2024 climateprediction.net