climateprediction.net home page
New work

New work

Message boards : Number crunching : New work
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · 27 . . . 42 · Next

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7144
Credit: 22,254,291
RAC: 18,428
Message 59497 - Posted: 23 Jan 2019, 8:40:50 UTC - in response to Message 59496.  

So now is when you want to do all the OS patches that require reboots and change the fan that rumbles a little now and then ?

CPDN has just taken away the excuse... but I have that long running WU that I don't want to disturb !

Bill F



If you're talking about this task: wah2_safr50_n1x3_199312_14_781_011717119_2

it's been killed by the project.
ID: 59497 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Jan 06
Posts: 460
Credit: 20,131,831
RAC: 51,808
Message 59506 - Posted: 25 Jan 2019, 12:12:32 UTC

A new batch (784) of hadcm3s just came out, and I got four of them. The first one has been running for 45 minutes with no problems, so they could work.
ID: 59506 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2628
Credit: 3,143,917
RAC: 335
Message 59507 - Posted: 25 Jan 2019, 13:01:09 UTC - in response to Message 59506.  
Last modified: 25 Jan 2019, 13:02:33 UTC

A new batch (784) of hadcm3s just came out, and I got four of them. The first one has been running for 45 minutes with no problems, so they could work.


I just tried to get some but got the database down message so have to wait an hour before having another go. I know Andy knows about it but not sure why the machine should be so busy that it is causing problems at the moment?

Edit: Took 8 attempts to post the above. Now to see how long the edit takes....
ID: 59507 · Report as offensive     Reply Quote
Profile Iain Inglis
Volunteer moderator

Send message
Joined: 16 Jan 10
Posts: 1006
Credit: 4,416,747
RAC: 15,472
Message 59508 - Posted: 25 Jan 2019, 13:48:21 UTC

There are 3061 of them, but I can't get any for my Mac - which can only run HADCM3S - because of the database being "down" (batch list).
ID: 59508 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2628
Credit: 3,143,917
RAC: 335
Message 59509 - Posted: 25 Jan 2019, 14:18:25 UTC - in response to Message 59508.  

Have snagged one on my desktop machine now.
ID: 59509 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2628
Credit: 3,143,917
RAC: 335
Message 59544 - Posted: 6 Feb 2019, 16:46:10 UTC
Last modified: 6 Feb 2019, 16:58:08 UTC

New model type for batch 785 HadAM4 I don't know what is different about this model type but between two machines, three of them running under Linux at the moment. If I understand what I have read correctly, a relatively small batch of 500 tasks so they won't last long, especially if as I suspect they run on all three platforms.

Seen some on Windows of various types. Not found any on Mac yet but that means nothing as I only looked at about ten or 12 tasks.
ID: 59544 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 1896
Credit: 39,900,770
RAC: 47,353
Message 59545 - Posted: 6 Feb 2019, 18:45:51 UTC - in response to Message 59544.  

All 100 of the running tasks I looked at were on the linux app. I had tried Windows first when I saw there were new tasks, but the Windows clients wouldn't pick any up.
ID: 59545 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7144
Credit: 22,254,291
RAC: 18,428
Message 59547 - Posted: 6 Feb 2019, 20:41:24 UTC

Yes, the new app is Linux only. (Yea!)
And I've got some running on one computer, which now has both Linux/Wine/Windows, and Linux only. (Yea!)
I just need to remember which icon starts which version.

And I've come across the first mass killer, who is now running this batch. :(

Batch 785 is a small spinup batch.
ID: 59547 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7144
Credit: 22,254,291
RAC: 18,428
Message 59549 - Posted: 7 Feb 2019, 4:00:42 UTC

Now a bit over 3% at a bit over 6 hours on my 3.50 GHz Haswell computer, so about 8.5 days total.
ID: 59549 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2628
Credit: 3,143,917
RAC: 335
Message 59551 - Posted: 7 Feb 2019, 5:48:12 UTC
Last modified: 7 Feb 2019, 6:38:31 UTC

A lot seem to be crashing with
Model crashed:
READDUMP: BAD BUFFIN OF DATA
. This happened to four on testing when someone put a digger through a mains cable near where I live. This happening anything from shortly after model starts to several hours in. I suspect they don't like being interrupted.

This is one example
https://www.cpdn.org/cpdnboinc/result.php?resultid=21488816

Looks like about one in six of those that don't fail due to missing libraries are failing with the above error. A few other errors also spotted. A few with an insufficient stack memory available and one where user had restricted memory usage to half a gig. Interestingly that one had completed some hadcm3s tasks.
ID: 59551 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2628
Credit: 3,143,917
RAC: 335
Message 59553 - Posted: 7 Feb 2019, 12:11:42 UTC

And the new task type is now on the project Status page.
ID: 59553 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 1896
Credit: 39,900,770
RAC: 47,353
Message 59555 - Posted: 7 Feb 2019, 15:47:14 UTC

I had one of those failures with the bad buffin data, when I stopped boinc several timesteps after a checkpoint. There should have been nothing wrong with doing it. It was the only task running on that PC at the time. Not good.

These things might be the biggest memory hogs in terms of active memory that we've had. Each model task takes about 650 MB of RAM, so for a fully loaded i7 with 8 tasks, about 5.5 GB of RAM used. I would imagine given cache and memory contention, in that circumstance, it would REALLY slow model progress relative to some of our other model types/regions. My reasonably quick PCs running only 1 model each are averaging 7.5 to 9.5 sec/TS.
ID: 59555 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2628
Credit: 3,143,917
RAC: 335
Message 59557 - Posted: 7 Feb 2019, 19:43:29 UTC - in response to Message 59555.  

Thanks George,

Each model task takes about 650 MB of RAM, so for a fully loaded i7 with 8 tasks, about 5.5 GB of RAM used. I would imagine given cache and memory contention, in that circumstance....


I hadn't looked at how much memory was being used but what you say makes sense. There are a few machines out there that have crashed tasks due to running out of memory. One of them admittedly a 4 core I7 with only 1GB ram!
ID: 59557 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1099
Credit: 19,939,739
RAC: 4,331
Message 59558 - Posted: 7 Feb 2019, 20:17:11 UTC

Any sign of new work for Windows? I will have 4 empty cores by tomorrow.
ID: 59558 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7144
Credit: 22,254,291
RAC: 18,428
Message 59559 - Posted: 7 Feb 2019, 20:44:07 UTC

Finally got to the zips.
About 7.3M on average, so much smaller than what we've had for a while.

I can confirm what George said about the Virtual memory size. Definitely not for bare bones machines.

14.0 sec/TS for the Haswell, and about 14.4 sec/TS for the Ivy bridge.
ID: 59559 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 2628
Credit: 3,143,917
RAC: 335
Message 59564 - Posted: 8 Feb 2019, 11:07:13 UTC
Last modified: 8 Feb 2019, 11:15:24 UTC

Any sign of new work for Windows? I will have 4 empty cores by tomorrow.

I think there is a batch in the pipeline. New files were uploaded to a potential cam25 batch about 0100UTC so someone was working late if based in Oxford!
ID: 59564 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1099
Credit: 19,939,739
RAC: 4,331
Message 59565 - Posted: 8 Feb 2019, 15:31:02 UTC - in response to Message 59564.  

Hopefully, a large batch.
ID: 59565 · Report as offensive     Reply Quote
Albert H.

Send message
Joined: 18 Feb 06
Posts: 58
Credit: 22,951,088
RAC: 11,944
Message 59674 - Posted: 25 Feb 2019, 19:20:46 UTC

I have 20 cores idle, how long do we have to wait for new work (windows) for CPDN ?
Just a question : is this the end ? Or is there some hope for better days ?

Thanks.
ID: 59674 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7144
Credit: 22,254,291
RAC: 18,428
Message 59675 - Posted: 25 Feb 2019, 19:47:26 UTC

It's not the end, just normal.
There were several large batches released late last year. Now those researchers are waiting for the data to be returned so that they can study the results.

And at long last work is in hand on new Linux models, for those of us who haven't had any work for a long time.

And lots more people are joining the project every day, so there's no shortage of computers waiting.
ID: 59675 · Report as offensive     Reply Quote
Albert H.

Send message
Joined: 18 Feb 06
Posts: 58
Credit: 22,951,088
RAC: 11,944
Message 59693 - Posted: 28 Feb 2019, 9:25:04 UTC

Thanks, now there is new work
ID: 59693 · Report as offensive     Reply Quote
Previous · 1 . . . 21 · 22 · 23 · 24 · 25 · 26 · 27 . . . 42 · Next

Message boards : Number crunching : New work

©2020 climateprediction.net