climateprediction.net home page
FAMOUS SUCCESS/FAILURE RATIO

FAMOUS SUCCESS/FAILURE RATIO

Message boards : Number crunching : FAMOUS SUCCESS/FAILURE RATIO
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7

AuthorMessage
[B^S] mavau

Send message
Joined: 30 Aug 04
Posts: 142
Credit: 9,936,132
RAC: 0
Message 41067 - Posted: 17 Nov 2010, 19:54:12 UTC

Here's one model I'm curious about (getting very cold).
Let's see how it develops:
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=12000683
Another note, this w series seems to run much slower (.55 v .48. on my machine).


Forum search Site search
ID: 41067 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 41069 - Posted: 17 Nov 2010, 20:48:01 UTC - in response to Message 41067.  

From a post by Hiro on the beta site:
On the main site, we have just started famous_w series of experiment using the same version of Famous.

The initial workunits are spin up runs with a wider range of parameters, including a new parameter for the number of dynamic sweeps. Actually, we perturbed the sweep parameter before, but only for a few work units.


and later:

To add a bit of background, we started using 2 sweep dynamics to stabilize the model. This effectively make the time step of the atmospheric _dynamics_ by half. However, the run speed hardly increases because the atmospheric dynamics (excluding what we call "physics" and radiative transfer) is a very small in term of CPU time.

According to my 5 or so cluster runs for the millennium and some results from Bristol group, this eliminates most of the cold crashes (still not perfect, though).


I think that the 2nd post also refers to the "w" series models.


Backups: Here
ID: 41069 · Report as offensive     Reply Quote
[B^S] mavau

Send message
Joined: 30 Aug 04
Posts: 142
Credit: 9,936,132
RAC: 0
Message 41113 - Posted: 20 Nov 2010, 19:31:17 UTC

ID: 41113 · Report as offensive     Reply Quote
[B^S] mavau

Send message
Joined: 30 Aug 04
Posts: 142
Credit: 9,936,132
RAC: 0
Message 41154 - Posted: 24 Nov 2010, 15:01:55 UTC

That second one took some time dying. Very cold.

Forum search Site search
ID: 41154 · Report as offensive     Reply Quote
3rkko

Send message
Joined: 12 Feb 08
Posts: 66
Credit: 4,877,652
RAC: 0
Message 41155 - Posted: 24 Nov 2010, 22:33:13 UTC

ID: 41155 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,053,321
RAC: 4,417
Message 41176 - Posted: 28 Nov 2010, 14:18:11 UTC

Famous_ kHz_1999_200_006712331_3 completed successfully. OS is Windows 7 64 bit SP1 RC running on a Core 2 Duo 2.2 GHz processor with 4 GB of RAM.

A very warm one. Average temp. rising steadily throughout the 21th and 22nd centuries from 17.2 to 22.9 degrees C. Rise is greatest in the Northern Hemisphere were it top out at 24.4C. Solar constant is at default.




ID: 41176 · Report as offensive     Reply Quote
Profile Greg van Paassen

Send message
Joined: 17 Nov 07
Posts: 142
Credit: 4,271,370
RAC: 0
Message 41180 - Posted: 28 Nov 2010, 20:05:09 UTC
Last modified: 28 Nov 2010, 20:07:31 UTC

In the famous_w0xx_599 series, I've had two fail and one succeed, so far.

One of the failures was a runaway, reaching 38.5 Celsius before crashing. The other was a cold world, crashing at 8.7 C.

The one that succeeded had quite extreme-looking values for ice fall speed, entrainment coefficient, and temp range of ice albedo variation. You just can't tell.

Back on "v series" famouses now - the luck of the draw.
ID: 41180 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,403,322
RAC: 5,085
Message 41319 - Posted: 18 Dec 2010, 15:21:14 UTC
Last modified: 18 Dec 2010, 15:28:25 UTC

Overall
130 success, 73 failures while computing (64% success ratio)
w-series
2 success, 12 failures while computing (14% success ratio)

Core i7 920 Linux
Success
All/w-series
52/0
Computing Failure
All/w-series
30/2

Phenom II X4 940 Linux
Success
All/w-series
50/0
Computing Failure
All/w-series
27/6

Phenom II X6 1090T Linux
Success
All/w-series
12/2
Computing Failure
All/w-series
7/1

Phenom II X2 B93 Windows
Success
All/w-series
10/0
Computing Failure
All/w-series
5/1

Core2 E8600 Windows
Success
All/w-series
6/0
Computing Failure
All/w-series
4/2
ID: 41319 · Report as offensive     Reply Quote
[B^S] mavau

Send message
Joined: 30 Aug 04
Posts: 142
Credit: 9,936,132
RAC: 0
Message 41494 - Posted: 17 Jan 2011, 19:47:27 UTC

Had a big crash (6 models) 12 days ago, due to disk issues (bad sectors).
Early symptom: McAfee check took ages to complete.
I eventually noticed all the disk error messages in Event Viewer.
This solution should work some time:
Have chkdsk identify the bad sectors once in a while (second checkbox).

The new batch has been successful, except for:
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=12400747

I hadn't met the error before.
Some links below for everybody's information:
http://ncas-cms.nerc.ac.uk/trac/UMHelpdesk/ticket/399
http://cpdnbeta.oerc.ox.ac.uk/forum_thread.php?id=229

Happy crunching for 2011.

Forum search Site search
ID: 41494 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 41496 - Posted: 17 Jan 2011, 20:17:37 UTC - in response to Message 41494.  

I've passed this on to the project person for FAMOUS.


Backups: Here
ID: 41496 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 41503 - Posted: 18 Jan 2011, 19:52:09 UTC - in response to Message 41496.  

Mavau
and others who get a Model crashed: REPLANCA: PP HEADERS ON ANCILLARY FILE DO NOT MATCH error message:

This is caused by one of the auxiliary files not having sufficient data to cover the full modelling period.
Those data sets still in the queue that were affected have now been removed.

Apologies for the mix up.

Also, Kraken has filled up yet again, and data needs to be moved off to storage.
Milo is not available to do this, so we wait in hope. :)


Backups: Here
ID: 41503 · Report as offensive     Reply Quote
dajashby

Send message
Joined: 1 Sep 04
Posts: 55
Credit: 17,223,688
RAC: 967
Message 41505 - Posted: 19 Jan 2011, 0:07:35 UTC

I have a Macbook Air (1114220) that's been crunching since the middle of December. It's received nothing but FAMOUS models, and has completed 6 successfully out of 48 downloaded. I haven't checked them all, but the ones I have looked at state "INVALID THETA DETECTED". My impression from a search of the boards is that Famous models are reasonably prone to fail, but the percentage on this machine seems way too high. My question is, is there a way to conveniently exclude this machine from receiving Famous models, and if there is, should I do so? Of the 6 PCs I have on the project, this one is far and away the most "productive", when measured by credits received - over the last 5 days it's averaged 2,308.42 credits (1,154.21 per CPU core). My 2 Quad core Windows machines averaged 550 and 599 per core over the same period. The other two Core2 Duo machines (both Windows boxes) managed slightly less.
Derrick Ashby
ID: 41505 · Report as offensive     Reply Quote
Profile Greg van Paassen

Send message
Joined: 17 Nov 07
Posts: 142
Credit: 4,271,370
RAC: 0
Message 41506 - Posted: 19 Jan 2011, 1:03:14 UTC - in response to Message 41505.  

Yes, you can exclude the Air from getting Famouses.

To do so:

(1) Go into "Your account" - see the blue menu on the left.

(2). Scroll down to "computers", go into this, and then into "Details" for the Air. Set the Air to be in a different 'Location' from your other computers -- say, School.

(3) Back on the "Your account" page, go into "climateprediction.net preferences". Find the link for "Add preferences for School", and in there, select the applications that you want to allow, and de-select "accept work from other applications?"

------------

The reason for high daily credit and famouses failing so frequently on Macs is that the CPDN programmers could not get the Famous application to compile without extra optimizations. The result is that Famouses run very fast but also crash more often on Macs than on other platforms.

HTH
ID: 41506 · Report as offensive     Reply Quote
3rkko

Send message
Joined: 12 Feb 08
Posts: 66
Credit: 4,877,652
RAC: 0
Message 41509 - Posted: 19 Jan 2011, 16:48:50 UTC - in response to Message 41506.  

Unfortunately Famous is currently the only model type available for Mac and Linux, so if you exclude Famous you will get no work at all.
ID: 41509 · Report as offensive     Reply Quote
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4314
Credit: 16,376,846
RAC: 3,590
Message 41510 - Posted: 19 Jan 2011, 18:57:00 UTC

64bit linux on dual core Intel
10 errored out all together, 2 probably due to reboot issues. 2 are 599 models which are known to be more prone to crashing.
u4pe1999 74gg999 ugyf1799 v3pc1899 vhcx1199 vizg1599 vizh1799 w56v599 w8y4599 w158599 Some invalid theta the rest negative pressure values.
Completed.
v3cz1799 v1b01799 va9x1799 ubdw1999 uh8d1799 which makes 5 or 1/3 completed. On my partners box winxp amd. vnt18199, the only famous unit started completed.
ID: 41510 · Report as offensive     Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7

Message boards : Number crunching : FAMOUS SUCCESS/FAILURE RATIO

©2024 climateprediction.net