climateprediction.net home page
Iceworld Appeal

Iceworld Appeal

Message boards : Number crunching : Iceworld Appeal
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 38211 - Posted: 28 Oct 2009, 17:56:38 UTC
Last modified: 28 Oct 2009, 17:58:01 UTC

I could buy another screen and put the four AMD models on it so when I am working I would see a bluey when it happens. Belt and Braces, I know.
The graphics slow down the processing a lot (~50%) whereas, oddly, the recording does not. So any method that avoids actually displaying the graphics will keep your processing rate up. Once started, the recording continues until stopped, whether the graphics are showing or not.

Great, so if they overwrite I don\'t need to delete, after each phase, as I should not exceed 8 * ~30GB (240GB)of data. Is that a correct understanding?
Yes, that\'s how it works.

This also brings up another question. Do the .tmp files stay after the model ends?
BOINC seems to tidy up when the model finishes normally or is aborted. (I don\'t know what happens after a random crash, since I don\'t have them - some model types certainly used to leave debris when they crashed, but the up-to-date versions may be tidier.)

A set of Web pages has been set up to track your AMD models (the Intel models will slow down so much you\'re bound to notice them). It\'s here - there are \'previous\' and \'next\' links at the bottom of the page. The pages are on a scheduled task list to be updated at 18:15 UTC each day. On an AMD, you\'re looking for a dip in the relative seconds/timestep as the model speeds up. If any iceworlds appear we can sort out communication by private message on this board.
ID: 38211 · Report as offensive     Reply Quote
old_user582229

Send message
Joined: 12 Aug 09
Posts: 20
Credit: 3,063,648
RAC: 0
Message 38212 - Posted: 28 Oct 2009, 18:29:34 UTC
Last modified: 28 Oct 2009, 18:31:52 UTC

The four AMD models are as follows:

hadsm3dhet2_ul4a_006479812_6 .96 s/TS @ 65.16% complete.
hadsm3fub_kh70_006479462_0 .99 s/TS @ 59.37% complete.
hadsm3fub_kgzi_006479192_2 .96 s/TS @ 55.53% complete.
hadsm3fub_kgxv_006479133_9 .97 s/TS @ 52.66% complete.

I also have a 4850 x 2 using .05% CPU / core to crunch GPU Milkyway WU\'s.
I have changed my setting to 1% of GPU for grapics, so should not slow things too much. Any idea how much the dip would be?
ID: 38212 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 38213 - Posted: 28 Oct 2009, 18:47:49 UTC - in response to Message 38212.  

The four AMD models are as follows:

hadsm3dhet2_ul4a_006479812_6 .96 s/TS @ 65.16% complete.
hadsm3fub_kh70_006479462_0 .99 s/TS @ 59.37% complete.
hadsm3fub_kgzi_006479192_2 .96 s/TS @ 55.53% complete.
hadsm3fub_kgxv_006479133_9 .97 s/TS @ 52.66% complete.
OK, they\'re the ones now at http://www.bridge-9.org.uk/temp/dg/6697305.html etc. The list will be updated as models finish. I\'m rather busy for a couple of weeks, but eventually the script will be changed to create the pages from a host id, then no human intervention will be required. :-)

Any idea how much the dip would be?
There\'s an AMD example way back in this thread, here - second graph.
ID: 38213 · Report as offensive     Reply Quote
old_user582229

Send message
Joined: 12 Aug 09
Posts: 20
Credit: 3,063,648
RAC: 0
Message 38214 - Posted: 28 Oct 2009, 19:24:41 UTC - in response to Message 38213.  

OK, they\'re the ones now at http://www.bridge-9.org.uk/temp/dg/6697305.html etc. The list will be updated as models finish. I\'m rather busy for a couple of weeks, but eventually the script will be changed to create the pages from a host id, then no human intervention will be required. :-)

Any idea how much the dip would be?
There\'s an AMD example way back in this thread, here - second graph.


OK then. That\'s easy, I\'ll just check the Wu\'s once a day and look for dips.
Looking at your graph, the bluey seems to only affect two trickles, so if I find any, I will copy the relevant TS\'s to a Iceworld directory, post here, and await further instructions.

PS: I could not resist checking the \'Disk Tab\' bug with this version of BOINC and things kept working fine (12.66GB on that i7 so far). I also checked an .tmp directory after a phase change and that model is working fine as well.
ID: 38214 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 38220 - Posted: 29 Oct 2009, 14:17:39 UTC

David,

The Intel models have been added to the set of graphs now. Forewarned is normally forearmed - however, with two i7 machines the chance of anyone being ahead of you in those work units is slim. Still, you\'ll know to check the machine if the seconds/timestep heads skywards.

Iain
ID: 38220 · Report as offensive     Reply Quote
old_user582229

Send message
Joined: 12 Aug 09
Posts: 20
Credit: 3,063,648
RAC: 0
Message 38224 - Posted: 30 Oct 2009, 0:30:19 UTC - in response to Message 38220.  

David,

The Intel models have been added to the set of graphs now. Forewarned is normally forearmed - however, with two i7 machines the chance of anyone being ahead of you in those work units is slim. Still, you\'ll know to check the machine if the seconds/timestep heads skywards.

Iain


Thanks for that Iain, I was considering asking for this, but decided you would think I was being lazy. hadsm3fub_kcb5_006473131 shows a spike, but that was due to a reset to the beginning of phase two after a power cut. My bad, too many computers off one surge protector. lol.

We had another power cut this morning and another, hadsm3fub_kgvz_006479065_9, reset to start, but that spike has not shown up yet. I also noticed all the recordings stopped and had to be restarted for all WU\'s.

I am, actually, following for three of the units.

David
ID: 38224 · Report as offensive     Reply Quote
old_user353238

Send message
Joined: 15 Mar 06
Posts: 41
Credit: 3,581,078
RAC: 0
Message 38228 - Posted: 31 Oct 2009, 9:08:49 UTC

After failing to deliver one (2 weeks ago) which carried on to completion after restore, now have another iceworld.
This one is the genuine article (I think!) at http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=10317957

The .CPDN file is on its way. Note that the file size dropped from 96k (usual range 95 - 100k) to one at 69k then
immediately 7k for a number, 3k for another series, then 2k before I killed it. Is that dwindling file size normal in the aftermath of an iceball?

Enjoy. ;)
ID: 38228 · Report as offensive     Reply Quote
old_user582229

Send message
Joined: 12 Aug 09
Posts: 20
Credit: 3,063,648
RAC: 0
Message 38229 - Posted: 31 Oct 2009, 12:05:08 UTC

Slab: hadsm3fub_kf5n_006476821 has developed into an Iceworld at 86.170%.
WU 6694993
Recording is on.

CPDN files show
4:11PM 116KB
4:11PM 85KB
4:12PM 10KB
4:13PM 10KB
4:14PM 11KB
4:15PM 10KB
4:17PM 4KB

Iceworld detector shows: Phase 3, Step 118,822, Trickle 59, First TS 1.31, Last TS 1.30, Ratio 1.0.

Model is still running, please advise.
ID: 38229 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2167
Credit: 64,487,919
RAC: 4,541
Message 38230 - Posted: 31 Oct 2009, 14:49:58 UTC - in response to Message 38228.  

iansm wrote:
The .CPDN file is on its way. Note that the file size dropped from 96k (usual range 95 - 100k) to one at 69k then
immediately 7k for a number, 3k for another series, then 2k before I killed it. Is that dwindling file size normal in the aftermath of an iceball?

Yes. The information in the file relates to colors. It goes from a lot of colors (variation of temperatures) for a normal model, to one color for an iceworld.
ID: 38230 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 38231 - Posted: 31 Oct 2009, 17:05:08 UTC - in response to Message 38230.  

iansm wrote:
The .CPDN file is on its way. Note that the file size dropped from 96k (usual range 95 - 100k) to one at 69k then
immediately 7k for a number, 3k for another series, then 2k before I killed it. Is that dwindling file size normal in the aftermath of an iceball?

Yes. The information in the file relates to colors. It goes from a lot of colors (variation of temperatures) for a normal model, to one color for an iceworld.

... and the in-built file compression exploits the redundant repetition in areas of the same value (temperature, pressure, precipitation and cloud cover) - so the files get smaller.

The reason for the progressive reduction in file size is that the model initially fails at a single grid point and that failure spreads to the whole grid in two timesteps (in your case 0:95-100k, 1:69k, 2:7k). The drop to the final value, quite a number of steps later, results from sea ice becoming uniform over the whole ocean: more repetition, more redundancy, more compression. After that, nothing happens.
ID: 38231 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 38232 - Posted: 31 Oct 2009, 17:58:02 UTC - in response to Message 38228.  

After failing to deliver one (2 weeks ago) which carried on to completion after restore, now have another iceworld.
This one is the genuine article (I think!) at http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=10317957

The .CPDN file is on its way. Note that the file size dropped from 96k (usual range 95 - 100k) to one at 69k then
immediately 7k for a number, 3k for another series, then 2k before I killed it. Is that dwindling file size normal in the aftermath of an iceball?

Enjoy. ;)

Thanks for that, Ian. Once the e-mail had been coaxed past various spam filters, it was then processed into point #19 on the West coast iceworld collection - it seems a popular spot.

The model froze at 184,334 in the third phase, which follows the pattern of all other crashes, whatever phase, whatever platform - i.e. the freeze occurs in the second timestep of a block of six. The significance of that? I haven\'t a clue.
ID: 38232 · Report as offensive     Reply Quote
old_user596405

Send message
Joined: 4 Oct 09
Posts: 73
Credit: 7,242,427
RAC: 0
Message 38233 - Posted: 31 Oct 2009, 18:44:04 UTC - in response to Message 38231.  


Yes. The information in the file relates to colors. It goes from a lot of colors (variation of temperatures) for a normal model, to one color for an iceworld.
... and the in-built file compression exploits the redundant repetition in areas of the same value (temperature, pressure, precipitation and cloud cover) - so the files get smaller.

The reason for the progressive reduction in file size is that the model initially fails at a single grid point and that failure spreads to the whole grid in two timesteps (in your case 0:95-100k, 1:69k, 2:7k). The drop to the final value, quite a number of steps later, results from sea ice becoming uniform over the whole ocean: more repetition, more redundancy, more compression. After that, nothing happens.


Thanks for explanation.


The model froze at 184,334 in the third phase, which follows the pattern of all other crashes, whatever phase, whatever platform - i.e. the freeze occurs in the second timestep of a block of six. The significance of that? I haven\'t a clue.


Interesting. But you\'ll crack it sometime!

ID: 38233 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 38235 - Posted: 1 Nov 2009, 0:34:59 UTC

David Glogau\'s model has come across nicely and establishes a new freeze point, north-east of the Canary Islands.

This shows that there is still considerable value in submitting Windows/Intel iceworlds, even though most of them do seem to pile up in the same place. (A Mac or Linux/AMD iceworld would nonetheless be of great interest because it would be the first to be looked at in this way, and would show whether fast-processing anomalies on those platforms have the same cause as iceworlds on Windows/Intel/AMD.)
ID: 38235 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 38253 - Posted: 4 Nov 2009, 9:44:05 UTC - in response to Message 38208.  
Last modified: 4 Nov 2009, 9:55:20 UTC

PS I don\'t know whether the new batch of slabs turn into iceworlds. I guess we\'ll find out.

That question is now answered by two adjacent iceworlds from Les Bayliss, u71b and u71c, both from the new batch.

Both west coast crashes (points #20, #21).
ID: 38253 · Report as offensive     Reply Quote
Matthias Lehmkuhl

Send message
Joined: 24 Sep 05
Posts: 7
Credit: 2,789,178
RAC: 2,197
Message 38259 - Posted: 5 Nov 2009, 8:51:46 UTC

Looks like I got also one.
hadsm3fub_keom_006431824_1 using hadsm3 version 607
On temperature graphic it shows a blue world and it looks like it needs much more then the 1.8 seconds for one timestep.
Temperature is -36 or -42
Precip is 0
Presure is 950

resultid=9940773

By now I\'m at Timestep 102366 of 259248 - Phase 3 of 3
Matthias
ID: 38259 · Report as offensive     Reply Quote
Profile Iain Inglis

Send message
Joined: 9 Jan 07
Posts: 467
Credit: 14,549,176
RAC: 317
Message 38260 - Posted: 5 Nov 2009, 9:39:37 UTC - in response to Message 38259.  

By now I\'m at Timestep 102366 of 259248 - Phase 3 of 3

Matthias,

Welcome to the CPDN message board.

From what you say, it does seem to be an iceworld. The rate of progress has slowed dramatically and, since the model is in the final phase, it will not recover.

If this is your first iceworld and you have a backup, then you could restore the backup to see whether the model freezes again at the same place: they usually do. Otherwise, my advice is to abort the model and download another that will then progress at normal speed.

Iain
ID: 38260 · Report as offensive     Reply Quote
Matthias Lehmkuhl

Send message
Joined: 24 Sep 05
Posts: 7
Credit: 2,789,178
RAC: 2,197
Message 38261 - Posted: 5 Nov 2009, 10:18:18 UTC - in response to Message 38260.  

By now I\'m at Timestep 102366 of 259248 - Phase 3 of 3

Matthias,

Welcome to the CPDN message board.

From what you say, it does seem to be an iceworld. The rate of progress has slowed dramatically and, since the model is in the final phase, it will not recover.

If this is your first iceworld and you have a backup, then you could restore the backup to see whether the model freezes again at the same place: they usually do. Otherwise, my advice is to abort the model and download another that will then progress at normal speed.

Iain

Iain,
There is no backup of an older state, so I\'ll abort this one an try a new one.
Thanks for the fast answer and the welcome.
I made a copy of the model files, so if you need some feel free to contact me.


Matthias
ID: 38261 · Report as offensive     Reply Quote
Don Nicholson

Send message
Joined: 31 Aug 04
Posts: 18
Credit: 13,882,347
RAC: 0
Message 38320 - Posted: 17 Nov 2009, 9:14:58 UTC

My graphics are showing totally blue on Hadsm3mh-kw93 006490731-3
I\'m using an Intel Q6600.
Timestep is 254245 of 259248 on Phase 1
S/Ts of 2.41
ID: 38320 · Report as offensive     Reply Quote
Dave Peachey

Send message
Joined: 5 Aug 04
Posts: 11
Credit: 2,356,953
RAC: 0
Message 38335 - Posted: 20 Nov 2009, 22:21:56 UTC
Last modified: 20 Nov 2009, 22:42:13 UTC

Iain,

I\'ve likely got another iceworld for you - hadsm3fub_kbz7_006472701 went blue somewhen before 35.5% complete so I\'ve wound it back a ways (currently at just beyond 34%) and I\'m re-running with recording switched on.

It will probably be a day or so until it hits the blue wall again (I didn\'t catch the exact point first time around) but a note of the email address to which to send the \'.cpdn\' file would faciltate a speedy upload of the appropriate file.

Cheers
Dave
ID: 38335 · Report as offensive     Reply Quote
Dave Peachey

Send message
Joined: 5 Aug 04
Posts: 11
Credit: 2,356,953
RAC: 0
Message 38338 - Posted: 21 Nov 2009, 1:24:17 UTC

Iain,

Update:

Hah, caught it at timestep 11577 - I even had the graphics turned on just at the point it tripped over so was able to watch it go competely blue over a couple of timesteps.

Just to confirm it, I re-ran the last few timesteps (was able to switch of the model before it did a checkpoint) and it froze at the same t/s three times straight. Seemed to spread from the US west coast as per others mentioned above.

Sorry, this is all a bit sad but it\'s the first time in years I\'ve caught one blue-handed (so to speak)!

Ready if/when you are
Dave
ID: 38338 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : Number crunching : Iceworld Appeal

©2024 climateprediction.net