climateprediction.net home page
Posts by Aurum

Posts by Aurum

1) Message boards : Number crunching : Upload server is out of disk space (Message 70030)
Posted 4 Nov 2023 by Aurum
Post:
Following a meeting yesterday, I've been asked by the CPDN Project Director to pass on a message.
If anyone is having problems with 'stuck uploads', then Abort the task. This batch is of questionable scientific quality because of the very high number of failures.
As this batch is now closed, no resends for any Aborted tasks will be sent out to others.
Problems on the server have been investigated. One of their disks filled completely resulting in a move of data, which may be causing the problem. The server will be looked at again before any more batches go out (hopefully in the next couple of weeks when folk return from holiday).
What batch?
Should I abort the 16 Linux HadSM4 WUs I received a week ago? The trickles seem to go through but the ULs do not.

Edit: I found my answer, "The following hadsm4 batches for the DOCILE project have now been closed: 937, 938, 939, 940, 941. These batches were issued in Nov/22."
2) Message boards : Number crunching : Upload server is out of disk space (Message 70029)
Posted 4 Nov 2023 by Aurum
Post:
Please DON'T do that. Instead, cancel the TRANSFER only - in the transfers tab - and the task should become 'ready to report'. Update the project as normal, and you'll get a lot of disk space freed up, without a blot on your account record.
Oops, clicked wrong Quote button. No delete post button.
3) Message boards : Number crunching : New work discussion - 2 (Message 69629)
Posted 17 Sep 2023 by Aurum
Post:
Dynex is a scam.
Certainly enough red flags that I wouldn't touch it with a barge pole. Not proven to be a scam but no way am I going near it.
I've run it for thousands of compute hours with no problem. Can't imagine what you imagine may happen to your barge pole?
I found many many sites saying Dynex is a scam. And none saying it was of any use.

I did try to install it, but when it kept on refusing to work, and AVG kept saying no, I sent it to virustotal, where almost every virus checker didn't like it. I'm not going to run a virus on my computer to please that 14 million dollar fraudster - https://www.reuters.com/article/us-sec-jumio-mattes-idUSKCN1RE260 . Other cryptocurrency doesn't flag up like that.

And your first sentence to me was "by definition all cryptocurrency is a scam", now you say this one isn't? Maybe you're Daniel?

I've only found two coins which aren't scams, Gridcoin and Curecoin. They give you coins for work you're already doing in Boinc and Folding@Home, not ask you to run mindless calculations for other people.
I found it was absolutely trivial to setup and run DNX. The early miners had bugs.
GRC and CURE are scams too. All supply and no demand. Nobody actually needs cryptocurrency. Someone should figure out a way to pay workers using real money so they cover their electric bills. Somebody will but crypto is not the answer.
4) Message boards : Number crunching : New work discussion - 2 (Message 69628)
Posted 17 Sep 2023 by Aurum
Post:
Do you run them on virtualbox or natively on linux?
I only run native Linuc WUs and never use virtualbox.
5) Message boards : Number crunching : New work discussion - 2 (Message 69627)
Posted 17 Sep 2023 by Aurum
Post:
LHC's ATLAS tasks at 10GB are the biggest I know of. But that's 8 threads, so you don't get people trying to run huge numbers of them. Are yours going to be single threads?
If memory serves LHC ATLAS is not a legitimate multithreaded project, they just package multiple tasks in one WU. You can watch them end one-by-one until there's a single long runner with 7 idle CPUs. Milkyway has real multithreaded WUs but hardly uses any RAM. Problem with mt is they overload the L3 cache and slow down dramatically. MW works best at 3 CPUs.
6) Message boards : Number crunching : New work discussion - 2 (Message 69625)
Posted 17 Sep 2023 by Aurum
Post:
No other projects I know of run tasks with this high memory requirements so it's not obvious how they will be received. Let's walk first before we run with this.
The 2 highest I'm running are LHC ATLAS and einstein_O3MD1 Multi-Directional Gravitational Wave search both using 2 GB RAM. Problem is they start off committing much less and it slowly grows to 2 GB. They'll let you run as many as you want. If you start too many you can freeze your computer and must take caution to limit the number in your app_config.
That said I too have a number of computers that could give your big boys a field test.
7) Message boards : Number crunching : New work discussion - 2 (Message 69623)
Posted 17 Sep 2023 by Aurum
Post:
Glenn, I wonder if running climate simulations on the DynexSolve distributed neuromorphic computing network might be faster and/or more accurate. It's still a work in progress and they might welcome the extra load. Their devs are active on their Discord channel. https://dynexcoin.org/about
Not something I have heard of - I have no idea what a neuromorphic network is; I presume it's just a fancy name for a neural network implemented on GPUs. Can't say it's of interest.

There is no existing code for GPUs in any of the models that CPDN use. The forecasting centres are however working on including GPUs in the model codes. Try searching for 'atmospheric ocean models using GPU' and it will give hits.
Thirty times faster caught my eye from this search, hopefully CPDN will speed up.
Looking forward to getting some Linux WUs of any kind.
8) Message boards : Number crunching : New work discussion - 2 (Message 69622)
Posted 17 Sep 2023 by Aurum
Post:
Dynex is a scam.
Certainly enough red flags that I wouldn't touch it with a barge pole. Not proven to be a scam but no way am I going near it.
I've run it for thousands of compute hours with no problem. Can't imagine what you imagine may happen to your barge pole?
9) Message boards : Number crunching : New work discussion - 2 (Message 69621)
Posted 17 Sep 2023 by Aurum
Post:
Glenn, I wonder if running climate simulations on the DynexSolve distributed neuromorphic computing network might be faster and/or more accurate. It's still a work in progress and they might welcome the extra load. Their devs are active on their Discord channel. https://dynexcoin.org/about[/size]
https://www.virustotal.com/gui/file/ace9dd93beae65218cfc7abdbaf6d22e58e0075ed9b1cbf3fd76c153cd1c0eeb

https://medium.com/@ares_61826/why-we-believe-the-dynex-cryptocurrency-is-a-scam-from-sec-sanctioned-daniel-mattes-561bbabbd89a

Dynex is a scam.
By definition all cryptocurrency is a scam. The fatal flaw of Dynex is they created a new CaCaCoin making it subject to difficulty changes and miner profitability. Miners are fickle and the project will die when they can't cover their electric bills.
I read 4 of the philippics written by the anonymous conspiracy axe-grinder you linked to. Things I learned: the Dynex SAT solver works but he knows one that runs faster but doesn't say if that's in distributed computing mode, Dynex spent money to get where they are, there is no rugpull token, no premine and hence no scam. The anonymous conspiracy theorist merely seems to have a grudge against the person he thinks is behind a different mask than his.
I wonder if Mr Hucker thinks Robert F Kennedy, Jr is a polymath?
Somebody is going to make this concept work. A version was even the subject of a distant BOINC project.
10) Message boards : Number crunching : New work discussion - 2 (Message 69525)
Posted 24 Aug 2023 by Aurum
Post:
Weather & climate models are very non-linear. Small differences in numerical calculations can quickly cause big differences in runs of identical code on different hardware (many, many, years ago this was a topic of my PhD). There are places in the code where just a single bit difference is enough. For example, for a cloud to form the air must be saturated, so the code computes the saturation at each grid-point and compares it to the value needed for a cloud to form. A single bit difference in that comparison is all you need to have, or, not have a cloud form. A cloud makes significant changes to its local environment.

Differences in the numerics can come from different rounding in the processor, differences in numerical libraries the code might be linked to. Code errors that might be reading random memory locations can also cause small differences (maybe not enough to crash the model).

There have been studies to look at this in the very early days of CPDN on the long-running climate models.
Glenn, I wonder if running climate simulations on the DynexSolve distributed neuromorphic computing network might be faster and/or more accurate. It's still a work in progress and they might welcome the extra load. Their devs are active on their Discord channel. https://dynexcoin.org/about
11) Message boards : Cafe CPDN : World Community Grid mostly down for 2 months while transitioning (Message 69524)
Posted 24 Aug 2023 by Aurum
Post:
IBM had plenty of problems with their grid and needed a team constantly putting on band-aids. TN-Grid, Denis, etc seem to run fine with just one person at the helm.
12) Message boards : Cafe CPDN : World Community Grid mostly down for 2 months while transitioning (Message 69522)
Posted 24 Aug 2023 by Aurum
Post:
I'm not convinced WCG can be blamed for this outage (or other hardware-related problems), and I sometimes wonder what sort of [software] bag of nails IBM handed over :-)
Cheers - Al.

I think the root cause of WCG's problems are due to the grid configuration. It serves no real benefit. Put MCM, SCC, OPN, etc on their own BOINC servers and most issues will vanish.

If your GPUs need something to do you might consider mining Dynex (DNX). It's a distributed neuromorphic computing project. It's still a work in progress. https://dynexcoin.org/
13) Message boards : Number crunching : New work discussion - 2 (Message 69111)
Posted 4 Jul 2023 by Aurum
Post:
[quote]These 3 WUs started at the same time and they're all at 36-38% progress. They're all running on the same Win7 i5-4690K quadcore CPU with nothing else running. I've restarted BOINC twice and the first was to upgrade to 7.22.2. Set to only allow a single file transfer.

wah2_eas25_a23h_200211_25_994_012218155_0
https://www.cpdn.org/result.php?resultid=22321423
wah2_eas25_a23h_200211_25_994_012218155_0 now has _12.zip and _restart.zip hung.
Looks like all 3 of my WUs will be failures.
14) Message boards : Number crunching : New work discussion - 2 (Message 69110)
Posted 4 Jul 2023 by Aurum
Post:
These 3 WUs started at the same time and they're all at 36-38% progress. They're all running on the same Win7 i5-4690K quadcore CPU with nothing else running. I've restarted BOINC twice and the first was to upgrade to 7.22.2. Set to only allow a single file transfer. Might there be a clue in one working properrly and two not or is just random?

wah2_eas25_a21o_200211_25_994_012218090_0
https://www.cpdn.org/result.php?resultid=22321358
wah2_eas25_a21o_200211_25_994_012218090_0_r98403190_1.zip 1.136 121210.47 K 00:18:38 - 197:37:54 85.08 KBps Uploading
wah2_eas25_a21o_200211_25_994_012218090_0_r98403190_5.zip 90.622 121400.95 K 00:31:46 - 105:43:03 0.00 KBps Upload pending (Retry in: 03:05:03), retried: 62
slot 4: Transferred _9.zip today with _1.zip & _5.zip still hung after a BOINC restart.

wah2_eas25_a23h_200211_25_994_012218155_0
https://www.cpdn.org/result.php?resultid=22321423
slot 6: This WU has transferred 9 zips as of this morning with none hanging.

wah2_eas25_a342_201111_25_994_012219472_0
https://www.cpdn.org/result.php?resultid=22322764
wah2_eas25_a342_201111_25_994_012219472_0_r523039799_4.zip 0.000 120899.03 K 00:24:22 - 152:14:17 0.00 KBps Upload pending (Retry in: 02:55:04), retried: 55
slot 5: Transferred _9.zip yesterday with _4.zip still hung after a BOINC restart.


wah2_eas25_a21o_200211_25_994_012218090_0 failed with an error while computing.
wah2_eas25_a23h_200211_25_994_012218155_0 is still running nicely with no hung ULs.
wah2_eas25_a342_201111_25_994_012219472_0 ULed _12.zip and _restart.zip today but _4.zip still refuses to UL.
No feedback says they'll just run to failure with no fix forthcoming.
15) Message boards : Number crunching : New work discussion - 2 (Message 69081)
Posted 2 Jul 2023 by Aurum
Post:
These 3 WUs started at the same time and they're all at 36-38% progress. They're all running on the same Win7 i5-4690K quadcore CPU with nothing else running. I've restarted BOINC twice and the first was to upgrade to 7.22.2. Set to only allow a single file transfer. Might there be a clue in one working properrly and two not or is just random?

wah2_eas25_a21o_200211_25_994_012218090_0
https://www.cpdn.org/result.php?resultid=22321358
wah2_eas25_a21o_200211_25_994_012218090_0_r98403190_1.zip 1.136 121210.47 K 00:18:38 - 197:37:54 85.08 KBps Uploading
wah2_eas25_a21o_200211_25_994_012218090_0_r98403190_5.zip 90.622 121400.95 K 00:31:46 - 105:43:03 0.00 KBps Upload pending (Retry in: 03:05:03), retried: 62
slot 4: Transferred _9.zip today with _1.zip & _5.zip still hung after a BOINC restart.

wah2_eas25_a23h_200211_25_994_012218155_0
https://www.cpdn.org/result.php?resultid=22321423
slot 6: This WU has transferred 9 zips as of this morning with none hanging.

wah2_eas25_a342_201111_25_994_012219472_0
https://www.cpdn.org/result.php?resultid=22322764
wah2_eas25_a342_201111_25_994_012219472_0_r523039799_4.zip 0.000 120899.03 K 00:24:22 - 152:14:17 0.00 KBps Upload pending (Retry in: 02:55:04), retried: 55
slot 5: Transferred _9.zip yesterday with _4.zip still hung after a BOINC restart.

02-Jul-2023 08:54:14 [climateprediction.net] Started upload of wah2_eas25_a342_201111_25_994_012219472_0_r523039799_4.zip
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Info: Trying 141.223.16.156:80...
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Info: Connected to upload7.cpdn.org (141.223.16.156) port 80 (#31)
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: POST /cgi-bin/file_upload_handler HTTP/1.1
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Host: upload7.cpdn.org
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.22.2)
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Accept: */*
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Accept-Encoding: deflate, gzip
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Accept-Language: en_US
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Content-Length: 311
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Content-Type: application/x-www-form-urlencoded
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server:
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Info: We are completely uploaded and fine
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: HTTP/1.1 200 OK
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: Date: Sun, 02 Jul 2023 16:04:16 GMT
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: Server: Apache/2.2.3 (CentOS)
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: Transfer-Encoding: chunked
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: Content-Type: text/plain; charset=UTF-8
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server:
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: 64
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: <data_server_reply>
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: <status>0</status>
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: <file_size>87031808</file_size>
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server: </data_server_reply>
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Received header from server:
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Info: Connection #31 to host upload7.cpdn.org left intact
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Info: Found bundle for host: 0x32d7a70 [serially]
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Info: Re-using existing connection #31 with host upload7.cpdn.org
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: POST /cgi-bin/file_upload_handler HTTP/1.1
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Host: upload7.cpdn.org
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: User-Agent: BOINC client (windows_x86_64 7.22.2)
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Accept: */*
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Accept-Encoding: deflate, gzip
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Accept-Language: en_US
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Content-Length: 36769291
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Content-Type: application/x-www-form-urlencoded
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server: Expect: 100-continue
02-Jul-2023 08:54:15 [climateprediction.net] [http] [ID#122] Sent header to server:
02-Jul-2023 08:54:16 [climateprediction.net] [http] [ID#122] Received header from server: HTTP/1.1 100 Continue
02-Jul-2023 08:54:36 [climateprediction.net] [http] [ID#122] Info: Recv failure: Connection was reset
02-Jul-2023 08:54:36 [climateprediction.net] [http] [ID#122] Info: Closing connection 31
02-Jul-2023 08:54:36 [climateprediction.net] [http] HTTP error: Failure when receiving data from the peer
02-Jul-2023 08:54:36 [climateprediction.net] Temporarily failed upload of wah2_eas25_a342_201111_25_994_012219472_0_r523039799_4.zip: transient HTTP error
02-Jul-2023 08:54:36 [climateprediction.net] Backing off 05:49:49 on upload of wah2_eas25_a342_201111_25_994_012219472_0_r523039799_4.zip
02-Jul-2023 08:54:36 [climateprediction.net] Started upload of wah2_eas25_a21o_200211_25_994_012218090_0_r98403190_5.zip
02-Jul-2023 08:54:37 [climateprediction.net] [http] [ID#124] Info: Hostname upload7.cpdn.org was found in DNS cache
02-Jul-2023 08:54:37 [climateprediction.net] [http] [ID#124] Info: Trying 141.223.16.156:80...
02-Jul-2023 08:54:58 [climateprediction.net] [http] [ID#124] Info: connect to 141.223.16.156 port 80 failed: Timed out
02-Jul-2023 08:54:58 [climateprediction.net] [http] [ID#124] Info: Failed to connect to upload7.cpdn.org port 80 after 21303 ms: Couldn't connect to server
02-Jul-2023 08:54:58 [climateprediction.net] [http] [ID#124] Info: Closing connection 32
02-Jul-2023 08:54:58 [climateprediction.net] [http] HTTP error: Timeout was reached
02-Jul-2023 08:54:58 [climateprediction.net] Temporarily failed upload of wah2_eas25_a21o_200211_25_994_012218090_0_r98403190_5.zip: transient HTTP error
02-Jul-2023 08:54:58 [climateprediction.net] Backing off 05:18:52 on upload of wah2_eas25_a21o_200211_25_994_012218090_0_r98403190_5.zip
16) Message boards : Number crunching : New work discussion - 2 (Message 69019)
Posted 27 Jun 2023 by Aurum
Post:
Looking forward to it. Which app would that be?
Weather at Home Windows tasks.


About how much RAM per task will be required?

My Win7 is running three 8.24 wah2 using 423 MB each.
17) Message boards : Number crunching : New work discussion - 2 (Message 68867)
Posted 7 Jun 2023 by Aurum
Post:
The researcher for the EAS tasks has discussed the results with her professor and there were some concerns about the spin up results but they have decided everything is within range and the mainsite tasks will be released, "very soon."

Looking forward to it. Which app would that be?
18) Message boards : Number crunching : Big credit jump! (Message 68824)
Posted 28 May 2023 by Aurum
Post:
The HADAM4 task I am running has just sent its first data and trickle uploads. I see that the new credit script has instantly granted the credit and it is commensurate with the amount of credit granted for these tasks in the past, i.e. 1/5 of the total credit for a 5 month N216 resolution task of that type. It is good to see the credit script works.

Me too. I've got 3 HADAM4 WUs running and one sequestered on a retired computer that I'll either finish or abort.
19) Message boards : Number crunching : Server Status page questions (Message 68606)
Posted 19 Mar 2023 by Aurum
Post:
What is happening with these work units?
In the case of the region independent tasks, I doubt anything is happening. The research that used these is long finished. However I do see the very occasional user returning one on the server status page. CPDN has in the past granted credit for work done after the deadline. At a time when tasks even on a reasonably fast machine of the day could take over six months I don't think this was unreasonable. I hope this is not happening on more recent tasks but I have no idea whether it is or not.

Which applications exactly are the "region independent tasks?"
I keep getting hadam4 WUs and I sure do NOT want to waste my electric bill on useless garbage.
If there's obsolete WUs circulating then the project should issue "server aborts" for all of them and clear the decks of the flotsam and jetsam.
20) Message boards : Number crunching : New work discussion - 2 (Message 66069)
Posted 7 Sep 2022 by Aurum
Post:
Will the new work have user-friendly checkpointing?
I sure would love to run climate & weather models. I searched for "checkpoint" and found nothing about it.
As I recall checkpoints were 4 hours or so apart. That makes it very difficult to deal with heatwaves and TOU metering.
Also looks like CPDN wants to use every CPU thread on your computer. Hopefully they'll fix that bug too.

Edit: Found some info but not sure how old it is: https://www.climateprediction.net/getting-started/support/technical-faq/#no_tasks_available
How long does a Timestep take in real time?
"A Timestep represents a 1/2 hour of model time (not realtime)."
"Climateprediction.net checkpoints every 144 Timesteps..."

How do we make backups of a WU in-progress?
"More worrying is that a computation error loses more work. What is the appropriate reaction to this? Complaining is unlikely to be useful as trying to make the Work Unit smaller has been considered and rejected as not practical. A better reaction would be to decide to make a backup from time to time so if you do suffer an error, you can recover without losing too much work."


Next 20

©2024 climateprediction.net