climateprediction.net home page
Posts by Bastian Baum

Posts by Bastian Baum

1) Message boards : Number crunching : Completing a WU? Impossible. What am i doing wrong? (Message 70011)
Posted 29 Oct 2023 by Bastian Baum
Post:
But if it's a bug in the code, there's something that irritates me.

Recently I got a task from Batch 994 on one of my old systems (Intel Core2 Duo CPU T5900 from 2008, 32-Bit-Architecture + Windows 10 64-Bit + Boinc 7.22.2). Of course it's a slow system but I finished the task successfully in 40 days and 15 hours with 34+ Windows-Restarts!
I am monitoring some Apps with this system during the day and I usually shut it down every evening.
You can count the "Quit request from BOINC" in the stderr.txt file here: https://www.cpdn.org/result.php?resultid=22326043
Now I got another task from Batch 996 on the system and until now I shut it down and restarted the system 8 or 9 times.
This task seems to be stable, too. Look here: https://www.cpdn.org/result.php?resultid=22347602

This is just a subjective impression, but as newer my systems get, as higher is the rate of crashed tasks during the restart.
I have a small Server with a Intel Xeon Scaleable CPU from 2019 with some VMs on it, and it seems that I can finish 3-6 tasks of about 40 tasks from Batch 996 that I caught (loss-rate 85 - 92.5 % during two windows-update-restarts).

My newest System has a AMD Ryzen 5 5625U CPU from 2023 (Windows 11 + Boinc 7.24.1) got 14 tasks and I lost all task during the first restart (loss-rate: 100%).

Of course I know that the analysis is more complex than just looking at loss-rates and CPUs, I ignored the RAM for example. Maybe the minimal task-sample-size of about 60 tasks misleads my thoughts and I only had luck with the tasks on my older machine. But I am wondering why tasks can survive 34 or more restarts from checkpoints on a slow machine, but crash on newer, faster machines?
If there is a structual bug in the code, shouldn't this affect all systems with a nearly equal rate?
Or do the restart crashes depend on how fast the files are loaded into the RAM or is it "old" app-code not compatible with "new" system architectures(32-bit vs. 64-bit for example)?
Would it be a temporary solution for users to use older systems to avoid too much crashes of the recent instable tasks?




©2024 climateprediction.net