climateprediction.net home page
New work discussion - 2

New work discussion - 2

Message boards : Number crunching : New work discussion - 2
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 42 · Next

AuthorMessage
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4346
Credit: 16,541,921
RAC: 6,087
Message 66921 - Posted: 15 Dec 2022, 16:23:28 UTC - in response to Message 66919.  

I can upgrade one computer to boinc 7.20.5 using this ppa: https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/boinc
I think that version is a development release. I guess it could cause other issues.
Gianfranco's versions of BOINC rarely cause problems. I have been running the 7.21.0 compiled from source from Git-Hub and have yet to have problems with it. Every few weeks I do it afresh from the nightly build. (Occasionally I have had problems getting it to compile but that is another matter!)
ID: 66921 · Report as offensive
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,305,695
RAC: 11,295
Message 66922 - Posted: 15 Dec 2022, 16:41:22 UTC - in response to Message 66919.  

I can upgrade one computer to boinc 7.20.5 using this ppa: https://launchpad.net/~costamagnagianfranco/+archive/ubuntu/boinc
I think that version is a development release. I guess it could cause other issues.
I'm running the same v7.20.5, from the same source. This is a very minor change - some security fixes for the latest Apple Mac OS, and a small bugfix for Linux and Windows for an error introduced during that Mac change.

Otherwise, it's exactly the same as the full release v7.20.2
ID: 66922 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 807
Credit: 13,593,584
RAC: 7,495
Message 66923 - Posted: 15 Dec 2022, 16:48:43 UTC - in response to Message 66920.  

biodoc & cletus: could you please look in your /var/log/messages file around the time the task crashed. I am interested to see if there is any report in there that a process was killed due to a memory issue. There may not be one but some systems do log process kills due to out of memory issues for example.

Thanks.

I also had a task that failed with that error, however the model did not finish:

https://www.cpdn.org/result.php?resultid=22250347

Exit status 9 (0x00000009) Unknown error code
...
12:03:35 STEP 973 H= 243:15 +CPU= 10.376
12:03:45 STEP 974 H= 243:30 +CPU= 10.186
12:03:56 STEP 975 H= 243:45 +CPU= 10.185
double free or corruption (out)
12:04:06 STEP 976 H= 244:00 +CPU= 10.574

</stderr_txt>

The same computer successfully completed 11 other jobs from the latest oifs batch. They were being run 6 at a time, with around 20GB free ram available.
I'm certainly fine with test jobs being sent to it, if you want to.
ID: 66923 · Report as offensive
biodoc

Send message
Joined: 2 Oct 19
Posts: 21
Credit: 46,360,943
RAC: 13,614
Message 66924 - Posted: 15 Dec 2022, 18:07:22 UTC - in response to Message 66923.  

syslog of https://www.cpdn.org/result.php?resultid=22250486. This one crashed in the middle of a run. No useful information.
Dec 14 19:48:21 x32-linux3 boinc[1692]: 14-Dec-2022 19:48:21 [climateprediction.net] Started upload of oifs_43r3_bl_a054_2016092300_15_949_12166578_0_r1730349614_14.zip
Dec 14 19:48:34 x32-linux3 boinc[1692]: 14-Dec-2022 19:48:34 [climateprediction.net] Finished upload of oifs_43r3_bl_a054_2016092300_15_949_12166578_0_r1730349614_14.zip
Dec 14 19:48:37 x32-linux3 boinc[1692]: 14-Dec-2022 19:48:37 [climateprediction.net] Computation for task oifs_43r3_bl_a054_2016092300_15_949_12166578_0 finished

syslog of https://www.cpdn.org/result.php?resultid=22250622. Looks like most of the output files were missing.
Dec 15 07:38:08 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:08 [climateprediction.net] Started upload of oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_42.zip
Dec 15 07:38:16 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:16 [climateprediction.net] Finished upload of oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_42.zip
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Computation for task oifs_43r3_ps_1325_2021050100_123_946_12164414_2 finished
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_43.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_44.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_45.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_46.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_47.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_48.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_49.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_50.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_51.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_52.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_53.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_54.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_55.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_56.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_57.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_58.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_59.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_60.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_61.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_62.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_63.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_64.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_65.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_66.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_67.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_68.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_69.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_70.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_71.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_72.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_73.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_74.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_75.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_76.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_77.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_78.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_79.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_80.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_81.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_82.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_83.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_84.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_85.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_86.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_87.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_88.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_89.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_90.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_91.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_92.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_93.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_94.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_95.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_96.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_97.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_98.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_99.zip for task oifs_43r3_ps
_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_100.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_101.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_102.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_103.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_104.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_105.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_106.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_107.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_108.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_109.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_110.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_111.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_112.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_113.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_114.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_115.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_116.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_117.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_118.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_119.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_120.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_121.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:38:18 x32-linux3 boinc[1692]: 15-Dec-2022 07:38:18 [climateprediction.net] Output file oifs_43r3_ps_1325_2021050100_123_946_12164414_2_r1007344185_122.zip for task oifs_43r3_p
s_1325_2021050100_123_946_12164414_2 absent
Dec 15 07:40:11 x32-linux3 boinc[1692]: 15-Dec-2022 07:40:11 [climateprediction.net] Started upload of oifs_43r3_bl_a004_2016092300_15_949_12166398_1_r1266389906_8.zip
Dec 15 07:40:24 x32-linux3 boinc[1692]: 15-Dec-2022 07:40:24 [climateprediction.net] Finished upload of oifs_43r3_bl_a004_2016092300_15_949_12166398_1_r1266389906_8.zip
Dec 15 07:42:08 x32-linux3 systemd[1]: Created slice system-systemd\x2dcoredump.slice.
Dec 15 07:42:08 x32-linux3 systemd[1]: Started Process Core Dump (PID 24512/UID 0).
Dec 15 07:42:10 x32-linux3 systemd-coredump[24513]: Core file was truncated to 2147483648 bytes.
Dec 15 07:42:11 x32-linux3 systemd-coredump[24513]: Process 23225 (oifs_43r3_model) of user 129 dumped core.#012#012Stack trace of thread 23225:#012#0  0x0000000001dc903b n/a (/var/lib/
boinc-client/slots/0/oifs_43r3_model.exe (deleted) + 0x19c903b)
Dec 15 07:42:11 x32-linux3 systemd[1]: systemd-coredump@0-24512-0.service: Succeeded.
ID: 66924 · Report as offensive
cetus

Send message
Joined: 7 Aug 04
Posts: 9
Credit: 139,753,972
RAC: 19,927
Message 66925 - Posted: 15 Dec 2022, 18:19:24 UTC - in response to Message 66923.  

Glen,
I looked in syslog, kern.log and the systemd journal, but did not see anything unusual while the job was running or when it ended.

The boinc log messages for when the job failed were:
Dec 14 12:04:05 hal boinc[2320]: 14-Dec-2022 12:04:05 [climateprediction.net] Finished upload of oifs_43r3_bl_a019_2016092300_15_949_12166439_0_r1529103669_9.zip
Dec 14 12:04:07 hal boinc[2320]: 14-Dec-2022 12:04:07 [climateprediction.net] Computation for task oifs_43r3_bl_a019_2016092300_15_949_12166439_0 finished
Dec 14 12:04:07 hal boinc[2320]: 14-Dec-2022 12:04:07 [climateprediction.net] Output file oifs_43r3_bl_a019_2016092300_15_949_12166439_0_r1529103669_10.zip for task oifs_43r3_bl_a019_2016092300_15_949_12166439_0 absent
Dec 14 12:04:07 hal boinc[2320]: 14-Dec-2022 12:04:07 [climateprediction.net] Output file oifs_43r3_bl_a019_2016092300_15_949_12166439_0_r1529103669_11.zip for task oifs_43r3_bl_a019_2016092300_15_949_12166439_0 absent
Dec 14 12:04:07 hal boinc[2320]: 14-Dec-2022 12:04:07 [climateprediction.net] Output file oifs_43r3_bl_a019_2016092300_15_949_12166439_0_r1529103669_12.zip for task oifs_43r3_bl_a019_2016092300_15_949_12166439_0 absent
Dec 14 12:04:07 hal boinc[2320]: 14-Dec-2022 12:04:07 [climateprediction.net] Output file oifs_43r3_bl_a019_2016092300_15_949_12166439_0_r1529103669_13.zip for task oifs_43r3_bl_a019_2016092300_15_949_12166439_0 absent
Dec 14 12:04:07 hal boinc[2320]: 14-Dec-2022 12:04:07 [climateprediction.net] Output file oifs_43r3_bl_a019_2016092300_15_949_12166439_0_r1529103669_14.zip for task oifs_43r3_bl_a019_2016092300_15_949_12166439_0 absent

ID: 66925 · Report as offensive
Richard Haselgrove

Send message
Joined: 1 Jan 07
Posts: 943
Credit: 34,305,695
RAC: 11,295
Message 66926 - Posted: 15 Dec 2022, 18:24:49 UTC - in response to Message 66917.  

Thanks Richard. That gives some reassurance about mixing versions.
To follow up on that:

Just at the moment, one of my machines is currently running one of your IFS_ps tasks built with API version 7.20.1, alongside a HadSM4 N144 built with API 7.9.0. They're getting along just fine.

It's a brutally simple but effective system. The compiler puts the text string API_VERSION_whatever into the compiled library, and the deployment script searches for that text in the finished executable, and copies it to the XML control file. That way, the BOINC client knows how to talk to the app via the correct library calls.
ID: 66926 · Report as offensive
biodoc

Send message
Joined: 2 Oct 19
Posts: 21
Credit: 46,360,943
RAC: 13,614
Message 66927 - Posted: 15 Dec 2022, 18:26:46 UTC

I upgraded the boinc client 7.20.5 on the computer with the 2 errors to see if it's more reliable.
ID: 66927 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 807
Credit: 13,593,584
RAC: 7,495
Message 66930 - Posted: 16 Dec 2022, 9:53:41 UTC - in response to Message 66926.  

Thanks Richard. That gives some reassurance about mixing versions.
To follow up on that:It's a brutally simple but effective system. The compiler puts the text string API_VERSION_whatever into the compiled library, and the deployment script searches for that text in the finished executable, and copies it to the XML control file. That way, the BOINC client knows how to talk to the app via the correct library calls.
Ok. If you do hear any word on the grapevine that 7.20 is, shall we say, less trustworthy than earlier versions, let me know ;)
ID: 66930 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 807
Credit: 13,593,584
RAC: 7,495
Message 66932 - Posted: 16 Dec 2022, 10:18:22 UTC - in response to Message 66924.  

syslog of https://www.cpdn.org/result.php?resultid=22250622. Looks like most of the output files were missing.... (snipped)
biodoc & cletus, thanks for looking. This is exactly what I was hoping for, in biodoc's syslog (at the bottom), we have:
Dec 15 07:42:11 x32-linux3 systemd-coredump[24513]: Process 23225 (oifs_43r3_model) of user 129 dumped core.#012#012Stack trace of thread 23225:#012#0  0x0000000001dc903b n/a (/var/lib/boinc-client/slots/0/oifs_43r3_model.exe (deleted) + 0x19c903b)
This tells me it's the model process that's failing, and not the controlling wrapper code. Which is very useful because up to now we've been assuming it was the controlling wrapper.

I've never seen the model fail like this on my machines, nor on the machines attached to CPDN's development test site. I wonder if it's hardware related, as this failed on biodocs's 5950X. I only have small AMD box to test on and develop on intel.

The missing files was a bug that was corrected before the latest batch of the oifs_43r3_bl app went out.

Thanks again.
ID: 66932 · Report as offensive
biodoc

Send message
Joined: 2 Oct 19
Posts: 21
Credit: 46,360,943
RAC: 13,614
Message 66933 - Posted: 16 Dec 2022, 10:48:30 UTC - in response to Message 66932.  


I've never seen the model fail like this on my machines, nor on the machines attached to CPDN's development test site. I wonder if it's hardware related, as this failed on biodocs's 5950X. I only have small AMD box to test on and develop on intel.

Thanks again.


Actually that computer is a 3950X which is also AMD so your point is taken.
ID: 66933 · Report as offensive
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 807
Credit: 13,593,584
RAC: 7,495
Message 66934 - Posted: 16 Dec 2022, 11:28:57 UTC - in response to Message 66933.  
Last modified: 16 Dec 2022, 11:47:56 UTC

I've never seen the model fail like this on my machines, nor on the machines attached to CPDN's development test site. I wonder if it's hardware related, as this failed on biodocs's 5950X. I only have small AMD box to test on and develop on intel.
Actually that computer is a 3950X which is also AMD so your point is taken.
I stand corrected. What does puzzle me is why there was no stack trace in the log returned with the task. Something else to look into.

There may also be two errors here because one of your logs showed the model finishing normally and only at the very end did we see the double free corruption. If you do see any more 'double free' errors, please do check in the syslog for any core dump messages, that would be very helpful.

Overall this batch of 250 tasks ran considerably better than previous batches, with only a 8% error rate. I believe there are about 6500 tasks of the oifs_43r3_bl app & 39000 tasks of the oifs_43r3_ps app ready to go once CPDN are happy with this test batch. Hopefully soon.
ID: 66934 · Report as offensive
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1061
Credit: 16,545,793
RAC: 2,287
Message 66935 - Posted: 16 Dec 2022, 12:33:59 UTC - in response to Message 66934.  
Last modified: 16 Dec 2022, 12:37:35 UTC

Overall this batch of 250 tasks ran considerably better than previous batches, with only a 8% error rate. I believe there are about 6500 tasks of the oifs_43r3_bl app & 39000 tasks of the oifs_43r3_ps app ready to go once CPDN are happy with this test batch. Hopefully soon.


My machine is ready to go, running 3 of each all at the same time, once I get them. So far, all the tasks I have run have completed successfully. This may be too many to run at once from a performance standpoint, but we will see.

Good time to send them out, too, since Rosetta's download server has be down for days and I ran out of work from them two days ago.
ID: 66935 · Report as offensive
Mr. P Hucker

Send message
Joined: 9 Oct 20
Posts: 690
Credit: 4,391,754
RAC: 6,918
Message 66936 - Posted: 16 Dec 2022, 15:13:23 UTC - in response to Message 66935.  
Last modified: 16 Dec 2022, 15:14:15 UTC

Good time to send them out, too, since Rosetta's download server has be down for days and I ran out of work from them two days ago.
they have 6.5 million queued on the server, and I got some 1, 2, 3, and 8 hours ago, and the odd one every day for the last week (but they had not much to send)
ID: 66936 · Report as offensive
xii5ku

Send message
Joined: 27 Mar 21
Posts: 79
Credit: 78,302,757
RAC: 1,077
Message 66939 - Posted: 17 Dec 2022, 16:32:57 UTC
Last modified: 17 Dec 2022, 16:38:19 UTC

Uploads are not working very well for me currently (and haven't last week either):

I am currently running three "OpenIFS 43r3 Perturbed Surface v1.05" tasks in parallel. (Those are replica tasks from workunits with earlier error results, a.k.a. resends.) From the result file output of these few tasks alone, I am seeing very frequent "transient HTTP error" events. Upload server is upload11.cpdn.org.

Secondary problem: When the client retries failed uploads, it receives "Error reported by file upload server: [file name] locked by file_upload_handler PID=12345" on the first several retries.
ID: 66939 · Report as offensive
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4346
Credit: 16,541,921
RAC: 6,087
Message 66940 - Posted: 17 Dec 2022, 16:36:28 UTC - in response to Message 66939.  

Secondary problem: When the client retries failed uploads, it receives "Error reported by file upload server: [file name] locked by file_upload_handler PID=12345" on the first several retries.


The locked by file_upload_handler is I think that the process managing the initial try hasn't let go of it yet. Are the files getting through eventually? If the problem persists we can chase on Monday when I am guessing Andy will be in.
ID: 66940 · Report as offensive
xii5ku

Send message
Joined: 27 Mar 21
Posts: 79
Credit: 78,302,757
RAC: 1,077
Message 66941 - Posted: 17 Dec 2022, 16:39:53 UTC - in response to Message 66940.  
Last modified: 17 Dec 2022, 16:43:10 UTC

Dave Jackson wrote:
The locked by file_upload_handler is I think that the process managing the initial try hasn't let go of it yet. Are the files getting through eventually? If the problem persists we can chase on Monday when I am guessing Andy will be in.
Yes, files get through eventually. It takes them a good while, but the overall backlog is not increasing in the long run.

(Edit, some files upload without errors on first try. But rather many don't.)
ID: 66941 · Report as offensive
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4346
Credit: 16,541,921
RAC: 6,087
Message 66942 - Posted: 17 Dec 2022, 17:37:23 UTC - in response to Message 66941.  

Yes, files get through eventually. It takes them a good while, but the overall backlog is not increasing in the long run.


I saw that when uploads started to work again and then later all seemed OK. Some other users saw no problems. I had put it down to the servers getting hammered when Andy sorted things out but what you report makes me not so sure. I guess we will just need to keep an eye on it.
ID: 66942 · Report as offensive
xii5ku

Send message
Joined: 27 Mar 21
Posts: 79
Credit: 78,302,757
RAC: 1,077
Message 66964 - Posted: 19 Dec 2022, 5:54:52 UTC - in response to Message 66942.  

FWIW, the rest of the few replica tasks in my work buffers completed and uploaded throughout yesterday, all results are marked valid. Though the uploads were dragged out by the described temporary transfer failures, which recurred yesterday as well.
ID: 66964 · Report as offensive
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 247
Credit: 12,035,877
RAC: 23,095
Message 66973 - Posted: 19 Dec 2022, 21:19:33 UTC

Any indications if the Mac models have been fixed and are going to be re-released soon? Just wondering as re-configuring the PC between WSL2 and VBox Mac is not as simple, thus preferably would run one for a while before switching to the other.
ID: 66973 · Report as offensive
Profile Dave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4346
Credit: 16,541,921
RAC: 6,087
Message 66974 - Posted: 20 Dec 2022, 8:33:34 UTC - in response to Message 66973.  
Last modified: 20 Dec 2022, 22:38:42 UTC

Any indications if the Mac models have been fixed and are going to be re-released soon? Just wondering as re-configuring the PC between WSL2 and VBox Mac is not as simple, thus preferably would run one for a while before switching to the other.

"Soon if all goes well." Is what I have seen but whose definition of soon? At least another day before the next OIFS tasks is likely.

Edit: If I was to bet on it I would say the OIFS will make it first but I really am guessing.

£dit2: 1,000 OIFS perturbed surface tasks which have almost gone. #950
ID: 66974 · Report as offensive
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 42 · Next

Message boards : Number crunching : New work discussion - 2

©2024 climateprediction.net