1)
Questions and Answers :
Unix/Linux :
Computational Error exit status 193 (0xc1) various Linux computers
(Message 54474)
Posted 11 Jul 2016 by Jonathan Brier Post: Worth running memtest for several hours to exclude a problem with one of your memory modules. Both computers passed memtest many times over. One is a new workstation stress tested for hours and only issues on climateprediction.net which I'm marking up to the app's handling and robustness. The problem with this, is that the researchers are no longer at Oxford. They're climate physicists from all over the planet. The physical location of researchers should not be an issue as the Internet allowing distributed work on software. In the interest of getting the science done efficiently and using volunteer resources ethically someone should be monitoring and caring the errors for needed fixes to be implemented. Knowing which errors are occurring at what rate helps direct the time investment for the largest return in additional computing. You don't know the impact without measuring them. Additionally people should care as these are volunteer resources and software is not a static thing especially when dealing with heterogenous environments such as BOINC was designed. Disregard for the electricity and efficient use of hardware of volunteers will breed ill will for the project no matter the science.
Neither group should have to worry about results failing that is on those running the project to make sure their software is operating correctly and robust to using the donated resources efficiently with the least amount of waste or the researchers should not be recruiting the general public. BOINC projects are designed to be installed and able to be left alone. Anything less and the project is not mature for public participation.
There should be zero expectation of participants to install extra libraries. Instead the lack of these should be detected by the project to not send work units to these computers or provide necessary the library locally. There should not be a constant stream of errors and wasted resources from volunteers due to the project providing work units to computers that do no have sufficient environments. If providing them locally is not possible then the project should be running in a virtual machine to have full environment control. Regarding the 32 bit libraries needed on a 64 bit Linux machine there is insufficient documentation for this and that needs to be moved to a more visible location than the sticky in the linux section of the forums the join instructions would be one additional place to note this for reference. BOINC notifications would be appropriate starting point to notify computers missing the libraries to help bring them into compliance before an automated mean could be implemented. Participants shouldn't have to sift through the entire thread to find libraries that may need to be installed.
The whole point of checkpointing is to resume where unexpectedly interrupted actions occur. The app should recognise an invalid exit and attempt resuming from the last checkpoint. I'm well aware with the issue that could arise given I've watched BOINC evolve from day one. I expect climateprediction.net to have a more robust approach to their project maintance given how long they have been using BOINC not just a pretty website and highly valuable scienctific project that is perfectly paired for engaging the public.[/quote] |
2)
Questions and Answers :
Unix/Linux :
Computational Error exit status 193 (0xc1) various Linux computers
(Message 54467)
Posted 9 Jul 2016 by Jonathan Brier Post: Looking over the past workunits I'm seeing a majority of my devices are exiting with a computation error with exit status 193 (0xc1). It appears to be memory related [url]http://boincfaq.mundayweb.com/index.php?view=238[\url] http://climateapps2.oerc.ox.ac.uk/cpdnboinc/results.php?hostid=1256213 http://climateapps2.oerc.ox.ac.uk/cpdnboinc/results.php?hostid=1401337 The virtualLHC project recently published their breakdown of computational error at http://lhcathome2.cern.ch/vLHCathome/forum_thread.php?id=1846 which was quite informative on what problems they were encountering, tackling, and somewhat explianed what they were. Could we get such a breakdown for climateprediction.net to see how pervasive the various computational errors are on the tasks or computer types? |
3)
Message boards :
Number crunching :
Any warnings about upgrade from BOINC 6.10.56 to 6.12.34 ?
(Message 43146)
Posted 6 Oct 2011 by Jonathan Brier Post: I have not experienced any issues when upgrading. Did you shutdown the old BOINC instance before installing the new version? What type of system pc, mac, etc? |
©2024 climateprediction.net