1)
Message boards :
Number crunching :
Error code 22 / Missing Data In Ocean UV Field
(Message 31690)
Posted 13 Dec 2007 by Andy Lee Robinson Post: I think anyone taking on these tasks should be committed to seeing them through, and dedicate whole machines to them. 1000 day limit? I hate arbitrary limits because they will always bite someone somewhere! M$ is famous for that crime! Yes, I don\'t anticipate having a \'wrong\' deadline will cause any problems. The machine is dedicated and CPDN runs completely unobtrusively so can largely forget it\'s even there. Hopefully this thread will help others with similar questions. Cheers, Andy. |
2)
Message boards :
Number crunching :
Error code 22 / Missing Data In Ocean UV Field
(Message 31684)
Posted 13 Dec 2007 by Andy Lee Robinson Post: Andy, the deadline for your new model says 2012 on your computer\'s results page: Ah, thanks for the pointer - yes, looks fine there... Here is my CPDN task list as shown via BoincView via an ssh tunnel to the webserver running Crunch3r\'s client 5.5. The task is highlighted in red because it thinks it has expired, and note also the erroneous 1.18 s/TS - the machine is a 4200+, not an overclocked Core2! These errors could arise from the workunit, client or boincview and therefore probably not so easy to identify. Having said that, the other task looks to be reporting OK even though it is a particularly demanding one. Cheers, Andy. |
3)
Message boards :
Number crunching :
Error code 22 / Missing Data In Ocean UV Field
(Message 31646)
Posted 10 Dec 2007 by Andy Lee Robinson Post: http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6910804 I just had the same problem on a WU that managed to get from 2000 to 2065. No problems with the machine - it is a production webserver! I take daily backups using rsync, but I don\'t think it is worth restoring because it\'ll probably just do the same thing again, and the machine has also just downloaded a new WU. Better to make sure that all the models have all the data they will need during their lifespans! Curiously, the deadlines for that one and the new job is backdated to 2002, surely a mistake? I think the validator ignores missed deadlines because the data is too valuable to just throw away, but backdating WUs in this way pushes BOINC into EDF mode. It isn\'t a real problem, but perhaps it should be looked at to avoid more chatter! Cheers, Andy. |
4)
Questions and Answers :
Unix/Linux :
Fatal error in last minute of WU, but still reports success. Admin, please examine!
(Message 29125)
Posted 3 Jun 2007 by Andy Lee Robinson Post: Nice idea but it can\'t be done. CPDN awards credit as the Run progresses. It is intended that all boinc Projects award the same amount of credit for equal amounts of work. So... Well, in principle yes, but in practice I suspect a little more generosity wouldn\'t go amiss as these WUs are about 1000x longer than any others and require a lot of patience, commitment and stamina to see through!
Yes, but this is also a negative thing which doesn\'t give so much incentive to take care of the task!
Well, perhaps avoiding churn and keeping the existing participants interested may be more significant than bringing in new ones that then just drop out after a while!
Thanks - I have a nice warm fuzzy feeling now at actually having got all the way through two of these monsters... :-) the sulphur runs last year were much shorter, but I still had difficulty keeping a machine stable enough to run continuously while trying to survive occasional power outages, developing applications which could do all sorts of unpredictable things, and rendering animations etc. I think a greater degree of granularity would help overall, say distributing 10 year pieces - you can combine them as they come in, though there isn\'t the same magnitude of satisfaction on completion! ;-) Also, optimisation for the significant numbers of SSEn+ enabled processors (of course without losing sight of accuracy) and maybe even a PS3 version, which I think would be a major feat! Conceivably they could do a WU in about a week, if single precision could be fudged to produce acceptable results, though would still be useful in double precision mode. I guess the next version of the Cell will do DP just as quick as current SP anyway, so worth a thought! I\'m very concerned about climate change, and look forward to learning about your developments and of any improvements in model capability and code optimisation. |
5)
Questions and Answers :
Unix/Linux :
Fatal error in last minute of WU, but still reports success. Admin, please examine!
(Message 29097)
Posted 1 Jun 2007 by Andy Lee Robinson Post: Yes Andy, it did finish, well done - result and graph here. Version 5.15 of the climate software shocks everyone by reporting every single error message since the beginning of the model, when the model completes! Looks as if you restored it from a backup at some point? (If so, well done for that too!) Thanks very much for your reassurance - I have another one on the other core to finish in 5 hours time, so looking forward to that too! I\'m surprised that it didn\'t seem to upload everything on completion. Yes, it is quite an achievement to actually complete a WU, I tried a few times on my overclocked Core2, but after a few weeks a crash would happen, something would get corrupted and the WU would abort :-( This time I ran it on my linux production webserver which is quite lightly loaded and stable, (as it has to be!) and the WUs survived. I tried to just leave it alone as much as possible, and not even sneeze in the general vicinity! It might be a good idea to award a substantial credit prize on successful completion. I hadn\'t restored a backup on the machine, but upgraded the kernel a few times so requiring a reboot. Once the last 5.15 WU has completed, should I detach and reattach to clean out the folder and prepare for the new app? Cheers, Andy. |
6)
Questions and Answers :
Unix/Linux :
Fatal error in last minute of WU, but still reports success. Admin, please examine!
(Message 29079)
Posted 31 May 2007 by Andy Lee Robinson Post: I\'ve just finished one which reported success but the details in the results file suggest otherwise, and I didn\'t see anything uploaded. 3 months processing and all this in the last minute... I\'d like to know if it really is OK, and if the files can be salvaged and uploaded somehow. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6312426 <core_client_version>5.5.0</core_client_version> <stderr_txt> (null): cannot open input file dataout/atmos_restart.day (null): cannot open input file dataout/ocean_restart.day ... [deleted] ... pp2netcdf crashed: Error in getting file type Error in converting file dataout/b6hcfo.pjk6c10 to netcdf format. pp2netcdf crashed: Error in getting file type Error in converting file dataout/b6hcfo.pik6c10 to netcdf format. pp2netcdf crashed: Error in getting file type Error in converting file dataout/b6hcfo.pfk6c10 to netcdf format. pp2netcdf crashed: Error in getting file type Error in converting file dataout/b6hcfa.phk6c10 to netcdf format. pp2netcdf crashed: Error in getting file type Error in converting file dataout/b6hcfa.pgk6c10 to netcdf format. pp2netcdf crashed: Error in getting file type Error in converting file dataout/b6hcfa.pek6c10 to netcdf format. pp2netcdf crashed: Error in getting file type Error in converting file dataout/b6hcfa.pdk6c10 to netcdf format. (null): cannot open input file dataout/ocean_restart.day Model crashed: umshell1.f: READ_FLH: I/O error (null): cannot open input file dataout/ocean_restart.day Model crashed: umshell1.f: READ_FLH: I/O error (null): cannot open input file dataout/ocean_restart.day Model crashed: umshell1.f: READ_FLH: I/O error (null): cannot open input file dataout/ocean_restart.day Model crashed: umshell1.f: READ_FLH: I/O error Fatal crash! :-( </stderr_txt> |
©2024 climateprediction.net