climateprediction.net home page
Posts by jrapdx

Posts by jrapdx

21) Message boards : Number crunching : Error while computing (Message 53507)
Posted 23 Feb 2016 by jrapdx
Post:
Terribly sorry about the multiple posts. The browser kept timing out and I didn't know the (partial) messages were sent. Maybe a moderator could delete all but the last one, I'd appreciate it.
22) Message boards : Number crunching : Error while computing (Message 53506)
Posted 23 Feb 2016 by jrapdx
Post:
Suspend work if CPU usage is above is probably set at the default of 25%, which isn't a good idea with climate models.
It was set to 25%. On my other computer, setting of 60% seems OK, and I reset this one to 60% too. However the computer wasn't being used for much except Wine/BOINC, without which CPU usage was well below 25%. I doubt the occasional OS activity exceeded 25%, but no harm using the higher setting.

Yesterday one task completed (yay!), but subsequent downloads errored out with message "couldn't start app: CreateProcess() failed - Internal error.(0x54f)". Around that time I had trouble getting Wine to run (after a system reboot), which probably accounts for these errors.

Wine does seem unreliable on my system, perhaps configuration issues but haven't found anything notable. I've considered deleting and reinstalling Wine (and BOINC), but hesitate re: losing the CPDN work underway. Maybe there's a way to save and resume it, but haven't dug into the question yet.

Could be coincidental but all the failures under Wine have been with wah2 tasks. However on the positive side two wah2 are still running and with any luck will successfully complete.
23) Message boards : Number crunching : Error while computing (Message 53497)
Posted 22 Feb 2016 by jrapdx
Post:
You inspired me to do the same, and I've been running BOINC/CPDN tasks for a few weeks now. One set of tasks (wah2) finished, and I have another progressing with the shorter tasks nearing completion, demonstrating that it can work well.

However I think it's worth pointing out that BOINC/CPDN under Wine is not all a bed of roses. I've experienced numerous "error while computing" task failures, some of which are likley attributable to Wine-related interruptions. Wine itself can be tricky to set up, I am still working on getting boincmgr.exe to start correctly when the computer unexpectedly reboots (as we are subject to random power failures here).

I found it was necessary to use the most recent development Wine versions, the earlier releases didn't work on my system. (Using Ubuntu 15.04/15.10 on stock hardware.) Wine 1.9.4 was just announced, with Ubuntu PPA latest is 1.9.3.
When I nail down the magic recipe for keeping all the plates spinning, I'll post the information.

24) Questions and Answers : Unix/Linux : CPDN under Wine: not getting new tasks (Message 53477)
Posted 19 Feb 2016 by jrapdx
Post:
For 2 days everything looked good, running 4 tasks without problems, until an hour ago, when there was the dreaded "Error while computing...". (Task 19292736) The error was "Signal 11 received..." following a huge number of suspend requests. The segmentation violation was previously happening just after downloading, not sure what accounts for the early or late errors.

The remaining 2 wah2 tasks are still running, I guess we'll see how far they get.
25) Questions and Answers : Unix/Linux : CPDN under Wine: not getting new tasks (Message 53459)
Posted 17 Feb 2016 by jrapdx
Post:
...it's NOT an error; it's a BOINC information message...
Indeed a closer look shows a dozen tasks exited on SIGSEGV. It's a familiar result of mistakes I've made, like calling free() on a NULL pointer or some similar bug. I understand what you mean re: "file not found" messages. Taking too quick a glance at stderr, my attention was drawn to the prominent and repeated "file xfer error" vs. "Signal 11" that's kind of buried in the stuff at the top.

More relevantly, a few hours ago BOINC downloaded 4 new tasks, unfortunately one promptly crashed per above error. After rebooting, turns out it requires just the right magical incantation to get Wine and BOINC going correctly. Once that got sorted out, 3 wah2 tasks seem to be running fine.

Now awaiting a fourth task to land, however when requesting a project update, I get messages like "... Not sending work - last request too recent: 3174 sec". I take it an interval needs to expire, that 3174 sec is too soon. Not clear how long a delay is enough or how/where the interval is set. Also, does the "timer" reset with each attempt to update the project?

Eventually I'll get these things figured out.
26) Message boards : Number crunching : Error while computing (Message 53451)
Posted 16 Feb 2016 by jrapdx
Post:
Sounds similar to problems I was having. See thread in Unix/Linux section where the problem was discussed, apparently a "bad batch" of tasks is implicated. We're told the server has been in maintenance mode to allow technicians to weed out the error-producing tasks. I'm not sure if that work has been completed.

In my case tasks were crashing immediately after download. I've temporarily set "no new tasks" to prevent this thrashing. When problem resolved will reset it.
27) Questions and Answers : Unix/Linux : CPDN under Wine: not getting new tasks (Message 53449)
Posted 16 Feb 2016 by jrapdx
Post:
Just checked the Server Status page, still gives wah2 "ready to send" number as 15,274, same as before. Not clear if the work of removing the error-prone WUs is completed and OK to resume downloading tasks.

As it happens I need to do some updating on my computer, so I'll take advantage of this hiatus to do that and check back later. I'm thinking the number of available tasks will be <15274 after sorting out good vs. bad batches.
28) Questions and Answers : Unix/Linux : CPDN under Wine: not getting new tasks (Message 53443)
Posted 16 Feb 2016 by jrapdx
Post:
Thanks for your reply. In the morning (local time) the servers should be cleaned up and back on line, so I can give it another try...
29) Questions and Answers : Unix/Linux : CPDN under Wine: not getting new tasks (Message 53441)
Posted 16 Feb 2016 by jrapdx
Post:
According to the "Server Status" page, there are 15704 wah2 tasks ready to send. Are all of these afflicted with "batch 341 setup"? If not all, what proportion?

As it is, I saw no point continuing to download only to have tasks immediately error out. Of course I'd like to have my computer get back to work, but hard to know when doing that will be "safe".

Seems like a problem that could cause a lot of consternation for participants. I'm guessing how quickly it's resolved could depend on the size of the bad batch. In my imagination, if not too large, could be easier to remove error-causing WUs from the ready-to-send list and task startup would go back to normal that much sooner.

Resolving the problem, however it's done, will be a good thing.
30) Questions and Answers : Unix/Linux : CPDN under Wine: not getting new tasks (Message 53439)
Posted 16 Feb 2016 by jrapdx
Post:
Wow. Do something else for a couple of hours, come back and everything's changed!

In the interim, 6 WU were downloaded and errored out. The error was "file not found", e.g., wah2_sas50_fe5a_201412_13_341_010314204_2_[1..14].zip (for WU 10314204). Seems like we've seen this before.

However, three fresh tasks were sent which are running now. Not sure what accounts for the difference, they're all wah2 tasks. Maybe it would be useful for someone more familiar with the programs to take a look at computer 139186 (Win10) to assess the errors.

The problem I reported earlier is still a mystery to me. According to BOINC, 10GB is total disk allocation, of which 3.23GB is in use, leaving 6.77GB available. Obviously there was and is ample disk space. There are some crashed tasks now, but there weren't any earlier, so probably could rule that out as a cause of the problem.

I do appreciate the responses to my query! The new developments above raise a bunch of interesting questions about what's going on, and what to do to help things run more smoothly.

Edit: I spoke too soon. On closer inspection it appears the tasks that seemed to be running actually terminated due to errors and replaced by tasks which had errors, etc. I may have to stop accepting downloads for a while, until this situation is clarified.
31) Questions and Answers : Unix/Linux : CPDN under Wine: not getting new tasks (Message 53436)
Posted 15 Feb 2016 by jrapdx
Post:
BOINC is showing notices saying the new tasks need more memory than it thinks available, precisely:
UK Met Office HadAM3P-HadRM3P Australia New Zealand needs 840.86MB more disk space. You currently have 1066.49 MB available and it needs 1907.35 MB.
That doesn't make sense to me re: OS reports there are 672GB available on the drive "c:" as Wine knows it. Must be something in the Wine config, but not clear what the problem is.

Anyway if anyone has a clue I'd appreciate the info.
32) Questions and Answers : Unix/Linux : CPDN under Wine: not getting new tasks (Message 53433)
Posted 15 Feb 2016 by jrapdx
Post:
BOINC was working under Wine (Ubuntu 15.04), and 4 CPDN tasks were running OK. All 4 wah2 WUs are nearly complete (>95%).

Unfortunately, something happened earlier today when I wasn't around, causing an apparent reboot. After logging in and getting Windows BOINC restarted, trickles (for all 4 tasks) were immediately uploaded.

However the trickles haven't shown up yet in the task details pages, which before were usually listed a couple of hours after upload. Also boincmgr didn't request new tasks and none received. In the past as tasks approached completion, new ones were lined up to start running but that's not happened even though available.

I'm wondering if the reboot (whatever the reason) messed things up. Indeed running tasks under Wine feels riskier vs. a "real" Windows platform. On the good side, none of the tasks failed and everything looks on track for them to finish satisfactorily. It would be nice to get new work to do rather than have the machine sit idle.

Maybe the issues will be self-correcting. Requested a project update but that didn't do a whole lot. Anything else worth doing?
33) Questions and Answers : Unix/Linux : Running BOINC under WINE (Message 53405)
Posted 4 Feb 2016 by jrapdx
Post:
How about that: this afternoon 4 tasks were downloaded and are now running under Wine! The only adjustment I made was reducing CPU utilization slightly to keep CPU temp a bit lower (around 60), similar to configuration for Linux tasks. Don't think it slows task progress excessively.

So far, so good...
34) Questions and Answers : Unix/Linux : Running BOINC under WINE (Message 53400)
Posted 4 Feb 2016 by jrapdx
Post:
Not entirely sure, but I think Wine gained Windows 10 support in Sept or Oct 2015. Since I didn't install Wine until last month, Win10 was available right away. Windows version probably doesn't matter a whole bunch, I notice quite a few computers on CPDN lists continue to run Win7 without problems re: CPDN tasks.

Looking forward to release of the new batch of tasks!
35) Questions and Answers : Unix/Linux : Running BOINC under WINE (Message 53396)
Posted 4 Feb 2016 by jrapdx
Post:
Just a few minutes ago the BOINC instance running under Wine was able to contact the servers and I could finally add the CPDN project! It shows up on my list of computers (ID 1389186) as a Win10 device.

BTW BOINC wasn't working correctly on my Linux box until Wine upgraded to latest devel version (1.9.2), but possible some other change made the difference. (Previously the problem was BOINC couldn't make the necessary server connections so nothing could happen.)

Now the only thing missing are some tasks to run. It will be very interesting to see how it goes when they arrive. Unless of course the next batch is for Linux machines, but I'm gambling that won't be the case.
36) Message boards : Number crunching : New WU coming? (Message 53182)
Posted 27 Dec 2015 by jrapdx
Post:
As soon as I can offload some obligations I working on, I'm going to have to try it. Sounds indeed like WINE has aged nicely, more stable than it used to be. FWIW all the Linux tasks either finished or crashed in the last couple of weeks, so ATM none are running. Since it doesn't appear too likely new Linux work units are on the horizon, starting up WINE a few weeks from now shouldn't be a problem.
37) Message boards : Number crunching : New WU coming? (Message 53180)
Posted 26 Dec 2015 by jrapdx
Post:
I have now got BOINC installed under wine on my machine ...

WINE is an option I was thinking about trying on one of my Linux boxes. From past experience running Windows binaries was rather a crap shoot, some did work but more didn't. So I think it's great getting Windows BOINC installed, but whether the CPDN tasks will run could be a different matter. I'll be very curious how it goes for you.
38) Questions and Answers : Unix/Linux : *** Running 32bit CPDN from 64bit Linux - Discussion *** (Message 53064)
Posted 9 Dec 2015 by jrapdx
Post:
Thanks for the information, interesting, I hadn't considered permissions, then again I'd think "read" access would be available for a file in the right directory.

More relevant is the history of the two "fail to download" tasks. In both instances the tasks had been previously crunched until "error while computing" (exit code 193), followed by download errors on being sent to two computers on Dec 5 2015, my computer being the last. (Re: workunits 9887820 and 10008664.)

It wasn't clear when the problem was fixed, maybe after Dec 5. In that case it looks like I was "unlucky" enough to get the bum tasks. At least it seems my systems were not the source of the issue, well, this time anyway.

Thanks,
JRA
39) Questions and Answers : Unix/Linux : *** Running 32bit CPDN from 64bit Linux - Discussion *** (Message 53060)
Posted 9 Dec 2015 by jrapdx
Post:
I had two tasks fail with the same download error that Wes reported. If I understand what's going on, the server wasn't able to find >=1 file the client requested, though not necessarily the same files on repeated tries. I'm guessing this error is affecting more than a few clients.

The fact that for the same work unit successive clients receive download errors for different sets of files is a curiosity, though the important thing appears to be that quite a few files intended to be downloaded aren't in the proper location. Is it fair to call this a problem of server misconfiguration?
40) Message boards : Number crunching : HadCM3n release (Message 53046)
Posted 7 Dec 2015 by jrapdx
Post:
Thanks for the info about "theta" errors, lots to learn about these subjects.

As far projected duration, my Linux system has 3 tasks running with a range of ~637 to 641 hours to go (and about 46 already elapsed). That seems pretty realistic but further adjustments may still happen.


Previous 20 · Next 20

©2024 climateprediction.net