climateprediction.net home page
Posts by Joe's Climate

Posts by Joe's Climate

1) Questions and Answers : Unix/Linux : Starting Boinc (Message 47176)
Posted 26 Sep 2013 by Profile Joe's Climate
Post:
I went through this problem many years ago, and after reviewing several websites for how to do this, I ended-up figuring out, and then writing a how-to for what I was running back then in the past (Mandriva) - see here: http://www.joescat.com/boinc/

The steps described above should still be able to be used now, with maybe some minor modifications to suite your particular flavour of linux, and hopefully what is described helps you understand how linux multitasks different tasks (and users too) plus does interesting things like 'nice' priorities, which I find much more integrated for running BOINC compared to the Windows screensver-mode version I tried back then in the past.

...along with the info above, if you run into problems, you may also want to read the spyhill howto (links included) in parallel for comparison of notes, differences, etc in case what might apply in one may not apply in another (and visa-versa in-case you happen to be running a different distro).

Hopefully, the info above helps you get going in the right direction.

Cheers,
Joe
2) Message boards : Number crunching : New Tasks not being snapped up (Message 47175)
Posted 26 Sep 2013 by Profile Joe's Climate
Post:
I'm probably not the only one who has thought of this but just wondered if CPDN couldn't email all active accounts with a gentle reminder that lots of tasks are awaiting their attention.


For users new to BOINC, this sounds like a good idea, but for users of BOINC who have used it a long time, this would get old quickly - so I could see the possibility of some people complaining of spam.

BOINC was designed to be a background type of program making use of spare CPU cycles. Ideally, BOINC should go out and scout for more work if it's available instead of the user going out and finding work to run.

You may want to run the BOINC manager on your desktop for a little while to see what's happening. You'll note that if you leave the settings turned-on for your copy of BOINC to check for tasks, it should find something if there is WUs to do, and if there isn't, it will try again a little later.

In summary, just leave it turned-on to look for more WUs, and BOINC should take care of itself.
3) Message boards : Number crunching : failed upload: can't resolve hostname (Message 47174)
Posted 26 Sep 2013 by Profile Joe's Climate
Post:
if you're not running an ancient BOINC-client like v6.2.xx or something even older.


You mean ancient, as in something like 6.12.x ?
;-P

(...yes, I'll upgrade sometime...soon hopefully)
4) Questions and Answers : Unix/Linux : Raspberry Pi? (Message 45423)
Posted 9 Jan 2013 by Profile Joe's Climate
Post:
If you are going to try the raspberry, choose one that has hardware floating point, and choose an OS that enables hardware floating point since not all OSes have it enabled.

When running BOINC, you'll have to use ver 6.x at the moment unless any significant updates happened recently to ver 7.x

I know it's already been tried on the OLPC a few years ago, and there is also some mention of raspberry in boinc_opt recently a couple months ago using seti and boinc 6.x at that time.

I agree with Les, that Climate Prediction is a bit too big for a raspberry, but some other smaller projects are within it's reach if you are still looking to climb that mountain. ;-)
5) Message boards : Number crunching : Download Failed (Message 45420)
Posted 8 Jan 2013 by Profile Joe's Climate
Post:
Check if you have ample harddrive space. Some work units are very large.
If you have boinc running on vfat, you may want to consider running it in ntfs since vfat is limited to 2GB file sizes.
In boinc, there is an option to test/verify the integrity of your file downloads since some ISPs may alter your downloads - maybe you'll want to turn that on.
Make sure your computer is up to date with all patches and upgrades.


If the WU fails to download, let boinc kill the download and cleanup itself.
The reason for mentioning that is because some projects tended to have a lot of this happen and users got in the habit of killing failed downloads - which sort of messed things up on the project's side, and some projects actually penalized your computer for that sort of stuff making it harder to download a new replacement project.

I don't know if any of the above will help, but hopefully it might.
6) Message boards : Number crunching : hadcm3n affecting other projects, computer crash if running a long time (Message 45419)
Posted 8 Jan 2013 by Profile Joe's Climate
Post:
Hi Les,
Thanks for the suggestion about Suspend before shutting-down, but doing tasks like that is a bit of extra work, which realistically shouldn't need to be done if we are supposed to run BOINC as run-n-forget. For now, I'll just leave the computer running 24/7 while this WU is running, but my preference has always been to shut-off the computer if I'm done.
I also took a look at your setup and both your machines are XP, which are pretty well single user computers, which to me most likely means you've got the boinc manager handy on your desktop so that you can suspend right away before you shut down. With linux, it's not that much more difficult to create a second or Nth user, and let boinc run isolated in that other user account, so even if it messed-up that account, it wouldn't affect your own stuff (it's just a different way of thinking security - and if you want to read-up, I've got it more or less set up like this: http://www.joescat.com/boinc/ ). The other difference between XP and linux is that in XP, it seems safer to actually suspend a boinc if you become busy with something (move your mouse), but with linux, you set the "nice" command to be nice and give other programs priority, so, climate prediction may slow down to give priority to other tasks, but it never stops, so if you saw where the CPU is allotting time, it's running climate prediction at almost 100% even between mouse movements and keystrokes. It's just a little bit of a different concept. I'm guessing perhaps the best way to describe this would be to run boinc at 100% with no suspend, while at the same time running an older version of directX running some graphical 1st-person-shooter-type-game at the same time (if I recall right, I think the older versions of directX had problems with sharing the math coprocessor with other programs, so you would have conflicts between boinc and directX ...I think you can find some old bugs listed in the boinc buglist related to directX co-existance). ...and I'm guessing this may be a little similar here without really going into looking at code itself to see where the problem lays.


Hi Belfry,
Good point to mention cpu overheating. It is a tower/desktop machine with fairly good ventilation - maybe it could be a recent possibility.
I normally have 2 projects running at a time but these climate projects (or the recent project that bubbled to the additional odd hours), were causing all secondary projects to fail computing within 1 minute of starting, so I'm a little more opinionated to think that the climate calculations aren't sharing the math registers nicely (similar to boinc and directX co-existence mentioned above). I do recall looking at milkyway a while ago and seeing some interesting code just to deal with math going through the math co-processor, so if climate prediction is doing something that assumes it's got 100% attention for the coprocessor, then we're going to have issues as tasks flip back-n-forth between time slices. The reason for mentioning milkyway is I think it had some coexistance issues really early in some of the earliest versions...if not milkyway, it might have been cosmology or einstein then - but I don't recall 100% now.
If it is Fortran code, I can understand the problems you mention of getting similar fixes inserted.

Ohh well, let's see where this goes for now...just running 1 project instead of 2. Thanks for all your suggestions.
Joe
7) Message boards : Number crunching : MAKING BACKUPS??? (Message 45414)
Posted 6 Jan 2013 by Profile Joe's Climate
Post:
Reading through this thread, I think there would likely be some problems trying to backup BOINC while it's running similar to issues with trying to backup an actively running database. Has anyone tried to stop boinc (when you get to the point that you are going to backup the boinc directories), then backup, then restart boinc (after you've backed-up the boinc directories).

...reading through the rest of the thread, I guess it would still have the problem with multiple IDs appearing if you're connected to the internet, so you still have that problem there.

Like JIM indicates, it's a bit of a shame to have 400hrs of time go to waste, but then again, I lean more towards it's wasted and gone, versus making myself extra work to backup and recall, then redo a WU just to get back that wasted 400hrs.

Thanks for the warnings about HADCM3N around the decade mark.
Joe
8) Message boards : Number crunching : Energy efficiency and low task numbers (Message 45413)
Posted 6 Jan 2013 by Profile Joe's Climate
Post:
In a reply to somebody else's mention of lack of WUs, it was mentioned that it isn't a steady stream, but comes more in batches, so it is more than likely that you'll see a whole bunch, and then nothing see-sawing along.

Different projects seem to use resources quite differently, and I notice that climate prediction seems to need a long time, and a lot of memory to be able to process accurately. I'm less likely wanting to shut-off my computer if I realize I'm processing a climate prediction type of WU.

Other projects seem to work fine with smaller time slices, so you are much more okay shutting-off a computer and resuming later, which really was the main idea behind BOINC in the first place ... basically, using spare CPU cycles that would otherwise be thrown away as wasted heat spinning a CPU doing nothing.

Currently, there are other pure-science projects available that might interest you if you like to keep your computers chugging-away 24/7, such as Einstein, MilkyWay, Cosmos. Some projects seem to have a constant stream of work available, while others can also similarly give it away in batches too.

Another thing about BOINC is that you have the choice of setting your computing allotment to different ratios, so that if more Climateprediction comes along, it can set the other work aside... maybe something like setting climateprediction to 96%, Einstein to 1%, Milkyway to 1%, Cosmos to 1%, Seti to 1%. ...Well, maybe not so extreme, but I think you know what I mean here.

Cheers,
Joe
9) Message boards : Number crunching : hadcm3n affecting other projects, computer crash if running a long time (Message 45411)
Posted 6 Jan 2013 by Profile Joe's Climate
Post:
Hi Belfry,
I believe you are right about running two projects at once, that could be a case for this problem.
I have had 2 different projects running at once in the past (concerning climate change), but this particular run seems to be throwing errors at the other projects. Maybe it happened in the past too, but I'm just paying more attention now.

I recall in the past, particularly with some projects that were first beginning, how they appeared to have difficulty in sharing resources, but then, they did figure a way around that, so I do think that there are solutions available, plus this is also going to become more and more of a problem as more computers become multi-CPU, so, programs do need to be more aware of being able to share resources.

I have some interest in programming but I know I wouldn't have time to help here if I wanted right now, but based on past knowledge and experience, I think you probably may take an interest in looking at the main math routines used by other projects, and this might give some suggestions and ideas on getting some co-operation happening. I'd suggest looking at the key math routines used in einstein, milkyway, seti, they seem to have things figured-out around these key areas so that things don't clash soo much.

Only noticed your reply now - I'll have to check to see if there is a tick-box or something to notify me of reply messages. Thanks for the prompt detailed reply.
Joe
10) Message boards : Number crunching : hadcm3n affecting other projects, computer crash if running a long time (Message 45407)
Posted 1 Jan 2013 by Profile Joe's Climate
Post:
I definitely notice the current hadcm3n appears to make other projects return computation error. I'm getting computation error for milkyway, cosmology, einstien, seti

I have not quite found if this is due to had3n, but if I leave the computer running more than 3 or something days, I eventually get computer lockup.

This resulted in one climate prediction hadcm3n failing, while the second one changed from 400hr complete 300 to go, to now being at this time... 512hr done, 1354hr to go.

I'll probably try a boinc shutdown and restart again to see if it is actually boinc/climate prediction causing the issue.
11) Message boards : Number crunching : DEADLINES AND HIGH PRIORITY MODE (Message 43570)
Posted 18 Dec 2011 by Profile Joe's Climate
Post:
Thanks for the info.
I started my first WU a week ago and it went into high priority when it started. Initial estimate of time was 611 hours. It's gone down as low as 580 hours, then I had a second WU appear, and ended-up running 2 WU high priority.
The first WU has climbed up and passed 590 hours, while the second WU started with 611 and now suggests 635 hours.

Reading through other threads where people mention problems with seg faults and error 193, I thought this was going to be another case of wasted efforts (been-there, done-that, with early versions of another BOINC project before they got cleaned-up a fair bit (leaving name of project out, since it's not important and the problem is fixed)).

Will take your advice of stopping the second climate prediction WU project for now and let the 2nd CPU run something else other than climate prediction until the 1st WU completes.

Out of curiosity, has anyone run multiple climate prediction models at the same time? Do they behave okay? (...just asking since I've run into some projects which used "static" variables, which may seem like a good idea at first, but isn't a good idea if you run more than one instance at a time).




©2024 climateprediction.net