climateprediction.net home page
Posts by old_user212146

Posts by old_user212146

1) Questions and Answers : Windows : Comments for \'Generic solutions to models\' sticky (Message 25513)
Posted 8 Dec 2006 by old_user212146
Post:
Sorry about this - my final post of the night I promise :-)

Does the checkpoint handle completely unexpected results, i.e. if i suddenly had a powercut would it recover from the last checkpoint (my pc is similar spec to yours so from about the previous 30 minutes worth of processing)? This is as opposed to an out of bounds value turning up in the data.

If it does this why would it ever restart back at 1920? If it does this because it has reached a data dead end (i.e. the model has realised it is spiralling unrealistically out of countrol and decides to abort) is the data produced up to that point useful?

If the problems are being caused by PC hardware failure you would expect the checkpoint system to be able to rewind to the recent checlpoint position and just carry on regardless - hardware problems are extremely unlikely to occur in the same processing place and so you are always going to steadily increase your progress (albeit with the odd slight rewind).

Thanks for your patience in answering my questions - I am running the Prime95 Torture Test at the moment to ensure I have no hardware issues to explain my resets and as i say i will be attempting to do backups at regular intervals.
It is way past my bedtime (1:45am) and I need sleep now.
2) Questions and Answers : Windows : Comments for \'Generic solutions to models\' sticky (Message 25511)
Posted 7 Dec 2006 by old_user212146
Post:
\"smaller chunks of work\" will hopefully become possible sometime next year, once some code is written and tested. This is what the \"restart dumps\" every 40 years are for.
But currently, the core team have higher priority matters to deal with.
One of which is to solve the problem of the \'climateapps2\' server falling over because of the huge work load.
Then there\'s the programs that will allow the global researchers to access the data for their own analysis without causing problems for the storage servers.
These people are some of those \'paying the piper\', so they get a bit of a say in the priorities.

of course - this is understood - business as usual tends to take precedence over this sort of thing.


As for blaming people\'s computers, if you read back throught the posts of the last couple of years, and the replies to them, you\'ll see that a lot of people have been trying to run the program on seriously under-powered machines, which they think will work \'because SETI works on them\'.
Such as only having 128 Megs of ram, a cpu speed of 98 MHzs, an OS of 98ME, etc.
And you\'ll also find that a lot of people with \"faulty computers\" have been helped to get the program going with just a bit of advice.

agreed - but not everybody - and as was posted here - many people who have had problems have simply not bothered to post. I myself have crashed as i say and my pc is pretty stable 3.4ghz dual cpu with 1gb ram + 1 terrabyte of disk storage. It may not have been your intention but the responses came across as \"well it can\'t be us it must be your fault\" which is going to get peoples backs up a bit.


One such from way back was a person who \'laid down the law\' about how it was the program, and not his computer. It turned out that he\'d had a psu failure, opened the case to replace it, then hadn\'t replaced the cover. After some advice from \'yours truly\', as they say, when he cleared away a few things so that he could see into the case, he found that the cpu heatsink was covered with a thick layer of dust, from a \'year of neglect\'. Or so.
Last I heard, his computer was crunching away successfully, and he was a \'happy little vegemite\'. (Local saying.)


the joy of users


So, how may we help you?
(Starting with some advice that you can obtain a \'better\' name for yourself by changing it in the preferences on your Account page.)

I will get around to this at some point soon


PS
\"last known good state\" IS written \"at regular intervals\".
But making sure that the tyres on your car have a good tread, and are inflated to the correct pressure, is no use if you get hit by a loaded out of control truck.
And similar things can happen to the model/program which makes the checkpoint useless.

mmmm - well - my definition of regular would be at least once per day. If the last known checkpoint can get destroyed then it is not an effective checkpoint and how it is done needs to be rethought. I work in the telecoms industry now where we process realtime telephone data (10\'s of millions of items of data every day) - everytime a process crashes it HAS to know how to recover otherwise people lose revenue (lots of revenue) so i know these procedures are difficult, but not impossible to accomplish.


Which is where backups come into the picture. As an IT person, you\'ll know about the value of having everything safely backed up every night.

I understand this - but you have to realise human nature - probably 95% of the people who attach to this project did so thinking - \"ooohhh, that\'s a pretty screensaver and im doing a bit for the environment as well\". It is also a screensaver, which by definition is hands off, you are simply not there when it is running. People are not going to remember to do backups, they just aren\'t going to remember. By insisting on this policy (and with it being unstable as it is it does become mandatory) you are restricting yourself to the IT literate amongst us.

I am not having a go at the project - I appreciate what it is trying to achieve but you have to accept that most people are just running this for fun with the hope that it helps understanding of climate change. If it becomes \"difficult\" they are just gonna quit. This is the climiteprediction.net\'s loss as one more person detaches from the project to run one that is more stable.

I will try and stick with the project and try and remember to do regular backups but it is the most frustrating screensaver I have ever used :-)
3) Questions and Answers : Windows : Comments for \'Generic solutions to models\' sticky (Message 25506)
Posted 7 Dec 2006 by old_user212146
Post:
I\'m afraid that I have to agree with the above post. Being a seasoned software engineer for 20 years (5 years of which was at the Met Office in Bracknell) I know that you cannot write code perfectly - If you are talking about 1 million lines of Fortran code (god I hated using Fortran when i worked there) then there are bound to be errors in the code - it is just infeasible that the code created is 100% perfect so blaming peoples pc\'s for the problem is a little harsh, it is more likely that a certain combination of data is forcing the code down an unexpected path and causing a crash. The work packages are enormous - I used to run the project but gave up after having it restart itself back at 1920 3 times.

I appreciate that it may be difficult to do but reorganising the code to allow for smaller chunks of work - or at least to fix a \"last known good state\" at regular intervals would greatly increase the amount of results you are receiving.




©2024 climateprediction.net