climateprediction.net home page
Posts by Steve Bergman

Posts by Steve Bergman

1) Message boards : Number crunching : How badly did I get screwed? (Message 51729)
Posted 31 Mar 2015 by Steve Bergman
Post:
Thanks! I'll have a look.
2) Message boards : Number crunching : How badly did I get screwed? (Message 51719)
Posted 28 Mar 2015 by Steve Bergman
Post:
Thanks. That does give me some idea. The A10-7850K is AMD's top of the line of a *very* interesting new architecture. It's not even called a CPU anymore. It's an APU. It has 4 cpu cores and 8 radeon gpu cores sharing up to 32GB of DDR3-2400 main memory. It employs technologies like HSA and hUMA (which have been used by smartphones for years.) What this does is put the cpus and gpus on an even footing, with a shared virtual and physical address space with no memory copy necessary to move data back and forth between cpu and gpu. And from the programmer's standpoint, the the gpus are just another type of processor available with its own own characteristics, like being *really fast* for vector processing. Basically opencl done right and on steroids. But the programmer needs to write for it. LibreOffice calc does, and the results are pretty spectacular.

A silver lining to all this is that it prompted me to research, and I found PlayOnLinux which is pretty spectacular. I installed it, installed Windows Steam in it, and I'm now happily playing one of my old favorites, Doom 3, at 1920x1080 with 8x antialiasing and 512MB "Ultra" texture quality at great frame rates. Not bad for on-cpu... err... on-APU graphics. (I could never have done that with my $500 Nvidia GeForce 3 9600GT back when the game came out.)

And of course, that means that I should do well on opencl apps. The problem is that while I don't care about credits and competition on BOINC, I do care deeply about being excited about the science I'm helping. And my favorite projects, Einstein@Home gravity wave search and CPDN are CPU only. (CPDN for excellent reasons, I should add.). So if I want to leverage the GPU, I'm pretty much limited to signal processing like looking for pulsars, which is a worthy pursuit, but doesn't really grab me all that much.

I still use Debian 7 for my customers' local servers and Centos/Scientific Linux for the ones in other cities. But I just recently moved my own desktop from Debian 7 to Mint 17.1 Cinnamon and am loving it!
3) Message boards : Number crunching : How badly did I get screwed? (Message 51717)
Posted 28 Mar 2015 by Steve Bergman
Post:
Hi Guys,

Due to my being out of the loop regarding hardware for awhile, some bad advice, and the draconian return policies of a local computer store which I'll never be doing business with again, I got stuck with an AMD A10-7850k "12 core" processor, which is actually a 4 core where AMD calls the on-board GPU another 8 cores. (I don't game.) And the 4 cpu cores have a reputation for being... disappointing, with each 2 cpu cores sharing a single floating point unit. But I'd kind of like to know how bad it really is. I'm running 4 simultaneous HadSM3P with Moses II models. They've only been running for about 18 hours and are only at 0.5%, but the estimated total elapsed time is holding steady at 350 hours per model. So how good/bad/ok is finishing 4 of those in 350 hours? I've been kinda out of the loop regarding CPDN models, too.

Thanks so much for any opinions,

Sincerely,
Steve Bergman
4) Message boards : Number crunching : Trickles and Credits (Message 34762)
Posted 26 Aug 2008 by Steve Bergman
Post:
In any event, perhaps some information update should be posted suggesting that the admin folks are alerted to the problem?

Maybe. But perhaps it is also a good time to reflect upon what we are doing here in the first place: Devoting our processor cycles and extra watts to this worthy project. It\'s so very easy to get distracted by credits. But they are, at best, only a rough estimate as to what we may be contributing to the data pool.

I quite understand the concern you are voicing. (Actually, I should say that I *think* I understand. Saying \"I understand\" can come off as a bit condescending.) I even agree. Expressing concern for how the SETI guys should be more communicative and sensitive to credit reporting issues, during time of crisis, in order to not to piss off more the more \"credit-sensitive\" users, more or less got me ousted from the SETI message board. (Ouch!)

But all\'s well that ends well. :-)

Take Care,
Steve
5) Message boards : Number crunching : Trickles and Credits (Message 34756)
Posted 26 Aug 2008 by Steve Bergman
Post:
...but in the meantime, please remain seated, and have your seat belts fastened.

*click*
6) Message boards : Number crunching : Trickles and Credits (Message 34752)
Posted 26 Aug 2008 by Steve Bergman
Post:
Today (Monday) is a holiday in the UK, so it is unlikely to be exported until Tuesday...

While I tend to view the whole \"credits\" thing as a necessary evil, though an important incentive to participate in BOINC projects... I\'m curious why credit exports don\'t happen on holidays? Is it a manual operation? I would have expected all that to be automated. Or perhaps the server that usually does it flew to Bermuda to catch some rays over the 3 day weekend? :-0
7) Questions and Answers : Unix/Linux : Thought on improving model robustness (Message 34716)
Posted 22 Aug 2008 by Steve Bergman
Post:
CPDN used to hammer HDD. Carl put a lot of effort into reducing HDD I/O and got rid of ~91%, at a small cost in RAM requirement. I don\'t think Run stability suffered. It\'s a matter of trade-offs, eh?


I don\'t really see a substantial trade off in this case. Calling fsync might increase write overhead by, say, 10%. That\'s off the top of my head and subject to debate. But I think it\'s a reasonable ballpark figure. It all goes through the OS\'s page cache and the OS\'s (probably elevator) I/O scheduling algorithms.

On the other hand, it is not a panacea. Basically, what fsync gives the app is the ability to know what has been written to disk and what may or may not have been written. It is still up to the application to \"do the right thing\". If a write is requested, and the return code from the fsync is good, the app can know what to do. If the return code is not good it can, absolutely, know what it must do to prevent the possibility of a crashed model. It gives an absolute guarantee to the app that certain things have, effectively, happened. Without that, the app has to assume that the write actually happened. And it might not have happened at all.

It is not exactly an esoteric or performance killing facility. Databases like PostgreSQL and MySQL use it to help insure data integrity. Windows, no doubt, has a counterpart system call.

It is a necessary, but not sufficient, condition to ensure data integrity.

I am not intimately familiar with the I/O performance issues that CPDN may have had with certain models. But I sincerely do not see exercising a bit of control over the timing of physical writes, in this context, to be any sort of return to those difficulties about which I have heard.

It sounds like disk writes are, perhaps, already being collected up into logically related bundles, for performance reasons. That would be a perfect fit for fsync. Because the real danger a programmer faces is when only a *partial* write of a logically related bundle of data occurs. One wants to know if it happened, or if it might not have. Or, most importantly, if only *part* of it might have happened.

I\'m not an expert in this area. (And it has been 26 years since I have written in Fortran!) But I do know my way around a bit. I, personally, think there is some promise to improve the success rate of these very long-running models, here.
8) Questions and Answers : Unix/Linux : Thought on improving model robustness (Message 34711)
Posted 20 Aug 2008 by Steve Bergman
Post:
After thinking about this some more, an obvious question occurred to me. Does the model call fsync to force writing the file\'s updated blocks to disk after writes? This would accomplish the same thing programmatically, reducing the possibility of corruption after an unplanned shutdown.

Edit:

To answer my own question, I monitored my CM3 model with \'strace\', which shows me all of the system calls that the model makes to the operating system kernel. It appears that it does *not* use fsync after writes. This would, I believe, be a way to improve the success rate of future models.
9) Questions and Answers : Unix/Linux : stuck in quota limit - cannot get data (Message 34709)
Posted 20 Aug 2008 by Steve Bergman
Post:
Hi,

Yes, I suspect that installing the 32 bit libs will end up a go next time the app runs. There are no true 64 bit CPDN models at this time. The \"64 bit\" models are 32 bit, but due to the way BOINC works, it has to have a model tagged to go to a Linux x86_64 arch before it will send your x86_64 box anything at all.

10) Questions and Answers : Getting started : Beta 2? How? (Message 34683)
Posted 19 Aug 2008 by Steve Bergman
Post:
I have an 8 core Xeon available to me on which I was thinking about allocating a core or two to the Beta2 project I\'ve been hearing about. But I cannot seem to find the site. Could someone please point me to it? Thanks.

-Steve
11) Questions and Answers : Unix/Linux : stuck in quota limit - cannot get data (Message 34661)
Posted 16 Aug 2008 by Steve Bergman
Post:
I had a similar issue when I was getting started and moving things around. I think you should be good to go tomorrow. In the mean time, you might want to fold a few proteins or something.

*Do make sure* that you have the ia32-libs package installed. CPDN models are 32 bit only and Ubuntu 64 bit does not ship with the 32 bit libs.
12) Message boards : Cafe CPDN : Join Team SETI.USA (Message 34650)
Posted 16 Aug 2008 by Steve Bergman
Post:
Hmmm. Maybe CPDN should create its own team and spam other projects\' forums with recruitment ads? Or maybe not. ;-)

SAH, in my opinion, already commands a far greater share of BOINC resources than its premise justifies. I left it and came here (and to WCG) to try to help with some *real* science that actually has a reasonable chance of doing the Earth some good. I must respectfully decline the invitation.

-Steve Bergman
13) Questions and Answers : Unix/Linux : Thought on improving model robustness (Message 34646)
Posted 15 Aug 2008 by Steve Bergman
Post:
I\'m fairly new to cpdn, so keep that in mind. But one thing that I have done to try to ensure robustness is to use the extended \'j\' attribute, supported by the ext3 filesystem, on my BOINC subtree.

For anyone who is not aware, ext3 has 3 levels of journaling, which give the different sets of guarantees for data integrity.

data=writeback:

This is what most journaling filesystems implement, and is the fastest mode. It journals the metadata and obviates the need for an fsck after an unplanned shutdown. The filesystem structure is guaranteed to be consistent. But some data blocks could contain garbage.


data=ordered:

This is the ext3 default. It entails some performance penalty relative to \"writeback\". Like \"writeback\", it only journals the metadata, but it orders writes carefully in order to guarantee that no data blocks can contain garbage. Files may not have the very *latest* data. But they will be what the file looked like some seconds before the crash. It is possible for two related files to be inconsistent with each other if one has the latest data and the other does not.


data=journal:

This is the most robust mode. It incurs the greatest performance penalty. It journals both the metadata and the data. It does not tell the requesting program that the data has been written until it has actually been written to the journal. This mode guarantees that if the calling program was told that the data was written to disk, it absolutely will be there after a crash.

Since journaling does incur a substantial performance penalty, most people do not find it worthwhile. However, it is possible to tell ext3 to journal the data for individual files. This is done with the \'j\' attribute.

For example, if one has a file \"important.db\", one can do:

chattr +j important.db

and from then on, that file will have its data jounaled, regardless of what mode the filesystem is mounted in. The -R option to chattr makes it recursive:

cd ~steve
chattr -R +j BOINC

will ensure that everything related to our (very long running) BOINC processes has full data integrity guarantees. (At least at the filesystem level.)

And, of course, you can view the attributes with \'lsattr\' which works very much like ls.

I just wanted to mention this and get any feedback anyone might care to give.
14) Message boards : Number crunching : Models and Unix/Linux shutdown (Message 34645)
Posted 15 Aug 2008 by Steve Bergman
Post:
I was reading a thread where a user mentioned that they had boinc installed from their OS repository, so it had start and stop scripts which got run during system boot up and shutdown. This got me to wondering exactly what the shutdown script did. On my Ubuntu box, it looks like it treats it like any other service. It sends a sigterm (signal 15) to the process. The process is expected to catch this signal, perform any cleanup that it needs to do, and exit.

At the end of the system shutdown procedure, all remaining processes are sent a sigterm, and the system waits 5 seconds to give them a chance to clean up. It then sends a sigkill (signal 9) which is the ipc equivalent of a beheading, and halts the machine.

How do the climate models respond to signal 15? Do they catch it and do clean up before terminating?
15) Questions and Answers : Unix/Linux : What is the definition of \"idle\" for BOINC/Linux? (Message 34639)
Posted 15 Aug 2008 by Steve Bergman
Post:
Thank you. The machine I need to use \"only run when idle\" is remote and has both a local X and remote X logins, currently, plus my SSH session. It seems to be sensitive to activity on my ssh session. It\'s kind of hard to tell. I should probably go to the BOINC forum for an answer.
16) Questions and Answers : Unix/Linux : What is the definition of \"idle\" for BOINC/Linux? (Message 34634)
Posted 15 Aug 2008 by Steve Bergman
Post:
If I start boinc with:

./boinc --daemon

what is the definition of idle? Is it \"recent mouse activity\" or \"someone is logged in at the console\" or something else?

Thanks,
Steve Bergman

P.S. Got my quad core in today, and started a hadam3 in addition to my ongoing CM3. Yea! :-)
17) Message boards : Number crunching : Yay! Workunit finally finished after 1 year! (Message 34611)
Posted 12 Aug 2008 by Steve Bergman
Post:
I have one of these that did the same to me last week, going into this half-speed mode.

I\'m not sure how it works under Windows, but under Linux/Gnome the Gnome power manager defaults to not counting \"niced\" (lower priority) processes while governing the cpu speed. i.e., you can have a \"niced\" process spinning at 100% and it will stay at a low clock speed. I tell mine to consider niced processes so that it stays at full speed when boinc is running.
18) Message boards : Number crunching : Effect of L2 cache size and FSB on models? (Message 34587)
Posted 10 Aug 2008 by Steve Bergman
Post:
Can your current mobo support 1066 FSB?


Hi Neil,

Yes, it can handle 1066 but not 1333. That\'s why I was targeting the E7200, and then the Q6600. (The Q6700 is also 1066 but is only a little faster for significantly more money.) The machine in question was an ASUS bare-bones which I originally spec\'d low because it was just acting as an X thin client into one of my Red Hat servers. The E2140 was way overkill. Then the on-board NIC went out on my Athlon64 4000+ desktop and I decided it was time to just make the move to a nice Core 2, and this box was conveniently available.

I\'m hoping my Q6600 shows up tomorrow (And the new memory; I spec\'d that low, too.) as I am anxious to see how it does. Looks like you are getting about 1.7 s/ts for CM3\'s on yours, which is comparable to what my single core 4000+ has done in some informal spot testing. So the overall package should have about 4x the crunching power of what I had before. (Truly, multi-core is the industry\'s response to the fact that they can\'t make individual cores faster like they used to... so they dumped the burden on the software guys.)

I\'m just getting back into BOINC, after a long hiatus. With 4 cores, I\'ve selected CPDN, SETI, FightAIDS@Home and ConquerCancer@Home as satisfying projects in which to participate. And all for just (an expected) 125W for the full system (monitor off), which appeals to the green in me. :-)
19) Message boards : Number crunching : Effect of L2 cache size and FSB on models? (Message 34582)
Posted 10 Aug 2008 by Steve Bergman
Post:
OK. That certainly explains the search mystery.

I expected that L2 cache size might be more likely to affect model performance. And yet, in the other thread, someone reports that doubling the L2 cache from 512k to 1MB made a scant 5% difference in their testing.

This has reminded me of this excellent Ars article from 2002 which explains processor caching and what kinds of things affect cache efficacy:

http://arstechnica.com/articles/paedia/cpu/caching.ars/1

If the model is cycling through a large amount of data which does not get reused for a long time (demonstrates poor temporal locality) then shuffling the bits through the cache hierarchy is ineffective, and a bit of a waste. Then again, the results of one time step are referenced in the next. So it may be that if the L2 were large enough, a substantial speed up might be observed when it reached some critical size and the data had not been evicted from the cache by the time it needs it again. I suspect that the model cycles through quite *a lot* of data, though, before going back and referencing the results of the previous time step. More than 8MB, I\'d guess.

Or, maybe it is something as simple as the time required by the processor to perform the floating point calculations dominating over the time required to retrieve the data to be processed, even though most of it has to come all the way from main memory. These models are not exactly the typical server bean counting app. ;-)
20) Message boards : Number crunching : Effect of L2 cache size and FSB on models? (Message 34580)
Posted 9 Aug 2008 by Steve Bergman
Post:
Steve,

It might be worthwhile letting the model run for a few trickles (5-10) before doing the upgrade.


Thanks for the tip. Actually, since I posted my previous message, I noticed a boxed Q6600 quad core for only $74 more ($194 for boxed, $184 for OEM), so I ordered a boxed unit and will be sending back the E7200 when it arrives. (I can\'t wait!)

The Q6600 has 2x4MB cache, a 1066MHz FSB and a 2.4GHz internal clock. Even though it uses a 65nm process as opposed to the E7200\'s 45 nm, it is remarkably efficient. TDP for the 4 core is only 95W, vs the E7200\'s 65 watts. My power supply and the rest of my machine are high efficiency, so I will be able to crunch 4 boinc projects on only 125W of total system power, or about 31W per core vs the 7200\'s 44W per core. Each core of the quad is clocked about 5% slower, but it has 33% more L2 cache per core, so each core should be comparable to one of the 7200\'s cores.

-Steve


Next 20

©2024 climateprediction.net