climateprediction.net home page
Visual Fortran Run-Time Error

Visual Fortran Run-Time Error

Questions and Answers : Windows : Visual Fortran Run-Time Error
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 51823 - Posted: 13 Apr 2015, 0:27:44 UTC - in response to Message 51822.  

The error there, is
REPLANCA :I/O ERROR
which is a data mismatch between files.

So, in that particular case, yes it's a problem with the model.

ID: 51823 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,060,840
RAC: 733
Message 51824 - Posted: 13 Apr 2015, 4:23:33 UTC - in response to Message 51822.  

[quote]Probably because no other project uses programs that are close to a million lines of source code, or are so complex in what they do.
Add to this the auxiliary files, such as the new, more detailed analysis of the latest version of MOSES + Triffid, and you have a super computer program that doesn't tolerate desktop/laptop computers that aren't "just so".

The Programmers will have to be careful not to make the programs so finicky that they cannot be run successfully on an average home computer or they will no longer be suitable as a Boinc project. Then they will be back to trying the raise the money to rent supercomputer time.

ID: 51824 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,357,324
RAC: 875
Message 51835 - Posted: 14 Apr 2015, 16:52:50 UTC - in response to Message 51823.  
Last modified: 14 Apr 2015, 16:55:59 UTC

@Les Bayliss
The error there, is
REPLANCA :I/O ERROR
which is a data mismatch between files.

So, in that particular case, yes it's a problem with the model.



I'm curious, how did you find the specific error code? Not finding specifics on the Workunit 9760129 page.

-----

@Jim

The Programmers will have to be careful not to make the programs so finicky that they cannot be run successfully on an average home computer or they will no longer be suitable as a Boinc project. Then they will be back to trying the raise the money to rent supercomputer time.


Agreed.

The BOINC network is a globally distributed, heterogeneous supercomputer that currently has only, like, 0.0015% of the available computing power tapped by BOINC clients.
With fault tolerant coding, in smaller chunks, the smartphone computing power might be enough to meet all ClimatePrediction.net 's computing needs with computing power to spare.

BOINC market penetration on the desktop, tablet, laptop and smart phone needs to increase. A marketing campaign is needed to make BOINC cool and one of the top d/led apps.

I guess there's enough computing power out there so that clients should be competing for WU, and many just sitting idle, because the servers for all BOINC projects can't get work out fast enough.
ID: 51835 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 51836 - Posted: 14 Apr 2015, 20:02:40 UTC - in response to Message 51835.  

Forget the Work unit ID. Look in the first column, the Task ID.
This is where all the important information about each model is stored.
Go down to Stderr, and click on the + symbol to expand the list.

Smart phones aren't powerful enough to run these models, and the UK Met Office doesn't have programs to run on them. Or on GPUs.

ID: 51836 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,357,324
RAC: 875
Message 51841 - Posted: 16 Apr 2015, 13:50:13 UTC - in response to Message 51836.  
Last modified: 16 Apr 2015, 13:53:12 UTC

Forget the Work unit ID. Look in the first column, the Task ID.
This is where all the important information about each model is stored.
Go down to Stderr, and click on the + symbol to expand the list.


Thanks, found it.



Smart phones aren't powerful enough to run these models, and the UK Met Office doesn't have programs to run on them. Or on GPUs.


I wanted to test that assertion and went to the database of CPU's to look up the GFLOPS of my single core, circa 2005 Intel(R) Celeron(R) M processor 900MHz {Family 6 Model 13 Stepping 8} that completed Task 17549228 in 710 hours.
It's GFLOPS is 0.54 on it's single core.

I looked up the cheap, 2014 Lumia 625 phone based on the Snapdragon 400 (8926) and found it has a 0.09 GFLOP per core and 0.26 on 4 cores.
That's not enough performance to get a Climate WU done within 1500 hours.

The 2014 iPhone 6 is a different story.
It has the A8 dual core CPU with 0.77 GFLOPS per core which is similar performance to a Intel(R) Pentium(R) 4 CPU 1.60GHz.
A return time of about 500 hours on a similar WU that the Celeron M 900MHz completed.

The other popular CPU's in higher end smartphones of 2014 are
the Tegra K1 at 0.67 GFLOPS per core (LINPACK seems to only recognize 2 cores on the multi-thread bench),
the Snapdragon 805 at 0.32 GFLOPS per core and
the Exynos 5420 Octa core CPU with 0.39 GFLOPS per core (again, the LINPACK benchmark seems to only be running on 2 cores and not on at least the 4 A15 cores of the A15/A7 BIG.little architecture).

An iOS BOINC clients for iPhone 6 and later editions are capable of handling ClimatePrediction.net WU's in 500 hours if owners are willing to run them.
I've been running Asteroids and SETI on a Zeepad and it's turnaround is much worse than that level, yet these devices are becoming so prolific that their computing power can't be ignored.
Also, the pad market is increasing enormously and is based on the highest performing RISC based CPU's and running predominantly Android and iOS.
Something I didn't look at, but should be significant, is the amount of energy per GFLOP required on these devices compared to desktops and laptops. Completing the WU for much lower energy costs would ease the burden of people donating processing time to the projects.

I'll back off my contention that 90% penetration of the high end smartphone market of 2013 onwards could handle the BOINC projects needs as they are about equivalent to 2004-06 x84/x64 GFLOPS performance.
ID: 51841 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 51843 - Posted: 16 Apr 2015, 14:24:14 UTC - in response to Message 51841.  

All that is a bit irrelevant, as the Met Office only has apps for desktops/laptops using the x86 instruction set.
There may never be any ARM/RISC version, as professionals want the results of their daily work fast, not in a few weeks/months, as provided by a lot of BOINC users.

ID: 51843 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,357,324
RAC: 875
Message 51857 - Posted: 20 Apr 2015, 4:33:12 UTC - in response to Message 51843.  
Last modified: 20 Apr 2015, 4:48:05 UTC

All that is a bit irrelevant, as the Met Office only has apps for desktops/laptops using the x86 instruction set.
There may never be any ARM/RISC version, as professionals want the results of their daily work fast, not in a few weeks/months, as provided by a lot of BOINC users.



Your comment makes little sense as the deadlines for WU on ClimatePrediction is 1 YEAR which is the longest deadline of any project I've ever seen. If work is required from the BOINC network more quickly then smaller slices of work needs to be put out and the deadline severely decreased.

If ClimatePrediction wants to ignore the quickly growing ARM market then they are making a huge mistake as there will be a growing number of people going without desktops or laptops and using only ARM based phone and pads in the next decade.

It's already happening among the college and under crowd. Who needs a laptop when you have a Samsung Note with writing stylus which a student can get discounted. If you want people to run BOINC WU for you for years to come then catch them young and get them involved.

Politics and name recognition are also considerations as climate modeling is crucial to the future of our species.
ID: 51857 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 51858 - Posted: 20 Apr 2015, 5:42:24 UTC - in response to Message 51857.  

The so called "deadlines", as used in this project, are just an arbitrary number put into the appropriate box in the BOINC code. It's made long because so many multi-project people complained when it was shorter.
But this doesn't mean that the researchers don't care how long it takes. And a fast, single-project computer can complete the models in from less than a day, to about 14-15 days for the very long models. 3 weeks on slower computers.
But if you want this "deadline" decreased, then I'm all for it. And have been for a long while. Perhaps 3-4 times the time taken by my Haswell.

And climateprediction.net does NOT write the code.
It all comes from the UK Met Office, where it normally runs on their super-computers, for daily weather modelling to long term climate modelling.
All of which has been posted about many times over the years.

As for making the "slices" shorter, they're already as short as they can be without compromising accuracy.

ID: 51858 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,060,840
RAC: 733
Message 51859 - Posted: 20 Apr 2015, 14:21:56 UTC - in response to Message 51858.  
Last modified: 20 Apr 2015, 14:22:55 UTC

The long deadlines are also a holdover from the days (about 10 years ago) of 160 year models run on single core 1.2 GHz processors. These took 7 or 8 months to complete running just about 24/7.
ID: 51859 · Report as offensive     Reply Quote
Deidelit

Send message
Joined: 6 Mar 06
Posts: 1
Credit: 2,097,174
RAC: 0
Message 52090 - Posted: 24 Jun 2015, 21:32:28 UTC

Do I understand correctly from browsing this thread for a while that there is no real solution to the Fortran errors?

Been ignoring the error for a while but now I've been getting my first failed packages. So this is due to restarts based on crashes occuring because some programs run at the same time as boinc will crash the computer and there is no predicting which?

ID: 52090 · Report as offensive     Reply Quote
ryan

Send message
Joined: 17 Aug 13
Posts: 2
Credit: 8,456,886
RAC: 0
Message 52093 - Posted: 25 Jun 2015, 6:40:37 UTC

Just started getting these on my machine having never seen them before. I have some exclusive programs defined and always get the errors after I shutdown one of those programs/games so maybe the way that BOINC is automatically suspending the models is not correct?

I don't see this behavior when I suspend computation manually. My RAM usage is quite high with 11 tasks + 1 or 2 GPU tasks depending on the active project. I wonder if the models get swapped out during games and that causes the crash. I didn't expect to see 16GB to be a limitation quite so quickly.
ID: 52093 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 52094 - Posted: 25 Jun 2015, 7:01:16 UTC - in response to Message 52093.  

Climate models don't like being interrupted.
Some model types are more prone to various failures than others.

ID: 52094 · Report as offensive     Reply Quote
Profile JIM

Send message
Joined: 31 Dec 07
Posts: 1152
Credit: 22,060,840
RAC: 733
Message 52099 - Posted: 25 Jun 2015, 15:37:35 UTC - in response to Message 51858.  

The so called "deadlines", as used in this project, are just an arbitrary number put into the appropriate box in the BOINC code. It's made long because so many multi-project people complained when it was shorter.
But this doesn't mean that the researchers don't care how long it takes. And a fast, single-project computer can complete the models in from less than a day, to about 14-15 days for the very long models. 3 weeks on slower computers.
But if you want this "deadline" decreased, then I'm all for it. And have been for a long while. Perhaps 3-4 times the time taken by my Haswell.


One problem with short deadlines is if someone is running multiple projects, while CPDN ignores the deadlines, Boinc Manager takes them very seriously. If it thinks that the user is going to miss a CPDN deadline it will suspend all other projects, go into �high priority� mode and not let anything else run, then later not let CPDN run until it has paid back the time it �borrowed� from the other projects. There is no way to turn this off except to manually suspend CPDN.

ID: 52099 · Report as offensive     Reply Quote
ryan

Send message
Joined: 17 Aug 13
Posts: 2
Credit: 8,456,886
RAC: 0
Message 52109 - Posted: 26 Jun 2015, 3:03:16 UTC

The issue is further compounded because the processes are not properly cleaned up. They stick around taking up memory until the user ends them manually, logs out, or reboots the machine.

I can consistently repeat this problem by suspending tasks then taking up a bunch of extra memory (browsers, office programs, etc) then closing them and resuming the tasks. I get a slew of fortran errors but the tasks stay in Windows process viewer.

Even if models do not like being interrupted don't think it should be too hard to take the few extra milliseconds or seconds to reach a safe stopping point.
ID: 52109 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 52112 - Posted: 26 Jun 2015, 6:05:53 UTC - in response to Message 52109.  

This only happens to a small number of computers, and not all of the time.
It's something to do with the hardware/software on the computer, how it's being used, and what else is running at the time.

ID: 52112 · Report as offensive     Reply Quote
marmot

Send message
Joined: 12 May 05
Posts: 34
Credit: 1,357,324
RAC: 875
Message 52684 - Posted: 6 Oct 2015, 5:41:09 UTC - in response to Message 51858.  

@Les Bayliss:

And climateprediction.net does NOT write the code.
It all comes from the UK Met Office, where it normally runs on their super-computers, for daily weather modelling to long term climate modelling.
All of which has been posted about many times over the years.



I'm not sure why you took this tact. You can see that I have 11 posts on these forums and obviously am not deeply involved with these projects so going after my ignorance of the years worth of posts was unusual.

@Les Bayliss:
Climate models don't like being interrupted.
Some model types are more prone to various failures than others.



This kind of fault intolerance after 10 years of climateprediction.net running on BOINC shows some failure in the project. Probably from lack of funding leading to programmers not being able to spend appropriate amounts of time hardening their code for the BOINC environment across a heterogeneous selection of user machines. I have trouble believing that FORTRAN itself hasn't been hardened to run in a multi-core modern OS.

@ryan:
The issue is further compounded because the processes are not properly cleaned up. They stick around taking up memory until the user ends them manually, logs out, or reboots the machine.

I can consistently repeat this problem by suspending tasks then taking up a bunch of extra memory (browsers, office programs, etc) then closing them and resuming the tasks. I get a slew of fortran errors but the tasks stay in Windows process viewer.

Even if models do not like being interrupted don't think it should be too hard to take the few extra milliseconds or seconds to reach a safe stopping point.


Some younger coders need to take some time looking over the apps being sent out to BOINC machines and improve the fault tolerance of the code.
Maybe some student loan forgiveness could be offered.

Maybe these comments need to be taken to a UK Met Office forum or representative since they write the code, might never read any of these forums, and ClimatePrediction.net has no power to make any changes to correct these errors.


ID: 52684 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4

Questions and Answers : Windows : Visual Fortran Run-Time Error

©2024 climateprediction.net