Message boards :
Number crunching :
Good news for Mac users. HadAM3P Latest News???
Message board moderation
Author | Message |
---|---|
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
Reported in Latest News and Announcements by MO.V:- There's good news for Mac users. HadAM3P ran much more slowly on Mac than on Windows and Linux. Tolu has released a new 6.07 version of this model for Mac. It uses a different compiler and should run faster than version 6.06. It will still be helpful to CPDN if members can complete their current HadAM3P models before downloading new ones as they all produce good data for the researchers. Does this mean Macs can now run these tasks without them bombing out as they had done previously through an incompatible Fortran compiler? Please clear this for me and I will change my preferences to accept HadAM3P in future. I look forward to a reply confiming that, as I am sure, are many others. Keith Scott |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
When Tolu released v.6.07 on the CPDN Beta project for testing the CPDN Coordinator Hiro Yamazaki told me 'Tolu has just released a new HadAM3P for mac on the beta site to address the performance issue. I believe he has used gfortran instead of intel compiler.' (I don't think Hiro will mind me quoting that.) One of Keith's crashed HadAM3Ps is here. Perhaps someone more knowledgeable than me could look at the error messages to see whether the problem is just the compiler for this model type. Cpdn news |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
The problem previously was that a different Fortran compiler had been used for some types of task and another with other types. It turned out that once the first task had been run, the program only would accept tasks of the same type (unless BOINC was reset and started with one of the other type of tasks, when only that type would be accepted). I hope you understand that, and that I am relating the story correctly as I understand it. You will notice that earlier on I was having every task being bombed out until I was advised to be selective in the preferences to avoid those that were problems. I am hoping that I can now add back HadAM3P in my Preferences. Keith |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
OK, now I see what you mean! Sorry! Could you select HadAM3P only again, see what happens when you run one and please let us know. Cpdn news |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
OK, now I see what you mean! Sorry! OK. Have done that by aborting one task which had been recently downloaded but not yet started. After loading and eventually crunching, it has happily run for over 15 minutes. It never managed as much as 5 minutes before, so I assume all is OK. (We shall see.) As you will notice, I was successfully running these early in April until I did the recommended reset. After that they dropped out almost immediately on starting crunching. I will let you know if any further problems arise after resetting preferences to accept HADSM3 and HADSM Mid-Holocene again together with HADAM3P (all now running on different version numbered apps 6.09, 6.04, 6.07). Keith |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
I shall also reinstate HADCM3, which I also used to run successfully until they started to drop out (due I had assumed, to the Fortran problem). I was disappointed when I could no longer run those long tasks and will be glad if they are now available to me again. I will report any problems that may arise. Keith |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Thank you, Keith. We will be keeping an eye on what you report. Cpdn news |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
Thank you, Keith. We will be keeping an eye on what you report. MOV HADam3P has been running for 8hrs 30 mins and seems OK., except that "Show Graphics" is completely unresponsive in either Simple View or in Advanced View. Is that normal with the new app using NIVIDIA GPU, as mine uses RadeonX1600? EDIT: HAVE JUST RESTARTED AFTER JAVA UPDATE AND NOW "SHOW GRAPHICS" WORKS NORMALLY Keith |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
MOV Had first trickle through. Fantastic. As promised 50% + improvement. The s/TS has dropped from 9.1691 on my last successful task to 5.9795 this time. So should finish in 4 days instead of 6 days. We shall see. But the main thing is that it has successfully kept going without crashing this time. Keith . |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
MOV I have an HADCM3 istd task scheduled to crunch in 3 days time. I was looking forward to that being a success after so many previous failures. The outcome had always ended, after several repeats of same details, the following report:- ................. cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/jobs/ocean_dump.start cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/jobs/specsw cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/jobs/speclw Insufficient Memory/Stack Space Available! called boinc_finish CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=73137, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/jobs/yafbg.ihist cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/jobs/yafbg.namelists cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/dataout/atmos_restart.day cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/dataout/ocean_restart.day cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/jobs/atmos_dump.start cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/jobs/ocean_dump.start cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/jobs/specsw cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3istd_crc6_1920_160_06019562/jobs/speclw Insufficient Memory/Stack Space Available! called boinc_finish CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=73137, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( called boinc_finish </stderr_txt> ]]> Validate state Invalid Claimed credit 0 Granted credit 0 application version 6.04 However, it is still running the app version 6.04, so I suspect I will still have the same problem. If you have any comments about this before crunching starts, they would be appreciated. To keep you in picture as to progress. Successful crunching (70% so far) of HADAM3P task. And I shall eventually try each of the four tasks available for my Mac. I shall be so glad then hopefully to be able to crunch all 4 on my MacBookPro Intel. Keith |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
Only the HadAM3P models have been 'upgraded'. The HADCM3 models are the same as before. Backups: Here |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
MOV As promised, I report that the HADAM3Pmnjs task has completed, but surprisingly has not been granted full credit (may be just delayed?). And the HADCMistd task shows no improvement (as feared) It crashed immediately after starting , and the messages are shown below:- Fri 26 Jun 04:49:14 2009 climateprediction.net task hadsm3fub_kh1t_005986884_5 suspended by user Fri 26 Jun 04:49:14 2009 climateprediction.net Starting hadcm3istd_cskk_1920_160_06021160_5 Fri 26 Jun 04:49:15 2009 climateprediction.net Starting task hadcm3istd_cskk_1920_160_06021160_5 using hadcm3i version 604 Fri 26 Jun 04:49:18 2009 climateprediction.net task hadsm3fub_kh1t_005986884_5 resumed by user Fri 26 Jun 04:49:53 2009 climateprediction.net Computation for task hadcm3istd_cskk_1920_160_06021160_5 finished Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_1.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_2.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_3.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_4.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_5.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_6.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_7.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_8.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_9.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_10.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_11.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_12.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_13.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_14.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_15.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Output file hadcm3istd_cskk_1920_160_06021160_5_16.zip for task hadcm3istd_cskk_1920_160_06021160_5 absent Fri 26 Jun 04:49:53 2009 climateprediction.net Resuming task hadsm3fub_kh1t_005986884_5 using hadsm3 version 609 As you see below, I had two successes previously:- 7262911 6140270 15 Feb 2008 1:15:59 UTC 23 May 2008 20:21:25 UTC Over Success Done 7,455,395.67 49,766.40 49,766.40 7219285 6135719 31 Jan 2008 1:12:49 UTC 31 Jan 2008 1:14:11 UTC Over Client error Compute error 0.00 0.00 0.00 7218792 6135669 30 Jan 2008 22:48:36 UTC 31 Jan 2008 1:12:27 UTC Over Client error Compute error 0.00 0.00 0.00 7199412 6133557 31 Jan 2008 1:14:11 UTC 7 May 2008 7:48:28 UTC Over Success Done 6,617,097.77 49,766.40 49,766.40 Keith |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The credits program is still having problems because of lingering climateapps2 problems. *************** From the stderr out list for_kh1t_: Insufficient Memory/Stack Space Available! Backups: Here |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
The problem will be the insufficient shared memory. Macs seem to have plenty of it for single-core machines but not for multi-cores. I can't understand why they don't configure this more sensibly at the manufacturing stage because it's been a known problem for quite a long time. If you have two models running simultaneously and both reach a checkpoint at the same time I'd guess that this causes at least one model to crash. A tried and tested solution is explained here by Eric Myers. Cpdn news |
Send message Joined: 23 Jan 07 Posts: 26 Credit: 852,233 RAC: 0 |
Macs seem to be getting a 40% decrease in processing time on 3P tasks. Great. They are also getting a 20% decrease in credits awarded. Not so great. Cheers. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Hi Billy A completed HadAM3P should earn 1982 credits whichever platform it's crunched on. Sometimes I think the credits for the last few days of the model don't get included (I don't know why) and then the model gets 1980. Zombie67 tested the new Mac version on CPDN Beta and got 1980 credits. Are members getting fewer than this for completed HadAM3Ps on the main project? Cpdn news |
Send message Joined: 1 Jan 07 Posts: 1014 Credit: 35,791,220 RAC: 25,255 |
Are members getting fewer than this for completed HadAM3Ps on the main project? Temporarily, yes, but only for a couple of models completed and reported today - for the time being, they're showing the (lower) benchmark * time BOINC claim, and the amount awarded for whatever trickle they'd reached the last time the credit process was run - I think we lost a run when climateapps2 was down on Wednesday. All tasks reported long enough ago to have had a complete credit run since reporting are showing the expected 1,982.64 for both claim and grant - most recent 22 June. |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
Richard, what you say must apply to all OSs. We will need to wait a day or two and check again. Cpdn news |
Send message Joined: 1 Jan 07 Posts: 1014 Credit: 35,791,220 RAC: 25,255 |
Richard, what you say must apply to all OSs. We will need to wait a day or two and check again. Here's a suitable mine-canary: host 788878 |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
The problem will be the insufficient shared memory. Macs seem to have plenty of it for single-core machines but not for multi-cores. I can't understand why they don't configure this more sensibly at the manufacturing stage because it's been a known problem for quite a long time. If you have two models running simultaneously and both reach a checkpoint at the same time I'd guess that this causes at least one model to crash. MOV It is no question of reaching a check point. In spite of having successfully completed twice as I have indicated, never have I been able to even start crunching since then ---- No crunching, no checkpoint -- just crashes immediately it's turn comes to start crunching. I had thought it was related to the old problem of the apps being compiled from 2 different versions of fortran, but this would not now appear to be the case. I will see if I can follow Eric Myers solution. I will make sure that my next task to be downloaded will be expedited as soon as I have made the necessary adjustments. Keith |
©2024 climateprediction.net