climateprediction.net home page
Posts by old_user909

Posts by old_user909

1) Message boards : Number crunching : Anyone else experience decrease in team credits? (Message 14136)
Posted 5 Jul 2005 by old_user909
Post:
Wonder if this is related? I just noticed that the expavg_time field (time stamp of the last time expavg_credit was recalculated) hasn't been touched for any users for over 2 weeks. My website looks for this field and if it is over 2 weeks ago, it calls the user "inactive". CPDN currently has 0 "active" users.

On a seperate issue, I noticed that cpdn evidently lost over 400 teams sometime in the last couple of days... This seems a bit unusual to me. Did you all clear out inactive teams or something? There used to be 2,108 - now there are 1669. Big losses in team/user numbers causes my stats site to mark the update as invalid and not use it. I will fix this but losing that many teams at once seemed suspicious to me...
2) Message boards : Number crunching : XML stats problems (Message 13967)
Posted 28 Jun 2005 by old_user909
Post:
Well all squabbling aside, it seems the XML has been fixed. I have seen a couple updates today and the hosts file appears to be complete and well-formed. Yay! Thanks to the CP.net team.
3) Message boards : Number crunching : XML stats problems (Message 13922)
Posted 27 Jun 2005 by old_user909
Post:
> Why don't you dig into the stats code yourself and provide a fix to the
> developers? :-)

Uh... I have. There are several patches in db_dump that were written by me or by Dr. Anderson after I found a problem, researched it as far as I could without access to the database and brought it to his attention. As I stated above, I believe the current host problem has actually been fixed in the latest version.

> Someone that bumps a post as they don't feel they're getting the attention
> they want does remind me of a kid in the back seat asking "are we there yet?"
> on a too frequent basis! :-(

Don't think a once per week post counts as being QUITE that annoying...

> Let them fix the basic issues, as I feel they will, and then worry about your
> xml stuff..

It looked like some of the basic issues were being fixed when I posted that bump. Looks like things may not have worked out after all...
4) Message boards : Number crunching : XML stats problems (Message 13841)
Posted 25 Jun 2005 by old_user909
Post:
Bump. The XML is still messed up! I see other problems are being solved... Can someone PLEASE look at the XML? Right now there have been no updates for over 5 days.
5) Message boards : Number crunching : XML stats problems (Message 13636)
Posted 21 Jun 2005 by old_user909
Post:
This thread is not about the no credit problem. The problem with the XML files is completely seperate and was happening long before the credits stopped flowing. There are a couple of other threads about the no credit problem. Lets try not to confuse the admins TOO much and keep seperate problems in seperate threads :)
6) Message boards : Number crunching : XML stats problems (Message 13626)
Posted 20 Jun 2005 by old_user909
Post:
Well now all the XML files are empty or completely malformed... I hope this means someone is working on it? :) I hear tolu is supposedly back as of today but probably swamped with email and backlogged problems :/
7) Message boards : Number crunching : XML stats problems (Message 13356)
Posted 12 Jun 2005 by old_user909
Post:
*cough* anyone still there or is this project running itself while everyone is on vacation? The problem still remains...
8) Message boards : Number crunching : XML stats problems (Message 13265)
Posted 9 Jun 2005 by old_user909
Post:
Wohoo! Did tolu come back or did someone else finally find the right button to push? :) There is going to be a big jump in my history graphs in about 7 hours!

Now for the bad news:
The hosts.xml file is even more damaged now :/

It *LOOKS* like the same problem seti@home experienced just a little while ago. It was discussed on the boinc_stats mailing list. Read here if you wish:
http://www.ssl.berkeley.edu/pipermail/boinc_stats/2005-May/000095.html

The problem in their case was that a user was deleted from the user table. This should <b>*NEVER*</b> <b><i>*EVER*</i></b> be done! Doing so leaves orphaned records in other tables which can cause problems like this and are very hard to track down. In seti's case the user was deleted and then db_dump went to look up the owner of a host to determine wether or not to put the &lt;userid&gt; tag in or not (if the user has his hosts hidden, it is not put in). This lookup failed and caused the host record to be terminated prematurely with no closing &lt;/host&gt; tag which will also completely hose most XML parsers. Here is an example from the CPDN XML file:

&lt;host&gt;
&lt;id&gt;79&lt;/id&gt;
&lt;host&gt;
&lt;id&gt;83&lt;/id&gt;
&lt;userid&gt;57&lt;/userid&gt;
&lt;total_credit&gt;24385.515355&lt;/total_credit&gt;
...

The record for host #79 is obviously truncated. There are a total of 34 hosts which have this problem. Here is a list and the command I used to get it from the XML (email might be better for this but since I'm not getting any response from anyone on the team...):

grep -B 1 "&lt;host&gt;" host.xml | grep -v \\-\\- | grep -v "&lt;/host&gt;" | grep -v "&lt;host&gt;"

&lt;id&gt;79&lt;/id&gt;
&lt;id&gt;294&lt;/id&gt;
&lt;id&gt;549&lt;/id&gt;
&lt;id&gt;692&lt;/id&gt;
&lt;id&gt;2174&lt;/id&gt;
&lt;id&gt;2184&lt;/id&gt;
&lt;id&gt;3089&lt;/id&gt;
&lt;id&gt;3706&lt;/id&gt;
&lt;id&gt;4569&lt;/id&gt;
&lt;id&gt;4684&lt;/id&gt;
&lt;id&gt;6248&lt;/id&gt;
&lt;id&gt;6881&lt;/id&gt;
&lt;id&gt;7273&lt;/id&gt;
&lt;id&gt;7406&lt;/id&gt;
&lt;id&gt;7773&lt;/id&gt;
&lt;id&gt;8425&lt;/id&gt;
&lt;id&gt;8495&lt;/id&gt;
&lt;id&gt;8926&lt;/id&gt;
&lt;id&gt;9398&lt;/id&gt;
&lt;id&gt;10088&lt;/id&gt;
&lt;id&gt;10623&lt;/id&gt;
&lt;id&gt;11839&lt;/id&gt;
&lt;id&gt;13726&lt;/id&gt;
&lt;id&gt;14043&lt;/id&gt;
&lt;id&gt;15016&lt;/id&gt;
&lt;id&gt;18664&lt;/id&gt;
&lt;id&gt;19690&lt;/id&gt;
&lt;id&gt;20010&lt;/id&gt;
&lt;id&gt;20423&lt;/id&gt;
&lt;id&gt;20659&lt;/id&gt;
&lt;id&gt;21136&lt;/id&gt;
&lt;id&gt;21367&lt;/id&gt;
&lt;id&gt;21416&lt;/id&gt;
&lt;id&gt;21459&lt;/id&gt;

As to how to fix this... (*IF* this really is the problem - somene with access to the DB will have to verify) Deleting the offending hosts is probably not the way to go as that may cause other problems. Dr. Anderson did change db_dump.C to fix this problem but I'm not sure if it is in the public branch yet so you may have to get it from the development branch. You will want version 1.83 checked in on 5/12/05. You might also want to check with Dr. Anderson to double check...

As I said, this is my theory since it looks the same as what happened to seti but I'm only on the outside looking in...

And remember <b>NEVER</b> delete users! :)
9) Message boards : Number crunching : XML stats problems (Message 13243)
Posted 8 Jun 2005 by old_user909
Post:
&gt; Of course, it would be
&gt; easier if I knew what the bad character looked like ...

That is the problem... the character in question doesn't "look" like anything and can't be typed on a keyboard. It was a reserved control character. The byte had a hex value of 02 in the XML which is a "start of text" character. I'm not sure if you can even query mysql on non-printable characters... It would have to give you a way to enter the ASCII/hex codes for the characters. Might be possible but I have never done it.
10) Message boards : Number crunching : XML stats problems (Message 13213)
Posted 7 Jun 2005 by old_user909
Post:
&gt; We are here but we are incredibly busy. Actually we = I while Tolu is away
&gt; :)
&gt; I've updated the mysql db as Toby suggested. I'm a little worried about how
&gt; this occured and that there may be more unprintable characters in the db.
&gt; Unfortunately I haven't figured out how to query for unprintables yet :)

If you restart db_dump and give us some new XML I'll be able to tell you in about 2 minutes if there are any more bad characters :) db_dump was modified back in February to automatically filter out bad characters in most text fields such as user/team name as well as some host fields. I guess not all of the host fields were included - I might have to take that up on the stats mailing list. Of course this only fixes it on the XML side of things and as you said, random strange characters showing up in the DB are a bit worrysome. However if it IS just this one host, it could be that something went wrong on the host and the p_vendor field got some bad data in it from the client and the server just accepted it without question. As far ast the database is concerned, there is nothing wrong with the character. It is just when it gets dumped into XML that it becomes a problem since XML has a specific character set that is allowed.
11) Message boards : Number crunching : XML stats problems (Message 13127)
Posted 5 Jun 2005 by old_user909
Post:
*sigh - still nothing. I have submitted a report via their web form as well. Still no response from that. Anyone alive over there? The solution is really simple... one line of SQL typed into a mysql console:

update host set p_vendor = 'Intel' where id = 82994;

And then I guess someone needs to figure out how to restart db_dump. Anyone happen to have the CPDN root password laying around? :)
12) Message boards : Number crunching : XML stats problems (Message 13074)
Posted 2 Jun 2005 by old_user909
Post:
Yes, it seems that the "fluke" I saw in the host XML has persisted. The problem is an invalid character in the host XML file. I believe one host has some illegal data in its p_vendor field. See this copy/paste from the XML:

&lt;host&gt;
&lt;id&gt;82994&lt;/id&gt;
&lt;userid&gt;19151&lt;/userid&gt;
&lt;total_credit&gt;567.105002&lt;/total_credit&gt;
&lt;expavg_credit&gt;0.058292&lt;/expavg_credit&gt;
&lt;expavg_time&gt;1112096428.038390&lt;/expavg_time&gt;
&lt;p_vendor&gt;^B&lt;/p_vendor&gt;
&lt;p_model&gt;Pentium&lt;/p_model&gt;
...

Notice the ^B inside the p_vendor tags. This is actually an undisplayable control character (the hex value of the byte is 02) and it causes any standards compliant XML parser to stop parsing and throw an error. I emailed tolu about it las week. On Tuesday db_dump was upgraded to the latest version however this still didn't fix the problem. Now it would seem that db_dump is turned off completely. I'm pretty sure all that is needed is a one-line SQL statement to change the p_vendor field of host number 82994 to something reasonable like 'Intel' (since it is a Pentium...). I replied to tolu on Tuesday night but haven't heard back. Someone care to poke them with a pointy stick? :)
13) Message boards : Number crunching : XML stats problems (Message 12649)
Posted 18 May 2005 by old_user909
Post:
XML appears to be back but there is some oddity in the hosts file that was causing my parser to crash. I thought we had fixed the XML escaping problems a few months ago but there seems to be a new one. Unless this is just a fluke with all the problems...

As for the other projects... seti apparently filled up the drive that was holding the XML (they keep a daily copy for archival purposes). This caused all kinds of strange behavior that produced extremely bad files which caused some stats sites to nuke their seti tables. Mine just saw problems and aborted the update, leaving old (but accurate) data in place. LHC is working but we ran them out of work units so very few new credits are being handed out right now. I just noticed a few minutes ago that einstein is also overdue for an XML update. No "problems" with it... just nothing new published in 38 hours. So that leaves predictor and pirates that are still up and running with XML as they should be. As far as I can tell there isn't any BOINC-wide problem but 3/6 active projects just happened to develop problems at the same time. Seems a little fishy but I see no link between them right now... maybe I need new glasses :)
14) Message boards : Number crunching : XML stats problems (Message 12412)
Posted 8 May 2005 by old_user909
Post:
Not sure if anyone is aware of this but the XML stats for CPDN seem to be broken. several of the files have nothing in them. My XML loader hasn't been able to update in 36 hours now.

FYI :)
15) Message boards : Number crunching : Why can\'t my pc crunch cpdn wus? (Message 7232)
Posted 10 Jan 2005 by old_user909
Post:
Then the question is what is overheating? It isn't neccessarily your CPU. It could be one of the chipset chips on the motherboard or your RAM. I was having similar problems to yours. I could run seti@home and LHC all I wanted with no ill effects but as soon as I started CPDN I had to severely underclock the system to keep it stable. In my case it seems to have turned out to be a bad stick of RAM. Both memtest86 and prime95 gave me clean bills of health but one day the stick of RAM finally gave up the ghost and refused to work at all. Since taking it out, the system has become more stable and a few other small quirks have gone away. Now if I can just get newegg to take back the RAM...
16) Message boards : Number crunching : large work unit? (Message 6253)
Posted 20 Nov 2004 by old_user909
Post:
Might be helpful if you didn't hide your computers as we could look at the numbers ourselves... However your processing times don't surprise me in the least. Hyperthreading does NOT give you the processing power of 2 CPUs. Depending on what type of project you are running, it will yield different results. Projects that hit the main system RAM a lot will take more of a performance hit on HT processors than projects that can fit a good chunk of their data in the L2 cache. Since CPDN takes 50 MB of RAM, I would suspect that it would fall under the former category. Furthermore even dual CPU setups don't give you 2x the performance for many of the same reasons. The memory bus often turns into a bottleneck.

If you were running them 1 by 1, 4 work units would take you 2,400 hours. Completing all 4 of them in 1,600 on a dual HT machine seems very reasonable to me.
17) Message boards : Number crunching : outta here (Message 6244)
Posted 19 Nov 2004 by old_user909
Post:
Yeah... the CPDN client definitely does something that most other DC projects don't and that my CPU doesn't like at all. My system withstood over 10 hours of prime95's torture test with no errors but locked up hard within a couple minutes of starting CPDN. I was able to finish my last model by underclocking my system to 200 MHz below stock speed. I think I might try a model on my other good box and see how it turns out there. Anyone have any clue as to what CPDN is doing that might cause this? Maybe if you take out the function call to crash_cpu()? :)
18) Message boards : Number crunching : Stats site moving & note to cross project teams (Message 6041)
Posted 11 Nov 2004 by old_user909
Post:
My stats site is finally moving off of my residential cable connection! No more outages when Cox goes down (happens regularly). I have also just put some user history graphs online. The new URL is <a href="http://stats.kwsn.net/">http://stats.kwsn.net/</a>. Be sure to check out the new features link. My home server (at <a href="http://macg.no-ip.info:5520/boinc/">http://macg.no-ip.info:5520/boinc/</a>) will remain online but it will turn into a development server so the stats may not be accurate or up to date. Enjoy!

Also a note to cross-project teams: If you want your team stats to be combined, your team name must be <b>EXACTLY</b> the same on all projects! This goes for all the stats sites last time I checked. There is no CPID for teams as there is for users so the only thing we have to go on is a string comparison on the team name. Not to pick on anyone in particular... but Ars Technica should make note of this :)
<br>
----------------------------
A member of <a href="team_display.php?teamid=45">The Knights Who Say Ni!</a>
And one more link:
<a href="http://stats.kwsn.net/">My BOINC stats site</a>
19) Message boards : Number crunching : Stats a bit haywire after model upload... (Message 5767)
Posted 29 Oct 2004 by old_user909
Post:
Hmm. I didn't think about your tinkering possibly causing this... I'm certain that the model that got uploaded by my team mate was downloaded before your fix so that could still explain it. Don't know about others. The ones linked to in this thread are a little of both. Guess you have more data than us so we leave it in your capable hands :)
<br>
----------------------------
A member of <a href="team_display.php?teamid=45">The Knights Who Say Ni!</a>
Yet another stats page: <a href="http://boinc-kwsn.no-ip.info">http://boinc-kwsn.no-ip.info</a>
20) Message boards : Number crunching : Stats a bit haywire after model upload... (Message 5664)
Posted 26 Oct 2004 by old_user909
Post:
yeah... something still isn't right IMHO. Look at <a href="http://macg.no-ip.info:5520/boinc/team_hist_graphs.php?proj=cpdn&amp;teamid=45&amp;pmode=r">this graph</a> on my stats site. You can clearly see several big jumps when people return work units. If the credit that is granted along the way doesn't match up to the final claimed credit, I could see some adjustment going on when you finish but this seems a bit extreme.
<br>
----------------------------
A member of <a href="team_display.php?teamid=45">The Knights Who Say Ni!</a>
Yet another stats page: <a href="http://boinc-kwsn.no-ip.info">http://boinc-kwsn.no-ip.info</a>


Next 20

©2024 climateprediction.net