stephenbrooks.orgForumMuon1GeneralBenchmarking with Mpts thread
Username: Password:
Page 1 2 3 4 5 6 7 8 9 10 11 12 13
Search site:
Subscribe to thread via RSS
Stephen Brooks
2004-09-17 01:21:58
Results so far, in a table:

UserProcessorMHzkpts/spts/MClock = (kpts/s)/GHzNote
Stephen BrooksP-II40016.7541.9
[OCAU] badgerK640016.641.5
[DPC]TeamNWW - HuubP4240015665.0
[DPC]TeamNWW - HuubP4180066.4236.9(stressed)
[DPC]TeamNWW - HuubP-III86641.8648.3
XanathornAthlon10008080.0
MaFiAthlon200016582.5
odessit2xAthlon360024066.7
John Kitchen2xAthlon347022765.4
Mike MalisAthlon1670110.0965.9
K`TetchP-II3331442.0
K`Tetch2xP-III11003229.1(stressed)
[OCAU] badgerAthlon2237259115.8
excaliber[Free-DC.org]Athlon1670181.47108.7
MataiP42300130.0956.6
spldart??404


Note that for dual machines I've added up the clockrates of the two processors.  It seems like the couple of Athlons that went over 100 pts/MC were either overclocked or were some later revision of the Athlon chip, perhaps with a higher FSB.

The current conversion rate that is being used to calculate the estimated-total-GHz figures on the graphs is 95 pts/MC, corresponding to a fairly good Athlon.
spldart
2004-09-17 06:21:18
quote:
Originally posted by Stephen Brooks:
spldart, can you tell me what processor you were using to get the above 404 kpts/s? 


Two AMD Barton core Mobile processors modded for smp operation.  They are clocking 16x144 for 2304mhz apiece. 
I believe there is one more thing I can try for a tad more kpts.
Stephen Brooks
2004-09-17 07:33:56
Added a couple more columns, as I think the chip core and the FSB might have a big effect on performance too.  Can anyone help fill in the blanks?




UserProcessorCoreFSB (base)MHz (summed over cores)kpts/spts/MClock = (kpts/s)/GHzNote
Stephen BrooksP-IIDeschutes10040016.7541.9
[OCAU] badgerK640016.641.5
[DPC]TeamNWW - HuubP4240015665.0
[DPC]TeamNWW - HuubP4180066.4236.9(stressed)
[DPC]TeamNWW - HuubP-III86641.8648.3
XanathornAthlon10008080.0
MaFiAthlonXP?133200016582.5
odessit2xAthlon360024066.7
John Kitchen2xAthlonMP?347022765.4
Mike MalisAthlon1670110.0965.9
K`TetchP-II3331442.0
K`Tetch2xP-III11003229.1(stressed)
[OCAU] badgerAthlonBarton1792237259115.8
excaliber[Free-DC.org]AthlonXP?1670181.47108.7
MataiP42300130.0956.6
spldart2xAthlonBarton144460840487.7
[TA]zP4Prescott2003400273.3880.4
[TA]zP4Northwood2003000276.1392.0
[TA]z
2004-09-17 07:34:22
minor edit:

P4 550 - 3.4GHz, 1MB L2 Cache
Core: Prescott
FSB: 800MHz / 200base
273.38 kpts/s
Stephen Brooks
2004-09-17 07:37:14
I'm guessing that's a FSB200 (iFSB800) Northwood "D".

Thanks.  Well it looks from that as if the Prescott core isn't all that bad at Muon1 after all.  The FSB probably helps a lot, too.

That's a point: the new P4s include hyperthreading, so are you running Muon1 on 2 threads?  That gave me about a 20% performance boost on my 2xXeon system.
[TA]z
2004-09-17 08:47:41
Though it does seem Northwood out performs Prescott by a reasonable amount.  Here is another benchmark:

P4, Northwood, 200, 3000, 276.13, 92.0
Stephen Brooks
2004-09-17 09:00:37
Hmm so it does... I thought initially that your 3400 system might have been the max-clocking Northwood, which from what it looks like now, might be the best Intel desktop processor for Muon1 for some time.  Scaling linearly, accounting for the Prescott nerfing factor of 92.0/80.4 = 1.144, their new chips would have to reach >3890MHz before surpassing that, i.e. the "580" 4GHz chip, which is about 6 months away.
[TA]z
2004-09-17 09:09:47
I had to throw that 3400 together in a hurry recently for relieving another system.  Had I payed a little more attention, I would have spent the extra $30 for the Northwood... *sigh*
kitsura
2004-09-17 13:47:18
AMDs would theoratically perform better than Intel CPUs for DPAD because they have higher IPCs.  But yes I guess bus speed also play a very important role too.  No one running an A64, FX or Opteron to benchmark?
spldart
2004-09-17 16:54:49
Just got some promising results Big Grin
Hope to have something to post tomorrow.
spldart
2004-09-18 05:48:02
Uptime (secs),Mpts in file,Estimate kpts/sec
1461,3849177.1,0.00
2665,3849641.8,386.06
3568,3850085.1,431.05
4771,3850525.9,407.47
5674,3850965.4,424.48
6878,3851452.6,420.09
7781,3851857.4,424.14
8984,3852320.9,417.88
10188,3852758.4,410.37
11091,3853194.8,417.22
11994,3853590.0,418.98
13197,3854019.9,412.64
14100,3854475.5,419.21
15304,3854910.9,414.21
16207,3855316.2,416.34
17410,3855757.2,412.57
18313,3856223.1,418.11
19517,3856678.2,415.44
20420,3857065.5,416.09
21624,3857526.4,414.10
21924,3857672.6,415.16
22827,3858108.4,418.01
24031,3858530.7,414.43
25235,3858995.6,413.00
26138,3859460.6,416.73
27040,3859772.2,414.21
27642,3860049.3,415.27
28846,3860512.9,413.94
29749,3860957.9,416.46
30953,3861418.6,415.09


Wooot!
Same cpu clock.  Turned off some of the excessive background apps I have loaded on the machine and a tweak to the chipset.
[OCAU] badger
2004-09-18 21:44:36
I should note that many of the benchies at the beginning of the thread would have been done with v4.3x which is much faster than v4.4x.

see thread here
spldart
2004-09-19 06:42:53
I didn't know that Confused
I'm still running the 4.4 I was a month ago.  I'm a n00b Frown
Ben Whitten
2004-09-25 11:34:21
Okie feel like a fool, looked at the previous posts and saw that I needed auto dl'in off, restarted and will get some new results soonish, lol thankyou anyhow bbl.
K`Tetch
2004-09-29 11:16:19
Well, might be an idea to include Os

I switched my dual P3-550 from 2k to Xp, and what a difference.  Instead of 32 kpts/s its around 61.

Mainly thats due to a significant reduction in 'kernal time'
I'm wondering if thats what the (stressed) you put on means Stephen.
Stephen Brooks
2004-09-30 01:34:48
I put it there because you said you were doing CAD and various things during that benchmark.
Deleenheir
2004-10-11 02:01:39
These are my results... the lower results are the moments i was using the computer but it looks like when the comp is idle i get a good 200kpts

Uptime (secs),Mpts in file,Estimate kpts/sec
94180,1585504.4,0.00
96282,1585959.6,216.58
98984,1586453.5,197.56
101086,1586898.9,201.93
101686,1587034.1,203.79
103788,1587495.3,207.21
104089,1587533.6,204.80
106491,1588028.6,205.05
108592,1588477.3,206.28
110994,1588923.5,203.34
113096,1589369.7,204.34
113397,1589436.0,204.60
115799,1589927.4,204.59
118201,1590420.7,204.67
120603,1590912.0,204.66
122705,1591358.7,205.24
125107,1591801.7,203.62
Uptime (secs),Mpts in file,Estimate kpts/sec
Uptime (secs),Mpts in file,Estimate kpts/sec
131320,1592714.7,0.00
133722,1593202.5,203.07
136124,1593698.1,204.70
138526,1594199.7,206.07
141530,1594652.6,189.81
144533,1595101.0,180.60
Uptime (secs),Mpts in file,Estimate kpts/sec
2404,1597757.4,0.00
4806,1598204.1,185.96
6308,1598550.0,203.05
8410,1598988.3,204.97
10812,1599435.2,199.57
12914,1599910.9,204.92
15316,1600407.3,205.24
17718,1600903.1,205.42
20120,1601350.9,202.85
22822,1601848.3,200.36
25224,1602343.8,200.98
27326,1602791.2,201.98
29428,1603199.1,201.37
31530,1603648.2,202.26
33632,1604099.6,203.10
Deleenheir
2004-10-11 02:04:05
Oh yeah i'm running it on a Athlon xp 2000+ @ 1922 Mhz (167*11.5)
[DPC] LittleB
2004-10-14 11:21:13
Hi,

AMD Athlon 64 3000+ @ 2000Mhz, 200MHz frontsidebus(normal clockspeed) and 1Gig mem

Results:
33654,682736.7,248.08
35455,683187.9,249.33
37256,683638.9,249.71

For your list.
UnderTitan
2004-12-25 02:02:31
Is anyone out there using a sempron, or have any DPAD benchmark info for a sempron processor?  Reason I ask is that I'm thinking about buying a couple of cheapo 1u's for the project, and can get a decent deal.  Take a look here if you're curious: http://cheap1u.com/store/product_info.php?products_id=28
Stephen Brooks
2004-12-26 07:49:41
If I'm right in thinking that Semporns are just the normal Athlon XP and K8 cores with less cache and 64bit disabled, then providing Muon1 isn't near some sort of cache usage boundary 256/512KB (which is probably isn't), they should run nearly as good as the standard AMD chips, which run it very well.  Perhaps you'll get some O/C out of those too.
UnderTitan
2004-12-26 12:49:41
Thanks for responding Stephen.

I don't know anything about how programs use the cache on a processor.  These computers are going to be solely for crunching muons, so I'd expect that the cache would have less of an importance, since there won't be 50 processes asking for things to be done, know what I mean?

The l1 cache is 128, l2 cache is 256 if I remember correctly.

Do you know how much muon1 relies on the cache?  Anyone?
[SG]Santas little helper
2005-01-04 21:24:36
P4 CPU: 2,67@2,7, FSB 533MHz, RAM: 512MB@333MHz,
Client: 4.41f (ChicaneLinacB90)
Uptime (secs),Mpts in file,Estimate kpts/sec
18893,1193758.3,0.00
22200,1194304.2,165.08
25507,1194799.7,157.46
28814,1195350.3,160.48
32120,1195895.2,161.55
35728,1196449.1,159.84
38734,1196923.2,159.52
42041,1197475.6,160.59
45348,1197971.1,159.25
48354,1198490.1,160.61
48654,1198514.5,159.81
Jumping[Romulus2]
2005-01-28 02:05:38
I done some benching on my new Athlon 64 3500+ (Winchester Core) no overcloc.

This is the results..

uptime+0 Mpts+0.0 No estimate so far
uptime+1805 Mpts+439.1 Estimate 243.22 kpts/sec
uptime+3911 Mpts+962.7 Estimate 246.12 kpts/sec
uptime+6017 Mpts+1486.1 Estimate 246.95 kpts/sec
uptime+6920 Mpts+1714.8 Estimate 247.78 kpts/sec
uptime+9328 Mpts+2237.5 Estimate 239.86 kpts/sec
uptime+11133 Mpts+2715.3 Estimate 243.87 kpts/sec
uptime+13541 Mpts+3239.7 Estimate 239.24 kpts/sec
uptime+15647 Mpts+3766.0 Estimate 240.67 kpts/sec
uptime+17754 Mpts+4246.4 Estimate 239.18 kpts/sec
uptime+19559 Mpts+4727.2 Estimate 241.68 kpts/sec
uptime+21364 Mpts+5215.2 Estimate 244.10 kpts/sec
uptime+23170 Mpts+5648.4 Estimate 243.78 kpts/sec
uptime+25276 Mpts+6162.6 Estimate 243.81 kpts/sec
uptime+27081 Mpts+6606.3 Estimate 243.94 kpts/sec
uptime+28886 Mpts+7128.9 Estimate 246.79 kpts/sec
uptime+30391 Mpts+7508.4 Estimate 247.06 kpts/sec
uptime+32497 Mpts+8029.5 Estimate 247.08 kpts/sec
uptime+34603 Mpts+8552.3 Estimate 247.15 kpts/sec
uptime+36108 Mpts+8965.3 Estimate 248.29 kpts/sec
uptime+38214 Mpts+9488.9 Estimate 248.31 kpts/sec
uptime+38515 Mpts+9513.5 Estimate 247.01 kpts/sec
uptime+40621 Mpts+10036.0 Estimate 247.06 kpts/sec
uptime+42125 Mpts+10480.2 Estimate 248.78 kpts/sec

This machine is used for other things like games, movie playback etc.

One thing I have noticed is that somedays its not that much faster than my old XP2600+ is crunching but others its a bit quicker.

Not sure if it would be possible to either compile it with AMD64 optimizations or make a 64bit version.

I have got the Windows 64b it RC1 and could be interesting to see what the client could do if it was 64bit aswell.

Regards
Jumping
Thor
2005-02-07 06:49:04
I have some results for an XP 2400 running W2K and the latest client..

clock speed is 2000MhZ , Bus Speed 133Mhz
512MB RAM (PC2700)

Uptime (secs),Mpts in file,Estimate kpts/sec
576,2396097.3,0.00
2678,2396562.5,221.28
4780,2397019.1,219.24
7183,2397483.0,209.73
9285,2397937.2,211.25
11688,2398490.2,215.34
11988,2398501.9,210.70

Probably not enough to be included, but a good indicator...

Btw did anybody try how DPAD reacts on different RAM timings?

Greets Thor
Stephen Brooks
2005-02-08 02:05:42
My guess would be that lower latency would have a noticable effect since it seems to like low latency/high FSB setups (probably goes main RAM a lot).
HaloJones
2005-02-15 23:51:25
uptime+1504 Mpts+396.1 Estimate 263.34 kpts/sec
uptime+3609 Mpts+943.2 Estimate 261.28 kpts/sec
uptime+5715 Mpts+1495.8 Estimate 261.70 kpts/sec
uptime+7520 Mpts+1970.1 Estimate 261.96 kpts/sec
uptime+9025 Mpts+2338.8 Estimate 259.14 kpts/sec
uptime+10529 Mpts+2801.1 Estimate 266.03 kpts/sec
uptime+12334 Mpts+3261.6 Estimate 264.43 kpts/sec
uptime+14139 Mpts+3724.7 Estimate 263.43 kpts/sec
uptime+16245 Mpts+4276.0 Estimate 263.22 kpts/sec
uptime+18050 Mpts+4737.5 Estimate 262.46 kpts/sec
uptime+20155 Mpts+5289.9 Estimate 262.45 kpts/sec
uptime+21960 Mpts+5750.7 Estimate 261.86 kpts/sec
uptime+23765 Mpts+6215.1 Estimate 261.51 kpts/sec
uptime+25871 Mpts+6773.5 Estimate 261.81 kpts/sec

Athlon XP1700+ (T-Bred- at 2400MHz, 12x200
[TA]z
2005-08-05 20:09:54
A short test on a 3800+ X2 (Manchester) @ 2.0 GHz is pushing 455+ kpts/sec.  Mem timings are loose at 3-3-3-8 right now, let's see where muon1 will go with tighter timings Wink
HaloJones
2005-08-06 03:07:07
quote:
Originally posted by [TA]z:
A short test on a 3800+ X2 (Manchester) @ 2.0 GHz is pushing 455+ kpts/sec.  Mem timings are loose at 3-3-3-8 right now, let's see where muon1 will go with tighter timings Wink


Is it showing as using both cores?
Stephen Brooks
2005-08-08 02:06:11
Just looking at the previous posts we had
AthlonXP 2GHz ~ 210 kpts/s
Athlon64 (Winchester) 2.2GHz ~ 245 kpts/s
Athlon64 (3000+) @ 2GHz ~ 245 kpts/s again (dunno why - maybe used less/more RAM?)
and...
"Two AMD Barton core Mobile processors modded for smp operation.  16x144 for 2304mhz" ~ 415 kpts/s

So that 3800+ is doing well for its speed.  If we had _perfect_ scaling by doubling the best 2GHz A64 result we'd get 2*245 = 490, so 455 is not far off.

The summary so far:



























UserProcessorCoreFSB (base)MHz (summed over cores)kpts/spts/MClock = (kpts/s)/GHzNote
Stephen BrooksP-IIDeschutes10040016.7541.9
[OCAU] badgerK640016.641.5
[DPC]TeamNWW - HuubP4240015665.0
[DPC]TeamNWW - HuubP4180066.4236.9(stressed)
[DPC]TeamNWW - HuubP-III86641.8648.3
XanathornAthlon10008080.0
MaFiAthlonXP?133200016582.5
odessit2xAthlon360024066.7
John Kitchen2xAthlonMP?347022765.4
Mike MalisAthlon1670110.0965.9
K`TetchP-II3331442.0
K`Tetch2xP-III11003229.1(stressed)
[OCAU] badgerAthlonBarton1792237259115.8
excaliber[Free-DC.org]AthlonXP?1670181.47108.7
MataiP42300130.0956.6
spldart2xAthlonBarton144460841590.1
[TA]zP4Prescott2003400273.3880.4
[TA]zP4Northwood2003000276.1392.0
DeleenheirAthlonXP 2000+1671922203105.6
[DPC] LittleBAthlon643000+2002000249124.5
[SG]Santa's little helperP42.67GHz133270016059.3
Jumping[Romulus2]Athlon64Winchester2002200247112.3
ThorAthlonXP 2400+1332000215107.5
HaloJonesAthlonThoroughbred B (XP 1700+)2002400262109.2
[TA]zAthlon64x2Manchester2004000455113.8(3-3-3-8)
Stephen Brooks
2005-08-08 02:50:42
Hmm dunno why that's showing with a big gap above it.

Edit: see previous page!

Edit: actually, see following graph.



I should probably explain that all the entries from "[TA]?" were imported from another benchmarking threads we had on the AnandTech forums a while ago.
Stephen Brooks
2005-08-09 05:46:35
The stats server makes a rubbish Muon1 machine, it only gets 240kpts/s even though it's dual Xeon 2.8GHz, because of the samplefile script etc. that are always running.  I don't think I'll add it to the chart.
K`Tetch
2005-08-10 07:31:41
Well, a non-stressed benchmark for this client (4.42b) comes out at around 65kpts/s for the dual p3 machine.

It fluctuates a lot, obviously, based on if i have torrents going, or there's a netsplit on EFnet.

(you can usually find me in #distributed on EFnet)
Stephen Brooks
2005-08-10 07:40:16
I might do a benchmard on my A64 3200+ at home too.

Incidentally, there's now a paragraph on the FAQ page on how to use the benchmark thingy.
Stephen Brooks
2005-08-15 04:10:05
Benchmark now done, but numbers still at home... will edit this when back there.

OK, just got the numbers: a stable 243kpts/s for my A64 3200+ Winchester (9*FSB235 = 2115MHz).

[edit] And how does a netspilt on EFnet make Muon1 go slower?!
K`Tetch
2005-08-15 08:15:09
quote:
Originally posted by Stephen Brooks:
[edit] And how does a netspilt on EFnet make Muon1 go slower?! 


oh, thats easy.  i've got 2 clients on there, one seeing some 2000 clients, and the other only in one 600 client channel.  When a split happens, on average half will leave - client will only handle doing about 20 quit messages a second, and it uses up a lot of cpu time writing some 1200 of them, of the format

[03:34.16] *** Bucci^_^ (~soij@ZYYKMMDCCXXXVII.dsl.saunalahti.fi) Quit (hub.se irc.du.se)
[03:34.16] *** crazyeyes (crazyeyes@81-232-166-42-no30.tbcn.telia.com) Quit (hub.se irc.du.se)
Stephen Brooks
2005-08-15 09:06:45
Pretty slow for just printing text though?
[TA]z
2005-08-15 12:57:12
Well I'm now trying Windows Pro.  x64 on the same X2 system @ 2-3-2-6 timing and the Mpts/sec has dropped by ~15 (I'm going to blame x64 edition on this one).
HaloJones
2005-08-15 14:00:52
Whenever I try the benchmark utility it eats 100% of the cpu and Muon can't do anything!
Stephen Brooks
2005-08-16 01:35:58
Either that's a bug/infinite loop, or you just have a very large results.dat and have to let the benchmarker read all the way through it.  It may be advisable to try it with an install of Muon1 with a smaller results.dat.
[OCAU] badger
2005-08-16 18:35:11
I just ran muon bench again on my P4 3GHz hyperthreaded box.

I run it with 2 instances, and one thread each instance as previous checking showed this gave about 10% inprovement on running one instance with 2 threads (I haven't checked this on later versions though).

currently getting 131 kpt/s for each instance, or 262kpt/s for the whole machine.


for the sake of scientific accuracy, I should emphasize that different versions of muon produce mpts at different speeds.  This means the above graphs may be misleading due to the different versions used.  Notably the change from 4.34 to 4.41 reduced output by about 30% (on my barton anyway)
Unfortunately I haven't kept my records for this box's output using previous versions (currently using 4.42b).  I'll check the 2 instances thing too.
[TA]z
2005-08-17 09:07:33
true, the benches significantly change after muon had undergone 4.4x revisions.  it may be of some use if we could convince Stephen to compile a slightly modified version of muon for benchmark purposes only.  something simple like forcing a given set of simulations and hopefully yielding a *more exact* benchmark in less time than it takes to get a decent number now Wink
UnderTitan
2005-08-17 09:47:36
That's a great idea.  How about it Stephen?  Would it be too much of a hassle to create a stand alone program to benchmark the muon crunching ability?  Maybe somthing similar to SuperPi?  It could probably just run commandline....

Your thoughts Stephen?
Stephen Brooks
2005-08-17 10:58:52
I think this would be best implemented by a -bench commandline switch that activates a built-in simulation.  I'll add it to the to-do list.

That would become a partially "synthetic" benchmark though, because it won't reflect exactly what the real project is running.
Stephen Brooks
2005-08-17 12:48:04
Here is another graph excluding the "old" AnandTech results and any of ours that were "stressed".
[OCAU] badger
2005-08-17 17:27:01
I ran my P4 last night with one instance and 2 threads, and got 267 kpts/sec, so it seems it is no longer more efficient to run 2 instances.
This is interesting, as with 2 instances, CPU is at 100% constantly, but one instance with 2 threads sits on 97-98%, but produces about 2% more results.

I'll rebench my other machines and post the results here.
K`Tetch
2005-08-18 06:33:25
quote:
Originally posted by Stephen Brooks:
That would become a partially "synthetic" benchmark though, because it won't reflect exactly what the real project is running. 


All benchmarks are inherently 'unreal' in that sence, but as long as they're similarly unreal it'll be fine.  Lots of short results, a few long results, same mpts, different times, different clients different speeds.
UnderTitan
2005-08-18 17:00:24
If we're trying to compare systems, the benchmark should naturally run the same exact calculation each time.  The results would be valid, and I don't think it would be necessarily synthetic, given that we're guaging the processors ability to produce a measured result in a given time.  If there is a known outcome, we will get useful results in being able to know the time it takes to complete the task, like in SuperPi.  You select the amount of digits to compute and sit back until it finishes. 

Anyone have an opinion on that?  My vote is for the same calculation to be ran each time, and go for a timed result as a benchmark.

Could even get fancy and calculate the "particle timesteps per million CPU cycles" as described in Stephen's first posting of this thread, in addition to the time taken to complete.

I would love the benchmark to work like described here.  Any input?
[TA]z
2005-08-18 18:12:01
My same 3800+ X2 @ 2.6 GHz (216.7 FS is now running 585 kpts/sec.
[OCAU] badger
2005-08-18 19:39:36
My barton 2500+ (now running at 11.5 x 186) produced 237 kpts/sec last night.

I think this method of benchmarking is perfectly valid if sufficient simulations are produced.  I did an experiment earlier to see if running muon with a top 100 results file or not made any difference, there was less than 0.5% difference.
: contact : - - -
E-mail: sbstrudel characterstephenbrooks.orgTwitter: stephenjbrooksMastodon: strudel charactersjbstrudel charactermstdn.io RSS feed

Site has had 25159630 accesses.