Jump to content



Photo

CPU Optimizations


  • Please log in to reply
85 replies to this topic

#76 soniqstylz

soniqstylz

    Neowin Trophy Slore

  • Joined: 30-September 06
  • Location: In your panty drawer

Posted 27 April 2013 - 21:11

Pi is 1.34~


Or 3.14~


#77 YouWhat

YouWhat

    Neowinian

  • Joined: 13-March 03
  • Location: UK
  • OS: Windows 7
  • Phone: iPhone 4s IOS 6.1.2

Posted 27 April 2013 - 21:55

From a few different systems got access to....

real 0m30.470s
user 0m30.406s
sys 0m0.024s

Intel® Xeon® CPU X3360 @ 2.83GHz (4 Cores)
8GB Memory


real 0m22.688s
user 0m22.677s
sys 0m0.000s

Intel® Core™ i5-2320 CPU @ 3.00GHz (4 Cores)
16GB Memory


real 1m19.754s
user 1m19.653s
sys 0m0.004s

Intel® Atom™ CPU D2550 @ 1.86GHz (2 Cores)
cache size : 512 KB
4GB Memory



#78 Mindovermaster

Mindovermaster

    Neowinian Senior

  • Tech Issues Solved: 10
  • Joined: 25-January 07
  • Location: /USA/Wisconsin/
  • OS: Mint Debian LMDE
  • Phone: HTC ONE V

Posted 27 April 2013 - 21:56

That was old, mate.

#79 segfault

segfault

    Neowinian

  • Joined: 16-March 03
  • Location: Chile

Posted 28 April 2013 - 05:02

What's the difference between Funtoo and Gentoo?

Quote from funtoo.org
Funtoo Linux features native UTF-8 support enabled by default, a git-based, distributed Portage Tree and funtoo overlay, an enhanced Portage with more compact mini-manifest tree, automated imports of new Gentoo changes every 12 hours, GPT/GUID boot support and streamlined boot configuration, enhanced network configuration, up-to-date stable and current Funtoostages, all built using Funtoo's Metro build tool. We also offer Ubuntu Server, Debian, RHEL and Fedora-based kernels.

IOW, Optimized "from scratch" gentoo, git portage, currrent is based on ~x86 and ~amd_64 things like that

#80 Chris000001

Chris000001

    Neowinian

  • Joined: 25-June 05

Posted 28 April 2013 - 06:46

Linux Mint 14.1 live usb key
Intel® Core™ i7-3770K CPU @ 3.50GHz, 4800 MHz

real 0m13.728s
user 0m13.713s
sys 0m0.000s

#81 MrA

MrA

    b47d2b5288e3c77

  • Joined: 09-November 03
  • Location: Oz.

Posted 28 April 2013 - 07:18

Hi All,

I have been playing around with some Compiler Optimizations in the Linux Kernel. I want to know what your results are if you run this in a terminal.

Whats your Hardware and whats your kernel version?

time echo "scale=5000; a(1)*4" | bc -l

Care to elaborate on why? You're basically benchmarking bc and how good your compiler is. The kernel (or OS in general) is going to have little to no effect on the result.


Hardware

  • Sun UltraSparc T1 (Niagara) CPU clocked at 1.0 GHz
  • 8 GB of RAM

The T1 is a very interesting CPU architecture (terrible single threaded performance, god awful floating point, but great integer throughput). How'd you get your hands on one?

#82 n_K

n_K

    Neowinian Senior

  • Tech Issues Solved: 3
  • Joined: 19-March 06
  • Location: here.
  • OS: FreeDOS
  • Phone: Nokia 3315

Posted 28 April 2013 - 10:10

Has anyone got an IBM POWER5/6/7 server they could try this on?

#83 OP +ChuckFinley

ChuckFinley

    member_id=28229

  • Joined: 14-May 03

Posted 28 April 2013 - 13:57

Care to elaborate on why? You're basically benchmarking bc and how good your compiler is. The kernel (or OS in general) is going to have little to no effect on the result.



The T1 is a very interesting CPU architecture (terrible single threaded performance, god awful floating point, but great integer throughput). How'd you get your hands on one?


I disagree. I have already shown that compiling the latest 3.9 Kernels with CPU specific optimizations yeilds better results than the 3.8 Kernels. Also In the early part of this thread the two different Kernels were producing DIFFERENT number streams.

#84 Lant

Lant

    Neowinian Senior

  • Joined: 13-April 06

Posted 28 April 2013 - 14:44

I disagree. I have already shown that compiling the latest 3.9 Kernels with CPU specific optimizations yeilds better results than the 3.8 Kernels. Also In the early part of this thread the two different Kernels were producing DIFFERENT number streams.


The problem with this conclusion is that your results come from many different systems each with a different configuration. They may have different kernels, but different hardware or execution environment may be the cause of the difference.
To actually compare these results you would need to run your tests on the same hardware with the same configuration (OS, connected peripherals, etc...) and only vary the kernel version.
Run repeats to get more reliable timings, with both cold and warm starts, and only then could you start to draw any conclusions. At the moment all you can really say is that computer's owned by different people produce different results when calculating Pi and take different lengths of times to do so.

Also what do you want to measure? If you are just measuring the performance of a single process to complete a result then your results could be skewed. For example, you may miss an overall decrease in system performance if say the different compile optimisations increase the code size and make the task scheduling code too large to fit in the cache. By running a single high load application you may not hit this issue often enough to observer performance degregation. But in, say a webserver, multiple high load processes may end up performing worse due to this "optimisation". So you will need to run different tests to see the impact of any optimisation.

Of course this all depends on the hardware you are running on too, so you will need to run a benchmark on the different hardware with the same configurations you tested the first bit of hardware on.

In short, benchmarking performance is difficult. But, specifying what you mean by performance (latency, average completion time, total completion time, lateness, maximum time taken) will definitely help with benchmarking.

Various tests used to benchmark the kernel:
http://kernel-perf.s...about_tests.php
http://lbs.sourceforge.net/
https://wiki.archlin...hp/Benchmarking

#85 jjkusaf

jjkusaf

    Deadhead

  • Tech Issues Solved: 1
  • Joined: 19-January 03
  • Location: Prattville, Al
  • OS: Win 7 Pro x64

Posted 28 April 2013 - 15:26

Linux Mint 14.1 live usb key
Intel® Core™ i7-3770K CPU @ 3.50GHz, 4800 MHz

real 0m13.728s
user 0m13.713s
sys 0m0.000s


Same CPU (though running at 3465.49Mhz) and Linux...and running in VirtualBox

real 0m17.286s
user 0m17.125s
sys 0m0.140s

#86 +Karl L.

Karl L.

    xorangekiller

  • Tech Issues Solved: 15
  • Joined: 24-January 09
  • Location: Virginia, USA
  • OS: Debian Testing

Posted 28 April 2013 - 17:38

The T1 is a very interesting CPU architecture (terrible single threaded performance, god awful floating point, but great integer throughput). How'd you get your hands on one?


You can buy Sun Fire T2000's on ebay for $400-800. While their single threaded performance is truly awful (and apparently floating point as well), they run heavily multithreaded applications very well. I have been impressed with the Java and Qemu performance in particular. I can build i386, AMD64, and ARM packages faster in qemubuilder on that machine than natively on their respective architectures. (That is comparing the 1.0 GHz T1 to a 2.4 GHz Core 2 Quad for i386 and AMD64 builds. The ARM machine is slow enough that it's not really a fair comparison.)