Faster Old Computer Perplexes

Started by
7 comments, last by Josheir 7 years, 2 months ago

Hello,

I'm looking at some old game code that has a 2 to 3 graphic blits and controls an animation with keypresses and has many if then statements. The project works fine on my equipment : Win10, Win 7,and Win Vista. However I am suprised to notice that the oldest computer (Win Vista) is much faster than the Laptop and the windows 7 computer at displaying the animation.

Here is some data on two of the systems:

Acer Laptop

Operating system: Win 10 Home

Processor: intel core i7-6500U cpu @ 2.50 GHz 2.60 Ghz

Ram : 8 GB

System: 64 bits, x64 based processor

Dell desktop

Operating system: Vista Home Basic SP 2

Processor: Intel Celeron CPU 450 @ 2.20Hz 2.19 Ghz

Ram: 2.00 GB

System: 64 Bit OS

What I'm wondering is why in the world is the Dell (much older) outperforming the other two computers? Is there something I can do to make them all as quick as the Dell is?

Thank you,

Josheir

Advertisement
Is the laptop on battery or plugged in?

Also, things like memory speed affect blitting performance greatly. The specs you listed do not tell anything about that.

Niko Suni

Also, what GPUs do the systems have? Graphical performance is often tied to that.

Finally, how do you measure performance? Maximum fps becomes increasingly irrelevant after you exceed the monitor refresh rate. Have you run any benchmark suites on the machines?

Niko Suni

I think the key is to discover your bottlenecks.

Are you bottlenecked by a single processing thread? If you're on a single thread both CPUs have a similar clock speed, and any cores beyond the first won't help. If you are CPU limited then this could be a big part.

What else is going on in other processes? That older system doesn't have hyperthreading (it is a single processor only), so if your newer system has more background stuff going on that uses all the cores then hyperthreading may be harming the process of a single-threaded program. Again if you are CPU limited this can be a factor. Single CPU means no tests for cache consistency between processors, which could account for a tiny speed difference.

What about bus traffic on the motherboard? The older single-core machine is less likely to saturate the hardware bus if that is your bottleneck, and will similarly have less memory contention if those are your bottlenecks.

It probably isn't in issue, but on much older games and software there were big differences that are hard for newer machines to handle. For example, if the game were a DX5-era system and one system was a DX5-era card (it isn't in this case) then the older architecture may perform better even though newer cards can handle newer processes faster. This was also true when older graphics cards were focused on tiles and sprites but newer cards were focused on textures and meshes. Today's cards handle point clouds amazingly well, but need to work hard to emulate old APIs. If you are using extensive 2D APIs then perhaps the newer cards don't have a good hardware-accelerated way to implement the operations.

It could be memory speed where the program has tons of cache misses, and both have similar speeds for accessing main memory.

It could be disk speed where the program is spending most of its time waiting for disk reads, and both have similar speed hard drives.

It could be something else entirely, those are just guesses without seeing the systems in motion or seeing any measurements.

Figure out what the bottlenecks are. You'll likely find they're the same or similar bottleneck on both systems.

I think most of these bottlenecks (except waiting on disk, which would be a "Duh!" thing) can be ruled out rather safely because of how difficult these computes are in some key respects.

I happen to own a i6500U notebook (though from HP, not Dell) as well as an old Conroe Core-2 quad (which is the same generation but way faster than the Celeron). The Skylake notebook easily outperforms the Conroe desktop in every respect except accelerated 3D graphics (HD520/R7 M340 hybrid versus nVidia 650Ti). The same is (rather obviously) true comparing a 4 core HT i7-non-U Skylake desktop system to that laptop.

Much to my surprise, the Intel graphics card isn't nearly as bad as I would have expected, on the contrary -- if you turn AMD hybrid off to save a few watts, it's just very slightly worse. Good enough to use Blender, and even render with Cycles in any case (on a notebook, mind you!). I've even used HD graphics on the desktop for a while, and unless you really do hefty 3D stuff, you don't even notice something is missing. I eventually plugged in a nVidia 1060 because there's no such thing as reliable Vulkan support on Intel (so far, at least I couldn't make it out?). Vulkan apart, if you want to see a difference, you must do serious stuff at really high resolutions. I remember 10 years ago when gamers were boasting how their big ass top-of-the-notch graphics cards could run Oblivion fullscreen (that is, 1024x768) at almost acceptable rates. Well, Intel integrated graphics do Oblivion at maximum level of detail (which admittedly isn't much compared to present games) just fine on a 3840x2160 display. Anyway, that's stunning for a GPU that just happens to be accidentially tucked on top of the CPU. My compliments to Intel for making something that doesn't suck. Now if they only supported Vulkan, too...

Is it a surprise that the i7 Skylake laptop wins every comparison both hands down? The Conroe Celeron has a single core and a FSB rated at 800MT/s. The i7 has 2 cores with HT, higher clock rate, and uses DDR4-2133 over a 256bpc system agent (huh... FSB, what's that?!). So that's shuffling around about 3x as much memory in the same time, twice the number of cores, not counting hypercores, higher clock speed, and better instruction/clock ratio. Not to mention twice as much L2 cache and 4MB L3 cache compared to zero on the Celeron.

The only explanations that, in my opinion, make sense would be

  1. The blitting happens on a dedicated graphics card (seeing how the Celeron doesn't have an on-chip GPU, there must be one!) of which we know nothing so far. Presumably, this dedicated graphics card is an "enthusiast" card or such, and is faster, at least at blitting, than the HD520. Presumably it happens altogether within the graphics card's dedicated memory, and is done by dedicated texturing/rop hardware. Thus, the computer's speed really doesn't matter all that much.
  2. Windows 10, Cortana, and all the espionage stuff that's running in the background are sucking out your life. No surprise there, I've seen identical corporate laptops in direct side-by-side comparison, one still running Windows 7, the other having been "upgraded" to Windows 10 a week or two earlier. No matter what amount of propaganda Microsoft spreads about how awesome Windows 10 is, it cannot be discussed away that the people with the (identical) Windows 10 systems looking at their peers using Windows 7 are shouting: "Fuck, why is your computer so much faster????". No kidding.

All those are true, but programs written for a single thread only don't get any advantage when there are multiple cores. In this case the 2.6 GHz with 1 core versus 2.8 GHz with 2 HT cores gives very little difference, since both are giving a 2.6 or 2.8 GHz single processor. And since the described program was working with large images in main memory the caches will have very little effect, both predictors will stream in from main memory as fast as they can, but a poorly written algorithm that doesn't work with the cache or the prefetch predictors will perform nearly equally badly. The shared cache on the HT core with all those extra features as described could easily bump it down.

Again, it is up to finding the actual bottleneck, but I've worked with several programs that have had no significant performance improvements when adding cores, memory, faster disks, or anything else, because they are constrained by a single processor and tons of main memory manipulation.

Adobe Illustrator in particular comes to mind. After all these years they still haven't made the thing support multiprocessing, and for large data files it runs just as slowly on 6-year-old machines as it does on brand new high-end developer workstations. They write that it gets fastest if you have a full core dedicated to the program: "However, that’s the extent of Illustrator’s multi-threaded support. As such, a CPU that features multiple cores won’t offer any significant advantage over a 2-core processor. However, you will see the most significant performance increases with Illustrator when you use a CPU with a faster clock speed (a faster bus speed helps too). In other words, if you’re wondering if you should spend money on a faster chip with fewer cores or a slower chip with more cores, go with the faster chip." That isn't a no-name product from a no-name company, it's Adobe Illustrator.

Just like how Illustrator is graphics heavy and compute-intensive, all the extra cores and faster video cards won't make a bit of difference to performance, perhaps that is the same situation being described here where a new machine performs similarly to an 8-year-old single-core machine.

You know a lot about technology but I'm wondering if there might be something simpler here (of course I may be wrong.) The Acer monitor display supports many resolutions but there all 32 bit. The Dell supports 32 and 16 bits. I'm thinking maybe although the function SetDisplayMode isn't failing the application is still not working right for the 16 bit. I am assuming that a 32 monitor color depth only, means it shouldn't work for 16 bit and 8 bit which seems to be the case.

8 bit would probably work well but the program only displayed the 32 bit images when it was set (flaud.)

Would this sound accurate or should a 32 bit display display graphics of a lower bpp too?

Sincerely,

Josheir

All the application does it is blit a 300 x 300 image and then on top a 100 x 100 image. And than does it over and over again.

I just looked at the windows 7 and it does support listed resolutions so I'm going to play with that now.

Here is more data like you asked for:
In particular I noticed the system video being 0, and an option for digital display with Acer.
I have some performance data to compare the three systems:
Dell : 6.3 frames/sec
Emachine : 4.2 frames/sec
Acer : 3.8 frames/sec
I used a sample of 5 repeated actions 6 times each and took the average.
I just assumed the new Acer would be the best...is there a way to improve it?
Acer:
(The monitor displays the 1024 x 768 as part of the screen.)
Adapter: Intel HD graphics 520 up to 4160 MB Dynamic Video Memory
8GB DDR4 memory
graphics memory 4160 mb
dedicated memory 128 mb
system video memory : 0 mb ?
shared system memory: 4032
Display is "built in" with a different option for : Digital Display and Display Television ?
EMachine:
System Memory 2048 MB
Graphics card : Geforce 6150 NForce 430 - NIVIDIA -
Dell:
Intel 645/643 Express Chipset
memory installed 2048 MB
memory available 2014 mb
memory speed 800 mhz
memory technology ddr3 SDRAM
I guess DELL wins it!
Thank you everyone,
Josheir
Yes its plugged in.

I guess one moral to the story is if you don't get it have someone that understands choose the system for you... But geeze I dunno what a salesman would say about 2D blitting.

Josheir

Well it looks like I've got it now.

Thanks,

Josheir

This topic is closed to new replies.

Advertisement