About computer instruction in relation to RAM consumption

Started by
32 comments, last by cgrant 9 years, 3 months ago

1) Why does a program(i.e. Internet Browser) gets loaded into the RAM(I think it is called lv1 cache from the CPU correct me if I am wrong) from hard disk? Why not just access it from the hard disk since that is where the program originated from after installation?

2) Does a computer instruction that takes longer to execute use more RAM?

3) If an instruction needs to create a stack frame does it use RAM and when the stack frame gets popped off, is that when RAM gets released?

4) Since RAM does not seem to be an issue (I assume even people with a tight budget use Windows 7 or a Mac, why is there a need to optimize a function when there is a bottleneck?

5) Are all bottlenecks linked to consuming too much RAM or is it much more than that? I have not experienced a bottleneck (maybe it is because I never used a profiler before or maybe I just know how to write computer instruction that supposedly used less RAM? if that makes any sense)

6) To access the data in RAM, the processor takes a nanosecond scale. Meanwhile, to access the data on the HDD, the time taken in milliseconds scale. I'm confused I thought millisecond or 0.001 was MORE than nanosecond or 0.000000001. Shouldn't accessing RAM take less time than accessing from HDD? Here is the source: http://new-ones.blogspot.com/2012/08/ram-function-for-performance-pc-and.html

7) How much CPU usage should a video game generally used up? I programmed a 2D RPG game and it uses up 15-17% CPU usage and 100 MB of RAM based on the data I am seeing from my Macbook Air's Activity Monitor.

Edit: I hate anonymous down votes...I just want to learn on a deep level

Advertisement
Your question number 1 means you don't understand anything about how a computer works. Even a Google search for "how does a computer work" will probably teach you that a processor reads its program from RAM. So a program needs to be loaded there.

I think you should probably do some reading and try to answer the other questions on your on.

I don't know what to tell you about question number 6.
#1: In order to execute an instruction, the instruction needs to be loaded into the CPU. If the instruction is on a hard drive, in the worst case it means all of the following have to occur: Hard drive spin up, hard drive seek, hard drive read, bus data transfer to RAM, RAM transfer to CPU cache(s), instruction load from cache, instruction execution. The hard drive is a physical machine with moving parts, which makes it slower than purely electronic systems like RAM. If the program is already in RAM then the first, slowest parts of the process can be skipped and the process goes much faster.

#2: Ignoring instruction fetch bandwidth and cache misses, the speed of instruction execution does not depend on its encoding length.

#3: Usually a lot of stack space is reserved, and stack activity is VERY rapidly moving the stack pointer up/down within the reserved space as pushing and popping occurs - the space reserved does not typically change unless the reserved area needs to expand, and then the expansion is typically permanent (until the process exits).

#4: RAM is fast, but not infinitely fast. Compilers are smart, but aren't perfect. Algorithms may be inefficient due to human mistakes.

#5: There are bottlenecks everywhere. You'll eventually run into some smile.png

#6: That's right. More time taken per operation = fewer operations per time = slower.

#7: Real-time games are typically written to consume 100% of each core they can run on. It sounds like you have either a 4-core hyperthreading processor or a 6-core processor and are using one core.

Ignoring instruction fetch bandwidth and cache misses, the speed of instruction execution does not depend on its encoding length.

What do you mean by encoding length? So what does speed of instruction depend on? Is it based on the algorithm analysis?


Real-time games are typically written to consume 100% of each core they can run on.

Wait, won't using up 100% of the core aka the CPU be bad for real time games even for the simple game like Pac-Man? The games I wrote use update, draw, and sleep the application for some time to give the CPU some breathing room.


That's right. More time taken per operation = fewer operations per time = slower.

I don't know why the source says access from RAM takes nanosecond scale and access from HDD takes millisecond scale. I thought nanosecond is much longer time than millisecond. Shouldn't the length of time for RAM and HDD be swapped?


Usually a lot of stack space is reserved

How much stack space the computer gets is dependent on the amount of RAM of the computer? How can I find out how much stack space is reserved? Is it important to know the amount?

I thought nanosecond is much longer time than millisecond



http://en.wikipedia.org/wiki/Metric_prefix

As Alvaro said you should read up on how computers work. But here goes.

1. Theoretically I'm sure someone could make a CPU that gets instructions from a HDD instead of RAM, but in that case the CPU would just be treating the HDD as very slow RAM. You may be thinking that the program should be separate from the data instead of treated as data. This is actually done in some architectures, mostly for microcontrollers where there will only ever be one program and no OS. In this case the entire program would be stored in Flash (often set to read-only after the program is loaded) and the CPU would treat Flash as being from memory address 0xEFFF to 0xFFFF. and RAM from 0x0000 to 0x1000, and IO and configuration registers from 0x1000 to 0x1100 (all values made up and not based on any particular architecture).

2. There really isn't a relationship between size of an instruction and time to execute it, especially with modern processors which do very complicated things with reordering instructions anyway. There might be benefit to smaller instructions so there are fewer cache misses if the code itself takes up less RAM.

3. Adding something to the stack takes up stack RAM, but the RAM used by the stack is dedicated to the stack, usually at a preallocated size. Your program isn't supposed to ever use the RAM allocated to the stack except when used as the stack. So even if your program pops the data from the stack your program still won't use it as generic RAM. There are likely ways to change your program's stack size at the linkage stage. For example, read this MSDN article for the MSVC++ way to do this.

4. RAM is always a potential issue. RAM is probably not the first place most people will look when worrying about speed, but it is potentially something that could help. Sometimes using more RAM will make things faster (by using an algorithm which needs more RAM) sometimes using less will help (perhaps if you get more cache misses because of too much RAM being used).

5. Bottlenecks can be processor related or GPU related or bandwidth related or RAM related. They all play their parts depending on what you're doing and the system the code is running on.

6. I think you're blatantly misunderstanding this. A millisecond is more than a nanosecond. You're confusing speed (a rate) with time to accomplish something. If I can run a mile in 4 minutes that's the same as saying I can run a mile at 15MPH which is faster than someone who can run a mile in 6 minutes or 10 MPH. Lower time, faster rate.

7. It could be that your program is very efficient or very inefficient. Since we don't know what it's doing or even what kind of processor it is we can't possibly guess. If you made a basic clone of Super Mario Bros from the NES with the same resolution and type of pixel art and it was taking up 100MB and used 14% of a modern CPU that might be a sign your code isn't very efficient. If your 'clone' used much higher resolution sprites with 3d effects, more enemies, larger maps, and more eye candy then 100MB and 14% CPU might be decent. We can't know given the information you gave.

C++: A Dialog | C++0x Features: Part1 (lambdas, auto, static_assert) , Part 2 (rvalue references) , Part 3 (decltype) | Write Games | Fix Your Timestep!


Ignoring instruction fetch bandwidth and cache misses, the speed of instruction execution does not depend on its encoding length.

What do you mean by encoding length? So what does speed of instruction depend on? Is it based on the algorithm analysis?

The encoding length is the byte-length of the instruction. The time it takes to execute an instruction depends on how complex it is. Simple bit-wise operations and integer instructions are typically fast, and more complex operations such as floating point square roots take longer time to execute.


Real-time games are typically written to consume 100% of each core they can run on.

Wait, won't using up 100% of the core aka the CPU be bad for real time games even for the simple game like Pac-Man? The games I wrote use update, draw, and sleep the application for some time to give the CPU some breathing room.

Unless you have bad cooling causing the CPU to overheat if you go at full speed for a long time, there's no reason to conserve on CPU usage as far as the CPU itself is concerned. On the other hand, over-utilizing the CPU for no reason just consumes, say, more batter power, or takes CPU time from other applications. But it's not like you have to pause your program to let the CPU recover from a short sprint like we humans have to do.


That's right. More time taken per operation = fewer operations per time = slower.

I don't know why the source says access from RAM takes nanosecond scale and access from HDD takes millisecond scale. I thought nanosecond is much longer time than millisecond. Shouldn't the length of time for RAM and HDD be swapped?

You got it right in your first post: RAM access is on the order of nanoseconds, which you said was 0.000000001; HDD access is on the order of milliseconds, which you said was 0.001. Clearly RAM access if faster if it takes less time.


Usually a lot of stack space is reserved

How much stack space the computer gets is dependent on the amount of RAM of the computer? How can I find out how much stack space is reserved? Is it important to know the amount?

Typically on the order of 1 MB or so. I'd say that you don't need to know the size, other than a ball-park figure so you know what to allocate where (on the stack or via dynamic allocations).

Moderator note: I've merged a few posts made by the OP in reply - please don't quote single segments of a post like that in future, it isn't needed.

It could be that your program is very efficient or very inefficient. Since we don't know what it's doing or even what kind of processor it is we can't possibly guess. If you made a basic clone of Super Mario Bros from the NES with the same resolution and type of pixel art and it was taking up 100MB and used 14% of a modern CPU that might be a sign your code isn't very efficient. If your 'clone' used much higher resolution sprites with 3d effects, more enemies, larger maps, and more eye candy then 100MB and 14% CPU might be decent. We can't know given the information you gave.

It's a RPG Game: everything of the game(2 map, 2 npcs of map # 1, 9 monsters of map #2, quest and dialogue system, save system, character data and animation files) is loaded ahead of time and is drawn and executed based on key or mouse commands

gets loaded into the RAM(I think it is called lv1 cache from the CPU correct me if I am wrong)


Level 1 Cache is not at all the same thing as RAM.

A CPU has a small amount of very, very fast memory embedded in it. That's the CPU cache. RAM is a separate bank of memory that is much slower (though still orders of magnitude faster than disk or network access).

Programs do not control what is in cache or not. RAM is accessed in larger chunks than individual bytes, and those chunks get saved in the cache. This results in there being a strong preference for accessing memory locations that are close together as that can result in a single RAM access instead of multiple; since cache is so much faster than RAM, that makes the program faster overall.

is that when RAM gets released?


RAM is almost never released from a program once it starts running. The memory that a program allocates typically stays in its internal memory allocator for later reuse. This is one of the reasons you see programs' memory usage grow over time but very rarely see it shrink.

5) Are all bottlenecks linked to consuming too much RAM or is it much more than that?


Much, much more than that. Using too much RAM is almost never a performance problem, but _accessing_ RAM (see the above comments on cache) are a common one.

I never used a profiler before


Learn. Stop writing forum posts. Stop writing code. Stop doing anything other than learning to use a profiler. The only tool more important to you is a debugger.

6) To access the data in RAM, the processor takes a nanosecond scale. Meanwhile, to access the data on the HDD, the time taken in milliseconds scale. I'm confused I thought millisecond or 0.001 was MORE than nanosecond or 0.000000001.


That's correct.

1/1,000 (milli-, or one thousandth) is _bigger_ than 1/1,000,000,000 (nano-, or one billionth).

RAM access operates at a smaller scale (nanoseconds, meaning that you can access RAM roughly billions of times every second) compared to disk access (milliseconds, meaning that you can access disk roughly thousands of times every second). Smaller operation time means you can do more operations in a given time (i.e. that you're faster).

Sean Middleditch – Game Systems Engineer – Join my team!

This topic is closed to new replies.

Advertisement