Jump to content

  • Log In with Google      Sign In   
  • Create Account

Question about Data segment/Code segment


Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.

  • You cannot reply to this topic
5 replies to this topic

#1 TheComet   Members   -  Reputation: 1603

Like
1Likes
Like

Posted 28 January 2014 - 06:55 AM

Consider the following piece of code running on a 16-bit microcontroller (16 bit address bus width). My questions all concern what happens before main() is called.

unsigned short foo = 156;
const unsigned short bar = 12;

int main()
{
   return 0;
}

Is my understanding correct that:

  1. The address locations in the heap for foo and bar are known at compile time and are stored in the data segment of the binary file
  2. The sizes (memory consumption) of foo and bar are known at compile time and are stored in the data segment of the binary file
  3. The values of foo and bar are stored in the data segment of the binary file, separate from the size declaration of foo and bar
  4. Before main() is called, the memory for foo and bar is allocated and filled with their respective values on the heap, always at the same offsets
  5. In the case of an embedded system (microcontroller), memory space for "bar" is not allocated in RAM, but is directly read from ROM when required, since it was declared const and cannot change its value.
  6. In the case of an embedded system (microcontroller), memory space for "foo" is allocated in RAM and its value is copied from ROM into RAM before main() is called, from which the value can be read and written to when required later on.

My two main questions are:

  1. Without starting this program, how much "disk space" do "foo" and "bar" actually consume? There obviously has to be information on what they are, what value they have, and where they will be stored in memory.
  2. When hard-coding values (such as "foo=2"), how is that number "2" stored?

Thanks


YOUR_OPINION >/dev/null


Sponsor:

#2 Bluebat   Members   -  Reputation: 409

Like
4Likes
Like

Posted 28 January 2014 - 08:43 AM

Hi,

I assume you are talking about a microcontroller similar to AVR or PIC. Those have separate address space for code and data: when the program executes, instructions are read from flash and they can read/write SRAM. To read data from flash you have to use a special instruction, that is a bit slower: LPM (load program memory).

Since you only program flash memory and ram is volatile, you do not really have 'data segment' in the final binary image. So some notes:

1, 2: They are not on heap (heap is what malloc internally uses), they are given locations and sizes in SRAM, which you can see if you objdump the binary.

3, 4: Since SRAM is erased with no power, they must be initialized to 156 and 12 each time just before main() starts. The compiler will generate code that copies those values to correct locations and then calls main().

5: Compiler will optimize it and embed the constant into the instruction. But no LPM will be used unless you instruct the compiler to do so (there is special macro PROGMEM to define a constant / string that is stored in flash.

6: Correct

 

And your questions:

1: All initialized variables will usually be close together, so there is just (src_flash, dst_sram, size) for the whole block + the actual values. I'd say it's typical in any binary.

2: If you use some constant values / numbers in code, their value is stored inside the instruction itself. For example LDI reg, 8-bit-value:

1 1 1 0 K K K K h h h h K K K K LDI Rh,K

 



#3 Álvaro   Crossbones+   -  Reputation: 13322

Like
1Likes
Like

Posted 28 January 2014 - 10:00 AM

If you never take its address, `bar' could end up not occupying any memory at all.



#4 Bregma   Crossbones+   -  Reputation: 5133

Like
6Likes
Like

Posted 28 January 2014 - 11:13 AM

The address locations in the heap for foo and bar are known at compile time and are stored in the data segment of the binary file

Namespace-level variables are of static storage duration, not dynamic storage duration(the heap). A typical compiler on most systems I've seen will place initialized static storage values in the DATA segment, but it's possible they would have values in the DATA segment that are used to initialize locations in the BSS segment. That's pretty specific to the compiler and the binary file format used. Also, constants may be folded by the compiler into inline code or even completely elided.

The sizes (memory consumption) of foo and bar are known at compile time and are stored in the data segment of the binary file

Generally speaking, the size of the variables is stored implicitly in the generated code, and possibly in the BSS segment, depending on the specifics of the compiler and binary format.

The values of foo and bar are stored in the data segment of the binary file, separate from the size declaration of foo and bar

See the above two answers. The initial values are stored in the DATA segment.

Before main() is called, the memory for foo and bar is allocated and filled with their respective values on the heap, always at the same offsets

See above. The BSS segment may cause the loader to allocate memory. How that works is specific to the particular loader.

In the case of an embedded system (microcontroller), memory space for "bar" is not allocated in RAM, but is directly read from ROM when required, since it was declared const and cannot change its value.

Yes, maybe.

In the case of an embedded system (microcontroller), memory space for "foo" is allocated in RAM and its value is copied from ROM into RAM before main() is called, from which the value can be read and written to when required later on.

Yes, maybe.

Without starting this program, how much "disk space" do "foo" and "bar" actually consume? There obviously has to be information on what they are, what value they have, and where they will be stored in memory.

Possibly some, possibly none, likely a few bytes.

When hard-coding values (such as "foo=2"), how is that number "2" stored?

Probably inline in a machine-language instruction or series of instructions (depending on the architecture).
Stephen M. Webb
Professional Free Software Developer

#5 frob   Moderators   -  Reputation: 21323

Like
3Likes
Like

Posted 28 January 2014 - 11:54 AM

As Bregma points out, that information is highly dependent on other factors.


For example, it may be that a constant is used in the code but thanks to optimizations may be nearly eliminated. Instead of showing up as a single number in memory, it may show up as an additional calculation in the code rather than an object in memory, or as value left on the stack as a side effect from another operation. Just because you have a value in your C or C++ code does not mean it exists in any particular location within the executable.

Also note that compilers sometimes do seemingly strange tradeoffs. You might think it is better to keep a number constant as the number of times in a loop, but the compiler and optimizer might decide it is more efficient to unroll the loop; it requires more space but it will run faster. Your single byte of '12' might vanish into twelve copies of a loop requiring 280 bytes of space.

Is there a reason you are asking this question?
Check out my personal indie blog at bryanwagstaff.com.

#6 TheComet   Members   -  Reputation: 1603

Like
0Likes
Like

Posted 29 January 2014 - 05:54 AM

Thanks for all the feedback! I see it's highly hardware and compiler dependent, but I'm satisfied with the answers.

 

Is there a reason you are asking this question?

 

General curiosity. Programming micro controllers makes you realise how sparse you suddenly have to be in comparison to programming for PCs, given the limited hardware, and I simply wondered how the binary file was structured.

 

The particular device I'm working with is the dsPIC33FJ06GS001, in case anyone was wondering. It's one of Microchip's newer line of controllers for digital signal processing.


YOUR_OPINION >/dev/null





Old topic!
Guest, the last post of this topic is over 60 days old and at this point you may not reply in this topic. If you wish to continue this conversation start a new topic.



PARTNERS