Sign in to follow this  
Ned_K

Theory question on C and array passing

Recommended Posts

Ned_K    175
C is purely pass by value, though some say it is pass by reference when it comes to passing arrays. But not really, correct? Since pointers are pass by value, i.e fun(int * p) receives a copy of the caller's pointer (pointing to the same location as the caller's pointer), then when an array decays into a pointer, the pointer passed into the function is also a copy (like any other pointer that is passed). The only magic is the array notation decaying into a pointer behind the scenes. Right? Thanks.

Share this post


Link to post
Share on other sites
Ned_K    175
Quote:
Original post by Hodgman
Yes, arrays are just pointers behind the scenes. So when passing an array by value, you're really just passing the pointer by value, which means the array-data itself is not copied.


And the pointer to the elements of the array received from the caller is a normal pointer (that can be legally assigned to), i.e. not a second class array-type "pointer" that cannot be assigned to, correct? And this pointer can be made to point anywhere like any other pointer, not just to the first element of the array?

[Edited by - Ned_K on February 28, 2008 11:47:33 PM]

Share this post


Link to post
Share on other sites
ToohrVyk    1595
Quote:
Original post by Hodgman
Yes, arrays are just pointers behind the scenes.


This is not entirely correct: at the implementation level, arrays will behave slightly differently from pointers. Accessing a[i] when a is an array addsi to the address of a and dereferences, whereas p[i] when p is a pointer will dereference the address of p, adds i to that value, and dereferences again.

Share this post


Link to post
Share on other sites
gsg    157
Quote:
whereas p[i] when p is a pointer will dereference the address of p, adds i to that value, and dereferences again.

Sorry, this is not correct. a[i] is nothing more than syntactic sugar for *(a + i). C requires that for this to make sense, one of those things be a pointer and the other an integer value. That's it.

Note that *(a + i) is symmetrical, meaning that you can do strange things like this:


char str[] = "foo";
printf("%c %c %c", 0[str], 1[str], 2[str]);



Which would be impossible if what you were saying was correct.

Share this post


Link to post
Share on other sites
exwonder    100
Quote:
Original post by ToohrVyk
Quote:
Original post by Hodgman
Yes, arrays are just pointers behind the scenes.


This is not entirely correct: at the implementation level, arrays will behave slightly differently from pointers. Accessing a[i] when a is an array addsi to the address of a and dereferences, whereas p[i] when p is a pointer will dereference the address of p, adds i to that value, and dereferences again.


lolwut?

You're saying that p[i] is *((*p) + i). It's *(p + i).

I suppose you mean that the compiler knows that a[i] will always be in the same place, while p[i] will not, so it can skip the actual addition in the array's case but not in the pointer's case?

Share this post


Link to post
Share on other sites
ToohrVyk    1595
Quote:
Original post by gsg
Sorry, this is not correct.


#include "discussion.hpp"

Quote:
a[i] is nothing more than syntactic sugar for *(a + i). C requires that for this to make sense, one of those things be a pointer and the other an integer value. That's it.


The existence of array decay completely obviates your argument: since arrays automatically become pointers when used in a pointer context, any pointer-only construct you exhibit (including *(a+i)) has no bearing on the actual nature of arrays, merely on their ability to become pointers on demand.

Share this post


Link to post
Share on other sites
ToohrVyk    1595
Quote:
Original post by exwonder
You're saying that p[i] is *((*p) + i). It's *(p + i).


Nope, this is not what I'm saying. My post stated that I was discussing the implementation level of the language, and p[i] is not implemented as either *((*p) + i) or *(p + i), because it would make absolutely no sense for a C compiler to generate C code.

p[i] is usually implemented as:
	mov	eax, DWORD PTR _i$[esp-4]
mov ecx, DWORD PTR _p$[esp-4]
mov eax, DWORD PTR [ecx+eax*4]
ret 0


Whereas a[i] is usually implemented as:
	mov	eax, DWORD PTR _i$[esp-4]
mov eax, DWORD PTR _a$[esp+eax*4]
ret 0


Notice the extra dereference in the first case.

Share this post


Link to post
Share on other sites
exwonder    100
Quote:
Original post by ToohrVyk
Nope, this is not what I'm saying.


Yes I realized that you were correct between posts, but your terminology was poorly chosen. "Dereference" when talking about pointers in C implies something quite different than what you meant to the casual reader.

I guess I'd restate what you're saying as "accessing a pointer by index creates one additional add instruction and one additional load from memory when compared to array indexing", but the assembly you pasted clears it up as well.

Edit: "And possibly one extra multiply depending on the type of your pointer."

[Edited by - exwonder on February 29, 2008 4:07:46 AM]

Share this post


Link to post
Share on other sites
King Mir    2490
Quote:
Original post by Hodgman
Yes, arrays are just pointers behind the scenes.
This is not accurate. Arrays identifiers decay to pointers when use in pointer context, but you statement makes it sound like an array is a pointer on the stack that points to a memory block in the heap. This is not the case. Only the memory is on the stack, and no special pointer is saved to it, except the offset from function stack frame.

In fact it is possible to pass arrays as array references, which do not decay to pointers.

Share this post


Link to post
Share on other sites
chairthrower    440
Quote:
Yes, arrays are just pointers behind the scenes.


This is a perfectly fine statement that accurately captures what goes on - excepting pretty complex stuff about a c/c++ compiler's handling of declarator syntax.

Quote:
[...] makes it sound like an array is a pointer on the stack that points to a memory block in the heap. This is not the case. Only the memory is on the stack, and no special pointer is saved to it, except the offset from function stack frame.


func() { static int ia[ 100] }

The contiguous memory for the symbol 'ia' is definately not on the stack.

Share this post


Link to post
Share on other sites
rip-off    10976
Quote:
Original post by chairthrower

func() { static int ia[ 100] }

The contiguous memory for the symbol 'ia' is definately not on the stack.


So? "static" variables are different, was that your point? The most common use case for arrays is on the stack, or as part of an aggregate type.

Share this post


Link to post
Share on other sites
Enigma    1410
Quote:
Original post by chairthrower
Quote:
Yes, arrays are just pointers behind the scenes.
This is a perfectly fine statement that accurately captures what goes on - excepting pretty complex stuff about a c/c++ compiler's handling of declarator syntax.
An array is a contiguously allocated non-empty set of objects of a particular type. The location of the array is implicitly known to the program during execution, usually relative to the current stack. A pointer is an indirect descriptor of an object at an unknown location. While arrays and pointers share many similarities in many contexts (often due to the ease with which an array will decay to a pointer to its first element) there are also many important differences. They are most assuredly not the same thing, either behind the scenes or in front of them.

Σnigma

Share this post


Link to post
Share on other sites
chairthrower    440
Maybe it would be better to think of decay as applying to the expression operators & and =

int ia[ 100];
int *pa = ia; // decay
void func() { }
void (*pfunc)();
pfunc = func;
pfunc = &func; // decay

And think of arrays in functions signatures such as,

void func( int ia[ 100]) { }

as 'mislabled pointers' mascarading as arrays since the location of the contiguous memory is *not* known at compile or runtime unlike other use-cases of arrays where the array memory is predetermined either in the stack frame or in some binary data section of the object file. The argument that "parameter arrays" are not really arrays but are pointers seems to be pretty reasonable if we are considering the problem from a low level assembly/implementation point of view.

The counter argument to this is from a language and higher level semantic perspective. From this point of view we can consider that the parameter declarator typing is fully explicit (it looks just like an array). eg sizeof( ia) is known and behaves just as we expect. In all cases it is a normal array declarator and responds in the same way as normal arrays to the application of & [] and = operators.

Irrespective of whether a parameter array declarator should propertly be characterised as an 'array' or not - the compiler is distinguishing (and hiding from the programmer) a parameter array's behaviour from 'normal' arrays at both high (semantic) and low (assembly output) levels
<edit for clarity>

Share this post


Link to post
Share on other sites
Enigma    1410
Quote:
The argument that "parameter arrays" are not really arrays but are pointers seems to be pretty reasonable if we are considering the problem from a low level assembly/implementation point of view.

The argument that "parameter arrays" are not really arrays but are pointers is explicit in the C++ Standard:
Quote:
C++ Standard, Section 8, Paragraph 3
<snip /> After determining the type of each parameter, any parameter of type "array of T" or "function returning T" is adjusted to be "pointer to T" or "pointer to function returning T," respectively. <snip />
Quote:
The counter argument to this is from a language and higher level semantic perspective. From this point of view we can consider that the parameter declarator typing is fully explicit (it looks just like an array). eg sizeof( ia) is known and behaves just as we expect. In all cases it is a normal array declarator and responds in the same way as normal arrays to the application of & [] and = operators.
As evidenced above, this counter argument is incorrect.

Σnigma

Share this post


Link to post
Share on other sites
Enigma    1410
Apologies for the rather brief argument above, I shall expand on it now that I have a bit more time. Firstly, I forgot we were dealing with C in this thread, not C++, so the above quote from the C++ Standard is only relevant in that it is a rephrasing of the language in the C standard (or at least the draft of which I have a copy):
Quote:
C Standard Draft, Section 3.7.1
<snip /> A declaration of a parameter as "array of type" shall be adjusted to "pointer to type," and a declaration of a parameter as "function returning type" shall be adjusted to "pointer to function returning type," as in 3.2.2.1. <snip />
Quote:
Original post by chairthrower
The counter argument to this is from a language and higher level semantic perspective. From this point of view we can consider that the parameter declarator typing is fully explicit (it looks just like an array). eg sizeof( ia) is known and behaves just as we expect. In all cases it is a normal array declarator and responds in the same way as normal arrays to the application of & [] and = operators.
Applying the above quote from the standard we can analyse the claims of this argument:
  • claim: sizeof(ia) is known and behaves just as we expect.
    refutation: sizeof(ia) behaves according to the adjusted type of ia, which is int *. I compiled the following program under Visual C++ 2008, MinGW gcc 3.3.1 and Borland 5.8.2:
    #include <stdio.h>

    void func(int array[7])
    {
    printf("%d\n", sizeof(array));
    }

    int main()
    {
    int array[7];
    printf("%d\n", sizeof(array));
    func(array);
    }
    The results were unanimous:
    28
    4
  • claim: In all cases it is a normal array declarator and responds in the same way as normal arrays to the application of & [] and = operators.
    refutation: The following program demonstrates that ia as a function parameter does not behave as a normal array:
    /*  1 */ int func(int ia[1000], int ib[2])
    /* 2 */ {
    /* 3 */ int * * iap = &ia;
    /* 4 */ ia = ib;
    /* 5 */ ++ia;
    /* 6 */ return **iap + *ia; // avoid unused variable warnings
    /* 7 */ }
    /* 8 */
    /* 9 */ int main()
    /* 10 */ {
    /* 11 */ int ia[1000];
    /* 12 */ int ib[2];
    /* 13 */ int * * iap = &ia;
    /* 14 */ ia = ib;
    /* 15 */ ++ia;
    /* 16 */ return **iap; // avoid unused variable warnings
    /* 17 */ }

    Visual C++ 2008:
    array_parameter2.c(13) : warning C4047: 'initializing' : 'int **' differs in levels of indirection from 'int (*)[1000]'
    array_parameter2.c(14) : error C2106: '=' : left operand must be l-value
    array_parameter2.c(15) : error C2105: '++' needs l-value

    MinGW gcc 3.3.1:
    array_parameter2.c: In function `main':
    array_parameter2.c:13: warning: initialization from incompatible pointer type
    array_parameter2.c:14: error: incompatible types in assignment
    array_parameter2.c:15: error: wrong type argument to increment

    Borland 5.82:
    Warning W8075 array_parameter2.c 13: Suspicious pointer conversion in function main
    Error E2277 array_parameter2.c 14: Lvalue required in function main
    Error E2277 array_parameter2.c 15: Lvalue required in function main
Σnigma

Share this post


Link to post
Share on other sites
chairthrower    440
Quote:
claim: sizeof(ia) is known and behaves just as we expect.
refutation: sizeof(ia) behaves according to the adjusted type of ia, which is int *. I compiled the following program under Visual C++ 2008, MinGW gcc 3.3.1 and Borland 5.8.2:



I had the belief that parameter array declarators behaved like normal arrays so ingrained in my mind that I didnt even bother to test the counter-example I gave before posting. hmmmmn and I even used to work on c/c++ compiler front ends about 10 years ago.

Thanks for taking the time to correct misinformation.

Share this post


Link to post
Share on other sites
SiCrane    11839
Quote:
Original post by rip-off
Quote:
Original post by polymorphed
So what you're basically saying, Enigma, is that:
Right?


Exactly.


Well, in theory anyways. In practice ... well, welcome to the world of compiler bugs. In MSVC, if you use:

void func (int myarray[5]) {}
void func (int * myarray) {}

in a single file, the compiler will complain that the function already has a body, as it should.

However, if you put the two definitions in different translation units, they will link despite the multiple definitions. The problem is that MSVC name mangles the two function signatures differently, so even though they should be the same function the different prototypes will resolve to different names.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this