Sign in to follow this  

Are C arrays passed by reference?

This topic is 3424 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Quote:
Original post by Iftah
Thanks.


I have to apologize, I got things wrong again. Updated post above on the previous page and now quoted bellow below >_<.

I can't even refer to the location of my own posts relative to each other properly >_<

Quote:
Quote:
As for the 3rd example, it will work if you wrote ptr_to_ptr[0] = (int*)42 right?

In this case, on my platform, yes.
Uhg, I'm really talking out of my ass tonight. On my platform, the 3rd line, with my updated snippet, behaves similarly to:

(*(int*)((*ptr_to_array)[0])) = 42;
or:
(*(int*)(array[0])) = 42;
or:
*((int*)42) = 42;
However, since both invoke undefined behavior, it'd be entirely valid for your compiler to make your program do something different.

Share this post


Link to post
Share on other sites
Quote:
Original post by MaulingMonkey
On my platform, the 3rd line, with my updated snippet, behaves similarly to:

(*(int*)(array[0])) = 42;
or:
(*(int*)((*ptr_to_array)[0])) = 42;

However, since both invoke undefined behavior, it'd be entirely valid for your compiler to make your program do something different.


Hmmm type punning. That has the nice tendency to break as soon as some compilers (*cough*gcc*cough*) optimize at non-trivial levels. The net result is reading from uninitialized memory at best.

Share this post


Link to post
Share on other sites
Quote:
Original post by fpsgamer
The key phrase in that quote is "decays to". "Array" is a distinct type in C++, and most importantly is not a pointer. Arrays however are implicitly convertible to a pointer to their first element. This is the same reason why an int is not a double. The langauge simply provides an implicit int->double conversion.


The "int vs double" argument is actually quite an unfortunate and wrong example. Ints and doubles have very different layouts in memory, and to convert from one to the other requires actual work.
A pointer-to-int and an array-to-int are nearly indistinguishable in memory except for the level of indirection. The array points to a memory location where integer values are stored, the pointer points to an address that points to a memory location where an integer value is stored. In the pointer case, the application first has to fetch the actual address stored in the pointer, while the array case has the address encoded right into the array variable.

The source example and its disassembly below help to demonstrate that.

Example:

int int_array[4] = { 1, 2, 3, 4};
int (*ptr_to_array)[4] = &int_array;
int *int_ptr = int_array;

int_array[0] = 4;
int_array[1] = 3; // To show the difference between the first and other elements.
(*ptr_to_array)[1] = 3;
*(int_ptr+2) = 2;
int_ptr[3] = 1;



Disassembled example:

; 13 : int int_array[4] = { 1, 2, 3, 4};

mov DWORD PTR _int_array$[ebp], 1
mov DWORD PTR _int_array$[ebp+4], 2
mov DWORD PTR _int_array$[ebp+8], 3
mov DWORD PTR _int_array$[ebp+12], 4

; 14 : int (*ptr_to_array)[4] = &int_array;

lea eax, DWORD PTR _int_array$[ebp]
mov DWORD PTR _ptr_to_array$[ebp], eax

; 15 : int *int_ptr = int_array;

lea eax, DWORD PTR _int_array$[ebp]
mov DWORD PTR _int_ptr$[ebp], eax

; 16 :
; 17 : int_array[0] = 4;

mov DWORD PTR _int_array$[ebp], 4

; 18 : int_array[1] = 3; // To show the difference between the first and other elements.

mov DWORD PTR _int_array$[ebp+4], 3

; 19 : (*ptr_to_array)[1] = 3;

mov eax, DWORD PTR _ptr_to_array$[ebp]
mov DWORD PTR [eax+4], 3

; 20 : *(int_ptr+2) = 2;

mov eax, DWORD PTR _int_ptr$[ebp]
mov DWORD PTR [eax+8], 2

; 21 : int_ptr[3] = 1;

mov eax, DWORD PTR _int_ptr$[ebp]
mov DWORD PTR [eax+12], 1

Share this post


Link to post
Share on other sites
Quote:
Original post by MadKeithV
A pointer-to-int and an array-to-int are nearly indistinguishable in memory except for the level of indirection. The array points to a memory location where integer values are stored, the pointer points to an address that points to a memory location where an integer value is stored.


No. The array is a memory location (rather, a sequence of adjacent memory locations) where integer values are stored. The pointer is an address of a memory location where an integer value is stored (with the note that this location might actually be within an array, especially at the beginning). Alternatively said, it "points at" that memory location. There is also no array-to-int; it is an array-of-int.

Share this post


Link to post
Share on other sites
Quote:
Original post by DevFred
Quote:
Original post by gharen2
Personally, I wish more classes taught about pointers early.

The problem is: IMHO there are no simple examples where pointers make sense (without arrays and dynamic allocation), except for emulating pass by reference (by passing pointers by value). Maybe you have a good example?


I think the best solution is to not start with a language like C. Start with something like Ruby where everything is a pointer. Then, when you get to variables in C, people will be wondering why you'd bother with something that wasn't a pointer. At least, that's what I thought when I started learning C after Smalltalk and Java.

If you want to stay in C, I think a good way would be to not start with stuff like declaring a variable in main(). Have them interact with an API from the start and have the API only interact through pointers. Maybe it doesn't "make sense", but it doesn't need to because the only reason it needs to make sense is to avoid questions like, "Why use pointers when variables are so much easier?" If you never told them about hammers, they won't question why you're solving the problem with a screwdriver.

Share this post


Link to post
Share on other sites
Quote:
Original post by Way Walker
I think the best solution is to not start with a language like C. Start with something like Ruby where everything is a pointer.


This is a very... disturbing way to describe the language. I think what you mean to say is that Ruby variables have reference semantics rather than value semantics. (The same is true in Python, and for non-primitives in Java, and, if I understand correctly, for class-as-opposed-to-struct types in C#.)

Quote:
Then, when you get to variables in C, people will be wondering why you'd bother with something that wasn't a pointer. At least, that's what I thought when I started learning C after Smalltalk and Java.

If you want to stay in C, I think a good way would be to not start with stuff like declaring a variable in main(). Have them interact with an API from the start and have the API only interact through pointers. Maybe it doesn't "make sense", but it doesn't need to because the only reason it needs to make sense is to avoid questions like, "Why use pointers when variables are so much easier?" If you never told them about hammers, they won't question why you're solving the problem with a screwdriver.


I really don't think this is a good idea. The whole idea of reference and value semantics is actually quite simple to explain: reference semantics mean you share things by default, and value semantics mean you copy things by default. It so happens that in C and C++, when you want to share objects instead, you have to manage memory in addition to simply requesting that objects be shared. In the other languages, copying objects is straightforward, but it's a lot harder to realize that you actually need to copy something instead of sharing it. :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
Quote:
Original post by MadKeithV
A pointer-to-int and an array-to-int are nearly indistinguishable in memory except for the level of indirection. The array points to a memory location where integer values are stored, the pointer points to an address that points to a memory location where an integer value is stored.


No. The array is a memory location (rather, a sequence of adjacent memory locations) where integer values are stored. The pointer is an address of a memory location where an integer value is stored (with the note that this location might actually be within an array, especially at the beginning). Alternatively said, it "points at" that memory location. There is also no array-to-int; it is an array-of-int.


I agree that my wording was unfortunate and confusing - the main point of my post was to explain the disassembled code which is pretty unambiguous and supports what I meant (and what you said much more clearly).
-- for myself for not spending enough time editing my post, ++ for you for correcting my language :)

Share this post


Link to post
Share on other sites
Quote:
Original post by Way Walker
I think the best solution is to not start with a language like C.

I was specifically asked to give a two-week introductory course in C. The students have a Java background, so I don't have to spend too much time on explaining control structures - everything else pretty much starts at zero.


My outline for the first week is

day 1: K&R chapter 1 (Tutorial Introduction) minus external variables, I kicked #define in favor of constants, #include is treated as magic, plus some history on C

day 2: K&R chapter 2 (Types, Operators, Expressions) plus repetition of binary representation of numbers, two's complement and bitwise arithmetic

day 3: K&R chapter 3&4 (Control Flow & Functions and Program Structure), especially internal variables and recursion (Java programmers tend to not understand low level concepts like stack frames), minus the preprocessor

day 4: K&R chapter 5 (Pointers and Arrays), focus on the C declarator syntax (is that how it's called?) and typedefs

day 5: storage classes, external variables, dynamic memory allocation, multiple compilation units, linkage, header files (#include magic resolved)


I don't have an outline for the second week yet, chapters 6&7 (structures & IO) will surely find their way in, chapter 8 (The UNIX system interface) seems pretty useless to me.

I will spend one day on the preprocessor, and some time on legacy stuff like macros for constants or implicit function declarations, because sooner or later, the students will be confronted with legacy code.

Oh and of course I have to talk about security issues and undefined behavior a lot.

Share this post


Link to post
Share on other sites
Quote:
Original post by rip-off
If an array is really a pointer, then why does sizeof() lie? Why is pointer arithmetic disallowed? The only conclusion is that the array is not a pointer.


Arrays are pointers in ANSI C. The major difference between char *array, and char array[10] is that char *array is a 4 byte auto variable on the stack used to reference memory that's generally allocated on the heap. So sizeof(array), when array is declared as char *array, will return the size used on the stack, so sizeof(char *) in this instance.

The other declaration, char array[10], allocates 10*sizeof(char) on stack, so sizeof(array) will return 10*sizeof(char). This is because variables on the stack can have their size determined at compile time and this information is stored in the binary, which where sizeof get it's information.

Share this post


Link to post
Share on other sites
Quote:
Original post by jpilon
Arrays are pointers in ANSI C.


This subject has already been discussed to death. In ANSI C, pointers and arrays have different behaviors—for instance, being able to determine the size of an array even if it wasn't allocated on the stack, the differing memory layouts that prevent from casting an array to a pointer outside of the very strict semantics of decay, or the inability for an array to support pointer arithmetics without being decayed first. Given these differences, stating that arrays are points at the level of the language is just mistaken.

Share this post


Link to post
Share on other sites
Quote:
Original post by ToohrVyk
the inability for an array to support pointer arithmetics without being decayed first.

Arrays don't even support array indexing without decay to a pointer :)

Share this post


Link to post
Share on other sites
Quote:
Original post by MadKeithV
[...]A pointer-to-int and an array-to-int are nearly indistinguishable in memory except for the level of indirection.[...]
Where in the C standard is that stated? Implementation details are a wonderful thing, but it's important to distinguish what is (de facto) from what must be (de jure).

The same is true for those saying that pointers are integers holding addresses - addresses are semantically different from integers in a significant way, and the standard (C++ at least, but probably also C) doesn't require them to be manipulable in integer form (only that they can be converted to certain types and back again and still point to the same thing {if possible, depending on type sizes etc}).

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
Quote:
Original post by Way Walker
I think the best solution is to not start with a language like C. Start with something like Ruby where everything is a pointer.


This is a very... disturbing way to describe the language. I think what you mean to say is that Ruby variables have reference semantics rather than value semantics. (The same is true in Python, and for non-primitives in Java, and, if I understand correctly, for class-as-opposed-to-struct types in C#.)


If I had said it again, I'd probably have said, "where every variable is a pointer," but I don't know if that's more or less disturbing. [smile]

But, really, I don't see what's so bad about saying that. Do you think anyone misunderstood? My intent was only to emphasize the comparison between variables with reference semantics in Ruby et al. and understanding pointers (variables with reference semantics) in C. And I certainly don't think it's any worse than the claims I used to hear about Java not having pointers.

Quote:

I really don't think this is a good idea. The whole idea of reference and value semantics is actually quite simple to explain: reference semantics mean you share things by default, and value semantics mean you copy things by default.


Simple to explain, sure, but apparently hard to understand. Really, I'm just saying what worked for me. Pointers never gave me any trouble because that's where I started. I guess it was a trade off since it took me a little to understand why one would bother with something that didn't have reference semantics (they just seemed like neutered variables to me), but it's a trade off I'm satisfied with.

Share this post


Link to post
Share on other sites
Quote:
Original post by Way Walker
If I had said it again, I'd probably have said, "where every variable is a pointer,"

Wouldn't that be -sorry for the pun- pointless? If pointers point to variables, and all variables are pointers, then you have no payload data anymore, only pointers pointing to other pointers pointing to other pointers ad infinitum :)

Share this post


Link to post
Share on other sites
Quote:
Original post by DevFred
Quote:
Original post by Way Walker
If I had said it again, I'd probably have said, "where every variable is a pointer,"

Wouldn't that be -sorry for the pun- pointless? If pointers point to variables, and all variables are pointers, then you have no payload data anymore, only pointers pointing to other pointers pointing to other pointers ad infinitum :)


I think the language of the C standard would be to say that pointers point to objects.

Also, if you're teaching to a group that already knows Java, then they should already be familiar with the concept of value vs. reference semantics and why one would need reference semantics since, last I checked (it's been a while) all primitives in Java have value semantics, all classes have reference semantics, and one wraps a primitive in an object (float vs. Float) to get reference semantics (so the idea of pointer-to-float should be familiar). The only sticking point should be the array vs. "pointer to the first element of an array" thing.

Share this post


Link to post
Share on other sites
Quote:
Original post by jpilon
Quote:
Original post by rip-off
If an array is really a pointer, then why does sizeof() lie? Why is pointer arithmetic disallowed? The only conclusion is that the array is not a pointer.


Arrays are pointers in ANSI C. The major difference between char *array, and char array[10]


Are you even reading what you're writing? If arrays were pointers, there wouldn't be a difference between the two things. There is, so they aren't.

Quote:
is that char *array is a 4 byte auto variable on the stack used to reference memory that's generally allocated on the heap. So sizeof(array), when array is declared as char *array, will return the size used on the stack, so sizeof(char *) in this instance.


Not really. There are a lot of reasons why it would refer to memory that isn't heap-allocated. In particular, if it points at a string literal, the allocation is in yet another place, "static storage".

Quote:
The other declaration, char array[10], allocates 10*sizeof(char) on stack, so sizeof(array) will return 10*sizeof(char). This is because variables on the stack can have their size determined at compile time and this information is stored in the binary, which where sizeof get it's information.


No. If the 'char array[10]' is a member of a struct, and the struct is allocated on the heap (with new - ok, technically 'new' allocates on the freestore rather than the heap - malloc(), or indirectly by being a member of something else, or in whatever way), then the array is allocated on the heap. Parts of a struct can't magically teleport themselves to the stack. And sizeof() will still return 10 in this case. (By the way, sizeof(char) is 1 by definition.)

And you also completely missed the point about pointer arithmetic. You can't assign a pointer to an array, which is a pretty good sign that it's something different. You could try to just pretend it's a const pointer, but by the time you're done explaining all the differences, you're left with "See? All an array is, is a pointer... which is const... and magically carries its pointed-at data around with it... and doesn't actually take up space for a value to point at the pointed-at data... and reports its size as the size of the pointed-at data." It may indeed be the case that it gets converted into an ordinary pointer if you so much as breathe on it, but it's very, very clearly a different thing.

Share this post


Link to post
Share on other sites
Quote:
Original post by Way Walker
But, really, I don't see what's so bad about saying that. Do you think anyone misunderstood?


It leads to DevFred's objection, and also leaves people wondering exactly what a pointer is in that context. There are more things you can do with a C pointer than are done behind the scenes by any pointers in the Ruby implementation. It also doesn't make logical sense, because it confuses the map for the territory: the object isn't actually the pointer.

It would be more accurate to say "everything is indirected through a reference implicitly", but at that point, you might as well just say "everything uses reference semantics".

Quote:
and understanding pointers (variables with reference semantics) in C.


But they aren't. C's pointers are far hairier beasts.

Quote:
And I certainly don't think it's any worse than the claims I used to hear about Java not having pointers.


LOL. Marketing is marketing. I still wonder why they decided to call it a NullPointerException at the same time.

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
Quote:
Original post by Way Walker
But, really, I don't see what's so bad about saying that. Do you think anyone misunderstood?


It leads to DevFred's objection, and also leaves people wondering exactly what a pointer is in that context. There are more things you can do with a C pointer than are done behind the scenes by any pointers in the Ruby implementation. It also doesn't make logical sense, because it confuses the map for the territory: the object isn't actually the pointer.


Well, DevFred's objection comes from seeing the variable as the object itself, which I think is the main trouble people face in wrapping their mind around reference semantics. I will admit, though, that what I suggested probably leads to the opposite problem. It's the path I took (I cut my teeth on Smalltalk) and I had trouble wrapping my mind around value semantics.

Also, however hairy pointers may be, I believe they're the correct answer to, "How do I get something like a Java reference in C?" (C++ is another matter, with its 31 flavors of reference semantics.)

Share this post


Link to post
Share on other sites
Quote:
Original post by Extrarius
Quote:
Original post by MadKeithV
[...]A pointer-to-int and an array-to-int are nearly indistinguishable in memory except for the level of indirection.[...]

Where in the C standard is that stated? Implementation details are a wonderful thing, but it's important to distinguish what is (de facto) from what must be (de jure).

I fully understand where you are coming from and continuing the discussion is a bit foolhardy of me, but it's summer and I like getting deep and dirty with the technical details and learning new things about my most love/hate programming language :) !

I had to look all that's below up, my original statement was pretty "seat of my pants", but I believe the standard does support my gut feeling.

The example below concerns multidimensional arrays but since multidimensional arrays are collections of single-dimension arrays I believe the comment extends to those.
"The C++ Standard", Appendix C, Section 7.2:
"The array is simply 15 ints that we access as if it were 3 arrays of 5 ints. In particular, there is no single object in memory that is the matrix - only the elements are stored."
As discussed here already, the name of the array can be used as a pointer to the first element (actual language from The C++ Standard book), and also taking a pointer to the element one beyond the end of an array is guaranteed to work though you cannot read or write the data at that location. Combine that with the definition of pointers and pointer arithmetic rules and that makes a pretty solid case that in any non-esoteric C++ implementation the main part of an array is a contiguous set of its contained type.
The C++ Standard even mentions that taking a pointer to one element before the start of the array is undefined behavior because certain architectures would allocate arrays at the beginning of a memory segment making "one less than the start" an illegal address.

This seems to be a case of "if it moves like a duck and quacks like a duck...". Once you get down and dirty and past the compiler's type system, arrays and pointers should be pretty similar, because you're dealing with "a bunch of memory" or "a bit of memory", and even the standard seems to support that notion.

On the other hand I (and Bjarne Stroustrup) would still advocate using std::vector instead simply because it's so low level and only lightly protected by the type system. I was pleasantly surprised at how many times in the book Bjarne warns against using arrays in favor of the std library.

[Edited by - MadKeithV on July 31, 2008 4:16:04 AM]

Share this post


Link to post
Share on other sites

This topic is 3424 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this