Pointers/References

Started by
48 comments, last by WillC 19 years, 3 months ago
Hi all. There was recently a thread on this forum in which a few people seemed fairly adamant that references were preferable to pointers when passing mutable parameters by reference, so a function can modify them. I think there's probably some good grounds for their opinions, so I'm interested to know why. Personally, I generally prefer pointers for this as it seems to make the code I write more self-documenting. For example, a piece of example code was offered in response to a question in a different thread recently:
template <typename T>
byte* read(T& into, byte* buffer)
{
   into = *(reinterpret_cast<T*>(buffer_start));
   return buffer_start + sizeof(T);
}

//example use:
byteBuffer = read(intVal, byteBuffer);
I hope the original poster of the code doesn't mind me borrowing it :o) The code is a good solution to a simple problem, but personally I would normally pass 'into' as a pointer, rather than a reference, so it looks like this when used:
byteBuffer = read(&intVal, byteBuffer);
The extra ampersand makes it clear when I (or anyone else) scan the code in 3 months' time that intVal is likely to be getting modified by the function. This is a fairly minor preference; there's doesn't seem to be any big technical advantage to either pointer or reference. References can't be null, but that seems as much a detriment as a benefit. So, given that I'm not particularly attached to either pointer or reference for this purpose, I'm curious as to why someone would hold a strong opinion on the subject. What am I missing? Thanks.
Harry.
Advertisement
Quote:There was recently a thread on this forum in which a few people seemed fairly adamant that references were preferable to pointers when passing mutable parameters by reference, so a function can modify them. I think there's probably some good grounds for their opinions, so I'm interested to know why.

There are a few reasons, and I'll go into them a bit further on. Passing data members that are modified by referance is mostly just for conformance however, when you're passing all members that don't get modified by constant referance, rather than by value. Passing by constant referance is a hell of a lot better than passing by value. Passing data members that are in fact modified inside a function by non-constant referance just kind of follows from that.

Quote:The extra ampersand makes it clear when I (or anyone else) scan the code in 3 months' time that intVal is likely to be getting modified by the function.

If you adhere to passing by constant referance for all parameters that are not modified in a function, and passing by non-constant referance for all parameters that are, the function signature tells anyone using it whether a parameter will or will not be modified within that function, which acts as a method of self-documentation as well.

Quote:This is a fairly minor preference; there's doesn't seem to be any big technical advantage to either pointer or reference. References can't be null, but that seems as much a detriment as a benefit. So, given that I'm not particularly attached to either pointer or reference for this purpose, I'm curious as to why someone would hold a strong opinion on the subject. What am I missing?

It seems you are missing the wonders of passing by constant referance. Consider this example below:

int Foo(int val1, int val2){	return val1 * val2;}


This simple function takes two parameters by value. It doesn't modify these parameters anywhere in the function, it just makes a calculation based on their values. When passing these values to the function however, copies of them have to be made, which are local to the scope of the Foo function, and are destroyed when it exits. This may seem minor for a few integers, but when you're passing classes around with complecated constructors, large numbers of data members, or large memory blocks allocated to them, you'll be wasting a lot of time creating and deteting copies of these objects when you pass them to functions, and thrashing your memory in the process. Instead, you can pass them by constant referance:

int Foo(const int& val1, const int& val2){	return val1 * val2;}


This will not create the temporary copies. Instead, the original versions of the parameters will be referanced from within the function. This can lead to a major increase in performance over passing by value, as well as avoid errors from someone not implementing a copy constructor. If a class for example allocates a block of memory for use, which is freed in the destructor, but you have not provided a copy constructor to create a deep copy of these members, by allocating a new buffer and copying the contents into it, then when the copy is deleted, it will free the buffer that is held by the original object. This won't manifest itself immediately, and it'll be a real pain in the ass to track down if it happens.

The fact that you're passing by constant referance, prevents the function from modifying that parameter within the function (casting voodoo notwithstanding). You know it will not be modified, and you can rely on that being the case, rather than just trusting the guy who wrote the function stuck to the conventions.

You probably pass a pointer when you're throwing around large classes currently, but that has some disadvantages. First of all, I have no way of knowing for sure if that parameter will be modified within the function. Second, inside the function, I have to worry about if I was passed a valid pointer. Third, the syntax is different. You can get some ugly looking stuff with pointers when you want to use operators on the thing they referance. You often end up with things like (*pointer)++ or the like.


Passing by referance for parameters that will be modified, and constant referance for ones that won't, is just better in general. Not only are there practical reasons like the ones I've listed here, it makes it easier to proof check and enforce design models too, through const correctness, which this is a part of.
I'm going to direct you to this thread, which covers pretty much everything, and in not too many posts either. Read it through, there's a lot of good information in there.
This is a touchy subject to so I'll refrain from any potentially inflammable remarks ;)
Quote:Nemesis2k
The fact that you're passing by constant referance, prevents the function from modifying that parameter within the function (casting voodoo notwithstanding). You know it will not be modified, and you can rely on that being the case, rather than just trusting the guy who wrote the function stuck to the conventions.

Constant pointers work the same way, there's really no advantage with references that allows you to access a new such language feature.
void do_const(int *const p) { ... }
In the code above, you gain nothing by passing by reference. This because as far as the machine code goes, those parameters are no longer ints, but pointers(addresses). You don't save much by allocating a pointer on the stack instead of an int.

The benefits start showing when you're passing around non-trivial types, which require complex copy construction semantics.
daerid@gmail.com
Thanks, but I'm already aware of the benefits of const references. I also know how references are implemented by the compiler. My question is specifically about:

Quote:passing mutable parameters by reference, so a function can modify them


ie. why this:
template <typename T> byte* read(T& into, byte* buffer)

and not this:
template <typename T> byte* read(T* into, byte* buffer)

Harry.
Since Nemesis2k2 took the time to write such a lot, I think it might be rude not to respond to some specific points other than the const references thing :o)

Quote:Original post by Nemesis2k2
If you adhere to passing by constant referance for all parameters that are not modified in a function, and passing by non-constant referance for all parameters that are, the function signature tells anyone using it whether a parameter will or will not be modified within that function, which acts as a method of self-documentation as well.

This is fine in principle, but in practice I don't want to look up every function prototype when I'm scanning code. I'd rather the self-documenting code was the bit I'm reading.

Quote:You probably pass a pointer when you're throwing around large classes currently, but that has some disadvantages. First of all, I have no way of knowing for sure if that parameter will be modified within the function.

If it's not modified then I'd rather pass it as const reference.

Quote:Second, inside the function, I have to worry about if I was passed a valid pointer.

Sometimes it's useful to pass in a null pointer. Like I said, this seems as much a benefit as a detriment.

Quote:Third, the syntax is different. You can get some ugly looking stuff with pointers when you want to use operators on the thing they referance. You often end up with things like (*pointer)++ or the like.
This is more convincing, although the way I write code I'm generally not worried about a little extra typing for the sake of clarity. At that level of complexity, it's not ugly enough to make the code hard to read, but I can imagine that it could get quite awkward if you had enough going on all in one statement. Still, if I'm doing something complicated I'm fairly happy if my code reflects that.

Quote:Passing by referance for parameters that will be modified, and constant referance for ones that won't, is just better in general.

This is less convincing :o)

I noted from that other thread that you linked that you can pass literals as const references, which generally requires a seperate declaration statement if you want to take the address of that literal. That seems like a feature worth having, although again I'm more concerned with clarity than with extra typing.
Harry.
There are a few reasons that I stated above:

"You probably pass a pointer when you're throwing around large classes currently, but that has some disadvantages. First of all, I have no way of knowing for sure if that parameter will be modified within the function. Second, inside the function, I have to worry about if I was passed a valid pointer. Third, the syntax is different. You can get some ugly looking stuff with pointers when you want to use operators on the thing they referance. You often end up with things like (*pointer)++ or the like."

The third one is the main concern for me. This has caused me real problems a few times, when I'm passing around a class using a pointer, but that class has overloaded operators. You have to dereferance the pointer first before you can call the operators, otherwise you're trying to do the operations on the pointer, not the object it referances. this looks quite ugly in code, and can lead to errors if I forget to dereferance. When I pass by referance, I can just use the operator normally without worrying about it.

Basically though, I pass by referance because I pass by const referance. It's mostly a matter of conformance. I only appreciated some of the other small perks of doing so once I actually started using them in real situations. They all really come back to questions of style though, rather than the obvious issues related to passing by const referance, so it's really just whatever you prefer.
Quote:This is fine in principle, but in practice I don't want to look up every function prototype when I'm scanning code. I'd rather the self-documenting code was the bit I'm reading.

Fair enough. I've only really thought about this from the point of view of writing code, not reading it later.

Quote:If it's not modified then I'd rather pass it as const reference.

I thought he didn't know about const referances when I was writing that post.

Quote:Sometimes it's useful to pass in a null pointer. Like I said, this seems as much a benefit as a detriment.

Heh, I agree. I have yet to encounter a situation where I felt a function needed a parameter that could be null, but I have quite a few classes that use pointers internally for a few things, where referances do not suit. This lack of consistency could be considered a negative I suppose, but there are times where I require the conveniences of one, without the hassles of the other.

Quote:This is more convincing, although the way I write code I'm generally not worried about a little extra typing for the sake of clarity. At that level of complexity, it's not ugly enough to make the code hard to read, but I can imagine that it could get quite awkward if you had enough going on all in one statement. Still, if I'm doing something complicated I'm fairly happy if my code reflects that.

Overloaded operators are all about providing simplicity though. Having to worry about this pointer dereferancing destroys it. I would more likely scrap overloaded operators in certain circumstances where I would normally use them (iterators, a recent example that I encountered), and just provide a regular member function instead, because the pointer made the operator harder to use and read than a regular function call.
As has been stated, a reference can never legally point to 0, i.e. there is no such thing as a null reference, unlike with pointers. If you make a pointer parameter instead of a reference parameter, you open up more possibilities for error since now people can pass a null pointer. When working with references, this is simply not a possibility so you avoid a possible misuse of your function. Take your function as an example -- what happens when you pass a null pointer? Kaboom! References get rid of this problem without needing any runtime checks. This is the main reason to prefer references over pointers. If your function can validly take a null pointer, then it doesn't make sense to use a reference anyway, so a choice of reference or pointer in that case shouldn't even come up.

If you are concerned about people not knowing if the parameter is modified, then the problem is not with the code. Your point crutches on the notion that the person calling the function simply doesn't know what it does, since if he knew what it did, it would be clear that the object being passed would or would not be modified. For instance, with your "read" function and the parameter called into, which is suggested as being a reference to a const-unqualified object type, the person should know that since the parameter is a non-const reference that its state will potentially be modified by the operation. As a side note, he should also believe as such with the buffer parameter -- if the buffer will never be changed it should be referencing const data, otherwise you can't use the function with a const buffer which is a design flaw (which, when refering to your current implementation, is a design flaw as predicted).

Finally, your claim that explicitly using & to make code more self documenting, since it makes it obvious that the parameter is being modified, doesn't actually do as you claim. This is because when you pass a pointer to the object, you have the same exact possibilities as when passing by reference -- the object can be potentially modified, or not modified. The constness of the dereferenced type is what matters. While you might personally associate passing a pointer to a function as making it clear that you are doing so to have the target object be modifiable, that is not what pasing by pointer necessarily means and if I, or anyone else not familiar with your convention comes across your code, we are going to have the same questions as with reference parameters, and in fact, using pointers opens up even questions. Remember, again, a pointer can be a null pointer, and as well, a pointer can have its address taken. A reference does not have these properties and therefore limits is use much more appropriately to the purpose at hand -- to get a way to reference a valid object and potentially modify its value.

So, using a reference more concisely represents what you are doing by appropriately limitting use, it prevents from more possible errors such as null pointer parameters, and it even takes less code to call the function with.

This topic is closed to new replies.

Advertisement