• Advertisement
Sign in to follow this  

Efficiency: Pointers vs. References

This topic is 4141 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I was surprised to find out today that a very large company disallows the use of references for efficiency reasons. void DoSomething( Stuff& stuff ); vs. void DoSomething( Stuff* stuff ); The person I was talking to agreed with the policy, yet couldn't convince me. Something about constructors being called more than you intended, but the person couldn't give me any concrete examples. I personally use references when I can, pointers when I must. Pointers are more complicated and more error prone. They can be null, which you should check against before use. References don't have this problem (unless somebody was bad and did something like dereferenced a null pointer and placed the result in the reference). They can also be uninitialized, which you can't really guard against other than by remember to initialize them. So enough ranting about references. How are they inefficient and is it enough of a reason to flat out prohibit them? My first guess is the policy was made by somebody in the industry for 15 years who is afraid of new developments. (the old virtual pointers being slow arguments, etc...)

Share this post


Link to post
Share on other sites
Advertisement
in the example you give, passing in a parameter, they have 100% identical performance characteristics.

It would be a violation of every rule to call a constructor extra times .. so how exactly could these be different efficiencies:


void DoSomethingRef(Stuff &stuff);
void DoSomethingPtr(Stuff *stuff);

MyClass obj1;

DoSomethingRef(&obj1);
DoSomethingPtr(obj1);



disassemble the compiler output and you will find they are 100% identical in every way ... (compare the release versions, not the debug versions).

Share this post


Link to post
Share on other sites
Quote:
Original post by Thrump
The person I was talking to agreed with the policy, yet couldn't convince me. Something about constructors being called more than you intended, but the person couldn't give me any concrete examples.


The person you talked to was misinformed. Passing parameters by value can cause extra copying. That is one of the reasons why passing (const) references is preferable. Using pointers is definitely not a good idea.

Quote:
I personally use references when I can, pointers when I must. [...]


Correct on all points

Quote:
My first guess is the policy was made by somebody in the industry for 15 years who is afraid of new developments. (the old virtual pointers being slow arguments, etc...)


I think you nailed it. Someone coming from a C background would only know about passing by value and by pointer and would promote the latter. It then becomes enshrined policy and references get ruled out because passing pointers is the norm.

Share this post


Link to post
Share on other sites
Perhaps they are refering to situations such as passing a string literal to a function that expects a const std::string reference. That will call a std::string constructor which, with a naive implementation, will allocate memory and do an array copy. Yes, this is a lot less efficient than passing a const char*, but really, I doubt it matters in just about all cases. I'd call this a form of premature optimization, and would potentially consider actually writing a semi-formal paper arguing for a reversal of the policy. If that didn't convince anyone, then I'd probably drop it and just go with what they insist, 'cause sometimes you just gotta accept what higher-ups assert, but I'd definitely try at least once to provide a substantial argument to change their minds.

Share this post


Link to post
Share on other sites
Agony, do you have any more information on that? That is not how I would have expected a const std::string reference to be passed. Why would it behave differently than passing any other class by reference?

Share this post


Link to post
Share on other sites
Quote:
Original post by Agony
Perhaps they are refering to situations such as passing a string literal to a function that expects a const std::string reference. That will call a std::string constructor which, with a naive implementation, will allocate memory and do an array copy. Yes, this is a lot less efficient than passing a const char*...

No. Certainly not a "lot" less efficient. Consider that passing a string literal to a function that takes const char * requires all sorts of housekeeping/setup within the function to be performed in order to safely make use of that string. So even while the memory may not be copied, the overhead is similar.

And any non-naïve implementation of std::string won't bother to allocate memory in that instance. It'll simply point at the memory that contains the string literal (since the std::string instance is immutable).

In other words, in all circumstances, the performance characteristics of pointers vs references are a wash. References, however, have the upside of syntactic simplicity and more assistance in statically determining correctness from the compiler.

Share this post


Link to post
Share on other sites
Quote:
Original post by Oluseyi
And any non-naïve implementation of std::string won't bother to allocate memory in that instance. It'll simply point at the memory that contains the string literal (since the std::string instance is immutable).


Allow me to have some doubts about that. Unless the compiler is made aware of the situation, I can't see how a difference can be made between constructing a const and non-const object. Move constructors may change that, but they're not there yet... and even if they were, I wouldn't expect companies with such policies in place to adopt their usage.



At any rate, Agony, I don't think char* vs. std::string really is at issue here, rather std::string vs. std::string* vs. std::string&. It is a problem of education and, in all likelihood, of a policy written when the choice was only between values and pointers, with the argument that pointers had better performance. Implicit here would be "than values". Since references aren't on the list of allowed things, they are therefore forbidden.

Share this post


Link to post
Share on other sites
Quote:
Original post by Fruny
Allow me to have some doubts about that. Unless the compiler is made aware of the situation, I can't see how a difference can be made between constructing a const and non-const object.

Point. I suppose copy-on-write semantics would have to be implemented to see the sort of behaviors I suggested - and those are rarely, if ever, used in common std::string implementations.

Share this post


Link to post
Share on other sites
Passing by reference makes it hard to tell when a function might modify the value, just by glancing at the caller. That's about the only benefit I can see for passing pointers.

Share this post


Link to post
Share on other sites
Quote:
Original post by phil_t
Passing by reference makes it hard to tell when a function might modify the value, just by glancing at the caller. That's about the only benefit I can see for passing pointers.

It's a false benefit. You can pass a pointer by value, for instance, but you won't be able to modify the original pointer. You'd need to pass the pointer by address (ie, a pointer to pointer, or a reference to pointer). Truth is, with a decent IDE that shows function prototypes, it becomes trivial to tell when a function is liable to modify the value (ie, makes no assurances that it won't): if it takes a reference, it may be mutable; if it takes a const reference, it certainly isn't.

With pointers, you also have to mentally resolve the ambiguities around const pointer vs pointer to const value...

Share this post


Link to post
Share on other sites
Ah, that old refrain. :)

You know, I really do pity people who figure everything has to be made "explicit" in order for code to be "clear". The usual resulting context is that you write lots of extra code that has no effect, sometimes interfering with standard idioms and/or reducing readability because of the added verbosity.

My favourite examples of this phenomenon are explicit comparison of a pointer to NULL in an if-statement, and explicit comparison of booleans to literal boolean values (or worse, an explicit if/else test on a boolean condition to decide what literal boolean to return! In this case, the most I would ever do "explicitly" would be a *cast* to bool - probably a conversion actually, i.e. bool(expression).)

But "pass reference parameters by pointer in order to make the caller explicitly reference the variable, so as to mark the parameter as changeable" certainly qualifies, too. Presumably, the intention is to avoid violating the Principle of Least Surprise (PLS) - i.e. allowing a function to do something unexpected, like modify what was passed to it. But I think you have to be pretty paranoid in order to want to write things like that, considering that there are already at least three plain-common-sense coding guidelines that are applicable here to help defend the PLS:

- Prefer return values to "out parameters".
- Document your code; consult documentation when necessary.
- Give functions descriptive names, such that any "out parameters" would be obvious.

Not to mention, this "explicitness" actually *loses* information when you go to look at the actual function prototype (which is the first place you should be looking if you can't guess what something does - not the call site!): Is that "char**" a C-style string being passed by pointer, or is it an array of C-style strings?

The OP's philosophy is sound, and echoed by a really darned smart guy who just happens to be on the ANSI C++ standardization committee. Of course, he pwns my rambling by just saying "it's a form of information hiding" (which is true) and leaving it at that. [wink]

Share this post


Link to post
Share on other sites
I think I know what's going on. Consider this:
void foo( int& n ){}
void bar( const int& n ){ foo( n ); }

Normally this would be an error. But some old compilers (ex. C++Builder 6) would create a temporary object instead:
Quote:
W8031 Temporary used for parameter 'parameter' Compiler warning

In C++, a variable or parameter of reference type must be assigned a reference to an object of the same type.

If the types do not match, the actual value is assigned to a temporary of the correct type, and the address of the temporary is assigned to the reference variable or parameter.

The warning means that the reference variable or parameter does not refer to what you expect, but to a temporary variable, otherwise unused.

It's never stopped me from using references though. Even though THIS compiler reduces the error to a warning, I don't think that's a valid reason not to use references at all (at least there's a warning).

[Edited by - izhbq412 on October 18, 2006 12:16:24 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
You know, I really do pity people who figure everything has to be made "explicit" in order for code to be "clear".


You pity them eh? Those poor, poor people. Glad you've got it all figured out though.

New to this board, but damn, there is a lot of elitism here!

Share this post


Link to post
Share on other sites
Quote:
Original post by phil_t
Quote:
Original post by Zahlman
You know, I really do pity people who figure everything has to be made "explicit" in order for code to be "clear".


You pity them eh? Those poor, poor people. Glad you've got it all figured out though.

New to this board, but damn, there is a lot of elitism here!
You certainly are new! Stick around and you just might find out that you were quoting one of the leetest ppl around on ANY forum, IMHO!

Share this post


Link to post
Share on other sites
Pointers and references are nearly identical as far as performance goes when used in similar simple cases. References get the upper hand for:
  • Accessing the passed object: in typical use, a pointer can be NULL, which requires an additional check before using it (because if the pointer could not be NULL, then a reference would have sufficed). If your code is unsafe (dereference pointers which are known to sometimes be NULL) then you have bigger problems to worry about.

  • Temporary values: you cannot take their address, which forces you to store them in a variable before passing them around through pointers. On the contrary, a constant reference can be used to pass a temporary value, resulting in one less copy.

  • Giving the compiler more knowledge: reference arguments can be optimized away in inline functions by even the most dense compiler (for instance, not requiring a dereference at all). An optimizer might sometimes avoid taking chances with pointers, although this becomes increasingly rare.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Quote:
Original post by Oluseyi
And any non-naïve implementation of std::string won't bother to allocate memory in that instance. It'll simply point at the memory that contains the string literal (since the std::string instance is immutable).
How can the "non-naïve" implementation know that the passed string is a literal? You do know that std::string is implemented with the language in standard library, i.e. it's not a special language construct? And if it's not a literal, it can be modified without the string knowing about it.
Quote:
I suppose copy-on-write semantics would have to be implemented to see the sort of behaviors I suggested - and those are rarely, if ever, used in common std::string implementations.

Even a COW string implementation wouldn't help, because its constructor would only receive a const char*; It could never know if the contents of that C-string changed (just because it is const char* in the function doesn't mean there can't be char* pointer to the same memory).

Share this post


Link to post
Share on other sites
Good point.
A super-optimising compiler could probably do this perhaps, I don't know if any currently do..It would have to realise memcopy (or whatever other method of copying memory (which may have side-effects) is not necessary for string literals.
Which may be hard to determine.

So it seems it would be up to the compiler and not the string implementation.

~But I don't think that this is a big enough issue to recommend not using references!

[Edited by - stevenmarky on October 18, 2006 6:41:49 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
Quote:
I suppose copy-on-write semantics would have to be implemented to see the sort of behaviors I suggested - and those are rarely, if ever, used in common std::string implementations.

Even a COW string implementation wouldn't help...

It wouldn't have to allocate memory until the string was modified.

Share this post


Link to post
Share on other sites
Quote:
Original post by Oluseyi
Quote:
Original post by Anonymous Poster
Quote:
I suppose copy-on-write semantics would have to be implemented to see the sort of behaviors I suggested - and those are rarely, if ever, used in common std::string implementations.

Even a COW string implementation wouldn't help...

It wouldn't have to allocate memory until the string was modified.
The COW string would still need to allocate memory when it was constructed. Example:

char szBuffer[256];
strcpy(szBuffer, "Hello World!");
CowString str(szBuffer);
strcpy(szBuffer, "Goodbye, cruel world");
// str's pointer still points to szBuffer, but the data has changed
// Only real solution is to copy the const char* into a buffer that the COW string knows
// is safe from modification

Share this post


Link to post
Share on other sites
I imagine a compiler writer could add a hardcoded rule allowing it to use a std::string like interface around a string literal in that case, however I'd assume that the value added to the compiler by doing that would be so small that it wouldn't be worth bothering.

Share this post


Link to post
Share on other sites
It's amazing what template magic can achieve (works in VisualC++ 8):
#include <iostream>
#include <boost/utility/enable_if.hpp>

template < typename type >
struct is_char_pointer
{
static bool const result = false;
};

template <>
struct is_char_pointer< char * >
{
static bool const result = true;
};

template <>
struct is_char_pointer< char const * >
{
static bool const result = true;
};

class string
{

public:

template < typename type >
string(type string, typename boost::enable_if_c< is_char_pointer< type >::result >::type * = 0)
{
std::cout << "mutable: " << string << '\n';
}

template < std::size_t length >
string (char const (&string)[length])
{
std::cout << "literal: " << string << '\n';
}

template < std::size_t length >
string (char (&string)[length])
{
std::cout << "mutable: " << string << '\n';
}

};

void function(const string & string)
{
std::cout << string << '\n';
}

int main()
{
function("literal");
char local[] = "local";
function(local);
char * local_ptr = local;
function(local_ptr);
}

Unfortunately not quite amazing enough. I'll leave it as an exercise for the reader to determine the fatal flaw in this implementation (besides only working on one of the three compilers I tested it on).

If anyone else is wondering why I bothered with this, the answer is that I like interesting challenges.

Σnigma

Share this post


Link to post
Share on other sites
Quote:
Original post by phil_t
Quote:
Original post by Zahlman
You know, I really do pity people who figure everything has to be made "explicit" in order for code to be "clear".


You pity them eh? Those poor, poor people. Glad you've got it all figured out though.

New to this board, but damn, there is a lot of elitism here!


My style of rhetoric doesn't appeal to everyone, true. But really, these ideas are pretty obvious.

Which would you be more likely to say in real life?

1) "I may need my umbrella, depending on whether it is raining."
2) "If it is true that it is raining, I will need my umbrella. Otherwise, I will not need my umbrella."

Now, which would you be more likely to write in code?

3) "need_umbrella == raining();"
4) "if (raining() == true) { need_umbrella = true; } else { need_umbrella = false; }"

Almost everyone would pick 1) over 2). (Actually, given free choice, most people would probably just say "if it is raining, I will need my umbrella", with the opposite case being implied. Of course, the computer lacks that kind of common sense, but we can still state things in an idiomatic way that still covers both situations.) But a stunningly large percentage of people seem to pick 4) over 3), and seem to be blissfully unaware of the vast logical disconnect. I'm trying to figure out what causes this way of thinking.

You don't need to *enumerate* cases in order to *cover* them all.

Share this post


Link to post
Share on other sites
I think VC8 tries to optimize out the std::string copying, and I also think the implentation is buggy. It works fine afaik if you only use it in one module (exe or dll), but if you try to pass a std::string to a dll from an exe, it sometimes crashes (I think the dll will then think it should deallocate the literal in the destructor).. I remember having some problems with it, although, I'm not at all sure that is what happends, all I know is that the application crashed with access violation on the literal address, and that it happend somewhere after the last line in the function leading me to think it was the destructor, and the problem disappeared when I used my own string implentation or c-style strings.. And in case anyone wonders, the exe and the dll were built with the same libraries and settings (triple-checked)..

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
But a stunningly large percentage of people seem to pick 4) over 3), and seem to be blissfully unaware of the vast logical disconnect. I'm trying to figure out what causes this way of thinking.


Possibly their training, or the fact that they're not using prolog.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement