• Advertisement
Sign in to follow this  

Having trouble getting my head around pointers

This topic is 3876 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I've been reading quite a bit on pointers recently and I still can't get my head around them. Why are they so powerful? Are there any good examples which can be given? I'm aware they point to a place in memory, so does this mean they can be used as a sort of global variable? Like, say I create a variable in one class, can a variable in another class be a pointer, and point to that variable in the other class...?

Share this post


Link to post
Share on other sites
Advertisement
They allow you to allocate memory during run-time.


int n_names;
std::string names[ ??? ]



How many names do we allocate? We don't know at compile time, since we'll read them from a file. We can allocate 100 names, but then we'll either waste memory, or not have enough, if there are 17,331 names in the file.

Dynamic memory allocation allows you to decide that during run-time. Pointers are the supporting syntax.

Another aspect is sharing data. Consider an Image object (1 Mb in memory). Now you need to use this image as texture on 500 different models. Without pointers, you need to make 500 copies of the same image. With pointers, you just pass the pointer around, and use the same image object.

Last use, the most dangerous one, is direct memory access. Variables, and even code, are stored in memory as a sequence of bytes. If the system you're using supports the concept of memory pointers (Java doesn't), then you can manipulate these bytes manually, creating code on the fly (self-modifying program) or you access the data your application uses directly (pixels of an image).

Share this post


Link to post
Share on other sites
Quote:
Original post by AntheusAnother aspect is sharing data. Consider an Image object (1 Mb in memory). Now you need to use this image as texture on 500 different models. Without pointers, you need to make 500 copies of the same image. With pointers, you just pass the pointer around, and use the same image object.


And this is good because instead of having to pass around that 1MB object, you pass around that 4 Byte / 8 Byte / Whatever size pointer instead. Smaller and faster.

Share this post


Link to post
Share on other sites
Oh ok, so for a 2D tile game, it's pretty much necessary to use pointers for the image tile being used? What about my last question, about pointers between two classes?

Share this post


Link to post
Share on other sites
Quote:
Original post by Side Winder
Oh ok, so for a 2D tile game, it's pretty much necessary to use pointers for the image tile being used? What about my last question, about pointers between two classes?


Sure. As long as you're perfectly clear on who and when allocates/de-allocates memory.

Let's say you pass a pointer from class A to class B. What happens if class A is de-allocated? Will B aware that it needs to de-allocate that pointer? What if A is still alive, and is using that variable?

Or, simply put, make sure you understand dynamic allocation mechanisms, then use it very sparingly. In C++ that's the biggest challenge of all, managing object life-cycle.

If you're using C# or Java, then this is a non-issue (most of the time), since VM will do the de-allocation for you.

For object references, you should also look into smart pointers once you'd want to use dynamically allocated objects on a larger scale.

It's just a very big and complex topic with thousands of pitfalls.

But generally, you do not *need* to use pointers. In C++ you can use references (the & syntax). They give you most of the functionality of pointers, but with almost no problems.

Share this post


Link to post
Share on other sites
Hmm, yeah it does seem quite complicated.

OK, let's say, for example, that I'm making a Win32 program and I have two classes. In the first class I have the window handle. I want to get a hold of that window handle from the second class.


class Window
{

HWND hwnd;
};





class Yes
{
...
};



Something like that. What would I have to do to get that window handle? And what would be the syntax...? Would I have to use inheritance and have an accessor function, or could it all be done with references (or pointers)?

Share this post


Link to post
Share on other sites
A pointer is a variable that contains the memory address of another variable or a block of RAM. They allow you to access that space without having to copy the original variable or memory block. There are many reasons why this is useful:
1) Polymorphic behavior is only possibly through reference types (references or pointers)
2) Coping a pointer is often cheaper than copying a structure/object/block of RAM
3) Since they are reference types they can also allow a function to return multiple variables by referencing a user created object and filling it with data. For example:

void Add2to3(int *one,int *two,int *three)
{
*one+=2;
*two+=2;
*three+=2;
}

void main()
{
int a=8;b=11;c=5;
Add2to3(a,b,c);
// now a=10, b=13, and c=7
}





Pointers are not only powerful but they are dangerous, here are some of the pitfalls:
1) Dynamic memory allocation can lead to a memory leak: If you assign a pointer to the value of "new" or "malloc" you have to call "delete" or "free" on that same pointer before it exists scope. If you do not, you have just leaked memory. Failing to do this consistently is so notoriously common that we now have things like smart pointers to help us. Unfortunately, smart pointers are even more complex than ordinary pointers, but they are much safer.
2) Dangling pointer: Sometimes you do remember to call "delete" or "free" on the pointer data but forget to stop using pointers that were referencing that same data. Other times this happens when you have a pointer to data on the stack, and that data goes out of scope before all of the pointers to it do. This results in a dangling pointer referring to data that can no longer be safely accessed. Trying to access that data can cause all sorts of weird errors, although often times it will just crash your program.
3) Null pointer: If are really smart, after freeing or deleting your pointer you assigned it a null value (0=null) indicating that it no longer points to anything. However, you must still be careful because trying to access the data from a null pointer when there is none (because it's null) will cause a crash (although it's usually safer than a dangling pointer crash).
4) Buffer overrun: This is just like a dangling pointer except worse. Occasionally doing pointer arithmetic ( *(pointer+x) OR array[x] ) can result in a reference to data that you did not allocate. Trying to access this data could crash your program for an access violation right there, or if you are really unlucky it could corrupt data elsewhere in RAM and possible cause a crash in some random place that has nothing to do with the buffer overrun.

Smart pointers can fix problems 1 & 2 and often times 3, and smart programmers using safe methods can correct number 4.

Smart pointers are safer but they are also complex. If you want to try them I would suggest the boost version.

Whatever you decide, use pointers but do so carefully. Also avoid "void *" at all costs.

Share this post


Link to post
Share on other sites
Quote:
Original post by Side Winder
Hmm, yeah it does seem quite complicated.

OK, let's say, for example, that I'm making a Win32 program and I have two classes. In the first class I have the window handle. I want to get a hold of that window handle from the second class.

*** Source Snippet Removed ***

*** Source Snippet Removed ***

Something like that. What would I have to do to get that window handle? And what would be the syntax...? Would I have to use inheritance and have an accessor function, or could it all be done with references (or pointers)?


You don't want to do that.

HWND is a handle, and as such has lifecycle beyond your control.

The proper solution in your case would be to pass the reference to instance of Window object to other classes. Alternative would be to just pass HWND itself into classes that need it.


class Window {
public:
Window( HWND hwnd )
: m_hwnd( hwnd )
{}
HWND GetHWND() { return m_hwnd; }
private:
HWND m_hwnd;
};

typedef Window * WindowPtr;

class Yes
{
public:
Yes( WindowPtr wPtr )
: m_window( wPtr )
{}
void foo()
{
m_window->GetHWND();
}
private:
WindowPtr m_window;
};



Of course, you now need to ensure that Window will get properly allocated and de-allocated, and that Yes will not use the pointer once it's no longer valid.


...
WindowPtr wPtr = new Window( hwnd );

Yes y( wPtr );
y.foo();

delete wPtr;
y.foo(); <-- Here there be dragons



But bad things will happen if you delete wPtr while Yes can still use it. On second call to foo, the behaviour of foo() is undefined. It might work as before, it might crash, it might do weird things.

Share this post


Link to post
Share on other sites
Quote:
Original post by Side Winder
Ah ok... Hmm.. Maybe I should switch to C# so I wouldn't be having these problems :p


++rating, for enlightenment and being one of the few beginners to have not to be beaten to death to gain such enlightenment.

Share this post


Link to post
Share on other sites
There seems to have already been quite a bit of discussion on this topic, but I'd still like to horn in. Specifically to answer the "good examples" question. It doesn't seem that anyone has given any really good answers (at least it doesn't to me) as to why pointers are the single greatest thing to happen to programming since Turing was alive.

Look at this struct:

struct node {
int someData;
node * next;
};




What this is is one node of a linked list. It is an object that will reside in memory, and it is composed of a bit of data, and what appears to be a pointer to itself. What good is that? Why are these three lines of code what could be the greatest bits of code ever?

Because the bit of data called next doesn't point to itself. It points to the next node in the link. You get to the next node by the line (for this example we've named a node pointer temp) temp->next. Or the next node's data by temp->next->someData.

This is a dynamic array. Or a linked list. But this is one dimensional. Think of it like this (where D is data, P is pointer, NULL signifies end of list):
[D|P]->[D|P]->[D|P]->NULL


You can also make a node two dimensional like this:

struct node {
int someData;
node * right;
node * left;




And suddenly, you've got a tree. Like this:
O
O O (this isn't turning out very well after posting, it should be shaped like a triangle)

One node has two children, and their children can have two children, etc. ad nauseum. In this specific example it's a binary search tree. With a few rules applied to your insert function it becomes a spanning tree, a hash table, a red-black tree, etc. etc. etc.

But you're not just restricted by dimension and one bit of data. You can stick anything in a node, and as many pointers as you want. One of the projects I had in school was to create a spell-checker using a shortest path algorithm to determine exactly what word you were TRYING to spell. It used a data structure called a graph (yeah, I know there's MUCH more efficient ways of spelling word suggestions algorithms, but the point was to make us use the shortest path algorithm).

The point is that pointers enable you to make the data structures that currently run our world. Without these structures there would be NO WAY to do some of the things we do in software. They allow you to find solutions to problems that you CANNOT solve otherwise.

That is the beauty of pointers.

The downside is that pointers, when strung up together wrong, will allow you, as the famous saying goes, "to shoot yourself in the foot."

Hope this diatribe helps. Hope you look up some of the names of the data structures I mentioned here. And I hope you come back to C++, no matter what Alpha_ProgDes says [smile].

- Goishin

Share this post


Link to post
Share on other sites
But just for fun, to see just how confusing pointers can get and exactly WHY you can shoot yourself in the foot with them, I had a job interview yesterday which included the following question:


char * c[] = {"this", "is", "a", "test"};
char **pc[] = {c + 4, c + 3, c + 2, c + 1};
char ***ppc= pc;
printf("%s*",**++ppc + 2);



What get's printed out? And before you say it won't compile, it will. I actually copied and pasted this from a test I did later on at home. It compiles, and when you run it, it actually prints something out. The question is WHAT does it print out?

I got this one right. But I didn't get the job...
<sobs>

- Goishin

Share this post


Link to post
Share on other sites
Looks complicated! Obviously I have absolutely no idea what it prints out. And cheers for the example, Goishin. I think I may stick with C++ purely because I've already started learning it and beginning to learn a new language (no matter how similar they maybe) just seems wrong at this point.

Share this post


Link to post
Share on other sites
Glad to hear it!

C++ is a language you can spend ten years with and never master. And to me, that is the beauty of it...


- Goishin

Share this post


Link to post
Share on other sites
Yeah, C# you can learn later, but it is always good to stick to something and not just bail out because you don't understand it. Perhaps later you will learn to appreciate C# even more because of your prior problems with C++, but you'll feel a whole lot better in that you understand the underlying concepts of memory management. Same goes for lots of things.

Share this post


Link to post
Share on other sites
Quote:
Original post by Goishin
But just for fun, to see just how confusing pointers can get and exactly WHY you can shoot yourself in the foot with them, I had a job interview yesterday which included the following question:

*** Source Snippet Removed ***

What get's printed out? And before you say it won't compile, it will. I actually copied and pasted this from a test I did later on at home. It compiles, and when you run it, it actually prints something out. The question is WHAT does it print out?

I got this one right. But I didn't get the job...
<sobs>

- Goishin

Pity about the job.

Yes, I managed a correct guess too.

But since the OP is using C++, code like that shouldn't occur (yay for std::containers and std::string). Any sane person would use array access instead of pointer + offset anyway, even in C. The purpose of that snippet was to see how familiar you are with various aspects of the C language.

Anyone who tried to write real code like that would have someone else shooting at them, and they may not be aiming for their feet. [grin]

Share this post


Link to post
Share on other sites
Quote:
Original post by Jemburula
Yeah, C# you can learn later, but it is always good to stick to something and not just bail out because you don't understand it. Perhaps later you will learn to appreciate C# even more because of your prior problems with C++, but you'll feel a whole lot better in that you understand the underlying concepts of memory management. Same goes for lots of things.


I don't know about that. I love C++, but if I had started with C# I'd probably have more finished projects to my name than I do now. Of course now I'm to masochistically addicted to C++ to just drop it :) Plus I keep finding excuses as to why I need it's greater flexibility.

Share this post


Link to post
Share on other sites
Quote:

The point is that pointers enable you to make the data structures that currently run our world. Without these structures there would be NO WAY to do some of the things we do in software. They allow you to find solutions to problems that you CANNOT solve otherwise.


Correction: It's references(not C++ references) that enable all these structures. Every language that supports references, like Python(actually in Python, *everything* is a reference),Java or C# can of course implement those structures, and safer than C++'s dangerous raw pointers. Actually, in good C++ code you wouldn't see any use of pointers; strings would be handled with std::string, dynamic memory allocation by SC++L containers, pass-by-reference with C++ references, and resetable references with boost's smart pointers. Learn pointers well if you decide to stay with C++, and then avoid them like the plague when you can; use the above structures instead.

As for an example, imagine you're making an FPS deathmatch game. The player class would be:


class Player
{
private:
Player* m_target;
int health;
public:
Player():m_target(NULL),health(100){}
SetTarget(Player* target){m_target=target;}
GetHit(){health=health-10;}
ShootTarget(){if (m_target) m_target->GetHit();}
};

...
Player mikeman,zombie,ogre;

mikeman.SetTarget(zombie);//point gun to zombie
mikeman.ShootTarget();//hit zombie

mikeman.SetTarget(ogre);//not point gun to ogre
mikeman.ShootTarget();//hit ogre
...





Got it? The pointer is used to hold the current "target" of a Player. It's a reference to another object, that allows you to operate on it. And you can change that reference as many times as you want during the lifetime of the program, that is you can switch to new targets. Think of m_target as an alias, you can use it to refer to any Player object, it can be "zombie","ogre","mikeman" or any Player you create. If m_target is set to "zombie" then m_target->GetHit() results in zombie getting hit. If m_target is set to "ogre", the same code results in ogre getting hit. You couldn't do this without pointers(or references in other languages).

[Edited by - mikeman on June 13, 2007 11:50:37 PM]

Share this post


Link to post
Share on other sites

SetTarget(const Player& target){m_target=⌖}


Fixed. :) (Although that design doesn't - as is - allow you to return to the "not targeting anyone" state once you've targeted someone.)

In general, use references where you can and pointers where you have to. In particular, (normally) use pointers in implementations (the data members of a class) and references in interfaces (arguments to, and return values from, member functions of a class).

Share this post


Link to post
Share on other sites
For those curious but who maybe aren't at the computer that they can compile it themselves with, or who compiled it but couldn't figure out why it printed what it did, here's an analysis of the code snippet Goishin posted:


char * c[] = {"this", "is", "a", "test"};
char **pc[] = {c + 4, c + 3, c + 2, c + 1};
char ***ppc= pc;
printf("%s*",**++ppc + 2);


First line is straightforward. Creates an array of C-strings and initializes it with c[0] = "this", c[1] = "is", c[2] = "a", and c[3] = "test".

Second line creates an array of pointers to strings and initializes it with pc[0] = &c[4], pc[1] = &c[3], pc[2] = &c[2], pc[3] = &c[1]. Note pc[0] is actually an invalid pointer at this point (references unallocated memory beyond the end of c[]), but since that pointer doesn't actually ever get dereferenced in the rest of the program, no "bad mojo" ends up happening.

Third line looks more confusing than it is. Just creates a pointer to a pointer to a string, then initializes it with the address pc[0] points to (that invalid one it inherited from c[4]). So basically **ppc = pc[0] now.

Then the print statement... It'll print its second argument as a C-string followed by *, but what does its second argument do? ** dereferences the pointer to a pointer part and basically means (since ***ppc = pc) &pc[0], ++ moves our pointer to pc[1] (which from above was c[3] = "test"), and + 2 moves two characters into the string to start printing it (so we get the same result as using c[3] + 2 would give). %s prints it as a C-string so it starts at s (2 characters beyond the first char location) and keeps going until it hits the terminating \0, then prints the * and calls it a day.

End result:
st*

I posted this as much to teach myself as anyone else, so if anything is wrong, PLEASE correct me. Stepping through with my debugger seemed to confirm my concept of what the code was doing though...

Share this post


Link to post
Share on other sites
Quote:
Original post by Xentropy
Second line creates an array of pointers to strings and initializes it with pc[0] = &c[4], pc[1] = &c[3], pc[2] = &c[2], pc[3] = &c[1]. Note pc[0] is actually an invalid pointer at this point (references unallocated memory beyond the end of c[]), but since that pointer doesn't actually ever get dereferenced in the rest of the program, no "bad mojo" ends up happening.

Actually, pc[0] points one past the end of c, and is therefore a perfectly valid pointer. It does not point to a valid object, but the pointer is valid. Point it two (or more) past the end, and the pointer is no longer valid and you're taking a step into the land of undefined behaviour (in theory... in practice, just setting the pointer is likely not going to hurt anything, as you say).

Used the standard C++ library? Noticed how all iterator ranges are specified as from start to one past the end of the desired range (for example, vector::end() returns an iterator one past the end of the vector, and is not dereferencable but is a valid iterator)? Same with pointers; a one past the end pointer is a valid pointer.

Share this post


Link to post
Share on other sites
Quote:
Original post by Brother Bob
Actually, pc[0] points one past the end of c, and is therefore a perfectly valid pointer. It does not point to a valid object, but the pointer is valid. Point it two (or more) past the end, and the pointer is no longer valid and you're taking a step into the land of undefined behaviour (in theory... in practice, just setting the pointer is likely not going to hurt anything, as you say).


What would you define as the semantics of a valid pointer? If my understanding is correct, you could walk 50 steps past the end of an array and still have a pointer to *an* address, just one 50 * sizeof(object) bytes beyond the last valid object. One step beyond or fifty steps beyond, it's all unallocated memory and thus, to me, an invalid pointer. (You can change the program to initialize **pc[] as {c + 5, c + 4, c + 3, c + 2} and change the last line to printf("%s*", **(ppc + 2) + 2) and it'll result in the same output, and it won't crash just because you assigned c[5] and never used it; what makes c[5] invalid but c[4] valid when they both point to a memory location but neither can be dereferenced?)

I'm aware of how the STL works, and have used it many times, as well as read great books on the subject, and to my understanding, the reason ranges are half-open is so an empty set can be easily defined as one in which .begin() == .end(), instead of messy semantics where the beginning is *after* the end or similar. That doesn't mean that an end iterator is a valid pointer either; it can be used to determine you're done stepping through a container, but if you can't dereference it and end up with defined behavior as a result I hesitate to call it "vaild".

Share this post


Link to post
Share on other sites
Quote:

One step beyond or fifty steps beyond, it's all unallocated memory and thus, to me, an invalid pointer.


It may be that way to you, but the C++ Standard disagrees with you. Guess who wins the argument:)

Remember: Just because your program happens not to crash in your system doesn't mean it's well-defined. That's not how undefined behaviour is detected. In the example, (c+4) is a well-defined pointer. (c+5) is not. It leads to undefined behaviour. Undefined means, you can't be sure what it points into. Pointer arithmetic is not integer arithmetic. Pointers are not integers that contain a memory address. Pointer arithmetic works as far as the pointers are valid, that is in the range [c,c+4]. Beyond that, you're in undefined territory. What could happen is entirely implementation-specific. The program may crash, corrupt the memory, blow up the computer. Unfortunately, the worst thing usually happens: It doesn't crash, leaving you to believe the program is well-defined according to the C++ standard. It's not.


[Edited by - mikeman on June 14, 2007 11:39:46 AM]

Share this post


Link to post
Share on other sites
Quote:
Original post by Xentropy
Quote:
Original post by Brother Bob
Actually, pc[0] points one past the end of c, and is therefore a perfectly valid pointer. It does not point to a valid object, but the pointer is valid. Point it two (or more) past the end, and the pointer is no longer valid and you're taking a step into the land of undefined behaviour (in theory... in practice, just setting the pointer is likely not going to hurt anything, as you say).


What would you define as the semantics of a valid pointer? If my understanding is correct, you could walk 50 steps past the end of an array and still have a pointer to *an* address, just one 50 * sizeof(object) bytes beyond the last valid object. One step beyond or fifty steps beyond, it's all unallocated memory and thus, to me, an invalid pointer. (You can change the program to initialize **pc[] as {c + 5, c + 4, c + 3, c + 2} and change the last line to printf("%s*", **(ppc + 2) + 2) and it'll result in the same output, and it won't crash just because you assigned c[5] and never used it; what makes c[5] invalid but c[4] valid when they both point to a memory location but neither can be dereferenced?)

I'm aware of how the STL works, and have used it many times, as well as read great books on the subject, and to my understanding, the reason ranges are half-open is so an empty set can be easily defined as one in which .begin() == .end(), instead of messy semantics where the beginning is *after* the end or similar. That doesn't mean that an end iterator is a valid pointer either; it can be used to determine you're done stepping through a container, but if you can't dereference it and end up with defined behavior as a result I hesitate to call it "vaild".


This & this.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement