Sign in to follow this  
ajm113

Why Do I Lose Data In a Char Pointer When Passed Into or Over a Function?

Recommended Posts

Hello, I'm writing a library using low level C++ methods so my library can easily be built in C. So I'm staying away from C++ standard templates. So vectors and strings are a bit limit. (Plus I think it's fun to have a bit of a challenge. ;) ). I've mostly been using strings in C++ for ease of use, and I decided to stick with char pointers, but I appear to run into odd problems, of which I cant explain.. I'll give a example.

//Issue One

char buffer[MAX_LENGTH];
sprintf_s(buffer, MAX_LENGTH, "Hello World!")

const char* bufferP = doLogic(buffer);

//bufferP = "Goodbye World!" after doLogic


//Some random function in my class that has nothing to do with bufferP at all.
doMoreMath();

//Wut? Now bufferP equals random garbage data?
//bufferP = "/////////tx///"

I know this problem can be fixed doing this:

const char* bufferP = doLogin();

char buffer_a[MAX_NAME_LENGTH];
memset(buffer_a, 0, MAX_NAME_LENGTH);
strncpy(buffer_a, bufferP, strlen(bufferP));

But why does this happen? This issue never seems to happen on any other pointers I've created, like objects. Is it necessary I do everything in char arrays? So I don't lose data randomly?

 

Thank you, Andrew.

Edited by ajm113

Share this post


Link to post
Share on other sites
it's hard to tell since you didn't give an actual working example, but my guess is your returning stack memory from doLogic, and when you call doMoreMath that stack memory is being overwritten, because you never should return stack allocated memory.

Share this post


Link to post
Share on other sites

What do your doLogic and doMoreMath do? (too many 'do's!)

 

In particular, what exactly are you returning from doLogic? If you're creating a buffer with local scope in doLogic and returning a pointer to it, you're only ever going to get nonsense back.

 

ph34r.png

Edited by mark ds

Share this post


Link to post
Share on other sites

Ah, that would make sense. So I need to make the memory heap then so I can use the variable globally. I didn't think I would need a working example since I thought putting comments of the output would be enough. I really need to get back to the basics again, but at lower level.

 

@Mark

 

DoLogic is anything that returns char*. I.E

 

Example of a substr function I made:

char* awesome_ini::aw_subStr(const char* line, size_t start, size_t length)
{
	char subbuff[AWI_MAX_KEY_NAME_LENGTH];

	memset(subbuff, 0, AWI_MAX_KEY_NAME_LENGTH);

	size_t p = 0;
	for (size_t i = start; i < length+1; i++)
	{
		subbuff[p] = line[i];
		p++;
	}

return (char*)subbuff;
}

That also causes the pointer in the example code to go away after a few lines of code. And do math is simply a function that does anything really...

 

I.E

void doMath(void)
{
//Do some long calculation
int x = 5 - 2;
printf("%i\n", x);
}

Share this post


Link to post
Share on other sites

A better way would be

char srcBuffer[MAX_LENGTH]
sprint_s(srcBuffer, MAX_LENGTH, "Hello World!");

char dstBuffer[MAX_LENGTH]

bool errorVal = doLogic(srcBuffer, dstBuffer); // return false if dstBuffer is too short

Share this post


Link to post
Share on other sites

 

A better way would be

char srcBuffer[MAX_LENGTH]
sprint_s(srcBuffer, MAX_LENGTH, "Hello World!");

char dstBuffer[MAX_LENGTH]

bool errorVal = doLogic(srcBuffer, dstBuffer); // return false if dstBuffer is too short

 

That is certainly the "C" way of doing things and it sounds like this is what the OP's goal is.

Share this post


Link to post
Share on other sites

Yes, a very common C idiom is to avoid allocation but instead to pass a buffer to a pre-allocated (possibly uninitialised) buffer / structure of sufficient size. In some cases, a library will have a function for determining the size of a buffer that needs to be passed when this would not be known.

Share this post


Link to post
Share on other sites

A better way would be

char srcBuffer[MAX_LENGTH]
sprint_s(srcBuffer, MAX_LENGTH, "Hello World!");

char dstBuffer[MAX_LENGTH]

bool errorVal = doLogic(srcBuffer, dstBuffer); // return false if dstBuffer is too short


`doLogic' should also take an argument that is the length of `dstBuffer', so it can make sure to not write past the end of the buffer. Notice how `sprint_s' already does that.

Share this post


Link to post
Share on other sites

 

A better way would be

char srcBuffer[MAX_LENGTH]
sprint_s(srcBuffer, MAX_LENGTH, "Hello World!");

char dstBuffer[MAX_LENGTH]

bool errorVal = doLogic(srcBuffer, dstBuffer); // return false if dstBuffer is too short

`doLogic' should also take an argument that is the length of `dstBuffer', so it can make sure to not write past the end of the buffer. Notice how `sprint_s' already does that.

 

To be even safer, it should take the length of srcBuffer as well

Share this post


Link to post
Share on other sites

 

`doLogic' should also take an argument that is the length of `dstBuffer', so it can make sure to not write past the end of the buffer. Notice how `sprint_s' already does that.

 

 

I was looking at the problem in isolation (thinking the compiler would know the value of sizeof). But the OP mentioned a library - so you are 100% correct.

 

Something like this would be ideal:

bool errorVal = doLogic(srcBuffer, sizeof(srcBuffer), dstBuffer, sizeof(dstBuffer));
Edited by mark ds

Share this post


Link to post
Share on other sites

char* awesome_ini::aw_subStr(const char* line, size_t start, size_t length)
{
char subbuff[AWI_MAX_KEY_NAME_LENGTH];

memset(subbuff, 0, AWI_MAX_KEY_NAME_LENGTH);

size_t p = 0;
for (size_t i = start; i < length+1; i++)
{
subbuff[p] = line[i];
p++;
}

return (char*)subbuff;
}

char subbuff is stack allocated memory, you NEVER return pointers to stack memory, this is because exactly what you are seeing is happening, instead you pass in a buffer that the function should write to(and you should pass in the size of that buffer).


The reason why you don't see this with objects is because when you return an object, you are copying the value from the function, so if i do:
 
struct Foo{
int x;
};

Foo Func(void){
Foo p;
p.x = 10;
return p;
}

int main(int argc, char **argv){
Foo f = Func();
printf("%d\n", f.x);
return 0;
}
what's happening is that when p is returned from func, the values of p are copied into the values of f, so your copying the values off the stack into another part of the stack that is still valid in the program context(note: this is a very simiplied explanation).

however if we do:
struct Foo{
int x;
};

Foo *Func(void){
Foo p[2];
p[0].x = 10;
p[1].x = 12;
return p;
}

int main(int argc, char **argv){
Foo *f = Func();
printf("%d\n", f[0].x);
return 0;
}
this is undefined behavior, and is very likely to crash the program, this is because you are returning a pointer into stack allocated memory that became invalid the moment you returned from Func. in many cases using the data immediately after would technically still work, but it is undefined and very bad to rely on such behavior.

one of the correct solutions is to do like so:
struct Foo{
int x;
};

bool Func(Foo *p, int psize){
if(psize<2) return false; //failed
p[0].x = 10;
p[1].x = 12;
return true; //succeded!
}

int main(int argc, char **argv){
Foo f[2];
Func(f, 2); // we can pass stack allocated memory into functions, we just should never return such memory!
printf("%d\n", f[0].x);
return 0;
}
edit:

I didn't think I would need a working example since I thought putting comments of the output would be enough.

always try to provide the minimalist working example to demonstrate your problem, you'll get more concise answers that way, then us trying to understand what the hell your doing. Edited by slicer4ever

Share this post


Link to post
Share on other sites

 

 

`doLogic' should also take an argument that is the length of `dstBuffer', so it can make sure to not write past the end of the buffer. Notice how `sprint_s' already does that.

 

 

I was looking at the problem in isolation (thinking the compiler would know the value of sizeof). But the OP mentioned a library - so you are 100% correct.

 

Something like this would be ideal:

bool errorVal = doLogic(srcBuffer, sizeof(srcBuffer), dstBuffer, sizeof(dstBuffer));

No. Something like that is not ideal.

 

Use of sizeof in that manner will, in the end, introduce subtle bugs, as you will depend on a behavior that does not exist  outside of that function. Specifically, arrays decay to pointers. Just pass in the damn size of the string, and track with proper variables.

Share this post


Link to post
Share on other sites

In the context of my original post, it would be fine:

char srcBuffer[MAX_LENGTH];
sprintf_s(srcBuffer, MAX_LENGTH, "Hello World!");

char dstBuffer[MAX_LENGTH];

bool errorVal = doLogic(srcBuffer, sizeof(srcBuffer), dstBuffer, sizeof(dstBuffer)); // return false if dstBuffer is too short etc

 But point taken.

Edited by mark ds

Share this post


Link to post
Share on other sites
Aside from just explaining that returning stack-allocated member is bad mkay, these slides will help you have a better understanding of why it's not supported and why you can see the behavior you get without just getting an outright crash or compiler error:

http://www.slideshare.net/olvemaudal/deep-c

Share this post


Link to post
Share on other sites


In the context of my original post, it would be fine

 

I think you're missing the point of Washu's post. Using sizeof() when you have MAX_LENGTH staring you in the face is just lazy. sizeof() is resolved at compile-time, not run-time. IMHO, using sizeof with a variable as the argument should be avoided whenvever possible, as it inevitably leads to:

char *dstBuffer = new char[MAX_LENGTH];
some_function( ..., sizeof( dstBuffer ) ...); // where MAX_LENGTH is intended, but sizeof(char*) is what it compiles to

Share this post


Link to post
Share on other sites

We're all drifting off topic here, but what about:

char srcBuffer[] = {"hello"};
char dstBuffer[] = {"World"};

bool errorVal = doLogic(srcBuffer, ?, dstBuffer, ?);

In which case you need to use either sizeof or strlen.

 

Please note I fully agree about the case of

char *dstBuffer = new char[MAX_LENGTH];

But the OP specifically asked about "C", meaning [] or malloc.

Edited by mark ds

Share this post


Link to post
Share on other sites

We're all drifting off topic here, but what about:

char srcBuffer[] = {"hello"};
char dstBuffer[] = {"World"};

bool errorVal = doLogic(srcBuffer, ?, dstBuffer, ?);
In which case you need to use either sizeof or strlen.
 
Please note I fully agree about the case of
char *dstBuffer = new char[MAX_LENGTH];

But the OP specifically asked about "C", meaning [] or malloc.


char srcBuffer[] = {"hello"};
size_t srcLen = sizeof(srcBuffer);
char dstBuffer[] = {"World"};
size_t dstLen = sizeof(dstBuffer);

bool errorVal = doLogic(srcBuffer, srcLen, dstBuffer, dstLen);
Oh hey, look, now when I refactor out that doLogic call to another function it doesn't magically start seeing 4 or 8 or whatever the pointer size is randomly. All because I didn't use sizeof in an easly breakable place! Edited by Washu

Share this post


Link to post
Share on other sites

"Oh hey, look, now when I refactor out that doLogic call to another function it doesn't magically start seeing 4 or 8 or whatever the pointer size is randomly. All because I didn't use sizeof in an easly breakable place!"

          

 

Wow, a lot of assumptions! I suggest you read the OP. The takeaways are really simple: "library" + "C". Not c++.

Share this post


Link to post
Share on other sites

"Oh hey, look, now when I refactor out that doLogic call to another function it doesn't magically start seeing 4 or 8 or whatever the pointer size is randomly. All because I didn't use sizeof in an easly breakable place!"
          
 
Wow, a lot of assumptions! I suggest you read the OP. The takeaways are really simple: "library" + "C". Not c++.

First off, the language marker is C++, he's already mentioned that he's doing this IN C++ and desires C compatibility. That means the public facing interface, not the internal implementation details, need to be something C can handle.

Using sizeof in a manner that is almost universally recognized as being seriosly error prone is A Code Smell(tm) and should be avoided. One common example of this is exactly what was demonstrated above.

I assume the "library" + "C" thing was in reference to refactoring??? In which case: Refactoring is something you should do in any language, it's a means of cleaning up, enhancing, and generally making code better. Which can include moving where a function call happens.

As an additional note I've also removed your other post, keep it constructive and technical. I should also note that this topic is in General Programming. Not For Beginners, as such we generally allow the discussion to range a bit further from the original topic. Note that if this was for beginners, I would be extra hard on the sizeof issue. Edited by Washu

Share this post


Link to post
Share on other sites

I would string suggest turning the warning levels on your compiler up all the way.  Most compilers written in the last few decades would have tagged this error with an appropriate warning.  Such warnings are often a good way to help you learn the bad habits to avoid.

 

In fact, if possible, never turn the warnings off.

Share this post


Link to post
Share on other sites

 

return (char*)subbuff;

Crank up the warning level of your compiler until it complains about that line. Returning a pointer to something in the stack is almost universally an error, and your compiler should be able to tell you about it.

If you are going to do without std::string, I suggest you reimplement it, as something like this:
struct String {
  char *data;
  unsigned size;
  unsigned reserved;
};
Then write functions to perform all the operations you need on them.

 

 

Ah, thats a very good post. If it means a better programmer at the end of the day, then all warnings are errors.

 

 

A better way would be

char srcBuffer[MAX_LENGTH]
sprint_s(srcBuffer, MAX_LENGTH, "Hello World!");

char dstBuffer[MAX_LENGTH]

bool errorVal = doLogic(srcBuffer, dstBuffer); // return false if dstBuffer is too short

`doLogic' should also take an argument that is the length of `dstBuffer', so it can make sure to not write past the end of the buffer. Notice how `sprint_s' already does that.

 

 

Yes, that would be better to use as well for my error handling system if something went wrong so the user's application doesnt crash because of a small memory error.

 

 

char* awesome_ini::aw_subStr(const char* line, size_t start, size_t length)
{
char subbuff[AWI_MAX_KEY_NAME_LENGTH];

memset(subbuff, 0, AWI_MAX_KEY_NAME_LENGTH);

size_t p = 0;
for (size_t i = start; i < length+1; i++)
{
subbuff[p] = line[i];
p++;
}

return (char*)subbuff;
}
char subbuff is stack allocated memory, you NEVER return pointers to stack memory, this is because exactly what you are seeing is happening, instead you pass in a buffer that the function should write to(and you should pass in the size of that buffer).


The reason why you don't see this with objects is because when you return an object, you are copying the value from the function, so if i do:
 
struct Foo{
int x;
};

Foo Func(void){
Foo p;
p.x = 10;
return p;
}

int main(int argc, char **argv){
Foo f = Func();
printf("%d\n", f.x);
return 0;
}
what's happening is that when p is returned from func, the values of p are copied into the values of f, so your copying the values off the stack into another part of the stack that is still valid in the program context(note: this is a very simiplied explanation).

however if we do:
struct Foo{
int x;
};

Foo *Func(void){
Foo p[2];
p[0].x = 10;
p[1].x = 12;
return p;
}

int main(int argc, char **argv){
Foo *f = Func();
printf("%d\n", f[0].x);
return 0;
}
this is undefined behavior, and is very likely to crash the program, this is because you are returning a pointer into stack allocated memory that became invalid the moment you returned from Func. in many cases using the data immediately after would technically still work, but it is undefined and very bad to rely on such behavior.

one of the correct solutions is to do like so:
struct Foo{
int x;
};

bool Func(Foo *p, int psize){
if(psize<2) return false; //failed
p[0].x = 10;
p[1].x = 12;
return true; //succeded!
}

int main(int argc, char **argv){
Foo f[2];
Func(f, 2); // we can pass stack allocated memory into functions, we just should never return such memory!
printf("%d\n", f[0].x);
return 0;
}
edit:

I didn't think I would need a working example since I thought putting comments of the output would be enough.

always try to provide the minimalist working example to demonstrate your problem, you'll get more concise answers that way, then us trying to understand what the hell your doing.

 

 

Very well explained, makes sense now. I appreciate your help and everyone elses! And next time I'll be more specific so it's not a guest that code game.

Share this post


Link to post
Share on other sites

Very well explained, makes sense now. I appreciate your help and everyone elses! And next time I'll be more specific so it's not a guest that code game.


That is an important lesson. If you have a bug you can't track down, you are probably not a good judge of where the problem is located.

One good way to proceed in these situations is to transform your program into a tiny program that does as little as possible and still shows the odd behavior. You will very often discover the bug yourself during this process, and if you don't you now have a 20-line program that you can post here, with a guarantee that you didn't leave anything important out. In the rare event that you have discovered a compiler bug, you can also send this 20-line program to the people that write your compiler.

Share this post


Link to post
Share on other sites

Funny enough you said that, I do actually already do something like that.

 

My process...

 

1. Get idea.

 

2. Create console application in MVS.

 

3. Write base classes that do logic in my "main". (Keeps code clean, and easy to look at, plus I don't cause global pollution.)

 

4. Test in main.

//..Includes, etc...

int main(...)
{
     //For debugging..
     bool printToScreen = true;
     myClass *myPtr = new myClass("Do something cool!", printToScreen);
     myPtr->explode();

    return 0;
}

5. Import library / code to larger project.

 

6. Drink more coffee.

Edited by ajm113

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this