Sign in to follow this  

C sizeof problem

This topic is 3457 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I had an interview today for a possible internship with a company supplying IPTV solutions. The interviewer asked a technical question, which I got with a little bit of help here and there but he said it was just as important to communicate how to solve the problem as well as actually implementing it. No idea whether I got the job, but we'll see. The question he asked was write a C function to reverse a string. So I came up with something on a whiteboard which he seemed relatively happy with. I wanted to be sure I could always do it so I tested the code again at home, at which point I hit a weird issue (as I usually do when working with C, I might add). Here's the code I wrote ('scuse the messiness of it, I was just hacking about):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void strRev(char* originalString);

int main()
{

	char* source;

	source = "hello";

	strRev(source);

	return (0);
}


void strRev(char* originalString)
{	
	char* tempString = NULL;
	int i = 0;
	int j = 0;
	int lengthOfTempString = 0;
	int lengthOfOriginalString = 0;
	
	lengthOfOriginalString = strlen(originalString);

	tempString = malloc(sizeof(char) * lengthOfOriginalString);

	lengthOfTempString = strlen(tempString);

	printf("Length of tempString: %i\n",lengthOfTempString);
	printf("Length of originalString: %i\n",lengthOfOriginalString);
	
	printf("sizeof(char): %i\n",sizeof(char));
	
	for (i = strlen(originalString); i >= 0; i--)
	{
		if (originalString[i] == '\0')
		{
			printf("Null terminator detected\n");
		}
		else
		{
			tempString [j] = originalString[i];
			j++;
		}
	}
	

	*tempString+='\0';

	printf("Original string: %s",originalString);
	printf("\n");
	printf("Newly reversed string: %s",tempString);
		
	free(tempString);
}


Now, when I run this, I see something interesting. The strlen of originalString is 5, as it should be, and this is what the code reports. So why is the strlen for tempString 9? That doesn't make any real sense to me, where's the 9 coming from? I know the size of a char* pointer is 4 bytes but that isn't important or relevant here from what I can see, even still there's no way that you could get 9 out of that. When it prints out tempString, I see the right output followed by 4 extra bytes of utter crap because the buffer is too big. I'm now getting 9, which is an improvement on the even more random 16 and 24 I was getting earlier. Similarly, when I do:
tempString = malloc(sizeof(char));
without the additional sizeof as shown above, the length of tempString is apparently 16?? I really don't get what's going on here, I'm not exactly Brian Kernaghan when it comes to C but that just seems plain wrong. How does having a char*, then assigning it dynamically to a standard char produce 16? Shouldn't it be 1, or 4? I'm compiling using VS2005 if that helps, maybe something weird about the compiler? This is actually bugging me, it doesn't seem right. Where did I go wrong? TIA, ukd.

Share this post


Link to post
Share on other sites
strlen works by reading memory at a pointer until it finds a null-terminator. malloc just allocates data, it does not initialize it. The solution to your problem is to initialize your buffer.

	tempString = malloc(sizeof(char) * lengthOfOriginalString);
memset(tempString, 0, sizeof(char) * lengthOfOriginalString);
lengthOfTempString = strlen(tempString); // = 0

Share this post


Link to post
Share on other sites
	lengthOfOriginalString = strlen(originalString);
tempString = malloc(sizeof(char) * lengthOfOriginalString);

lengthOfTempString = strlen(tempString); /* UH-OH! */

printf("Length of tempString: %i\n",lengthOfTempString);
printf("Length of originalString: %i\n",lengthOfOriginalString);

See the line marked with the "UH-OH" comment? Since we know that strlen works by seeking sequentially through the character array for null termination, but we just allocated the memory and have no idea what's in it... where's the null? 1 character away? 9? 16? 1015?

Yeah, exactly.

Share this post


Link to post
Share on other sites
You didnt initialize your buffer. Use calloc for your allocation, or malloc it then memset it. I prefer calloc, because thats what its for.


lengthOfOriginalString = strlen(originalString);

tempString = calloc(sizeof(char) * lengthOfOriginalString + 1, 1);

lengthOfTempString = strlen(tempString);

You should still however expect the value of lengthOfTempString to be zero, because you haven't put a string into it yet.

[edit]

Damnit, why do I have to be such a slow typist!?

[edit, again]

You also forgot to add 1 byte for the null terminator.

Share this post


Link to post
Share on other sites
The C standard defines sizeof(char) as 1. I think your problem here is that there isn’t a '\0' at the end of the string when you calculates its length. strlen calculates the numbers of chars before the first '\0' in memory.

It’s possible to reverse a string without using additional space (I haven’t tested it):

void strRev(char *str)
{
size_t len = strlen(str);
size_t n;

for (n = 0; n < len/2; ++n) {
char tmp = str[n];
str[n] = str[len - n - 1];
str[len - n - 1] = tmp;
}
}





Edit: Too slow... [bawling] [lol]
Edit 2: I thought your objective was to modify the string passed reversing it. But your code don’t do that.

Share this post


Link to post
Share on other sites
Quote:
Original post by apatriarca
The C standard defines sizeof(char) as 1. I think your problem here is that there isn’t a '\0' at the end of the string when you calculates is length. strlen calculates the numbers of chars before the first '\0' in memory.

It’s possible to reverse a string without using additional space (I haven’t tested it):
*** Source Snippet Removed ***

Edit: Too slow... [bawling] [lol]


This would be a good place to use XOR swap. Some interviewers seem to love this kind of bullshit: [grin]


void strRev(char *str)
{
size_t len = strlen(str);
size_t n;

for (n = 0; n < len/2; ++n) {
str[n] ^= str[len - n - 1];
str[len - n - 1] ^= str[n];
str[n] ^= str[len - n - 1];
}
}

Share this post


Link to post
Share on other sites
Quote:
Original post by fpsgamer
This would be a good place to use XOR swap. Some interviewers seem to love this kind of bullshit: [grin]

I never understand why the XOR swap is supposed to be faster than the simpler and easier to read and understand one (not considering my favorite swap: std::swap). I have tested the two versions on my machine once and the XOR swap was slower. But I doubt is worth optimizing things like that.

Share this post


Link to post
Share on other sites
Quote:
Original post by apatriarca
Quote:
Original post by fpsgamer
This would be a good place to use XOR swap. Some interviewers seem to love this kind of bullshit: [grin]

I never understand why the XOR swap is supposed to be faster than the simpler and easier to read and understand one (not considering my favorite swap: std::swap). I have tested the two versions on my machine once and the XOR swap was slower. But I doubt is worth optimizing things like that.


They prefer it because the interview would end too quickly if you could just use std::reverse like a sane developer :)

Share this post


Link to post
Share on other sites
I was actually told another way to do it by pointing to the start and end of the string, swapping the values until you reach the middle of the string. Might try that one next.

calloc solved the problem, thanks for your help! It's now screamingly obvious that I didn't initialise the buffer, I didn't think of zeroing it for some reason :S

It all seems so clear now, thanks for your help.

Share this post


Link to post
Share on other sites
Quote:
Original post by ukdeveloper
I was actually told another way to do it by pointing to the start and end of the string, swapping the values until you reach the middle of the string. Might try that one next.

It’s exactly the method I have implemented earlier. It use array indices instead of pointers but the logic is the same.

Quote:

calloc solved the problem, thanks for your help! It's now screamingly obvious that I didn't initialise the buffer, I didn't think of zeroing it for some reason :S

calloc initialize the entire buffer even if you only need to set the last char of the string to '\0'. A simple tempString[lengthOfOriginalString] = '\0' is enough.

Quote:

They prefer it because the interview would end too quickly if you could just use std::reverse like a sane developer :)

:) This is also the reason they have asked for a C implementation. In C++ it would have been too easy.

Share this post


Link to post
Share on other sites
Quote:
Original post by ukdeveloper
I was actually told another way to do it by pointing to the start and end of the string, swapping the values until you reach the middle of the string. Might try that one next.


The "middle" is just going to confuse you with even/odd length (even though it doesn't matter)


void swap(char* a, char* b)
{
char tmp = *a;
*a = *b; *b = tmp;
}

void reverse(char *str)
{
char* ptr = str + strlen(str) - 1;
while (str < ptr)
swap(str++, ptr--);
}



As for the XOR-swap. There might not be many reasons to use it, chances are, the compiler will optimize away the tmp variable anyway. However, it's one of those "tricks" every programmer seems to be supposed to know. In an interview, I might be torn. Some might appreciate that you know it, others might figure you're trying to show off with lame tricks and others might even worry that you'll use that nonsense in production code for no reason other than being cool.

Share this post


Link to post
Share on other sites
Quote:
Original post by Trienco
As for the XOR-swap. There might not be many reasons to use it, chances are, the compiler will optimize away the tmp variable anyway. However, it's one of those "tricks" every programmer seems to be supposed to know. In an interview, I might be torn. Some might appreciate that you know it, others might figure you're trying to show off with lame tricks and others might even worry that you'll use that nonsense in production code for no reason other than being cool.
I once had to take an interview-test which was almost entirely composed of "clever" questions, like how to swap two variables without using a temporary.

I didn't get the job, but 6 months later the guy in charge of writing the recruitment test was fired. Moral of the story - don't obsess over clever code ;)

Share this post


Link to post
Share on other sites
Is the XOR swap actually faster though??? I was always under the impression (from what I'd been taught) that it was useful on embedded systems when you were working under extremely tight memory constraints (where you might not even have memory to put the temp variable on the stack).

I've never actually profiled this stuff (and I really don't care to at the moment).

Share this post


Link to post
Share on other sites
I'd even expect a good compiler to notice that the tmp variable has such limited scope that it won't even put it on the stack and just use a register for it. Should be faster than all the XOR-trickery. That's why I wouldn't hesitate declaring the variable inside the loop instead of outside. Don't know how many compilers actually care about "register char tmp = *str;", but it probably won't hurt.

Share this post


Link to post
Share on other sites
In C++ the more efficient and elegant way to solve that is to simply use a range adaptor that reverses the range.

make_reversed_range(your_range) and there you go, you get a range that you traverse in reverse order.
The old variable is kept unmodified, and it is O(1). An access to the first element will simply automatically be translated to an access to the last element.

A pipe syntax is nice too, your_range | adaptors::reversed.

All of this is doable with the range algorithms and adaptors available in Boost.

(For those who do not know about the range concepts, a range can be any container, a pair of iterators, or any other type that allows extracting of a begin and an end iterator)

By the way, note that your code is not const-correct, and that's bad.

Share this post


Link to post
Share on other sites
Quote:
Original post by loufoque
In C++ the more efficient and elegant way to solve that is to simply use a range adaptor that reverses the range.

make_reversed_range(your_range) and there you go, you get a range that you traverse in reverse order.
The old variable is kept unmodified, and it is O(1). An access to the first element will simply automatically be translated to an access to the last element.

A pipe syntax is nice too, your_range | adaptors::reversed.

All of this is doable with the range algorithms and adaptors available in Boost.

(For those who do not know about the range concepts, a range can be any container, a pair of iterators, or any other type that allows extracting of a begin and an end iterator)

By the way, note that your code is not const-correct, and that's bad.


So.... I'm assuming you didn't read any parts of this thread at all? I mean, how else could you have missed all the references to "this is done in C, not C++"

Share this post


Link to post
Share on other sites
Quote:
Original post by loufoque
In C++ the more efficient and elegant way to solve that is to simply use a range adaptor that reverses the range.

I don't know how many job interviews you've gone on, but in my experience interviewers prefer solutions that actually do what the interviewer asked.

Share this post


Link to post
Share on other sites
Plus, for a simple task like "reverse a string" the above sounds like pure over-engineering. Something I've learned to hate during endless hours of going through peoples code to track down bugs that wouldn't even be there without a pointlessly complicated approach. Cool and fancy things are fine, IF they are needed to achieve a goal and IF they are appropriate. Using them for no reason other than being cool will just annoy the hell out of your coworkers.

In other words and as someone has said before: if you do anything more complicated than calling reverse(someString) in C++, you didn't understand the task ,-)

Share this post


Link to post
Share on other sites
Pro-tip: If you're ever asked for a zero-temporary swap, for maximum points, explain to the interviewer exactly _why_ it's slower on modern processors.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sneftel
Pro-tip: If you're ever asked for a zero-temporary swap, for maximum points, explain to the interviewer exactly _why_ it's slower on modern processors.


The problem for someone like myself is that I lack certain lowel-level hacker knowledge :) I can simply relate that the "compiler can do better".

I wouldn't mind hearing a more precise answer.

Share this post


Link to post
Share on other sites
"The compiler can do better" is one compelling argument. To the compiler, where possible, std::swap means "switch the register allocations", and masses-of-XOR means "masses of XOR". The big thing, though, is pipelining. a^=b unavoidably ties up both a and b for one clock cycle, and then unavoidably keeps a tied up for the next clock cycle. Since the next instruction needs to read back from a, this means that in the simplest possible case, the three instructions are completely serialized, taking six cycles.

Now consider the temp case. The first op, t=a, takes two cycles as well, but as soon as the first cycle is complete, a is no longer tied up, and a=b can start executing. Likewise, after one more cycle, b is no longer tied up, and b=t can start executing. In this same simple pipelined processor, the total is four cycles.

Share this post


Link to post
Share on other sites
Quote:
Original post by ukdeveloper
This is actually bugging me, it doesn't seem right. Where did I go wrong?


Pretty much everywhere, sorry to say.

Here's a reasonable "answer key" for C. Note how few lines of actual code there are.


#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* Just because the standard library uses cryptic names for functions doesn't
mean you should. On the other hand, there's not much sense elaborating that
the char* you're passing in is a "String" in the variable name... */


void reverse(char* original) {
/* We reverse the string in place, by swapping around the chars within it. */
/* First, set up two pointers, to the first and last characters (not
including the null terminator). There will be no need to do anything
special with the terminator; we just leave it in place, because the
reverse of a string is the same length as the original. */

char* begin = original;
char* end = original + strlen(original) - 1;
/* Notice that C allows you to initialize variables at declaration, even
with complicated expressions. Don't declare something and then assign it
on the next line, or "initialize" to a "zero" value and then assign on the
next line. The point of *initial*ization is to initialize something with
its *initial* value. :) */

/* Also notice the pointer arithmetic used to find the end pointer. */
/* Now we can iteratively swap pairs of characters: the first and last,
second and second-last, etc. until we get to the middle. You should be
able to convince yourself that this reverses the string. Notice that we
don't iterate each pointer over the whole string, since that would
re-reverse it. ;) So how do we find the middle? We could do some math,
but the simple way is to note that both pointers "move" at the same rate,
so 'begin' will reach or overtake 'end' exactly in the middle. So now we
are ready: */

while (begin < end) {
/* To swap, we assign the pointed-at characters to each other. But we can't
just do one assignment and then the other, because the first assignment
would mess up the second. Instead, we need to go through a temporary
variable. Its type is 'char', because those are what we're swapping. */

char temp = *begin;
*begin = *end;
*end = temp;
/* Move the pointers towards each other. */
begin++;
end--;
}
}

int main() {
/* Again, initialize when you can. */
/* Note that we are not allowed to modify string literals, so we can't pass
"hello" to reverse() directly. But we can do it if the string is copied
into an array. In C, the following syntax implicitly does the trick: */

char[] source = "hello";
/* There is a string literal which is "baked" into the executable, but at
runtime, the program will make an array on the stack and copy in the
literal. */

printf("original: %s\n", source);
reverse(source);
printf("reversed: %s\n", source);
/* We don't need to return 0 explicitly. But even then, don't put parentheses
around return values: return is a keyword, not a function. (If it were a
function, how would it return? */

return 0;
}




EDIT: Yeah, I suppose actually moving the pointers used for iteration would be a good idea. And of course someone noticed before I could correct it. That's what you get for writing that much by way of explanation, I guess. X_X

Share this post


Link to post
Share on other sites
Quote:
Original post by Zahlman
Here's a reasonable "answer key" for C. Note how few lines of actual code there are.

*** Source Snippet Removed ***

There seems to be a distinct lack of modification of the begin and/or end pointers in the loop in your function.

Share this post


Link to post
Share on other sites
I'd also add that Visual Studio 2005 (what I used) barked like a pack of dogs when I didn't declare everything at the top of the function in advance of actually initialising and/or assigning it. It wouldn't compile if I tried to do inline initialisation like you did apart from setting the tempString pointer to null.

Apparently it's a requirement of C99 but maybe it's just the VS compiler being picky? I don't normally use C so I probably had the compiler options set up wrong.

Share this post


Link to post
Share on other sites

This topic is 3457 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this