Sign in to follow this  
Sfpiano

Garbage being copied in between strings during concatenation

Recommended Posts

char* process() {
	FILE* fp;
	char* str = NULL, *newstr = NULL;
	fp = fopen("file.txt", "r");

	while( (str = getaline(fp)) != NULL ) {
		if( str[0] == '>' ) {

		}
		else {
			if( newstr == NULL ) {
				newstr = (char*)realloc(newstr, strlen(str)*sizeof(char));
				strcpy(newstr, str);
			}
			else {
				newstr = (char*)realloc(newstr, strlen(str)*sizeof(char));
				strcat(newstr, str);
			}
			free(str);
		}
	}

	fclose(fp);
	return newstr;
}
I have a file where entries span multiple lines and are marked with a starting character. My problem is when I try concatenating the strings as I read them in, they get corrupted along the way. Ex. File: >blah abcde abcde When I run this function the string I'll get back is something like: abcde@#$*@#abcde

Share this post


Link to post
Share on other sites
You're not providing space for the terminator.

Also, moved to For Beginners.

Share this post


Link to post
Share on other sites
^^ what he said ;)
Also:
-- sizeof(char) is always 1, so you can remove that bit of code.
-- When (newstr==NULL) you should be using malloc, not realloc
-- When (newstr!=NULL) you should realloc to strlen(newstr)+strlen(str) (plus 1 for the terminator)
			if( newstr == NULL ) {
newstr = (char*)malloc(strlen(str)+1);
strcpy(newstr, str);
}
else {
newstr = (char*)realloc(newstr, strlen(newstr)+strlen(str)+1);
strcat(newstr, str);
}

Share this post


Link to post
Share on other sites
Ah, the problem was the first realloc. I though if the buffer you passed in was null, it would just act as a malloc.

Share this post


Link to post
Share on other sites
Quote:
Original post by Sfpiano
Ah, the problem was the first realloc. I though if the buffer you passed in was null, it would just act as a malloc.


well, it might on some compilers, but its not good style :)

I'd be more worried about using "alloc(strlen(x))" instead of "alloc(strlen(x)+1)"

Share this post


Link to post
Share on other sites
Quote:
Original post by Sfpiano
...I though if the buffer you passed in was null, it would just act as a malloc.


Yes, you can use realloc in place of both malloc and free.

Share this post


Link to post
Share on other sites
0) The obligatory "why aren't you doing this in C++ where it's infinitely easier and the overhead is provably negligible or even negative (since a smarter algorithm is implemented than what you were thinking of), look, I'll show you" rant goes here :) But even in C we can learn tips and tricks that are applicable to programming in general, such as (1) below:

1) You don't need a special case at all. Handling NULL as a special case smells; consider "null objects" instead. Granted, you don't really have "objects" in C, but the natural "null object" for a C-string is... an empty string. If we just initialize 'newstr' as "" instead of NULL, then we can realloc the first time around just fine, as the strlen() of the "existing" string is 0. But it does seem that we need to allocate this string so that realloc() doesn't try to free() a literal :s

2) As noted, you need space for the null terminator. Also, are you sure your getaline() function does null termination?

3) I assume you are going to have multiple entries in the file eventually. You probably will need more robust handling of the '>' characters in that case. Right now, with the empty if clause, the code will concatenate everything it encounters until the end of the file, simply skipping over the '>' lines.

I assume that the text on these lines is intended to "label" the entries. You should consider if you need support for a multi-line label ;) But otherwise, the process you really want is something like:


- read lines until we find a label.
- Starting with the line after that, loop until a peek fails:
- If the peeked character is a '>', immediately return what we have so far.
- Otherwise, read the next line and concatenate it.


4) Just because C requires you to declare variables at top of scope doesn't mean you shouldn't still initialize instead of assigning as much as possible.


int peek(FILE* f) {
int result = getc(fp);
ungetc(c, fp); /* doesnt matter if this is EOF. */
return result;
}

/* The loop body seems useful enough in general to extract: */
char* append_reallocated(char* base, char* suffix) {
/* You shouldn't cast malloc() results in modern C, and I
assume that applies to realloc() too. It hides the error of
failing to #include <stdlib.h>, IIRC. */


base = realloc(base, strlen(suffix) + 1);
strcat(base, suffix);
return base; // for chaining, I guess... seems consistent with how
// the standard library behaves.
}

/* Since this function just reads one item of many, it shouldn't do the file
opening/closing now. */

char* process(FILE* fp) {
char c;
char* line;
char* result = malloc(1);

assert(result); *result = '\0';

/* Skip to the next item. */
while ((c = peek(fp)) != EOF && c != '>') {
free(getaline(fp));
}

/* Skip the label (safe if we're at EOF; we just free NULL) */
free(getaline(fp));

/* Read the item (returns "" if we're at EOF OR an item is empty...
hmm, if you need empty items then you'll need to fix this ;) */

while ((c = peek(fp)) != EOF && c != '>') {
append_reallocated(result, (str = getaline(fp)));
free(str);
}
return result;
}

int main() {
FILE* fp = fopen("file.txt", "r");
char* str;
/* Check for the "null object", instead of a null pointer... */
while((str = process(fp))[0]) {
printf("Read an item: %s", str);
}
fclose(fp);
}

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this