Best way to remove a substring from a C string?

Started by
4 comments, last by Bacterius 9 years, 9 months ago
I'm poking around with the C string functions. Don't ask, I'd personally just use C++ for the task, but it has to be pure C in this case. Anyway here's the problem. I have a null-terminated string, and I have to remove a certain substring from it that begins on location a, and ends on location b. I'm doing this on windows in MinGW. Consider this sample:

char * mystr = "sometestingtennisbabesjunk";const int a = 4;const int b = 11;removesubstring(mystr, a, b);
The result would be: "sometennisbabesjunk", characters that were removed from location 4 to 11: "testing".

What is the best way in C to do this? Is there a dedicated C string function for this task? I couldn't find one.
Advertisement

memcpy memmove is what you need.

you need to copy everything from b to the end of the string to the position a.

so something like this:

memmove(mystr + a, mystr + b, strlen(mystr) - b + 1);

The +1 is for also copying the null terminator.

That will leave some unused space at the end of the string.

If this is a prolem, you could copy the result to a newly allocated string which has the right number of bytes.

Maybe using strdup

edit: Fixed it tongue.png As vstrakh points out below, memmove is the one to use when regions overlap.

Also if you implement it manually make sure you validate your bounds, that is:


if ((b < a) || (b > strlen(str))))
    /* abort! */

Otherwise you're just setting yourself up for scribbling all over your stack or heap.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

memcopy behaviour is undefined when regions overlapped, so you actually need memmove.

And be careful to not operate on string constants. mystr variable only holds pointer to string literal data, which is constant.

Also if you implement it manually make sure you validate your bounds, that is:

if ((b &amp;amp;lt; a) || (b &amp;amp;gt; strlen(str))))    /* abort! */
Otherwise you're just setting yourself up for scribbling all over your stack or heap.
Shouldn't the sanity checks be like:
len = strlen(str);
if ( (!len) || (b <= a) || (a >= len) ) {... abort}
if (b > len) b = len;
EDIT: Stupid HTML

Also if you implement it manually make sure you validate your bounds, that is:


if ((b &amp;amp;lt; a) || (b &amp;amp;gt; strlen(str))))    /* abort! */
Otherwise you're just setting yourself up for scribbling all over your stack or heap.
Shouldn't the sanity checks be like:

len = strlen(str);
if ( (!len) || (b <= a) || (a >= len) ) {... abort}
if (b > len) b = len;
EDIT: Stupid HTML

Depends how you want to use your function. I personally prefer to not allow nonsensical input at all, thus if b is beyond the string, I would reject it. If you prefer to clamp it to the string's length instead like some string functions do, that's fine too, as long as you document that behaviour. If you don't want to allow a == b, that's fine as well. And of course you should reject null char* pointers, forgot that one (though it is obvious). Notice that !((b < a) || (b > len)) implies len > 0 (or a = b = 0) since a and b are unsigned.

“If I understand the standard right it is legal and safe to do this but the resulting value could be anything.”

This topic is closed to new replies.

Advertisement