Sign in to follow this  

Removing All Whitespace From a String

This topic is 3664 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I'm trying to remove all whitespace from a string for a project I'm working on and I need some help. I've tried a few different approaches but none have worked how I need. I found a way to trim off the whitespace from the beginning and end of a string but not from the middle. The most recent way I tried was this function I wrote:
string EatSpace(string a)
{
    for (unsigned int i = 0; i < a.size(); i++)
    {
	if(isspace(a[i]))
	    a.erase(i, 1);
    }
    return a;
}
The function manages to trim some of the beginning whitespace but not all of the whitespace in the string. Any help would be greatly appreciated.

Share this post


Link to post
Share on other sites
this has problem

Quote:

for (unsigned int i = 0; i < a.size(); i++)

{

if(isspace(a[i]))

a.erase(i, 1);

}


because when a space is removed the length of string is decreasing so you have to check the loop that it also runs for the current length of string not for the starting length of array. try:

while hasspace(string){
str = removefirstspace(str);
}

where hasspace can tell that the string contains some space, and removefirstspace removes first space from the string.

Share this post


Link to post
Share on other sites
You could try this:

[QUOTE]
for (unsigned int i = 0; i < a.size(); i++)
{
if(isspace(a[i]))
{
a.erase(i, 1);
i--;
}
}
[/QUOTE]

Edit: Why don't my "quote" tags work?

Share this post


Link to post
Share on other sites
Roughly speaking:

void Remove(std::string & rString, char cWhat)
{
rString.erase(std::remove(rString.begin(), rString.end(), cWhat), rString.end());
}

//...

std::string Data(...);

Remove(Data, ' ');



Share this post


Link to post
Share on other sites
I guess there are enough responses without me butting in, but I feel compelled.

// Don't rely on a's dynamically changing size. Make your own variable you can control
int size = a.size

// For all the elements in a...
for( int i=0; i < size; ) // increment i yourself. Don't rely on unconditional incrementation.
{
if( isspace(a.i) )
{
a.erase(i, 1);
size--; // Change size yourself.
// If everything moves back once, i is already in the right place!
}
else
{
i++ // Move i up ONLY if the string did not change.
}
}





(Code is untested.)

Share this post


Link to post
Share on other sites
There's another possibility here.

Erase will likely copy all remaining characters in string. So for a string full of spaces, worst-case performance will be (n-1)+(n-3)+(n-5)+...

While it's possible that some optimizations are done in some algorithms of the standard library, this might be algorithmically optimal:
#include <string>
#include <algorithm>
std::string full_trim( std::string &s )
{
std::string::iterator left = s.begin();
std::string::iterator right = s.begin();

while (left != s.end() && !isspace(*left)) {
std::advance(left, 1);
std::advance(right, 1);
}
while (right != s.end()) {
if (isspace(*right)) {
std::advance(right, 1);
} else {
*left = *right;
std::advance(left, 1);
std::advance(right, 1);
}
}
s.resize(left - s.begin());
return s;
}


Ideally, no reallocations should occur, and only minimal number of elements will be copied.

Double loops are for degenerate case where there are no spaces, at least in the beginning, which avoids redundant copy.

Share this post


Link to post
Share on other sites
you could also use another string as a buffer, and only tranfering non white space characters to the buffer string, something liek this?


!!WARNING!! untested, and from someone with not a whole lot of experience



char* remove_white(char* original,char* buffer)
{

int bufferpos = 0;



for(int i = 0; i < strlen(original);i++)
{

if(original[i] == ' ')
{
continue;
}

buffer[bufferpos] = original[i];
bufferpos++

}



strcopy(original,buffer);


return original;

}

Share this post


Link to post
Share on other sites
Quote:
Original post by Antheus
While it's possible that some optimizations are done in some algorithms of the standard library, this might be algorithmically optimal:
*** Source Snippet Removed ***

Congratulations, you just rewrote std::remove_if().

Share this post


Link to post
Share on other sites
I think I like godsenddeath's solution the best right now.

Yes, the problem can be solved in one line, but is the goal here to solve Celephix's problem, or teach him? I like godsenddeath's solution because it's simple and thus shows HOW the problem is resolved. The problems that solve it in one line of code are way too complex (in my opinion) for someone who's confused enough already.

Share this post


Link to post
Share on other sites
Quote:
Original post by Splinter of Chaos
I think I like godsenddeath's solution the best right now.


I don't. An extra buffer is unnecessary. Using char pointers in favour of real C++ string objects is a bad habit to get into. It only checks for space characters, which is not the same as isspace (which counts: single space, tab, vertical tab, form feed, carriage return, or newline).

Quote:

Yes, the problem can be solved in one line, but is the goal here to solve Celephix's problem, or teach him?


Both. He can use the simpler ones for reference, but in practise use the one liners.

Its like std::vector (or any container), sane people avoid looking at the implementation. When we are learning, we write our own version so we can understand what goes on inside the Standard Library Containers. When writing code, we do not use our hand-rolled version.


Share this post


Link to post
Share on other sites
Quote:
Original post by SiCrane
Quote:
Original post by Antheus
While it's possible that some optimizations are done in some algorithms of the standard library, this might be algorithmically optimal:
*** Source Snippet Removed ***

Congratulations, you just rewrote std::remove_if().


After checking the source, I really did. Although the MVS implementation is a bit more clever.

Like I said, in-place copy will be algorithmically optimal. It's good to know that standard implementation takes all such aspects into consideration, making those "performance" concerns really a moot point.

Share this post


Link to post
Share on other sites
Quote:
Original post by rip-off
Quote:

Yes, the problem can be solved in one line, but is the goal here to solve Celephix's problem, or teach him?


Both. He can use the simpler ones for reference, but in practise use the one liners.


Fair enough, but this means that whenever someone wants to post a one liner, s/he should also include a multi-lined code equivalent and an explanation of why they work the same and how the multi-lined one works and how the one liner works in comparison.

Share this post


Link to post
Share on other sites
Quote:
Original post by dalleboy
Go with raz0rs implementation or:

str.erase(std::remove_if(str.begin(), str.end(), isspace), str.end());


Winner.

(Antheus, std::remove and std::remove_if do handle this.)

Here's a more "creative" solution - still linear running time but I suspect it will be a few times slower:


istringstream iss(str);
ostringstream oss;
string word;
while (iss >> word) { oss << word; }
str = oss.str();


(Same basic idea as godsenddeath had, but more C++-idiomatic and simpler, and also checking all whitespace.)

Share this post


Link to post
Share on other sites
Quote:
Original post by Splinter of Chaos
I think I like godsenddeath's solution the best right now.

Yes, the problem can be solved in one line, but is the goal here to solve Celephix's problem, or teach him? I like godsenddeath's solution because it's simple and thus shows HOW the problem is resolved. The problems that solve it in one line of code are way too complex (in my opinion) for someone who's confused enough already.


thanks :) as of 8 months ago i had no idea what C++ let alone how to program using it, and i respect the critisim from the following posts, i'm completly interested in constructive critisim, it helps me grow as a programmer

ps. i just figured using char*(aka char[]) would be more universal and proper than using c++ strings, although i could re-write it using std::string, as i would normally would, because personally i only use char[] when it's required by an aPI to avoid .c_str() and other complications

Share this post


Link to post
Share on other sites

This topic is 3664 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this