Jump to content
  • Advertisement
Sign in to follow this  
gretty

[C++] Imitate Split() from Python

This topic is 3163 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi I am trying to make my own Split function like that in Python but written in C++. If you dont know what the split function does, you use it like this:
string s = "hello. hi. Yo.";
result = string.split('.') 

// result[0] = "hello"
   result[1] = " hi" 
   etc.
My problem is that I am having trouble thinking up my algorithm in an efficient way. Right now my split function is really inefficient, I check every character in a string & see if it is == to the target split character. What do you think pythons algorithm would be for their split() function? Can you suggest a better algorithm to split a string at a target character? Also in C++ is it possible to return an array from a function? Maybe I can only return a pointer to an array?
#include <iostream>
#include <cstdlib>
#include <vector>
#include <string>

using namespace std;

vector<string> split(string s, char target);

int main() {
    
    string a = "m kdfjkdmdfjdhf Mhkfjdjkfh M hnkjsjkdf m jkdsjcfkm sjfjkdsh";
    
    vector <string> splitted = split(a,'m');
    
    for (int i=0; i<splitted.size(); i++) 
    {
        cout << splitted.at(i) << endl;
        
    }
    
    system("pause");
    return 0;
}

vector<string> split(string s, char target)
{
    // Post: This function is the same as the split() function in python. 
    //       A string is split at every occurence of target & each section 
    //       is stored in a vector element.
    
    vector <string> result;
    string tempStr;
    
    //if (!islower(target)) {
    target = tolower(target);
    //}
    
    for (int i=0; i<s.length(); i++) 
    {
        s = tolower(s);
        if (s == target) {
           string res = "'"+tempStr+"'";
           result.push_back(res);
           tempStr = "";
        }
        else tempStr += s;
    }
    
    if (tempStr.length() > 0) {
         string res = "'"+tempStr+"'";
         result.push_back(res);
    }
    
    return result; 
}


Share this post


Link to post
Share on other sites
Advertisement
Here's some examples.
Quote:
Also in C++ is it possible to return an array from a function? Maybe I can only return a pointer to an array?
Returning a vector is a pretty standard thing to do if you want to return an array, especially if the size of the array isn't known at compile-time.

On a side note, when passing strings around in C++ it's best to pass them by const-reference, like:
vector<string> split(const string& s, char target)

Share this post


Link to post
Share on other sites
The Python split() allows you to specify a string as the delimiter (since Python does not distinguish a separate character type from the string type; '.' is a string of length 1 rather than a character, and 'hi mom'[0][0][0][0][0][0][0][0] will not raise an exception), and is case-sensitive (so you shouldn't be doing anything with tolower().

Also, the Python split function does not add quotes to the substrings. That's part of Python's built-in formatting for displaying a representation of strings. You should definitely not add them in the C++ version.

Finally, there are a bunch of 'find' functions in the std::string interface that you seem to be unaware of. Don't make life hard for yourself.

Quote:
Also in C++ is it possible to return an array from a function?


You can wrap an array in a structure and return an instance of the structure. Or you can use a pre-made structure for that purpose: boost::array (which is designed to let you treat it just like an array, with [] subscripting and everything).

However, for the current task, an array is inappropriate because you do not know ahead of time how many substrings there will be.

Here's what I came up with:


vector<string> split(const string& source, const string& delimiter, int limit = -1) {
string::size_type position = 0;
const int delimiter_size = delimiter.size();
if (delimiter_size == 0) { throw invalid_argument("empty delimiter"); }
vector<string> result;
// Note the use of '!=' here rather than '<' which allows us to treat a -1
// limit value as "infinity". The loop will still break when the string can't
// be found any more.
for (int i = 0; i != limit; ++i) {
string::size_type found_at = source.find(delimiter, position);
if (found_at == string::npos) { break; }
result.push_back(string(source, position, found_at - position));
position = found_at + delimiter_size;
}
result.push_back(string(source, position));
return result;
}


Share this post


Link to post
Share on other sites
Our library's split function looks something like this:

vector<string> split(const string& in, const string& delim) {
string::size_type start = in.find_first_not_of(delim), end = 0;

vector<string> out;
while(start != in.npos) {
end = in.find_first_of(delim, start);
if(end == in.npos) {
out.push_back(in.substr(start));
break;
} else {
out.push_back(in.substr(start, end-start));
}
start = in.find_first_not_of(delim, end);
}
return out;
}



But of course there are probably a dozen different ways you could go about writing a split function, with a ton of different parameters you could potentially pass in to tweak the behavior to precisely your needs.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

GameDev.net is your game development community. Create an account for your GameDev Portfolio and participate in the largest developer community in the games industry.

Sign me up!