Sign in to follow this  

Splitting a string

This topic is 408 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi all,

I've been struggling to find 'the best' way to split a string into several vars/ values.

To illustrate the situation, let's assume I have a std::string with this content:

 

MAX_PLIGHT:4;NORMALMAP:1;

 

Now, what i want to do is extract 2 strings and UINT's from there:

 

string1 = MAX_LIGHT

uint1 = 4

 

string2 = NORMALMAP

uint2 = 1

 

The solution I'm looking for would ofcourse be independent of the number of 'shader defines' in the full/ complete string.

You can assume that there will always be both a name and value in there and that there will be a  ';' at the end of the string.

 

Any ideas on how to approach this?

 

(Note: I came quite far extracting the names, but for the values/uint's I got stuck, unless I manually do a 'ATOI' convert on the chars between the : and ;)

Edited by cozzie

Share this post


Link to post
Share on other sites

Thanks, I really have to study this code, it looks like it 'splits' the name and value from each other.

In the meantime I've come up with a working, but to be honest, not very clean solution:

bool CD3dShaderPack::CreateDefines(const std::string &pDefineLine)
{
	uint nrDefines;
	std::vector<uint> defineNameStartPos;
	std::vector<uint> defineNameEndPos;
	std::vector<uint> defineValStartPos;
	std::vector<uint> defineValEndPos;

	defineNameStartPos.push_back(0);

	for(size_t i=0;i<pDefineLine.length();++i)
	{
		if(pDefineLine[i] == ':')
		{
			defineNameEndPos.push_back(i);
			defineValStartPos.push_back(i+1);
		}
		if(pDefineLine[i] == ';')
		{
			defineNameStartPos.push_back(i+1);
			defineValEndPos.push_back(i);
		}
	}
	
	// define line always ends with ';'
	defineNameStartPos.erase(defineNameStartPos.end() - 1); //efineNameStartPos.size());

	// push the names and values into the defines std::vector
	C_SHADER_MACRO newDefine;
	for(size_t defs=0;defs<defineNameStartPos.size();++defs)
	{
		std::string tval = pDefineLine.substr(defineValStartPos[defs], defineValEndPos[defs] - defineValStartPos[defs]).c_str();
		int def = atoi(tval.c_str());

		newDefine.Name = pDefineLine.substr(defineNameStartPos[defs], defineNameEndPos[defs] - defineNameStartPos[defs]).c_str();
		newDefine.Definition = def;

		mDefines.push_back(newDefine);
	}

	return true;
}

Share this post


Link to post
Share on other sites
Quick an dirty C code (just to get the point across):

tok = strtok( str, ";" );
while( 1 ){
if( !tok )
break;

char var[256], val[256];

sscanf( tok, "%s:%s", var, val );
do_something( var, val );
tok = strtok( NULL, ";" );
}

Share this post


Link to post
Share on other sites

Álvaro's solution is great because it's easy to extend, really he's just using a split function twice.  The first split separates elements at the ; character, then the second split separates the substrings at the : character.  You could rework that into a single function that returns a vector of strings split by a delimiter.  You could then split by the ; character first, then for each element in the vector split by the : character.  

Share this post


Link to post
Share on other sites

In the interests of completeness (and maybe overkill) here is how to do it will C++11 and regular expressions:

#include <regex>
#include <iterator>
#include <iostream>
#include <string>

int main()
{
  const std::string s = "MAX_PLIGHT:4234;NORMALMAP:17789;";

  std::regex words_regex("(\\w+)\\:(\\d+);");
  auto words_begin =  std::sregex_iterator(s.begin(), s.end(), words_regex);
  auto words_end = std::sregex_iterator();

  for (std::sregex_iterator i = words_begin; i != words_end; ++i) {

    const std::smatch &match = *i;                                             \

    std::string match_str = match[1];
    int match_int = std::stoi(match[2]);

    std::cout << match_str << "=" << match_int  << '\n';
  }
}

I've modified the code from http://en.cppreference.com/w/cpp/regex/regex_iterator to iterate through your text. It's pretty short and sweet and very robust. If you need to cope with white space around the strings you can change the regular expression to ignore those and only capture the text you need.

Share this post


Link to post
Share on other sites

maybe this:

 

[spoiler]

#ifndef stringlistH
#define stringlistH

#include "FileHandling.h"
#include <vector>
#include <android/log.h>
#include <fstream>
#include <sstream>
#include "DxcMath.h"
#include "string"
#include "stdlib.h"
#include <iostream>

#define APP_LOG2 "WNB_TXTLOG"
typedef std::string AnsiString;

extern std::vector<AnsiString> w_nawiasie[10];

enum tFloatConversionRule { tFCDot, tFCcomma };

extern tFloatConversionRule FLOAT_CONVERSION;
extern void  init_string_float_conversion_rule();

inline AnsiString IntToStr(int i)
{
	std::stringstream s;

	s << i;

	AnsiString converted(s.str());
	return converted;
}

inline AnsiString FloatToStr(float i)
{
	std::stringstream s;

	s << i;

	AnsiString converted(s.str());
	return converted;
}



inline AnsiString POINT_TO_TEXT(t3dpoint<float> p)
{
	return "X "+FloatToStr(p.x) + " Y "+FloatToStr(p.y)+" Z "+FloatToStr(p.z);
}

inline AnsiString iPOINT_TO_TEXT(t3dpoint<int> p)
{
	return "X "+IntToStr(p.x) + " Y "+IntToStr(p.y)+" Z "+IntToStr(p.z);
}

inline int Pos(AnsiString sub, AnsiString str)
{
	 std::size_t found = str.find(sub,0);
	  if (found!=std::string::npos)
		  return int(found)+1;
	  else
		  return 0;
}


inline AnsiString booltostring(bool hue)
{
if (hue) return "true"; else return "false";
}

inline int pstrtoint(AnsiString num)
{
return ::atoi(num.c_str());
}


inline AnsiString StringReplace(AnsiString str, AnsiString substr, AnsiString with)
{
	AnsiString s = str;
while (Pos(substr, s) > 0)
{
	int pos = Pos(substr, s)-1;
	 s.erase(pos,substr.length());
	 s.insert(pos, with);
}
return s;
}

inline float pstrtofloat(AnsiString num)
{
	AnsiString temp = num;
	if (FLOAT_CONVERSION == tFCDot)     //English rule (floats are written as 10.25)
	temp = StringReplace(num,",",".");
									 else      //Polish rule (floats are written as 10,25)
	temp = StringReplace(num,".",",");


return ::atof(temp.c_str());
}


inline int StrToInt(AnsiString num)
{
return ::atoi(num.c_str());
}

inline AnsiString stddelete(AnsiString str, int pos, int len) //this is for std::string only because i will call only Pos()-1 from it
{
	AnsiString s = str;
	 s.erase(pos, len);
	 return s;
}






extern AnsiString LowerCase(AnsiString str);
extern AnsiString UpperCase(AnsiString str);


inline int booltoint(bool k)
{
	if (k) return 1; else return 0;
}

inline bool stringtobool(AnsiString hue)
{
	if (LowerCase(hue) == "false") return false;
	if (LowerCase(hue) == "true") return true;
	if (LowerCase(hue) == "0") return false;
	if (LowerCase(hue) == "1") return true;
		return false;
}

inline AnsiString get_text_between2(AnsiString b1, AnsiString b2, AnsiString original_str)
{

if (Pos(b1,original_str) == 0) return original_str;
if (Pos(b2,original_str) == 0) return original_str;

//float, 2, 3); ahue ahue
AnsiString temp1 = stddelete(original_str, 0, Pos(b1,original_str) + b1.length() - 1 );
int k = Pos(b2,temp1) - 1;
AnsiString temp2 = stddelete(temp1, k, temp1.length() - k);
return temp2;
}



inline AnsiString get_before_char(AnsiString text, AnsiString sign, bool casesensitive)
{
AnsiString s = text;
AnsiString tmp;
if (casesensitive == false)
	tmp = stddelete(s,Pos(LowerCase(sign),LowerCase(s))-1,s.length()-Pos(LowerCase(sign),LowerCase(s))+1);
else
	tmp = stddelete(s,Pos(sign,s)-1,s.length()-Pos(sign,s)+1);


return tmp;
}


inline AnsiString get_after_char(AnsiString text, AnsiString sign, bool casesensitive)
{
AnsiString s = text;
AnsiString tmp;
if (casesensitive == false)
	tmp = stddelete(s,0,Pos(LowerCase(sign),LowerCase(s)));
	else
		tmp = stddelete(s,0,Pos(sign,s));


return tmp;
}



inline AnsiString get_after_char2(AnsiString text, AnsiString sign, bool casesensitive)
{
AnsiString s = text;
AnsiString tmp;
if (casesensitive == false)
	tmp = stddelete(s,0,Pos(LowerCase(sign),LowerCase(s))+sign.length()-1);
	else
		tmp = stddelete(s,0,Pos(sign,s)+sign.length()-1);


return tmp;
}




inline void get_all_in_nawias(AnsiString ainput, AnsiString aznak, int index)
{
    std::string input = ainput;
    std::string delimiter = aznak;


    w_nawiasie[index].clear();

    AnsiString pikok;

    std::size_t  start = 0U;
    std::size_t end = input.find(delimiter);

    while (end != std::string::npos) {

    	pikok =  input.substr(start, end - start);
      w_nawiasie[index].push_back(pikok);

        start = end + delimiter.length();
        end = input.find(delimiter, start);
    }
  	pikok =  input.substr(start, end);
    w_nawiasie[index].push_back(pikok);

}


inline AnsiString ExtractFileName(AnsiString pikok)
{
	get_all_in_nawias(pikok,"/",0);
	return w_nawiasie[0][ w_nawiasie[0].size()-1 ];
}


inline AnsiString ExtractFilePath(AnsiString pikok)
{
	get_all_in_nawias(pikok,"/",0);
	if (w_nawiasie[0].size() <= 0) return "";
	AnsiString ahue = "";
	for (int i=0; i < w_nawiasie[0].size()-1; i++)
	ahue = ahue + w_nawiasie[0][i]+"/";

	return ahue;
}


inline AnsiString get_filename_ext(AnsiString pikok)
{
	AnsiString filename = ExtractFileName(pikok);
	return get_after_char(filename, ".", false);
}


inline AnsiString change_filename_ext(AnsiString pikok, AnsiString next) //push .extension important with dot ex. ".tga"
{
AnsiString path 	= ExtractFilePath(pikok);
AnsiString filename = ExtractFileName(pikok);
AnsiString fname	= get_before_char(filename,".", false);

return path+fname+next;

}



struct TStringList
{

	int Count;
std::vector<AnsiString> Strings;


void Add(AnsiString text)
{
AnsiString p = text;
Strings.push_back(text);
	Count = Count + 1;
}

	AnsiString GetText()
	{
	AnsiString res = "";
	int i;
	for (i=0; i < Count; i++)
		res = res + Strings[i] + "\n";
	return res;
	}

	void AddLineAtPos(int atline, int atpos)
	{
		std::vector<AnsiString> tmp;
		tmp.clear();
		for (int i=0; i < atline; i++)
			tmp.push_back(Strings[i]);

		AnsiString prefix;
		AnsiString suffix;

if (Strings[atline].length() > 0)
{
		prefix = stddelete(Strings[atline], atpos, 100000);
		suffix = stddelete(Strings[atline], 0, atpos);

tmp.push_back(prefix);
tmp.push_back(suffix);
} else
{
tmp.push_back("");
tmp.push_back("");
}


		for (int i=atline+1; i < Count; i++)
			tmp.push_back(Strings[i]);

	Strings.clear();

	Count = Count + 1;
	for (int i=0; i < Count; i++)
	Strings.push_back(tmp[i]);

	tmp.clear();
	}

	void Clear()
	{
		Count = 0;
		Strings.clear();
	}

	TStringList()
	{
		Clear();
	}

	~TStringList()
	{
		Clear();
	}


void LoadFromFile(AnsiString fname)
{

	std::ifstream file(fname.c_str());
	if (file.good() == false) {
file.close();
return;
	}

	AnsiString str;

	Count = 0;
	if (Strings.size() > 0) Strings.clear();

	while (std::getline(file, str))
	{
		Strings.push_back(str);
		Count = Count + 1;
	}

	file.close();

	//now remove next line characters from strings \n \r  \r\n
for (int i=0; i < Count; i++)
	{
		AnsiString str = Strings[i];

		 char lastChar = str.at( str.length() - 1 );
	if ( (lastChar == '\r') || (lastChar == '\n') )
	{
//		AnsiString p = "deleted newline: "+str;
//		__android_log_print(ANDROID_LOG_VERBOSE, APP_LOG2, p.c_str(), 1+1);

	str.erase(str.length()-1);
	Strings[i] = str;
	}
	}
}

//std::string p;
AnsiString pc;

void SaveToFile(AnsiString fname)
{
	pc = GetText();
	std::ofstream outfile (fname.c_str(),std::ofstream::binary);
	int len = pc.length();
	char * buff = new char[ len ];
	memcpy(buff, pc.c_str(), sizeof(char) * len);
	outfile.write (buff, len);
	outfile.close();
}




};

#endif

cpp

#include "stringlist.h"
#include <algorithm>    // std::transform


tFloatConversionRule FLOAT_CONVERSION;

std::vector<AnsiString> w_nawiasie[10];

AnsiString LowerCase(AnsiString str)
{
	AnsiString s = str;
	std::transform(s.begin(), s.end(), s.begin(), ::tolower);
return s;
}

AnsiString UpperCase(AnsiString str)
{
	AnsiString s = str;
	std::transform(s.begin(), s.end(), s.begin(), ::toupper);
return s;
}

void  init_string_float_conversion_rule()
{
AnsiString f;
float p = 10.250f;
f = FloatToStr(p);
if (Pos(".",f) > 0)
FLOAT_CONVERSION = tFCDot; else FLOAT_CONVERSION = tFCcomma;

}


[/spoiler]

 

 

 

search for 

inline void get_all_in_nawias(AnsiString ainput, AnsiString aznak, int index)

little explanation:

 

ainput is your original string

aznak is the delimiter character (or string)

index is the index of the array 

 

extern std::vector<AnsiString> w_nawiasie[10];

so you can pass 0..9

 

 

function works like this:

if you have  and you pass

get_all_in_nawias("www.gamedev.net", ".", 9);

 

the function will produce:

 

w_nawiasie[9][0] will be  "www"

w_nawiasie[9][1] will be  "gamedev"

w_nawiasie[9][2] will be  "net"

 

since its a vector you can youse w_nawiasie[9].size() to know how much words you cut

 

 

or use get after char or ge before char funcs

Edited by WiredCat

Share this post


Link to post
Share on other sites

This topic is 408 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this