# optimizing my code (memory usage, postfix and prefix notation, "continue")

## Recommended Posts

Hello all

I don't usually write posts here, as I can find most information in the books I have, but I really have an interesting question here.

I have the following code (tiny part of a bigger function ofcourse).

line is a string, ll is an int, j is an int and splitchar is a char
The object of this piece of code is to increase j by 1, as long as the characer of line with index j, equals splitchar.
The character of line with index j, before execution of the inner while loop code, always equals splitchar, quaranteed. So j will be increased by at least one.

//some code
int ll = line.length();
int  j(0);
while(j < ll)
{
//some code that uses and changes j, but nothing very important. It doesn't influence the following code or the subject of this topic.
while (j < ll - 1 && line[j] == splitchar)
++j;
//some code
}


I changed the inner while loop to

while (j++ < ll - 1 && line[j] == splitchar)
continue;


or

while (j < ll - 1 && line[++j] == splitchar)
continue; 

which are pretty much the same. Trust me, I've tested everything in detail ;)
Note: postfix or prefix notation is very important here!

Now, my questions are;

1. Does it matter if I write "continue", or empty braces ("{}") in the last two examples, and how does this affect the performance?
2. Which one of these three would have the best performance?

3. Is there an even better possibility?

Performance is not very important in this program, but I really want to know the little details, because it WILL matter in future projects.

Nick

Edited by Nick C.

##### Share on other sites
Edit: Ignore this post, I didn't think it through properly and SiScrane already corrected me.


while (j++ < ll - 1 && line[j] == splitchar)
continue;

or


while (j < ll - 1 && line[++j] == splitchar)
continue;


Both versions are not well-defined C++. The value of j is undefined (see this Wikipedia page). A colleague of mine recently ran into problems because they were working on code that relied on that but then had to change compiler version. Edited by BitMaster

##### Share on other sites
Really? Well, you keep learning. Still, I would agree with SiCrane and avoid the loops even if they are well defined.

##### Share on other sites

Oh those quirky optimization attempts could well go wrong cause it can be undefined behavior when the increment of a variable thats used twice inside a statement happens. The last is always different from the original as j would be increased before accessing the array, which it was not in the original.

Also abusing the continue statement in this way when there is no jump needed looks ugly and a few years ago some compilers without good optimization could even have added a useless jump instruction.

Edit: slight clarification for language lawyers

Edited by wintertime

##### Share on other sites

Nick C., on 27 Mar 2013 - 07:54, said:

Performance is not very important in this program, but I really want to know the little details, because it WILL matter in future projects.

It really won't.

QFE. It won't.

It'd be better[note] to use the std:string member function find_first_of like so:

[source='cpp']size_t pos = line.find_first_of(splitchar);

if(pos != string::npos)
{
}[/source]

[Note] Err. Sorry, reading comprehension fail. Nonetheless the above is good advice for finding the first occurence of splitchar that I'll build on in a bit, so I'll leave it be.

It'd be better[really, this time] to use the std:string member function find_first_not_of like so (if you know that line begins with splitchar):

[source='cpp']size_t pos = line.find_first_not_of(splitchar);

if(pos != string::npos)
{
}[/source]

If you don't know that line begins with splitchar, then you can combine these two member functions like so:

[source='cpp']size_t pos = line.find_first_not_of(splitchar, line.find_first_of(splitchar));

if(pos != string::npos)
{
}[/source]

Edited by Ravyne

##### Share on other sites

while (j < ll - 1 && line[j] == splitchar)
++j;



while (j < ll - 1 && line[++j] == splitchar)
continue; 

Even though I got -2 for pointing out how this is a failed optimization attempt, that code is not equivalent and you should stop trying to microoptimize such irrelevant things when you are likely to introduce bugs.

Edited by wintertime

##### Share on other sites

Even though I got -2 for pointing out how this is a failed optimization attempt, that code is not equivalent and you should stop trying to microoptimize such irrelevant things when you are likely to introduce bugs.

I don't think you were downvoted for suggesting the optimization was useless and an irrelevant micro optimization. You were downvoted because you said it's invoking undefined behavior. It is well defined. I made the same mistake because I didn't think it through properly more than four hours before you though and I was already corrected.

##### Share on other sites

Also abusing the continue statement in this way when there is no jump needed looks ugly and a few years ago some compilers without good optimization could even have added a useless jump instruction.

Its hardly abusing continue. The fact that a compiler might have mishandled it in the past is not an indication of some terrible practice going on. It's probably good to avoid using looping structures solely for side-effects, as someone else pointed out, and there are better tools as I pointed out myself, but if you did it anyways, using continue is probably a better option than empty braces or a semicolon. At least it stands out and says precisely what the intent was. Empty braces might invite the thought that the programmer forgot to fill in the loop body, and an empty statement (a single semicolon) is first of all very easy to miss, and causes really strange errors if you should ever forget or mistakenly delete it.

##### Share on other sites

Empty braces might invite the thought that the programmer forgot to fill in the loop body, and an empty statement (a single semicolon) is first of all very easy to miss, and causes really strange errors if you should ever forget or mistakenly delete it.

I pretty much never find myself needing a loop without a body, but when I do, I put a single semicolon indented on the next line. It stands out pretty well and doesn't look like you forgot to write a body.

##### Share on other sites

Okay, thanks everyone for the replies (and sorry for my late reply).
It seems that I should have included my entire function to prevent confusion here, as some people don't really know what I want to archieve here.

So: the object of this function was to split a string on a character. However, the function that Ravyne showed here (yes, I know there are even more possible solutions) doesn't really do what I want, as I have a few more requirements xd. That function was actually one of the first I came up with.
Let me explain you a few other things.
- If the string doesn't contain that character, the function stores the entire string in a vector (size 1)
-if the string is empty, the function stores an empty string ("") in a vector (also size 1)
-If the character occurs as first character an empty string will be stored as first element in the vector
-If there are multiple duplicates of that character following, it ignores all those.

An example:
vector<string> splittedString;
char splitChar = '.';
string stringToSplit = ".test...12..3.";

After the instruction
SplitLine(stringToSplit, splittedString, splitChar);
splittedString contains the strings
0.  (empty)
1. test
2. 12

3. 3
4.  (empty)

Not trying to argue, just saying how it is... Like wintertime said, those two loops don't exactly do the same thing, but they do if you actually see the entire function. As I said before, I've tested all the possibilities comprehensively. So, without further ado, my entire function. You can use all my possibilities, they all do the same thing.



inline void DaeToAniConverter::SplitLine(string line, vector<string>& splittedline, char splitchar)
{
//vector with string segments should contain at least one string
if (line == "")
{
splittedline.push_back("");
return;
}
int ll = line.length();
splittedline.clear();
int prevIndex(0);
int  j(0);
while(j < ll)
{
int tempJ = line.find_first_of(splitchar, j);
j = tempJ >= 0 ? tempJ : ll;
//skip multiple equal characters
splittedline.push_back(line.substr(prevIndex, (j-prevIndex)));
while (j < ll - 1 && line[++j] == splitchar)
{}
prevIndex = j;
}

}


As you can see here, the main loop is only executed as many times as there will be string segments (in my last example, 5 times), so not character per character. I actually do use line.find_first_of.

And about that performance issue, you are all right . It really doesn't make a big difference, and a good structure is sometimes more important than performance.
Thanks to everyone else for your constructive remarks.
Again (but now you know what I want to archieve here), if there is a better way to archieve this functionctionality, I would gladly hear about it.

I'm just a 19 year old student, programming in my free time. Everyone makes mistakes, right?

gz
Nick

Edit: This may be important. This function is part of a program that reads a file, converts it to a custom format, and writes it to a new file. I'm talking about pretty big files, so this function is usually executed at least hundred times (usually a few hundred). Right now it only takes 2-5 seconds for the program to do its thing, so that's not too bad .

Edited by Nick C.

##### Share on other sites

Let me explain you a few other things.

1. If the string doesn't contain that character, the function stores the entire string in a vector (size 1)
2. If the string is empty, the function stores an empty string ("") in a vector (also size 1)
3. If the character occurs as first character an empty string will be stored as first element in the vector
4. If there are multiple duplicates of that character following, it ignores all those.

The second rule is redundant. An empty string won't contain the character, and therefore rule 1 applies, and since the string is empty, rule 1 will store an empty string.

I get why you want to eat multiple occurring characters, but I don't get why you want an empty string if its the first character (and last, by your example?)

## Create an account

Register a new account

• ### Forum Statistics

• Total Topics
628278
• Total Posts
2981789

• 10
• 11
• 17
• 14
• 9