Jump to content

  • Log In with Google      Sign In   
  • Create Account

Banner advertising on our site currently available from just $5!


1. Learn about the promo. 2. Sign up for GDNet+. 3. Set up your advert!


King Mir

Member Since 11 Jun 2006
Offline Last Active Jun 08 2015 06:56 AM

Posts I've Made

In Topic: Write ToLower As Efficient, Portable As Possible

19 April 2015 - 08:39 PM

 

I imagine the most performance relevant use of toLower is a case insesitive compare over a large container or database of records. That could be noticable, even if each string is short.

I doubt it would be noticeable in that case. Once you start talking about a large number of different strings, you get into the realm where memory dominates computation speeds. You might shave off a few clock cycles in your computation, but your overall processing time is still going to look pretty much identical to memory used divided by memory bandwidth.

 

That's a good point, you're probably right.


In Topic: Write ToLower As Efficient, Portable As Possible

19 April 2015 - 05:11 PM

 

At this level it really becomes a matter writing your code as plainly as possible, so that the compiler can oprimize it.
 
What you want your compiler to do here is really to inline toLower and vectorize it, so it can apply it to up to 32 bytes at a time at the same time, for large strings. (This is like packing 4 characters in an int, and having a special tolower function for that). For short strings, you may not want to vectorize (pack into large chuncks), but they are a special case, because in that case the string is stored in the string object itself.
 
If you were a library writer with a particular string implementation and compiler locked in, you could do better. You could hand write the vectorized toLower, and take advantage of the fact that you can write past the end of string, up to it's capalicy which would always be a multiple of the vector width (vector width is the size interger you're packing your string into for your special tolower operation).
 

Move construction and such will eliminate the temporary. Returning vs passing in a reference really provides no difference in this case.

I think it could make a big difference because by passing in a reference the code introduces the posibility that the input and output aliases. So passing in a reference could actually be slower. Unless, that is, you explicitly tell the compiler that the feilds cannot alias, with the restrict keyword or other language extentions.

This is a great idea if you want to throw all portability out of the window. I can imagine vectorizing it through a 64 Bit integer processing 8 bytes of a string per loop iteration. If you're processing huge strings ("ropes") this has massive benefit but for most uses will you notice the difference of nanoseconds when processing the average length of say 32 byte ascii string?

 

The idea is not to throw portability out the window, but to use an autovectorizing compiler. Clang, gcc, and msvc will all auto-vectorize, although I cannot say how well they will do in this case.

 

The bottom line is your compiler can optimize better than you can imagine, so focus on writing clear code, and let the compiler handle the micro-optimization.

 

 

Re:"but for most uses will you notice the difference of nanoseconds when processing the average length of say 32 byte ascii string?"

I imagine the most performance relevant use of toLower is a case insesitive compare over a large container or database of records. That could be noticable, even if each string is short. My guess is that even for short strings, vectorization would make a difference if you apply toLower to the capacity instead of up to the size of the string. (but admittedly, that wouldn't follow the above maxim)


In Topic: Write ToLower As Efficient, Portable As Possible

19 April 2015 - 04:09 PM

At this level it really becomes a matter writing your code as plainly as possible, so that the compiler can oprimize it.

 

What you want your compiler to do here is really to inline toLower and vectorize it, so it can apply it to up to 32 bytes at a time at the same time, for large strings. (This is like packing 4 characters in an int, and having a special tolower function for that). For short strings, you may not want to vectorize (pack into large chuncks), but they are a special case, because in that case the string is stored in the string object itself.

 

If you were a library writer with a particular string implementation and compiler locked in, you could do better. You could hand write the vectorized toLower, and take advantage of the fact that you can write past the end of string, up to it's capalicy which would always be a multiple of the vector width (vector width is the size interger you're packing your string into for your special tolower operation).

 

Move construction and such will eliminate the temporary. Returning vs passing in a reference really provides no difference in this case.

I think it could make a big difference because by passing in a reference the code introduces the posibility that the input and output aliases. So passing in a reference could actually be slower. Unless, that is, you explicitly tell the compiler that the feilds cannot alias, with the restrict keyword or other language extentions.


In Topic: Array realloc size

19 April 2015 - 02:25 PM

The question is moot, because you should use a library to implement such operations instead of writing them yourself, or if you do write them yourself, you should test extensively instead of asking on a forum.


In Topic: Template or Macro

19 April 2015 - 01:53 PM

Prefer templates over macros when both are usable, because templates offer better type safety and interact more intuatively with other language features. For example, safe delete could be written as a template in such a way that you get a compiler error when you try to delete an int, but you can't do that with a macro.

 

I veiw it as a hiarchy of tools for code reuse. It goes roughly as follows:

1) loops

2) functions

3) templates

4) macros

 

You use a loop when you need to do the same operation, for different values in sequence

You use a function when you need to do the same operation, for different values out of sequence

You use a template function when you need to do the same operation, for different values of different types.

You use a Macro when you need to do the same operation, but there are differences in structure.

 

(Polymoric classes could also fit in this hiarchy, between functions and templates)

 

There are other uses of macros too where templates obviously cannot be used.


PARTNERS