Sign in to follow this  
Catafriggm

Good Idea or Bad Idea?

Recommended Posts

So I had this idea to make a "really fast" strupr function. The idea: use MMX (8x parallel) or SSE2 (16x parallel) vector math to operate on 8 or 16 characters at a time, using branch-free conditional logic (of course this would only work on strings where the length is known up front). Here's the question: would this be significantly faster than the usual char-by-char method, given the extra instructions necessary for branch-free conditional logic, and the costs of making unaligned memory accesses (just how likely do you think it'll be to have your strings aligned on 16-byte boundaries?)?

Share this post


Link to post
Share on other sites
Is turning a string to uppercase normally something you need to do very rapidly?

Not to be overly pragmatic, but even though it might be a neat, quick implementation, it would have some drawbacks, and the speed you'd be gaining wouldn't be worth much.

Share this post


Link to post
Share on other sites
Case-insensitive hash algorithms were what I had in mind. The hashing itself doesn't lend itself to parallelism (usually), but maybe the conversion can.

Share this post


Link to post
Share on other sites
In the "real world", text data gets internationalized, and has to use Unicode, and maybe even characters from beyond the Basic Multilingual Plane - doing things with "branch-free logic" then at least means encoding in UCS-4 (UTF-16 won't cut it because of surrogate pairs) and a lookup table of 2^21 entries (the whole Unicode code space - actually, for the time being I think you can get away with less - only like 100k values have been assigned, but then I don't think it's the lowest 100k either). After all, you need to handle lots of weird cases.

But actually, really proper uppercasing could change the length of the text string, so doing this properly wouldn't really be possible at all. For example, the 'beta' symbol in German is properly uppercased as "SS", two characters. (And woe to the author who uses a character with the same "beta" glyph that's actually intended for writing Greek!)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this