• Advertisement
Sign in to follow this  

Unity Fixing 'strlen' Code [Resolved - Now Discussion Stuff]

This topic is 4681 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

So I've been browsing though some of the older posts in the forums and I cam across this from Magmai Kai Holmlor:
  
  int strlen(const char* cszStr)
  {
	DWORD* p = (DWORD*) cszStr;
	DWORD k,kk;
	
	while(DWORD(p)&3)
		{
		if(*(char*)(p)==0)
			return (char*)(p)-cszStr;
		p = (DWORD*) (((char*)(p)) + 1);
		}
	
	do
		{
		k = *p;
		kk = k + 0x7efefeff;
		k ^= -1;
		k ^= kk;
		p++;
		} while(!(k&0x81010100));
	
	k = *(--p);
	if(!(k&0x000000ff))
		return (char*)(p)-cszStr;
	
	//if(!(k&0x0000ffff)) Thanks Kippesoep! [smile]
        if(!(k&0x0000ff00))
		return (char*)(p)-cszStr+1;
	
	if(!(k&0x00ff0000))
		return (char*)(p)-cszStr+2;
	
	if(!(k&0xff000000))
		return (char*)(p)-cszStr+3;
	
	return (char*)(p)-cszStr;
  }
  


Looked fairly interesting so I gave it a try. Worked good for the most part, except when the string sent in is one character, as seen in this demo program:
int main( int argc, char* argv[] )
{
	char* str = "";
	// Returns 0 (as expected)
	printf("Length of %s: %i\n", str, __strlen__(str) );

	// Returns 2 (not as expected)
	str = "a";
	printf("Length of %s: %i\n", str, __strlen__(str) );

	// Returns 2 (as expected)
	str = "ab";
	printf("Length of %s: %i\n", str, __strlen__(str) );

	// Returns 100 (as expected)
	str = "1111111111__________1111111111__________1111111111__________1111111111__________1111111111__________";
	printf("Length of %s: %i\n", str, __strlen__(str) );

        // Returns 3 (as expected)
	char temp[256] = { '1', '2', '3', '\0' };
	printf("Length of %s: %i\n", temp, __strlen__(temp) );
}



So does anyone have any ideas to how it can work for a character of length 1? In particular, is this algorithm 'acceptable' to find the length of a string? This is just for knowledge and understanding, so please no "use strlen()" or "use std::string.size()" [wink]. Thanks! [Edited by - Drew_Benton on April 26, 2005 2:34:33 AM]

Share this post


Link to post
Share on other sites
Advertisement
Is that really that much more efficient than just doing:


int strlen(char* cszStr)
{
int ctr=0;
while (*cszStr++)
++ctr;
return ctr;
}



Especially considering that most calls to strlen aren't really the cause for a slowdown in a game :P. You could even do something like this if you're concerned about comparing chars vs. integers:


int strlen(char* cszStr)
{
int ctr=0;
unsigned int *ptrStr = (unsigned int*)cszStr;
do
{
if (!(ptrStr&0x000000ff))
break;
++ctr;
if (!(ptrStr&0x0000ff00))
break;
++ctr;
if (!(ptrStr&0x00ff0000))
break;
++ctr;
if (!(ptrStr&0xff000000))
break;
++ctr;
}
return ctr;
}



I don't really think that'll be any more efficient though :P.

Share this post


Link to post
Share on other sites
Quote:
Original post by Kippesoep
The line

if(!(k&0x0000ffff))

should be

if(!(k&0x0000ff00))


[smile] Thanks! That was it. I hope Magmai sees this...

Oh, Ready4Dis, I was only interested in this method because I've never seen it before [wink]. I have yet to play around with other techniques that are the most efficient or fast. I just wanted to see if this does indeed work, as it does thanks to the fix Kippesoep found.

I will be doing further 'research' just for fun. I'll be glad to post my findings on how good this is, but so far, it looks excelent! it's definitly not O( n ) as is when you go though looking for a NULL [grin].

Share this post


Link to post
Share on other sites
Sheeeshhh, optimized strlen!!! Use Pascal type strings then, where you store lenght in 1st n bytes...

Share this post


Link to post
Share on other sites
unsigned int strlen(const char* str) 
{
const char *s;

for (s = str; *s; ++s)
;

return (s - str);
}


Works well for me.

Share this post


Link to post
Share on other sites
Optimizing strlen is just silly. Needing to go through the entire string just to see how long it is will always be slow no matter how clever you are. Real string implementations store the string's length or a pointer to the end of the string.

Share this post


Link to post
Share on other sites
Quote:
Original post by _Madman_
Sheeeshhh, optimized strlen!!! Use Pascal type strings then, where you store lenght in 1st n bytes...


I have a shortstring template for that.

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Since changing strings can be expensive (like appending when you don't have a string length ;) )why not just do something like struct { int len, char* str } like other people have said.

Share this post


Link to post
Share on other sites
Thats a really nifty method magmai.

Any chance you could tell us how it actually works?

From,
Nice coder

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Looks dependent on byte ordering, and the ability to read unaligned data, although just the if(*(char*)(p)==0), it looks like it gets properly aligned after that.

Let's see how fast they are (using std::strlen for reference):


Magmai_strlen: 11390 ms 95.92%
Ready4Dis_strlen: 29234 ms 246.18%
DerAnged_strlen: 28078 ms 236.45%
std::strlen: 11875 ms 100%


Tested with 10x 1-65536 character strings of various alignments.

[Edited by - Magmai Kai Holmlor on April 26, 2005 12:02:36 PM]

Share this post


Link to post
Share on other sites
And I mysteriously got logged out somehow. And the code tags that were supposed to make that all nice and pretty vanished from existance.

Share this post


Link to post
Share on other sites
Quote:
Original post by smart_idiot
Optimizing strlen is just silly. Needing to go through the entire string just to see how long it is will always be slow no matter how clever you are. Real string implementations store the string's length or a pointer to the end of the string.
Yes I agree. This is talked about a lot in some article I read a little while ago.

Share this post


Link to post
Share on other sites
Just to play the Devil's advocado... the true way to fix strlen is to remove it completely :-P (although I understand some poor SOBs have to work with legacy code, or just don't feel like rewriting it all...)

To hijack this thread, bastard that I am:

template < typename char_t >
class literal
{
public:
typedef char_t char_type;
typedef size_t size_type;
private:
size_type length;
const char_type * data;
public:
literal( const literal< char_type > & copy )
: length( copy.length )
, data( copy.data )
{
}
template < size_type array_size >
literal( const char_type (& array) [ array_size ] )
: length( array_size )
, data( array )
{
}

operator const char_type * ( void ) const
{
return data;
}

template < typename char_type >
friend std::basic_string< char_type > & operator=( std::basic_string< char_type > & lhs , const literal< char_type > & rhs )
{
string value( rhs.data , rhs.data + rhs.length );
lhs.swap( value );
return lhs;
}
};

template < typename char_type , size_t length >
literal< char_type > make_literal( const char_type (& l)[ length ] )
{
return literal< char_type >( l );
}

#define DEFINE_THAT_EVIL_UNDERSCORE_THINGY "pretty please with suger on top"

#ifdef DEFINE_THAT_EVIL_UNDERSCORE_THINGY
#define _( x ) make_literal( x )
#endif //def DEFINE_THAT_EVIL_UNDERSCORE_THINGY

int main ( int argc , char ** argv )
{
string a_string;
a_string = _("I like pie, oh yes I do, lots of pie, for me and you");
}


Very basic, but you could extend it as desired.. :).

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
Magmai_strlen: 11390 ms 95.92%
Ready4Dis_strlen: 29234 ms 246.18%
DerAnged_strlen: 28078 ms 236.45%
std::strlen: 11875 ms 100%


I had done some primitive testing earlier:

#include <windows.h>
#include <stdio.h>

int __strlen__(const char* cszStr)
{
DWORD* p = (DWORD*) cszStr;
DWORD k,kk;

while(DWORD(p)&3)
{
if(*(char*)(p)==0)
return (char*)(p)-cszStr;
p = (DWORD*) (((char*)(p)) + 1);
}
do
{
k = *p;
kk = k + 0x7efefeff;
k ^= -1;
k ^= kk;
p++;
}
while(!(k&0x81010100));

k = *(--p);
if(!(k&0x000000ff))
return (char*)(p)-cszStr;

if(!(k&0x0000ff00))
return (char*)(p)-cszStr+1;

if(!(k&0x00ff0000))
return (char*)(p)-cszStr+2;

if(!(k&0xff000000))
return (char*)(p)-cszStr+3;

return (char*)(p)-cszStr;
}

unsigned int strlen2(const char* str)
{
const char *s;

for (s = str; *s; ++s)
;

return (s - str);
}

#define SIZE 1073741823

int main( int argc, char* argv[] )
{
char* str = new char[SIZE];

for( int x=0;x<SIZE;x++)
{
str[x] = rand() % 26 + 65;
}
str[SIZE-1] = '\0';

/*DWORD start = ::GetTickCount();
printf("Length: %i\n", strlen(str) );
DWORD final = GetTickCount() - start;
printf("strlen time: %i\n", final );*/


/*DWORD start = ::GetTickCount();
printf("Length: %i\n", __strlen__(str) );
DWORD final = GetTickCount() - start;
printf("__strlen__ time: %i\n", final );*/


DWORD start = ::GetTickCount();
printf("Length: %i\n", strlen2(str) );
DWORD final = GetTickCount() - start;
printf("strlen2 time: %i\n", final );

delete [] str;

return 0;
}

/*
Length: 1073741823
__strlen__ time: 39281
Press any key to continue
*/


/*
Length: 1073741823
strlen time: 50266
Press any key to continue
*/


/*
Length: 1073741823
strlen time: 40703
Press any key to continue
*/


/*
Length: 1073741823
__strlen__ time: 36562
Press any key to continue
*/


/*
Length: 1073741823
__strlen__ time: 37343
Press any key to continue
*/


/*
Length: 1073741822
strlen2 time: 52782
Press any key to continue
*/


/*
Length: 1073741822
strlen2 time: 33985
Press any key to continue
*/







Weird results when I ran them one after another though, that's why I did them by themselves. Overall looks like Magmai did win [wink].

@MaulingMonkey, very interesting, I will take a look at that later. [smile]

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
i'm really no expert when it comes to 64 bit platforms.. but is this code safe for 64 bit? is someone able to comment on this issue?

Share this post


Link to post
Share on other sites
Most C/C++ runtime implementations would most like alreayd be optimized.

For instance, if you're using VC7.1, this is what the source looks like (provided in asm .. for anyone with VC installed: vc\crt\src\[cpu vendor]\strlen.asm)


;***
;strlen - return the length of a null-terminated string
;
;Purpose:
; Finds the length in bytes of the given string, not including
; the final null character.
;
; Algorithm:
; int strlen (const char * str)
; {
; int length = 0;
;
; while( *str++ )
; ++length;
;
; return( length );
; }
;
;Entry:
; const char * str - string whose length is to be computed
;
;Exit:
; EAX = length of the string "str", exclusive of the final null byte
;
;Uses:
; EAX, ECX, EDX
;
;Exceptions:
;
;*******************************************************************************

CODESEG

public strlen

strlen proc

.FPO ( 0, 1, 0, 0, 0, 0 )

string equ [esp + 4]

mov ecx,string ; ecx -> string
test ecx,3 ; test if string is aligned on 32 bits
je short main_loop

str_misaligned:
; simple byte loop until string is aligned
mov al,byte ptr [ecx]
add ecx,1
test al,al
je short byte_3
test ecx,3
jne short str_misaligned

add eax,dword ptr 0 ; 5 byte nop to align label below

align 16 ; should be redundant

main_loop:
mov eax,dword ptr [ecx] ; read 4 bytes
mov edx,7efefeffh
add edx,eax
xor eax,-1
xor eax,edx
add ecx,4
test eax,81010100h
je short main_loop
; found zero byte in the loop
mov eax,[ecx - 4]
test al,al ; is it byte 0
je short byte_0
test ah,ah ; is it byte 1
je short byte_1
test eax,00ff0000h ; is it byte 2
je short byte_2
test eax,0ff000000h ; is it byte 3
je short byte_3
jmp short main_loop ; taken if bits 24-30 are clear and bit
; 31 is set

byte_3:
lea eax,[ecx - 1]
mov ecx,string
sub eax,ecx
ret
byte_2:
lea eax,[ecx - 2]
mov ecx,string
sub eax,ecx
ret
byte_1:
lea eax,[ecx - 3]
mov ecx,string
sub eax,ecx
ret
byte_0:
lea eax,[ecx - 4]
mov ecx,string
sub eax,ecx
ret

strlen endp




- EDIT -
Changed from code to source tags

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
i'm really no expert when it comes to 64 bit platforms.. but is this code safe for 64 bit? is someone able to comment on this issue?


It's not safe for any bits.

MSN

Share this post


Link to post
Share on other sites
Quote:
Original post by Drew_Benton@MaulingMonkey, very interesting, I will take a look at that later. [smile]


The powerful aspect of it is extremely simple, really. It just takes an argument (specifically, an array, by reference), and uses a template argument so that any size array can be passed. The alternative, taking a pointer to the starting element, requires counting - aka strlen. The nice thing about it is that the length is calculated at compile time with this method rather than at run time with an O(n) function.

On a side note, I never actually use that template. I spend my time optimizing things that actually need optimizing, not string manipulation routines :P.

As a side note, assigning a string to a literal tagged in this manner has one minor difference: it assigns all of the array, rather than just until the last null. That is:

str::string test_string_1 , test_string_2;

const char[] test_string_1_value = "Test string";
const char[] test_string_2_value = "Test string\0Testy testy testy!";

test_string_1 = _( test_string_1_value );
test_string_2 = _( test_string_2_value );

assert( strcmp( test_string_1.c_str() , test_string_2.c_str() ) == 0 ); //will succeed, only compares up to the null terminator.
assert( test_string_1 == test_string_2 ); //will fail, as it compares the entire contents of the string, even past null


Whereas both asserts will succeed if the _(...) wrapper is removed.

Share this post


Link to post
Share on other sites
Quote:
Original post by Anonymous Poster
i'm really no expert when it comes to 64 bit platforms.. but is this code safe for 64 bit? is someone able to comment on this issue?


It needs to be extended to work optimally on a 64bit register machine, instead of &3 you'd need &7, the various bit-mask would need to be extended as well, e.g. 0x7efefeff would become 0x7efefefefefefeff.

I've wanted to sit down and explain how it works, but I just don't have the time right now to go in detail. The first loop aligns the string, then the seocnd loop checks 4 bytes at a time. If someone wants to understand it, you should look at the binary encoding, and see how the bits flip as the algorithm progresses, it essentially makes a sequence of 'catch' bits and filter the data in the such a way to set these bit if the byte is not null. By the nature of the algorithm, we can only tell if a byte is not null, but cannot tell what the first null byte is, so we have to go find it with the last 4 if statements (those statements can be optimized further, that's a slow way to check).

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
You could iterator 2 or more each loop and perform additional checks at the end to get the length -this would be better for long strings.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
  • Advertisement
  • Popular Tags

  • Advertisement
  • Popular Now

  • Similar Content

    • By Florencia Sylvester Anelli
      Hi everyone! I've developed a game called The Last Dog (ZOMBIES + DOGS) and was wondering if you could give it a try in order for me to analyse your behaviour in analytics. Give it a shot! Please download it here: https://florsyl.wixsite.com/thelastdog2 and let me know if you detect any bugs. VIDEO.3gp
       

       
       
    • By Manuel Berger
      Hello fellow devs!
      Once again I started working on an 2D adventure game and right now I'm doing the character-movement/animation. I'm not a big math guy and I was happy about my solution, but soon I realized that it's flawed.
      My player has 5 walking-animations, mirrored for the left side: up, upright, right, downright, down. With the atan2 function I get the angle between player and destination. To get an index from 0 to 4, I divide PI by 5 and see how many times it goes into the player-destination angle.

      In Pseudo-Code:
      angle = atan2(destination.x - player.x, destination.y - player.y) //swapped y and x to get mirrored angle around the y axis
      index = (int) (angle / (PI / 5));
      PlayAnimation(index); //0 = up, 1 = up_right, 2 = right, 3 = down_right, 4 = down

      Besides the fact that when angle is equal to PI it produces an index of 5, this works like a charm. Or at least I thought so at first. When I tested it, I realized that the up and down animation is playing more often than the others, which is pretty logical, since they have double the angle.

      What I'm trying to achieve is something like this, but with equal angles, so that up and down has the same range as all other directions.

      I can't get my head around it. Any suggestions? Is the whole approach doomed?

      Thank you in advance for any input!
       
    • By devbyskc
      Hi Everyone,
      Like most here, I'm a newbie but have been dabbling with game development for a few years. I am currently working full-time overseas and learning the craft in my spare time. It's been a long but highly rewarding adventure. Much of my time has been spent working through tutorials. In all of them, as well as my own attempts at development, I used the audio files supplied by the tutorial author, or obtained from one of the numerous sites online. I am working solo, and will be for a while, so I don't want to get too wrapped up with any one skill set. Regarding audio, the files I've found and used are good for what I was doing at the time. However I would now like to try my hand at customizing the audio more. My game engine of choice is Unity and it has an audio mixer built in that I have experimented with following their tutorials. I have obtained a great book called Game Audio Development with Unity 5.x that I am working through. Half way through the book it introduces using FMOD to supplement the Unity Audio Mixer. Later in the book, the author introduces Reaper (a very popular DAW) as an external program to compose and mix music to be integrated with Unity. I did some research on DAWs and quickly became overwhelmed. Much of what I found was geared toward professional sound engineers and sound designers. I am in no way trying or even thinking about getting to that level. All I want to be able to do is take a music file, and tweak it some to get the sound I want for my game. I've played with Audacity as well, but it didn't seem to fit the bill. So that is why I am looking at a better quality DAW. Since being solo, I am also under a budget contraint. So of all the DAW software out there, I am considering Reaper or Presonus Studio One due to their pricing. My question is, is investing the time to learn about using a DAW to tweak a sound file worth it? Are there any solo developers currently using a DAW as part of their overall workflow? If so, which one? I've also come across Fabric which is a Unity plug-in that enhances the built-in audio mixer. Would that be a better alternative?
      I know this is long, and maybe I haven't communicated well in trying to be brief. But any advice from the gurus/vets would be greatly appreciated. I've leaned so much and had a lot of fun in the process. BTW, I am also a senior citizen (I cut my programming teeth back using punch cards and Structured Basic when it first came out). If anyone needs more clarification of what I am trying to accomplish please let me know.  Thanks in advance for any assistance/advice.
    • By Yosef BenSadon
      Hi , I was considering this start up http://adshir.com/, for investment and i would like a little bit of feedback on what the developers community think about the technology.
      So far what they have is a demo that runs in real time on a Tablet at over 60FPS, it runs locally on the  integrated GPU of the i7 . They have a 20 000 triangles  dinosaur that looks impressive,  better than anything i saw on a mobile device, with reflections and shadows looking very close to what they would look in the real world. They achieved this thanks to a  new algorithm of a rendering technique called Path tracing/Ray tracing, that  is very demanding and so far it is done mostly for static images.
      From what i checked around there is no real option for real time ray tracing (60 FPS on consumer devices). There was imagination technologies that were supposed to release a chip that supports real time ray tracing, but i did not found they had a product in the market or even if the technology is finished as their last demo  i found was with a PC.  The other one is OTOY with their brigade engine that is still not released and if i understand well is more a cloud solution than in hardware solution .
      Would there  be a sizable  interest in the developers community in having such a product as a plug-in for existing game engines?  How important  is Ray tracing to the  future of high end real time graphics?
    • By bryandalo
      Good day,

      I just wanted to share our casual game that is available for android.

      Description: Fight your way from the ravenous plant monster for survival through flips. The rules are simple, drag and release your phone screen. Improve your skills and show it to your friends with the games quirky ranks. Select an array of characters using the orb you acquire throughout the game.

      Download: https://play.google.com/store/apps/details?id=com.HellmodeGames.FlipEscape&hl=en
       
      Trailer: 
       
  • Advertisement