Jump to content
  • Advertisement
Sign in to follow this  
  • entries
    17
  • comments
    4
  • views
    32596

strtok not threadsafe on windows

Sign in to follow this  
Mathucub

1728 views

Anyone, like me, who started off in the land of "C" or before
stl really got standardized probably uses, or has legacy code that has strtok
for tokenization.

On Unix/Mac platforms--strtok has been replaced by strsep.
To me, this is more for convenience: strsep is smart enough to know
to skip double delimiters.

Eg, strtok("--Hi!-Dash-Seperated", "-") = failure, 2 delimeters back to back.
strsep("--Hi!-Dash-Seperated", "-") = "Hi!"

Thats all great and wonderful, except if you are developing on windows,
strsep is not available in the msvc runtime.

The ANSI definition for strtok is a bit vague as it does not reference
what it ought do when it comes to threads and processes:


char * strtok ( char * str, const char * delimiters );
Split string into tokens

A sequence of calls to this function split str into tokens, which are sequences
of contiguous characters separated by any of the characters that are part of
delimiters.

On a first call, the function expects a C string as argument for str, whose
first character is used as the starting location to scan for tokens. In
subsequent calls, the function expects a null pointer and uses the position
right after the end of last token as the new starting location for scanning.

To determine the beginning and the end of a token, the function first scans
from the starting location for the first character not contained in delimiters
(which becomes the beginning of the token). And then scans starting from this
beginning of the token for the first character contained in delimiters, which
becomes the end of the token.

This end of the token is automatically replaced by a null-character by the
function, and the beginning of the token is returned by the function.

Once the terminating null character of str has been found in a call to strtok,
all subsequent calls to this function with a null pointer as the first
argument return a null pointer.


Seems okay, right? What does MSDN say about threads and strtok?
Almost all of the above plus this little note:


Note:
Each function uses a thread-local static variable for parsing the string into
tokens. Therefore, multiple threads can simultaneously call these functions
without undesirable effects. However, within a single thread, interleaving
calls to one of these functions is highly likely to produce data corruption and
inaccurate results. When parsing different strings, finish parsing one string
before starting to parse the next. Also, be aware of the potential for danger
when calling one of these functions from within a loop where another function
is called. If the other function ends up using one of these functions, an
interleaved sequence of calls will result, triggering data corruption.


Well, that looks good as well. Looks like I can use it and not worry about
thread interactions.

But low and behold, what does the C Runtime say?


Using the statically linked CRT implies that any state information saved by the
C runtime library will be local to that instance of the CRT. For example, if
you use strtok, _strtok_l, wcstok, _wcstok_l, _mbstok, _mbstok_l when using a
statically linked CRT, the position of the strtok parser is unrelated to the
strtok state used in code in the same process (but in a different DLL or EXE)
that is linked to another instance of the static CRT. In contrast, the
dynamically linked CRT shares state for all code within a process that is
dynamically linked to the CRT. This concern does not apply if you use the new
more secure versions of these functions; for example, strtok_s does not have
this problem.


Wow, so its only thread safe if you static link to the c runtime.
To make matters worse: If you create your COM connections in proc, that is
what they are--in process. So you and all your threads + all the threads
from any libraries you are using that are dynamically linked to the C
runtime have the chance of a collision.

Also--if you throw the /clr switch to use C++/CLI code: you are required
to dynamic link. The static option is mutually exclusive, so you can't
do a quick fix and just switch to static.

Looks like the only solution on windows is to switch to strtok_s--luckily
it works almost the same way as strtok. Almost... it still takes some tweaking
to get the exact functionality.

Good luck, and beware. If you are multi-threaded then this code
has the potential to throw an exception, and since there is no handler:
cause a crash.


...
lFileLength = File->GetLength();

Data = new TCHAR[lFileLength+1];
File->Read(Data, lFileLength);

Data[lFileLength] = NULL;

File->Close();

char *buffer;

// start our tokenization
buffer = strtok(Data, "\n\r");
...


Literally, after this call went on the stack and started to execute:
We had a background thread strtok which changed the value of
its static internal state pointer. When this went to execute, it was at
the end of the other call's buffer and tried to read past it.

If its not at the end of your address space, then the call could simply fail,
returning NULL, or get a garbled state where you are now tokenizing the wrong
buffer.

If you are at the end of the buffer, and the next address is not owned
by you: it will throw an exception.
Sign in to follow this  


1 Comment


Recommended Comments

Huh. I've been in C-land for a long time, and I've always found myself reaching for strstr() instead of strtok(). I guess I probably end up more or less re-implementing strtok(), but at least it's thread-safe, and allows me to deal with the double-delimiter issue easily enough.

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!