stl really got standardized probably uses, or has legacy code that has strtok
On Unix/Mac platforms--strtok has been replaced by strsep.
To me, this is more for convenience: strsep is smart enough to know
to skip double delimiters.
Eg, strtok("--Hi!-Dash-Seperated", "-") = failure, 2 delimeters back to back.
strsep("--Hi!-Dash-Seperated", "-") = "Hi!"
Thats all great and wonderful, except if you are developing on windows,
strsep is not available in the msvc runtime.
The ANSI definition for strtok is a bit vague as it does not reference
what it ought do when it comes to threads and processes:
char * strtok ( char * str, const char * delimiters );
Split string into tokens
A sequence of calls to this function split str into tokens, which are sequences
of contiguous characters separated by any of the characters that are part of
On a first call, the function expects a C string as argument for str, whose
first character is used as the starting location to scan for tokens. In
subsequent calls, the function expects a null pointer and uses the position
right after the end of last token as the new starting location for scanning.
To determine the beginning and the end of a token, the function first scans
from the starting location for the first character not contained in delimiters
(which becomes the beginning of the token). And then scans starting from this
beginning of the token for the first character contained in delimiters, which
becomes the end of the token.
This end of the token is automatically replaced by a null-character by the
function, and the beginning of the token is returned by the function.
Once the terminating null character of str has been found in a call to strtok,
all subsequent calls to this function with a null pointer as the first
argument return a null pointer.
Seems okay, right? What does MSDN say about threads and strtok?
Almost all of the above plus this little note:
Each function uses a thread-local static variable for parsing the string into
tokens. Therefore, multiple threads can simultaneously call these functions
without undesirable effects. However, within a single thread, interleaving
calls to one of these functions is highly likely to produce data corruption and
inaccurate results. When parsing different strings, finish parsing one string
before starting to parse the next. Also, be aware of the potential for danger
when calling one of these functions from within a loop where another function
is called. If the other function ends up using one of these functions, an
interleaved sequence of calls will result, triggering data corruption.
Well, that looks good as well. Looks like I can use it and not worry about
But low and behold, what does the C Runtime say?
Using the statically linked CRT implies that any state information saved by the
C runtime library will be local to that instance of the CRT. For example, if
you use strtok, _strtok_l, wcstok, _wcstok_l, _mbstok, _mbstok_l when using a
statically linked CRT, the position of the strtok parser is unrelated to the
strtok state used in code in the same process (but in a different DLL or EXE)
that is linked to another instance of the static CRT. In contrast, the
dynamically linked CRT shares state for all code within a process that is
dynamically linked to the CRT. This concern does not apply if you use the new
more secure versions of these functions; for example, strtok_s does not have
Wow, so its only thread safe if you static link to the c runtime.
To make matters worse: If you create your COM connections in proc, that is
what they are--in process. So you and all your threads + all the threads
from any libraries you are using that are dynamically linked to the C
runtime have the chance of a collision.
Also--if you throw the /clr switch to use C++/CLI code: you are required
to dynamic link. The static option is mutually exclusive, so you can't
do a quick fix and just switch to static.
Looks like the only solution on windows is to switch to strtok_s--luckily
it works almost the same way as strtok. Almost... it still takes some tweaking
to get the exact functionality.
Good luck, and beware. If you are multi-threaded then this code
has the potential to throw an exception, and since there is no handler:
cause a crash.
lFileLength = File->GetLength();
Data = new TCHAR[lFileLength+1];
Data[lFileLength] = NULL;
// start our tokenization
buffer = strtok(Data, "\n\r");
Literally, after this call went on the stack and started to execute:
We had a background thread strtok which changed the value of
its static internal state pointer. When this went to execute, it was at
the end of the other call's buffer and tried to read past it.
If its not at the end of your address space, then the call could simply fail,
returning NULL, or get a garbled state where you are now tokenizing the wrong
If you are at the end of the buffer, and the next address is not owned
by you: it will throw an exception.