C#/.NET - Include delimiter in each element from String.Split()

Started by
3 comments, last by BTownTKD 15 years, 6 months ago
Basically I want to split a string up (likely using String.Split or RegEx.Split), and I want it to include the delimiter encountered at the end of each element. For example: If the delimiters are ' ' and '-' (whitespace and hyphen), and the string is "Hello I'm testing out this sweet new function, and I'm want to split-up hyphenated words too!", then the results would be; "Hello " "I'm " "Testing " **(etc)** "to " "split-" "up " "hyphenated " "words " "too!" Notice every element included the delimiter at the end. As far as I can tell, String.Split() always removes the delimiters from the resulting array. Not cool - is there a simple workaround? EDIT: A MatchCollection would work, too, if anybody knows a Regular expression that would achieve the same result with RegEx.Matches() [Edited by - BTownTKD on April 22, 2009 2:45:26 PM]
Deep Blue Wave - Brian's Dev Blog.
Advertisement
You could probably get real fancy with String::Split and split up the string one delimiter at a time, re-inserting the delimiter, then splitting on the next delimiter, while maintaining the proper string ordering, etc. But that sounds like more trouble than it's worth. I would probably just implement my own splitting function that does exactly what I want -- iterate the characters, check against delimiters, and leave them in when I create the list of strings.
Not simple. Though it's pretty trivial to use contains or the match location on increasingly small strings to do the same behavior.
I can't speak for the .Net regular expression library, but in some libraries I've used, if you capture the delimiter using the appropriate symbols, it will be included in the result of a split. For example "( |-)" would keep the delimiter because () captures the text, whereas " |-" would not.
"Walk not the trodden path, for it has borne it's burden." -John, Flying Monk
Quote:Original post by Extrarius
I can't speak for the .Net regular expression library, but in some libraries I've used, if you capture the delimiter using the appropriate symbols, it will be included in the result of a split. For example "( |-)" would keep the delimiter because () captures the text, whereas " |-" would not.


Ding-ding-ding!

I used the pattern ( |-) and it worked perfectly! Thanks so much! Coincidentally, this is now a fantastic word-wrap function in my custom XNA "Text" class.

Note that the resulting array stored each delimiter in its own index, and not at the end of each specific word, i.e:

"Hello,"
" "
"world!"

instead of
"Hello, "
"world!"

But it still works great. Thanks again.

[Edited by - BTownTKD on April 22, 2009 2:52:55 PM]
Deep Blue Wave - Brian's Dev Blog.

This topic is closed to new replies.

Advertisement