Sign in to follow this  
Niksan2

[.net] Regex woes

Recommended Posts

Niksan2    265
Hello, I've been looking at regex to tokenize some script so I can translate this into c# so I can use it via com to an external application, but I'm having troubles, no doubt because I've not even hit amateur status at regex yet, I have the following code and have had many variants to try solve the problem.
Regex r = new Regex(@"((""[^""]*"")|([^ ]*))*");
MatchCollection mc = r.Matches("this  = \"that and something else\"");
foreach( Match m in mc )
{
  Console.WriteLine(m.Value);
}

Now, as can probably be seen, I just want to split the string at whitespace points unless it's contained within "", this kind of works with the exception of any non matches yields in an entry to the match collection list having value as an empty string, now, I would have thought that any non matches wouldn't make it into a match list, being called a matchcollection and all that :) So, am I doing anything wrong or is this normal behaviour, and if so is there a better solution to my task without the need to have several passes on strings using string.split etc. PS: I also used a variant of regex.split, but this included the deliminators, I know string.split allows you to strip out the empty entries but there doesn't seem to be any same thing for regex. Any help much appreciated.

Share this post


Link to post
Share on other sites
Bob Janova    769
This did the trick for me:
		public static string[] Parse(string text){
if(text.IndexOf('"') < 0) return text.Split(' ');
else{
MatchCollection mc = Regex.Matches(text, "\"(?<word>[^\"]*)\" *|(?<word>\\w+)");
int len = mc.Count;
string[] res = new string[len];
for(int i = 0; i < len; i++) res[i] = mc[i].Groups["word"].Value;
return res;
}
}

Share this post


Link to post
Share on other sites
Niksan2    265
Cheers for that, still seems ott for how you'd expect it to work, but a co-worker here suggested another hack which seems to work, with just a subtle change.



Regex r = new Regex(@"[\s]*((""[^""]*"")|([^ ]*))*");
MatchCollection mc = r.Matches("this = \"that and something else\"");
foreach( Match m in mc )
{
Console.WriteLine(m.Value.Trim());
}




Thanks again.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this