[.net] C# string manipulation question

Started by
5 comments, last by Xai 17 years, 6 months ago
I have a System.String object, and need to retrieve an array of all substrings that match the pattern <*>. For example, if I was parsing this string: <Hello> I am a <little> confused from the <transition> from C++ to C# I would get an array of these strings: <Hello> <little> <transition> Any help is appreciated. Thanks!
my siteGenius is 1% inspiration and 99% perspiration
Advertisement
This will give you a general idea of how to do this as I am
not writing all the code for you but this is generally how you will do it....


string myString = "Hello I am a little confused from the transition from C++ to C#";

foreach (string test in myString.Split())
{
//This is where you check your string arrays
if (test.ToLower() == "hello")
Console.WriteLine(test);
else if (test.ToLower() == "little")
Console.WriteLine(test);
else if (test.ToLower() == "transition")
Console.WriteLine(test);
}

Console.ReadKey();
You should use Regular Expressions, something like this (I'm not an expert but this works):

using System.Text.RegularExpressions;
MatchCollection col = Regex.Matches("<Hello> I am a <little> confused from the <transition> from C++ to C#", "<[a-zA-Z]+>");
foreach(Match myString in col)
{
Console.WriteLine(myString.Value);
}
Quote:Original post by pucis
You should use Regular Expressions, something like this (I'm not an expert but this works):

using System.Text.RegularExpressions;
MatchCollection col = Regex.Matches("<Hello> I am a <little> confused from the <transition> from C++ to C#", "<[a-zA-Z]+>");
foreach(Match myString in col)
{
Console.WriteLine(myString.Value);
}


Thanks very much!
my siteGenius is 1% inspiration and 99% perspiration
Or better

using System.Text.RegularExpressions;
MatchCollection col = Regex.Matches("<Hello> I am a <little> confused from the <transition> from C++ to C#", "<.+>");
foreach(Match myString in col)
{
Console.WriteLine(myString.Value);
}

I think . means any character
Thats on UNIX :p
Quote:Original post by clauchiorean
MatchCollection col = Regex.Matches("<Hello> I am a <little> confused from the <transition> from C++ to C#", "<.+>");

Since that would be a greedy match, you'd only get a single match of "<Hello> I am a <little> confused from the <transition>".

The correct regex would be "<.*?>". The ? makes the match non-greedy; it will stop once it has a match.

--AnkhSVN - A Visual Studio .NET Addin for the Subversion version control system.[Project site] [IRC channel] [Blog]
the non-greedy method above is a good one, just adding the other way you could make a regex do this correctly: "<[^>]*>" which is '<' followed by 0 or more non '>' characters followed by the '>' character. Arild's answer is better in this case, but knowing how to do both is important for other situations.

This topic is closed to new replies.

Advertisement