Sign in to follow this  
silverphyre673

[.net] C# string manipulation question

Recommended Posts

I have a System.String object, and need to retrieve an array of all substrings that match the pattern <*>. For example, if I was parsing this string: <Hello> I am a <little> confused from the <transition> from C++ to C# I would get an array of these strings: <Hello> <little> <transition> Any help is appreciated. Thanks!

Share this post


Link to post
Share on other sites
This will give you a general idea of how to do this as I am
not writing all the code for you but this is generally how you will do it....


string myString = "Hello I am a little confused from the transition from C++ to C#";

foreach (string test in myString.Split())
{
//This is where you check your string arrays
if (test.ToLower() == "hello")
Console.WriteLine(test);
else if (test.ToLower() == "little")
Console.WriteLine(test);
else if (test.ToLower() == "transition")
Console.WriteLine(test);
}

Console.ReadKey();

Share this post


Link to post
Share on other sites
You should use Regular Expressions, something like this (I'm not an expert but this works):

using System.Text.RegularExpressions;
MatchCollection col = Regex.Matches("<Hello> I am a <little> confused from the <transition> from C++ to C#", "<[a-zA-Z]+>");
foreach(Match myString in col)
{
Console.WriteLine(myString.Value);
}

Share this post


Link to post
Share on other sites
Quote:
Original post by pucis
You should use Regular Expressions, something like this (I'm not an expert but this works):

using System.Text.RegularExpressions;
MatchCollection col = Regex.Matches("<Hello> I am a <little> confused from the <transition> from C++ to C#", "<[a-zA-Z]+>");
foreach(Match myString in col)
{
Console.WriteLine(myString.Value);
}


Thanks very much!

Share this post


Link to post
Share on other sites
Or better

using System.Text.RegularExpressions;
MatchCollection col = Regex.Matches("<Hello> I am a <little> confused from the <transition> from C++ to C#", "<.+>");
foreach(Match myString in col)
{
Console.WriteLine(myString.Value);
}

I think . means any character
Thats on UNIX :p

Share this post


Link to post
Share on other sites
Quote:
Original post by clauchiorean
MatchCollection col = Regex.Matches("<Hello> I am a <little> confused from the <transition> from C++ to C#", "<.+>");

Since that would be a greedy match, you'd only get a single match of "<Hello> I am a <little> confused from the <transition>".

The correct regex would be "<.*?>". The ? makes the match non-greedy; it will stop once it has a match.

Share this post


Link to post
Share on other sites
the non-greedy method above is a good one, just adding the other way you could make a regex do this correctly: "<[^>]*>" which is '<' followed by 0 or more non '>' characters followed by the '>' character. Arild's answer is better in this case, but knowing how to do both is important for other situations.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this