Sign in to follow this  
Dragon_Strike

[.net] regex removing HTML tags with Match

Recommended Posts

i would like to create a regex that matches everything in a string except characters within "<" ">"; tags. i know this would be quite simple with regex.Replace("<[^>]*>") but i need it to be with regex.Match... is there anyway to do this? something like Match(/*negate*/"<[^>]*>") EDIT: i would also like to point out that tehre isnt always whitespace between the tags and the text

Share this post


Link to post
Share on other sites
I can't be sure without knowing what you're trying to accomplish in the end - but I would recommend avoiding regex (which is really overkill here) and using text parsing functions instead. You should be able to write an enumerator function (using yield return, etc.) to do the job in less than 5 lines. You'll also have something much more readable if you can avoid regexes.

Share this post


Link to post
Share on other sites
Quote:
Original post by Dragon_Strike
i know this would be quite simple with regex.Replace("<[^>]*>")

but i need it to be with regex.Match...

is there anyway to do this?

No. You can't remove text with match. All you can do is determine if your input text matches the specified pattern, and optionally capture subgroups. You'll need additional processing to yield the equivalent of replace - match all the desired groups and then concatenate them.

Share this post


Link to post
Share on other sites
If I understand correctly You want to remove HTML tags from text.
I usually do it with
stringWithHtml = Regex.Replace(stringWithHtml, @"<(.|\n)*?>", string.Empty);


Matching is for finding parts of text. Replacing is for manipulation of matching fragments.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this