Jump to content
  • Advertisement
Sign in to follow this  
deadlydog

"Simple" regular expression

This topic is 4025 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I need a javascript regular expression that is able to search for a word within a string, and will match as long as it is a whole word, and is not contained within double quotes. Now, to find the _Word I am looking for is simple, I just use: new RegExp("\\b" + _Word + "\\b", "i"); and that will create a regular expression which will match the given _Word in a string, making sure it is a whole word (not part of another word). My problem is trying to also make it so the regular expression doesn't match the word if it is contained in quotes, such as "_Word", or "some times _Word happens". I'm sure this is a simple problem for those who speak regexp often. Any help would be appreciated. Thanks in advance. [Edited by - deadlydog on July 12, 2007 10:09:12 AM]

Share this post


Link to post
Share on other sites
Advertisement
you should be able to use negative lookahead to make sure it matches properly.

This seems to work fine in Ruby:

regexp = /\b(?!\")myfancyword(?!\")\b/
regexp.match 'I like myfancyword'
=> Match
regexp.match 'I like "myfancyword"'
=> No Match

dunno if javascript support the lookahead though...

Share this post


Link to post
Share on other sites
Quote:
Original post by rollo
you should be able to use negative lookahead to make sure it matches properly.

This seems to work fine in Ruby:

regexp = /\b(?!\")myfancyword(?!\")\b/
regexp.match 'I like myfancyword'
=> Match
regexp.match 'I like "myfancyword"'
=> No Match

dunno if javascript support the lookahead though...


I believe I tried that regular expression, and it did work if the _Word had a double quote directly in front, or directly behind it, but not if it didn't. For example, that would match against "_Word ...", and "..._Word", but not against "..._Word...". I will try your method again tomorrow when I get into work just to be sure, but I'm pretty sure it has that problem. Are there any other suggestions? Thanks.

Share this post


Link to post
Share on other sites
I doubt your problem is really as simple as that. In most situations where you care about "whether something is inside a double-quoted string", it's because you're parsing something source-code-like - which means you also have to handle escaped quotes within the string.

What I would do is first write a regexp that detects double-quoted strings:

"(\\.|[^\\"])*"


That is, a quote, followed by (one or more things which are either a backslash followed by any character - as that would always be part of the string - or a non-quote-non-backslash character), followed by a quote. (Actually, detecting escape sequences properly might be a *little* more complicated.)

Replace all instances of this pattern with nothing (in a new string, if you need to leave the original intact). Then search the *remaining* text for the word.

If you have to replace the word in the original string - good luck :)

Share this post


Link to post
Share on other sites
Thanks for the replies. Zahlman, while your idea might work, I can't use it for what I am trying to do. I basically have a function which returns a regular expression which can find the given word in a string...I am not actually given the string to search through or anything like that. So my function to get the regular expressions looks like:

function GetRegularExpressionToFindWholeWord(_Word)
{
return new RegExp("\\b" + _Word + "\\b", "i");
}

Now, I know how to find the whole _Word in a string (by simply using the regular expression above), and I know how to find if the given _Word is between quotes, using

new RegExp("\".*" + _Word + ".*\"", "i");

I am just not sure how I can combine the two into one regular expression, since regular expressions don't seem to have an AND operator (even though they have an OR operator, which seems weird to have one and not the other). Any other suggestions on my problem would be greatly appreciated. Thanks.

Share this post


Link to post
Share on other sites
Your expression to find the word between quotes won't work if there are multiple quoted sections within the string being searched. For example:

"This is quoted" Match this _Word "But not this one! _Word"

Both _Words are between quotes, but you only want to match the one that is outside of matching quotes.

I think this can be done by matching 0 or more pairs of quotes, followed by 0 or more non-quote characters, followed by the word you are trying to match.

Share this post


Link to post
Share on other sites
Quote:
Original post by Vorpy
Your expression to find the word between quotes won't work if there are multiple quoted sections within the string being searched. For example:

"This is quoted" Match this _Word "But not this one! _Word"

Both _Words are between quotes, but you only want to match the one that is outside of matching quotes.

I think this can be done by matching 0 or more pairs of quotes, followed by 0 or more non-quote characters, followed by the word you are trying to match.

Ahh, thank you, I did not notice that, but it is a feature I will want. I still am not sure how to get it to work with finding a whole word though. Any more suggestions anyone? Thanks

Share this post


Link to post
Share on other sites
Nobody here knows regular expressions enough to solve this problem? I figured this would be a simple problem, but I guess it's harder than I thought. I'm still open to any suggestions anyone might have. Thanks.

Share this post


Link to post
Share on other sites
This seems to work:

^([^"]|("(([^"\]|\\.)+)"))*\b(_Word)\b

It matches _Word surrounded by non-letter characters preceded by an even, possibly zero, number of unescaped quotation marks. That will ensure that _Word isn't in a string.

Share this post


Link to post
Share on other sites
I think this regex works:

^([^"]|("[^"]*"))*\b(word)\b

It matches any number of non-quote characters or quoted strings and then the word you are looking for.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!