Jump to content
  • Advertisement
Sign in to follow this  
deadlydog

Regular expression problem

This topic is 4107 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

I've currently got a javascript regular expression which searches for a given Word in a string and finds a match as long as the Word is not a substring of another word (matches the whole word), and as long as the Word is not in double quotes. So for example a search for "dog" in: Hello, "My dog is" the best doggy that ever was a dog would only find a match against the very last word (dog). The regular expression I'm using to do this currently is: return new RegExp("(^(?:[^\"]|(?:\"(?:(?:[^\"])*)\"))*)\\b(" + _Word + ")\\b", "i"); Now my problem is that the regular expression doesn't handle the apostrophe the way I would like. For example, if I searched for "dog" in "My dog's rule" it would find a match, since the apostrophe is a non-word character. I don't want it to match against this though. I want it to only match against the exact string, so a search for "dog's" would match against "My dog's rule", but a search for just "dog" shouldn't. I'm sure the only part of the regular expression that needs to be changed is: \\b(" + _Word + ")\\b but I'm not certain. If anyone has any ideas of how to get it to do what I want I would greatly appreciate it. Thanks in advance.

Share this post


Link to post
Share on other sites
Advertisement
So what I need is instead of just breaking on word-boundaries, is to break on word-boundaries except for apostrophes. Does anyone have any ideas how I can write that? I just can't seem to get it right.

Also, if you are wondering what any of the tags are, just do a google search for "javascript regular expression mozilla dev"

Share this post


Link to post
Share on other sites
Something I read somewhere: "You have a problem. You decide to solve it using regular expressions. You now have two problems." I think that this applies to your problem... Regexps are really powerful, and I don't doubt that they can do what you want. But something simpler might be better, especially with all the powerful string-handling javascript has.

I'd do something more along the lines of:
1) Split the string along space boundaries
2) Split it again on double-quotes
3) Loop through the list, match each word against the word you're looking for, and make sure it's not in double-quotes when you do so.

Share this post


Link to post
Share on other sites
The only real solution I see for it is to replace \b with your own character class which doesn't include the apostrophe. So something like [ \t\n\r] (you should probably look up how javascript defines \b, and replicate it except for the apostrophe). There's probably an easier way to do it, and probably a hackish way to do it with some extra lookahead/behinds.

Another vote to use a non-regex solution. I personally tend to shy away from them in any situation that requires a lookahead/behind, but that's just me.

Share this post


Link to post
Share on other sites
Quote:
Original post by Mushu
The only real solution I see for it is to replace \b with your own character class which doesn't include the apostrophe. So something like [ \t\n\r] (you should probably look up how javascript defines \b, and replicate it except for the apostrophe).


Yeah, I thought of doing this too, but I haven't been able to find how \b is defined anywhere on the net. I am really hoping someone can think of a way to solve this problem with regular expressions. Any suggestions would be appreciated. Thanks.

Share this post


Link to post
Share on other sites
I took a whack at this; it's a little beyond me!

*but*, I do recommend you take a look at RegEx buddy*. It's helped me a lot, and might make it a little easier to work through your problem.

Hope that helps somewhat!




*I'm in no way associated with this product!

Share this post


Link to post
Share on other sites
Quote:
Original post by _Sigma
I took a whack at this; it's a little beyond me!

*but*, I do recommend you take a look at RegEx buddy*. It's helped me a lot, and might make it a little easier to work through your problem.


Yeah, I actually already tried using RegEx Buddy a little bit, but I couldn't figure out how to get what I wanted from it. If anyone else has any ideas or suggestions I would be very appreciative. Thanks.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!