Public Group

# RegEx question

This topic is 4754 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

How does one use a regular expression to check that a string doesn't consist of XYZ in that order? The string can contain Xs, Ys, Zs, XYs, YZs, so on so forth, but not XYZ. I seem unable to find the right regex pattern to use. Thanks.

in perl style:

if ($string!~/XYZ/){ } I believe there's a more general style using the ^, but I don't often use regexes outside of perl. #### Share this post ##### Link to post ##### Share on other sites There are a few ways to do it with (?!), a commonly supported regex extension. I can think of ((?!XYZ).)* for matching a sequence of text that doesn't contain XYZ, or ^((?!XYZ).)*$ for checking an entire string.

(Actually, the first one isn't perfect; it'll refuse to match "ABCXY" of "ABCXYZ"; but I'm sure it'll get you started.)

##### Share on other sites
I suppose I should be more specific...

Is there a way to do something like [^xyz] but that also takes order into account? I want to extract everything in a string inbetween two quotation marks, but within the quotation marks, the string can have a quote if it's preceeded by the \\.

Assuming that doing [^(xyz)] meant everything but xyz in that order, the regex would be:

"([^(\\")]*)"

Thank you.

##### Share on other sites
I think that "([^"\\]|\\\\|\\")*" might be more useful for that case. I assume that backslashes can be escaped, too; watch out for strings like "foo \\".

##### Share on other sites
Quote:
 Original post by Beer HunterI think that "([^"\\]|\\\\|\\")*" might be more useful for that case. I assume that backslashes can be escaped, too; watch out for strings like "foo \\".

Heh. I was going to make it more complicated by considering it as "runs of non-(quote/backslash) characters separated by (quote or backslash)", basically the same thing with a * after the [^"\\]. Silly, and slower unless the regex compiler is way better than I expect.

FWIW though, it might be nicer to collect the entire string as a group, and not capture the other items, thus: "((?:[^"\\]|\\\\|\\")*)". Then, instead of translating and pasting the individual things together, you could run a second pass of regexes on the captured result in order to translate the \\'s to \'s and \"'s to "'s. :)

(P.S. That's actually harder than it sounds to get right, because of e.g. the possibility of a large number of backslashes followed by a " - the correct translation depends on whether it's an odd or even backslash count... so actually, maybe you should just go with the other way :) )

• 10
• 17
• 9
• 13
• 41