Jump to content
  • Advertisement
Sign in to follow this  
Dospro

Regular expressions to tokenize

This topic is 620 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Hi everyone.

 

I hope you can give me a hand with this:

 

I have a list of tuples like this:

table = [
("aaa", "a","b", 0),
("aaa", "a", "c", 1),
("aaa", "b", "a", 2),
("aaa", "b", "c", 3),
("aaa", "c", "a", 4),
("aaa", "c", "b", 5),
("aaa", "a", "*", 6),
("aaa", "b", "*", 7),
("aaa", "c", "*", 8),
...
]

More less like that. I't huge.
Now, the * means one digit.
So, as you can see, this table holds some kind of (not so)regular expressions
The idea is that if you write "aaa a b" you get 0. If you write "aaa c 1" you get 8.

 

The program actually works. But i want to change it to use python regular expressions.

I managed to write the regular expressions to match the strings and keep it in groups:

r'(?P<matcha>aaa[\s]+(a|b|c)[\s]+(a|b|c|[\d]))'

This, matches all the tuples in the example table.

My question.

Is there a way to get an specific integer from a match(like the one in the table)

 

Or maybe translate the match into the "table-regular-expression-format".

 

Share this post


Link to post
Share on other sites
Advertisement
It looks like some sort of assignment, but why not throw a PLY scanner at it?
That does all the hard work for you.

Otherwise, I am not quite convinced that a RE is a good solution for sequence recognition when you're in a hurry.

Share this post


Link to post
Share on other sites

I think I'm missing something here.

 

If you want to retrieve the fourth element of the tuple based on the other three, is there any reason for not using a good ol' dictionary and having "aaa a b" as the key and 0 as the value?

 

Also, how is a regular expression that matches all the possible strings you are using going to help to retrieve the number associated with a specific string?

Share this post


Link to post
Share on other sites

Try to avoid regular expressions whenever possible. They are very powerful for what they are designed for (esuring a text matches pattern) but are often overused hurting performance and readability. Your case is not what regex is for. There is no source text and no pattern to match. You will have problems later trying to extend your solution or debugging bizarre edge cases.

Share this post


Link to post
Share on other sites

Try to avoid regular expressions whenever possible.

That's an error in the other direction. Use regex when it's the right tool and avoid it when it's not. This is a case where it's definitely not.

 

I think Avalander is on the right path here with just having an associative container, but I think OP may have a very wrong idea about how regex is typically used.

Share this post


Link to post
Share on other sites

Ok. Thanks for the advice.

I think i will use regular expression for matching some of the "generic characters": numbers to *, etc.

 

I thought i could use re to get the number.

By the way, thanks for helping me see the a dict is far better than a list of tuples.

Share this post


Link to post
Share on other sites

That's an error in the other direction


Yes, but I've just seen it too many times. Dev learns about regex then "Wooaa! Shiny! I can do so many things with that!". And you get abominations like parsing HTML to get page title. Bloated beyond repair to eliminate false positives in headers, comments and js. Thanks, but no thanks :) I would rather err in this direction and use old fashioned search if it's viable and use regex only when I actually gain anything always sacrificing readability.

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!