[.net] Regex insanity

Started by
6 comments, last by Machaira 15 years, 8 months ago
This isn't actually game related, i'm working on a project trying to rip out all these file names from a read out i got from a flat file database for a cnc system. the problem is that we are trying to convert the file names in the database. ok so i have a ton of stupidlines and i need to rip out only the two values i need. "LTDADBCH_1621-002.PCH","001","PPGM"," "," "," ",2,466," ",0,"PROVEN","NONE","19351748.PCH","DNC","2007/07/12","D:\ITCM39\NEWFILES\LTDADBCH_1621-002.PCH","14:22:28"," "," "," "," "," "," "," "," "," "," "," "," "," ",1000000007,"N"," "," "," ",0, there are like ten thousand of these dumb things rofl. i need the values "LTDADBCH_1621-002.PCH" and the ","D:\ITCM39\NEWFILES\LTDADBCH_1621-002.PCH" values. any clue on how to do this? i have been playing with regex for like a day trying to get it. the "," parts keep messing me up. any hints? where i should look for more info?
Advertisement
\".*?\.PCH\" will match both "LTDADBCH_1621-002.PCH" and "D:\ITCM39\NEWFILES\LTDADBCH_1621-002.PCH". Why do you need the leading quotation mark and comma on the full path?
I have thousands of those exact lines as well as thousands of OTHER gobledy googe. i ONLY want the values in those lines like the one above and only the two parts in it which means i need to make sure it exclude all the other rows. other rows contain file names also so i need to make sure it's only a row that matches this form in this order that i catch.

I tested your regex and i get this as results:

"001","PPGM"," "," "," ",2,466," ",0,"PROVEN","NONE","19351748.PCH"

and

"DNC","2007/07/12","D:\ITCM39\NEWFILES\LTDADBCH_1621-002.PCH"

so that won't work though it is closer then i was getting on my own.
Why use a regex? Split on line, split on comma, pull the two indexes you need, strip quotes. Done.
because the boss is comfortable with regular expressions, he understands and trusts regular expressions, and he wants it done with regular expressions. I just work here rofl.



WOOT you helped me figure this out.

^(?<filename1>\".*?\.PCH\").*(?<filename2>\".*?\.PCH\").*(?<filename3>\".*?\.PCH\")

rips out the first the middle and the last ( will just ignore the middle hehe)
Or there's always \"[^".]*\.PCH\".

Of course, as with all things regex, there are a billion solutions to one problem.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

I went with ^ because there are other lines that match the one above but is in the MIDDLE of the line so i made sure that only this line and only at the start >.< the file read out this program generates is horrible.

thnx for the help all. my boss will be happy. now to idiot proof the status messages (i just ran a test and it took 10 minutes to run one of these files ugg!)
Quote:Original post by addtheice
because the boss is comfortable with regular expressions, he understands and trusts regular expressions, and he wants it done with regular expressions. I just work here rofl.

You must have a PHB. [grin]

Former Microsoft XNA and Xbox MVP | Check out my blog for random ramblings on game development

This topic is closed to new replies.

Advertisement