# Unity [java] I need some help with regex...

This topic is 3733 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hello all, Alright, since most of you will suspect this, I'll just say it. This is for a homework problem, but regex isn't part of the homework. They want us to parse a file by reading in each line and then parse it manually by going through each character. I am trying to do it using regular expressions just because it seems to be a more elegant solution and a good opportunity to learn about them. Ok, so the file contains info about books in this format: "The title goes here" [publishing year] [price] "Author's name here" The quotes are actually there around the title and the author. So, I'll start with the title, I am doing something like this: String title = lineScan.next("\".*\""); where lineScan is a Scanner object for the current line in the file. That expression \".*\" means a string with at least one character between quotes right? But when I run this, it throws a "java.util.InputMismatchException" exception. It works if I do this though: lineScan.useDelimiter("\""); String title = lineScan.next(); But then I try and change the delimiter back to whitespace like this: lineScan.useDelimiter("\\s+"); That means any number of spaces doesn't it? but then it throws a NumberFormatException because it reads in extra whitespace it looks like. Anyway, can anybody give me some help with this? I have read the Java API docs on regular expressions, the scanner object, all that; but I could have missed or misunderstood something. Thanks, Svenjamin

##### Share on other sites
Ugh, the regex and string escapes...

The regex string you're looking for can be written in single line. As it happens, the Scanner tutorial gives you the source:
Scanner s = new Scanner(input);        s.findInLine( ??? );        MatchResult result = s.match();        for (int i=1; i<=result.groupCount(); i++)            System.out.println(result.group(i));
The ??? is what you must figure out - and yes, it's a single string.

The problem with quoted text is that by matching ., quote apparently gets matched too. Hard to tell, since there's no output. So to match everything within quotes, search for just that - "everything that is not quote".

If you want to match something, the it'll usually be in the form: " (\\d+) " - everything outside of () will be matched literally.

So your final expression will look something like the Scanner example:
"\"(???)\" [(???)] [(???)] \"(???)\"";

??? is for you to find out - those are the regexes that will match the variable parts, I'm telling too much already.

Also - homework may or may not be about exceeding yourself. Sometimes, getting something done on your own matters more, so an elaborate solution you cannot fully explain (and regex is a very annoying topic) might not have the expected results.

##### Share on other sites
Hey, Thanks for the quick reply.
I think I can figure it out now, I didn't even think that the . might include quotes as well. Thanks for pointing that out. Thanks for not giving the solution as well, I will enjoy figuring it out myself.

Also, I appreciate your advice about homework, But this is just an intro to Java course. I have been doing C++ for a few years, so there is nothing new conceptually and I enjoy trying to make things harder this way. But again thanks for the advice.

Thanks again for the help,
Svenjamin

##### Share on other sites
Ok, I'm back. I think I've got the expressions I'm going to use figured out, but I'm having some trouble implementing them. I have been testing different expressions using this program.
So to extract the title which is surrounded by quotes I use this expression:

\"[\w[\s]]+\"

And that works like a charm in the test program, but then when I try and write it into my code like this:

String title = input.next("\"[\\w[\\s]]+\"");

I still get the input mismatch exception.
I also tried just printing out the results using the method in the example that you pointed out, but it throws another exception saying that no results were found.
Is there something I am missing when putting the expression into the code?

Thanks,
Svenjamin

##### Share on other sites
I think you have to escape your backslashes as well as the quotes because \" is also a escape for c strings (which I assume is the same for Java, BTW, sorry about my Java ignorance [smile])

so try:

String title = input.next('\\"[\\w[\\s]]+\\"');

or

String title = input.next("\\\"[\\w[\\s]]+\\\"");

Edit: This post has been heavily modified from the original because I found more problems than the ones I first noticed.

##### Share on other sites
Unfortunately, the first method you suggested doesn't compile in java, only one character can be between single quotes. And the second method still throws the exception. Thanks for trying though, I appreciate it.

Svenjamin

##### Share on other sites
Quote:
 Original post by KwizatzI think you have to escape your backslashes as well as the quotes...

No.

The issue is that regex, by default, is greedy. For example:
# this is Perl, because regex is most elegant in Perl [smile]$str = "\"Ponderous Book Title: Opaque Subtitle\" [2007] [100] \"Sonaiya, Oluseyi\"";$str =~ /\".*\"/;print "$&\n"; The output is "Ponderous Book Title: Opaque Subtitle" [2007] [100] "Sonaiya, Oluseyi". The regex will match any character, as indicated by .* until matching the next character would break the regex. What I need to do is tell it to match the smallest valid sequence, or make it non-greedy: # this is Perl, because regex is most elegant in Perl [smile]$str = "\"Ponderous Book Title: Opaque Subtitle\" [2007] [100] \"Sonaiya, Oluseyi\"";$str =~ /\".*?\"/; # NOTE THE QUESTION MARK!print "$&\n";

The output this time is "Ponderous Book Title: Opaque Subtitle".

Enlightenment beckons (with help, if your Perl-fu is weak).

##### Share on other sites
Alright, what you are saying about greediness and that makes sense, but I didn't think it would matter since neither \w nor \s match the double quote character right? So wouldn't it terminate anyway as soon as it hits the second quote?

Aside from that, I tried this:

input.next("(\"[\w[\s]]+\"){1}");

but that still didn't work. I'm not sure it is correct syntax though. There seem to be a lot of subtle nuances with regex syntax.

Thanks for the help so far,
Svenjamin

EDIT:

The first title in the file is "The Poky Little Puppy", so I tried doing:

input.next("\"The Poky Little Puppy\"");

And even that threw the exception still. Any insights there?

EDIT 2:
I solved it! (sort of) I gave up on the next() method because it still threw the exception even if I used the exact string for the pattern. Anyway, I used the findInLine(Pattern p) method to extract the different parts.

Svenjamin

[Edited by - Svenjamin on October 31, 2007 5:17:31 PM]

##### Share on other sites
Quote:
Original post by Oluseyi
Quote:
 Original post by KwizatzI think you have to escape your backslashes as well as the quotes...

No.

The issue is that regex, by default, is greedy. For example:
# this is Perl, because regex is most elegant in Perl [smile]$str = "\"Ponderous Book Title: Opaque Subtitle\" [2007] [100] \"Sonaiya, Oluseyi\"";$str =~ /\".*\"/;print "\$&\n";

This is why I gave a pretty strong hint in:
Quote:
 "everything that is not quote"

Since this is solved now, the expression I used for the original problem was:
s.findInLine("\"([^\"]+)\" \$(\\d+)\$ \$(\\d+)\$ \"([^\"]+)\"");

The "any character" is implied, so for quoted strings, I simply match "not quote".

##### Share on other sites
Quote:
 Original post by SvenjaminAside from that, I tried this:input.next("(\"[\w[\s]]+\"){1}");

You have a lot of superfluous nesting there. \"[\w\s]+\" is sufficient.

Glad you found the execution problem, though. [smile]

• 10
• 11
• 9
• 16
• 18
• ### Similar Content

• Custom coffee mugs have arrived... More caffeine!
Have a great weekend everyone!
#gamedev #indiedev #sama #caffeine

•
Hey guys,

Anthony here from Atwo Studios bringing you some new updates for the new year!
In this video I go over our game ROY, the new games and some general updates to the company!

If you have not checked out ROY feel free to give it a try! Many people have said they enjoyed the game thus far!
ROY: https://goo.gl/o6JJ5P

• By Affgoo
still a lot of work to do, but its pretty stable  please let me know what you think <3
Atlas Sentry is a game of destroy everything. Using your turret, simply swivel and shoot your way to victory, upgrading your weapons to unleash destruction on the variety of spaceships. The bigger your combo’s the more score you get! Earn silver as you play and then purchase new weapons and abilities to better deal with your enemy. Different enemies use different tactics and weapons, work out your own priorities in their destruction order.

Features:
**2 different game modes
**A level select mode with 20 difficult levels including a final boss, can you defeat it? **Arcade mode of endless destruction, how long will you last?
**High scores to compete against others, see who can take the top spot.

• Chamferbox, a mini game asset store has just opened with some nice game assets,
Here you can find a free greek statue asset

Also check their dragon, zombie dragon and scorpion monster out:

They're running the Grand Opening Sale, it's 30% off for all items, but for gamedev member, you can use this coupon code:
GRANDOPEN
to get 50% off prices What are you waiting for, go to
http://chamferbox.com
and get those models now!

View full story
• By Dafu
FES Retro Game Framework is now available on the Unity Asset Store for your kind consideration!
FES was born when I set out to start a retro pixel game project. I was looking around for an engine to try next. I tried a number of things, from GameMaker, to Fantasy Consoles, to MonoGame and Godot and then ended up back at Unity. Unity is just unbeatable in it's cross-platform support, and ease of deployment, but it sure as heck gets in the way of proper retro pixel games!
So I poured over the Unity pipeline and found the lowest levels I could tie into and bring up a new retro game engine inside of Unity, but with a completely different source-code-only, classic game-loop retro blitting and bleeping API. Months of polishing and tweaking later I ended up with FES.
Some FES features:
Pixel perfect rendering RGB and Indexed color mode, with palette swapping support Primitive shape rendering, lines, rectangles, ellipses, pixels Multi-layered tilemaps with TMX file support Offscreen rendering Text rendering, with text alignment, overflow settings, and custom pixel font support Clipping Sound and Music APIs Simplified Input handling Wide pixel support (think Atari 2600) Post processing and transition effects, such as scanlines, screen wipes, screen shake, fade, pixelate and more Deploy to all Unity supported platforms I've put in lots of hours into a very detail documentation, you can flip through it here to get an better glimpse at the features and general overview: http://www.pixeltrollgames.com/fes/docs/index.html
FES is carefully designed and well optimized (see live stress test demo below). Internally it uses batching, it chunks tilemaps, is careful about memory allocations, and tries to be smart about any heavy operations.
Please have a quick look at the screenshots and live demos below and let me know what you think! I'd love to hear some opinions, feedback and questions!
I hope I've tickled your retro feels!

More images at: https://imgur.com/a/LFMAc
Live demo feature reel: https://simmer.io/@Dafu/fes
Live blitting stress test: https://simmer.io/@Dafu/fes-drawstress
Unity Asset Store: https://www.assetstore.unity3d.com/#!/content/102064

View full story