Jump to content
  • Advertisement
Sign in to follow this  
Mage2k

Parsing an address...

This topic is 4979 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

OK, here's the situation I'm in: I need to parse a customer's address information that's going to come to me like so, Name, Address1[,][Address2[,]]Town/City[,][County/State][,]Postcode[,][Country] Items in brackets are optional, just about every item on there will be more than one word and there will probably be an uneven spacing format -- I.E. this is coming straight from the customer as one text field and they can never be trusted. I know I can probably do a regex on the postcode, but the rest I'm thinking is going to be pure guesswork... Any suggestions? Mage2k BTW, I'm doing this in PHP, but I don't think that should really matter...

Share this post


Link to post
Share on other sites
Advertisement
Guest Anonymous Poster
I think PHP has strtok() that you can use to split one string into several. You could use that to split at ",". Finding out which parts in the middle are the ones missing is trickier though.

Share this post


Link to post
Share on other sites
Actually, yeah, PHP has a bunch of functions for splitting strings. The problem is not necessarily splitting up the text, but determining which tokes belong together as what. If they input commas then it's easy, but they may not. Or, they may only put a comma between the city and postal code.

Mage2k

Share this post


Link to post
Share on other sites
Guest Anonymous Poster
Well then it is really problematic. How do you for example tell the difference between a street address and a town/city? Is the second item an address or a city? An address usually is a street name and a number, correct? If there is a name, followed by a number, you have an address, else you have a town. UNLESS you have town directly followed by postcode.... If you can tell the difference between a postcode and a street number, your problem is solved.

Share this post


Link to post
Share on other sites
Seems like an impossible problem, if commas can't be guaranteed. Looks like the only guaranteed comma is after the name. That means you have no way to tell where the address 1 ends and the next field begins. Since different countries are an option, you can't even guarantee the format of the PostCode.

If there's always a comma between fields, then you can probably come up with something (unless fields can contain commas).

Share this post


Link to post
Share on other sites
Well, I can match the Postal code exactly as it's in the UK (I think that, for now, my client is pretty much restricting orders to th UK) and there's pretty much a standard for how they are: "^\D\D?\d\d? ?\d\D\D$" Note that "^[a-zA-Z0-9 ]{5,8}$" will also match that but any 5-8 digit/letter/space combo will match the second and the first will match the postal code exactly.

Let's see, from there I know that whatever is after that is the Country. I also know that whatever is before the first number or the string "[Pp]\.?[Oo]\.?" is the name.

I'm left with a string consisting of the street portion of Address1, the Town/City, and a possible County. And, the same situation starting with Address2 if it is there.

So, I've got 3 definite matches and the rest is guesswork. Which isn't too bad as my client realizes this, he as to read whatever I get anywayz, and he's not stupid.

Mage2k

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement
×

Important Information

By using GameDev.net, you agree to our community Guidelines, Terms of Use, and Privacy Policy.

We are the game development community.

Whether you are an indie, hobbyist, AAA developer, or just trying to learn, GameDev.net is the place for you to learn, share, and connect with the games industry. Learn more About Us or sign up!

Sign me up!