This topic is 4914 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

OK, here's the situation I'm in: I need to parse a customer's address information that's going to come to me like so, Name, Address1[,][Address2[,]]Town/City[,][County/State][,]Postcode[,][Country] Items in brackets are optional, just about every item on there will be more than one word and there will probably be an uneven spacing format -- I.E. this is coming straight from the customer as one text field and they can never be trusted. I know I can probably do a regex on the postcode, but the rest I'm thinking is going to be pure guesswork... Any suggestions? Mage2k BTW, I'm doing this in PHP, but I don't think that should really matter...

##### Share on other sites
I think PHP has strtok() that you can use to split one string into several. You could use that to split at ",". Finding out which parts in the middle are the ones missing is trickier though.

##### Share on other sites
Actually, yeah, PHP has a bunch of functions for splitting strings. The problem is not necessarily splitting up the text, but determining which tokes belong together as what. If they input commas then it's easy, but they may not. Or, they may only put a comma between the city and postal code.

Mage2k

##### Share on other sites
Well then it is really problematic. How do you for example tell the difference between a street address and a town/city? Is the second item an address or a city? An address usually is a street name and a number, correct? If there is a name, followed by a number, you have an address, else you have a town. UNLESS you have town directly followed by postcode.... If you can tell the difference between a postcode and a street number, your problem is solved.

##### Share on other sites
Quote:
 Original post by Mage2kBTW, I'm doing this in PHP, but I don't think that should really matter...

It does. PHP has a BUNCH of functions. I point you to:

PHP: Manual > English > VI. Function Reference > CXX. String Functions (navigated from the php.net homepage).

Look into the str* series of functions, and possibly explode as well.

##### Share on other sites
Seems like an impossible problem, if commas can't be guaranteed. Looks like the only guaranteed comma is after the name. That means you have no way to tell where the address 1 ends and the next field begins. Since different countries are an option, you can't even guarantee the format of the PostCode.

If there's always a comma between fields, then you can probably come up with something (unless fields can contain commas).

##### Share on other sites
Well, I can match the Postal code exactly as it's in the UK (I think that, for now, my client is pretty much restricting orders to th UK) and there's pretty much a standard for how they are: "^\D\D?\d\d? ?\d\D\D$" Note that "^[a-zA-Z0-9 ]{5,8}$" will also match that but any 5-8 digit/letter/space combo will match the second and the first will match the postal code exactly.

Let's see, from there I know that whatever is after that is the Country. I also know that whatever is before the first number or the string "[Pp]\.?[Oo]\.?" is the name.

I'm left with a string consisting of the street portion of Address1, the Town/City, and a possible County. And, the same situation starting with Address2 if it is there.

So, I've got 3 definite matches and the rest is guesswork. Which isn't too bad as my client realizes this, he as to read whatever I get anywayz, and he's not stupid.

Mage2k

1. 1
2. 2
frob
16
3. 3
4. 4
5. 5

• 13
• 13
• 61
• 14
• 15
• ### Forum Statistics

• Total Topics
632125
• Total Posts
3004252

×