# doing searches based on address info

This topic is 4603 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

## Recommended Posts

Hi, I have a huge database filled with address info (e.g. just house / apartment addresses), along with some other data. This table has like 10 million records. I need to allow it so the user can find that "other data" by searching based on a given address. This is kind of a tricky problem since you can't tell how people spell things, space things, abbreviate things, etc. So the way we decided to handle it was to base it on zip code and house number only. The user enters a house number and zip code, and we display any addresses that have that zip code AND contain that house number in the "address_1" field. We do the latter with a wildcard search. The problem is that with 10 million records, doing a wildcard search / full table scan takes forever. This is something that is important to be relatively fast (e.g. only a few seconds instead of 30). Anyway, I have come up with a solution and would be interested to hear input on it and solutions of your own. Basically, we change the database schema from just one table with the address / other info, into 2 tables. The first table would be called "address_and_other_info" and the second table would be called "house_number". The data comes in via a CSV file that we parse, so the process would do something like this:
for each entry
insert the address and other info into address_and_other_info table. grab the PK here
grab the address_1 field and split it by whitespace, put into array s
for each string in s
if(string contains a number)
insert this string into the "house_number" table. also assign its "parent" to be the PK of the row we inserted above


• 36
• 12
• 10
• 10
• 9
• ### Forum Statistics

• Total Topics
631359
• Total Posts
2999543
×