sql server - SQL address data is messy, how to clean it up in a query? -
i have address data stored in sql server 2000 database, , need pull out addresses given customer code. problem is, there lot of misspelled addresses, missing parts, etc. need clean somehow. need weed oout bad spellings, missing parts, etc , come "average" record. example, if new york spelled in 4 out of 5 records, should value returned.
i can't modify data, validate on input, or that. can modify copy of data, or manipulate through query.
i got partial answer here addresses stored in sql server have many small variations(errors), need allow multiple valid addresses per code.
sample data
code name address1 address2 city state zip timesused 10003 american nutriton inc 2183 ball street olden utah 87401 177 10003 ameican nutrition inc 2183 ball street po box 1504 olden utah 87402 76 10003 american nutrition inc 2183 ball street olden utah 87402 24 10003 american nutrition inc 2183 ball street po box 1504 olden utah 87402 17 10003 samantha brooks 506 s. main street ellensburg washington 98296 1 10003 bemis company 1401 w. fourth plain blvd. vancouver washington 98660 1 10003 cei 597 vandyre boulevard wrightstown wisconsin 54180 1 10003 pacific pet 28th avenue olden utah 84401 1 10003 petsmart, inc. 16091 north 25th street phoena arizona 85027 1 10003 pet firm 16418 north 37th street phoena arizona 85503 1
desired output
code name address1 address2 city state zip 10003 american nutrition inc 2183 ball avenue olden utah 84401 10003 samantha brooks 506 s. main street ellensburg washington 98296 10003 bemis company 1401 w. fourth plain blvd. vancouver washington 98660 10003 cei 975 vandyke road wrightstown wisconsin 54180 10003 pacific pet 29th street ogden utah 84401 10003 petsmart, inc. 16091 north 25th avenue phoena arizona 85027 10003 pet firm 16418 north 37th street phoena arizona 85503
the best solution use cass certified address standardization program or service format , validate address. beyond usps has tools this, there many third-party programs or services provide functionality. address parsing far more complicated might imagine , trying whip few queries fraught peril.
google's geocoding place look.. apparently google requires display results use geocoding service. leaves using dedicated address parsers usps or third-party program.
Comments
Post a Comment