sql server - SQL address data is messy, how to clean it up in a query? -


i have address data stored in sql server 2000 database, , need pull out addresses given customer code. problem is, there lot of misspelled addresses, missing parts, etc. need clean somehow. need weed oout bad spellings, missing parts, etc , come "average" record. example, if new york spelled in 4 out of 5 records, should value returned.

i can't modify data, validate on input, or that. can modify copy of data, or manipulate through query.

i got partial answer here addresses stored in sql server have many small variations(errors), need allow multiple valid addresses per code.

sample data

 code    name                       address1                      address2           city            state          zip     timesused 10003   american nutriton inc     2183 ball street                                 olden           utah           87401     177 10003   ameican nutrition inc     2183 ball street              po box 1504        olden           utah           87402     76 10003   american nutrition inc    2183 ball street                                 olden           utah           87402     24 10003   american nutrition inc    2183 ball street              po box 1504        olden           utah           87402     17 10003   samantha brooks           506 s. main street                               ellensburg      washington     98296     1 10003   bemis company             1401 w. fourth plain blvd.                       vancouver       washington     98660     1 10003   cei                       597 vandyre boulevard                            wrightstown     wisconsin      54180     1 10003   pacific pet               28th avenue                                      olden           utah           84401     1 10003   petsmart, inc.            16091 north 25th street                          phoena         arizona        85027      1 10003   pet firm              16418 north 37th street                          phoena         arizona        85503      1 

desired output

 code    name                      address1                      address2           city            state          zip      10003   american nutrition inc    2183 ball avenue                                 olden           utah           84401 10003   samantha brooks             506 s. main street                               ellensburg      washington     98296  10003   bemis company             1401 w. fourth plain blvd.                       vancouver       washington     98660 10003   cei                       975 vandyke road                                 wrightstown     wisconsin      54180 10003   pacific pet               29th street                                      ogden           utah           84401 10003   petsmart, inc.            16091 north 25th avenue                          phoena         arizona        85027 10003   pet firm              16418 north 37th street                          phoena         arizona        85503 

the best solution use cass certified address standardization program or service format , validate address. beyond usps has tools this, there many third-party programs or services provide functionality. address parsing far more complicated might imagine , trying whip few queries fraught peril.

google's geocoding place look.. apparently google requires display results use geocoding service. leaves using dedicated address parsers usps or third-party program.


Comments

Popular posts from this blog

python - Scipy curvefit RuntimeError:Optimal parameters not found: Number of calls to function has reached maxfev = 1000 -

c# - How to add a new treeview at the selected node? -

java - netbeans "Please wait - classpath scanning in progress..." -