Microsoft SQL Server
--
Questions
--
Followers
Top Experts
I have 2 Tables namely AddressTable1 and AddressTable2. Both of them have same number of Columns.
Columns:
-------------
ProviderID
AddressLine1 VARCHAR(55)
AddressLine2 VARCHAR(55)
City
ZIP
State
What I would like to find out is something different here.
A provide(ProviderID) will have entries in both the tables(Only 1 row) and the address of the Provider in both tables may or may not match.
I need a query/proc to find out the Providers those addresses are not matching.
When I say NOT MATCHING, I simply;
The addresses may logically be matching. But not physically(actual data in the fields)
E.g:
The texts in the address fields might be loaded vice-versa. There could be space in between. There could be # instead of No.
The main reason of these data discrepancies is "Manual intervention". The data is loaded into database in 2 different modes.
User Mode: Business Users manually enter the data from a front-end application.
Batch Mode: Inbound data is fed to DB through certain interfaces
We may be able to teach the Business Users to enter the data as per the US standard. We may be able to request the source to send the data in a proper mode.
But, I need to find out the discrepancy in the data that is already there in the DB. I somehow need to compare the addresses by taking each of the words from the fields to compare against other fields to find out the match.
What I need is, Provider ID and the Percentage of match. I don't how to explain or define the Percentage of Match. If you can suggest, it would be great too.
Note:
Zero AI Policy
We believe in human intelligence. Our moderation policy strictly prohibits the use of LLM content in our Q&A threads.
We can look at some of the data but there are huge.
There are more than 100 Business Users intervene to update the addresses manually. On an avg. 2k records are processed on a Daily basis.
The DB has 2,74,498 providers as of now and the providers may have more than 1 addresses(Home, Remittance, Billing, Service 1,2,3 etc.)
What I am planning to do is to format the addresses in certain method;
Will it do good. I am pretty sure they may not be 100% but closest I can get to.
Is there an API/tool where I can pass the addresses and get their Latitude and Latitude so that I can compare against the other? How closer we can get? I will assume if the lang/long cannot be identified, I will have users looked at it manually.
??
Sounds like a decent method to me. I've used similar. Mentioned above, directional words cause issues (N,S,E,W,etc...). I try to keep city and state with the address when I've done it. How many cities have a "1 Main Street" in them? Add the city and state and they become unique.
>>Is there an API/tool where I can pass the addresses and get their Latitude and Latitude
Several. None free that I'm aware of. Look around for: Geo locating street address.
The suggestion of paying the USPS for their data is a good one. If they don't know about the street address then it probably doesn't exist (or is so new they haven't updated the data yet).
Another gotcha is "STREET" vs. "ST", vs. "st". It's not uncommon, particularly with people where English is their second language, to not butt letters and digits.
"21st Ave" is normally meant to be "Avenue 21". However, separate "21" and "st" and it reads "21 st Ave". Depending on the rest of the address it could be ambiguous, or at least confusing.






EARN REWARDS FOR ASKING, ANSWERING, AND MORE.
Earn free swag for participating on the platform.
Did you make any advances on this?
Microsoft SQL Server
--
Questions
--
Followers
Top Experts
Microsoft SQL Server is a suite of relational database management system (RDBMS) products providing multi-user database access functionality.SQL Server is available in multiple versions, typically identified by release year, and versions are subdivided into editions to distinguish between product functionality. Component services include integration (SSIS), reporting (SSRS), analysis (SSAS), data quality, master data, T-SQL and performance tuning.