Avatar of Sam OZ
Sam OZ
Flag for Australia asked on

Finding Rows breaking the data integrity rule

I have an Sql Server  table with following data   ( My data is not  exactly street data , but this is the sample format) 

Country    State   District   ZIPCode     Street  

Note:  I can take this to an excel sheet or MS Access database , A solution based on this is also fine

The current data  seems to be corrupt . How can I check the data integrity . What I mean by that is -  checking where it   breaks the top down approach 

 Ideally, if you visualize it in a form with picklist , Selecting a Country from the picklist for country will show states in the country and then  selecting State will show District ... then ZIPCode under District and Street under ZIPCode

How can I get the rows of  corrupt data breaking this rule ? 

SQLMicrosoft SQL ServerMicrosoft AccessMicrosoft ExcelMicrosoft Office

Avatar of undefined
Last Comment
Sam OZ

8/22/2022 - Mon
Saqib Husain

Your best starting point would be uploading a sample file. For me, I would request an excel file.
John Tsioumpris

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question

If you import the data into Access, you can create a query that looks for Null values in any of the rows.
Rob Henson

I created something similar some time ago using Dynamic Data Validation; selection in one cell restricted the selection in the next cell to a limited number of options related to first cell; third cell was then restricted to entries related to second cell. This was quite cumbersome but worked quite well using various lists and INDIRECT function in the Data Validation selection.

Because it used Data Validation I was able to use the Data Validation checker on the Data Tab, Data tools group and it would add a ring around invalid data. There is an article about it on my profile if you want to take a look but that is somewhat after the horse has bolted.

Are you able to create a list of correct combinations? This would have repetitions in all but the last level. Sample Data below; each of the District references are unique but the levels above are repeated as necessary.
AState A1District A1.1
AState A1District A1.2
AState A1District A1.3
AState A1District A1.4
AState A2District A2.1
AState A2District A2.2
AState A2District A2.3
AState A3District A3.1
AState A3District A3.2
AState A3District A3.3
BState B1District B1.1
BState B1District B1.2
BState B1District B1.3
BState B1District B1.4
BState B2District B2.1
BState B2District B2.2
BState B2District B2.3
BState B3District B3.1
BState B3District B3.2
BState B3District B3.3
BState B3District B3.4
CState C1District C1.1
CState C1District C1.2
CState C1District C1.3
CState C1District C1.4
CState C2District C2.1
CState C2District C2.2
CState C2District C2.3
CState C2District C2.4
CState C2District C2.5

You could then do effectively a reverse lookup. You could find the listed State in the lookup and compare Country in the lookup list to make sure it is correct; find district in the list to make sure State is correct etc etc.

If it truly were Address data, I suspect there would be an issue with Street as the same Street could occur in multiple ZIPs. Could this occur in your real data, ie one value could occur in multiple options at the higher level?
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
Robert Berke

Are you sure you totally understand the underlying data about Country State District ZIPCode and Street?
Maps often seem straightforward and top down, but in reality they mix geography, history, political divisions and USPS conventions, and it is easy to get confused.
If you give us and excel download which demonstrates your idea of “corruption” we will certainly be able to help you.
In the meantime, do you think the following demonstrates “corruption”
  • Pick sequence A shows  zip code 89439 has exactly 2 streets  Main and Washington
  • But pick sequence B shows the same zip code 89439 has exactly 3 streets Adam, Lincoln and Main
If you answered “yes that is corruption” then you are wrong !
This is because zip codes do not follow political boundaries. For instance zip code 89439 contains several cities and some of those cities cross the border between California and Nevada.
And of course some of the streets also cross the border, and some do not.
For example “Main Street” might start in Reno NV, then pass through Reno CA and end in Verdi CA.
But Lincoln Street might only be in Reno NV
Pick sequence A might show
Level 1: Country Picked USA
Level 2: State picked Nevada
Level 3: City picked Reno
Level 4: ZIP Code picked 89439
Level 5: Street might have drop down with exactly two streets  Main and Washington
Pick sequence B might show
Level 1: Country Picked USA
Level 2: State picked California
Level 3: City picked Reno
Level 4: ZIPCode picked 89439
Level 5: Street might have drop down with exactly three streets  Adam, Lincoln and Main
And, for what it is worth, there are 13  US Census' ZIP Codes with two states. 42223, 59221, 63673, 71749, 73949, 81137, 84536, 86044, 86515, 88063, 89439 & 97635.
I am hope you realize that "Country    State   District   ZIPCode     Street   "  would normally NOT have a "top down approach".  

For instance if "Main Street" may go through districts "A", "M" and "Z" and each district would have many streets.

and near a go through OH " and

does not make much sense to say "Date integrity" means "checking where it   breaks the top down approach " .
For instance Main Street

Robert Berke

sam oz, it has been 3 weeks since you asked this question.  Please review the experts advice and see if you have more questions. If not, please close the thread.

Sam OZ

closing the question
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.