chalie001
asked on
return value not equal to another column
hi i have a table which got duplicate column
empno_db1
empno_db2
i what to write a query which will return value
when empno_db1 <> empno_db2
table script
value in table
empno_db1 empno_db2 deptno_db1 depatno_db2 transDate_db1 transDate_db2 receiptdate_db1 receiptdate_db2
001 001 100 100 11/12/2015 11/12/2015 12/12/2015 12/12/2015
002 002 001 002 13/12/2015 09/12/2015 12/12/2015 12/12/2015
i only what to return value like this
empno_db1 empno_db2 deptno_db1 depatno_db2 transDate_db1 transDate_db2 receiptdate_db1 receiptdate_db2
002 002 001 002 13/12/2015 09/12/2015 12/12/2015 12/12/2015
empno_db1
empno_db2
i what to write a query which will return value
when empno_db1 <> empno_db2
table script
create table empt(empno_db1 number(6),
empno_db2 number(6),
deptno_db1 varchar(30),
depatno_db2 varchar(30),
transDate_db1 date,
transDate_db2 date,
receiptdate_db1 date,
receiptdate_db2 date)
value in table
empno_db1 empno_db2 deptno_db1 depatno_db2 transDate_db1 transDate_db2 receiptdate_db1 receiptdate_db2
001 001 100 100 11/12/2015 11/12/2015 12/12/2015 12/12/2015
002 002 001 002 13/12/2015 09/12/2015 12/12/2015 12/12/2015
i only what to return value like this
empno_db1 empno_db2 deptno_db1 depatno_db2 transDate_db1 transDate_db2 receiptdate_db1 receiptdate_db2
002 002 001 002 13/12/2015 09/12/2015 12/12/2015 12/12/2015
ASKER
i have 50 columns with about 100 000 records
Am not sure what the implication of your statement is.... the number of rows should not matter to the SELECT. You can list all your columns on the SELECT statement as shown.
ASKER
wht if the other columns got null
must i do this in all 50 columns
where empno_db1 <> empno_db2
must i do this in all 50 columns
where empno_db1 <> empno_db2
ASKER
i also what to check all other columns not just empno_db1 empno_db2
Ah! Ok. Had not realised you wanted to do the rest of the columns. You can use the NVL function to help deal with NULL values. The NVL function converts a NULL to a known value. NVL(x, value) returns value if x is null, otherwise, x is returned. Note that 'value' has to be consistent with the data type of that column. Add an OR condition for each pair of columns you want to compare.
select empno_db1, empno_db2, deptno_db1, depatno_db2, transDate_db1, transDate_db2, receiptdate_db1, receiptdate_db2
from empt
where NVL(empno_db1, 0) <> NVL(empno_db2, 0)
or NVL(deptno_db1, '') <> NVL(deptno_db2, '')
or NVL(transDate_db1, to_date('19000101','YYYYMMDD')) <> NVL(transDate_db2, to_date('19000101','YYYYMMDD'))
ASKER
i have 50 columns with about 100 000 records, i can't hard code date values there is a lot
Your requirements are unclear, in the question you say:
1 i what to write a query which will return value when empno_db1 <> empno_db2
in a later comment you say:
2 i also what to check all other columns not just empno_db1 empno_db2
The solution to your first question is:
select * from empt where empno_db1 <> empno_db2;
What do you want with the other columns? Which other columns do you want to compare against which (other) columns?
1 i what to write a query which will return value when empno_db1 <> empno_db2
in a later comment you say:
2 i also what to check all other columns not just empno_db1 empno_db2
The solution to your first question is:
select * from empt where empno_db1 <> empno_db2;
What do you want with the other columns? Which other columns do you want to compare against which (other) columns?
ASKER
something like this
where empno_db1 <> empno_db2
or (empno_db1 is null and empno_db2 is not null)
or (empno_db1 is not null and empno_db2 is null)
but must i do this to all 50 columns
where empno_db1 <> empno_db2
or (empno_db1 is null and empno_db2 is not null)
or (empno_db1 is not null and empno_db2 is null)
but must i do this to all 50 columns
ASKER
this other columns
deptno_db1, depatno_db2, transDate_db1, transDate_db2, receiptdate_db1, receiptdate_db
deptno_db1, depatno_db2, transDate_db1, transDate_db2, receiptdate_db1, receiptdate_db
Ok, so empno, deptno, transDate and receipt date - these are 8 columns, how to we get to 50?
The version I gave you solves any issues related to having a null in either one or both columns for any of the comparisons you need to do. By using the NLV function, you simplify your query structure. There is a slight performance knock though. But it works well. You then don't need to do this
where empno_db1 <> empno_db2
or (empno_db1 is null and empno_db2 is not null)
or (empno_db1 is not null and empno_db2 is null)
>> NVL(deptno_db1, '') <> NVL(deptno_db2, '')
>>By using the NLV function, you simplify your query structure.
This will NEVER work. Two empty single quotes in Oracle is the same as a null.
So NVL(deptno_db1, '') will return a null if deptno_db1 is null. Then a null cannot '=' or '<>' anything so the statement is invalid.
>>but must i do this to all 50 columns
Is there a reason you do not use the MINUS that was suggested in several of your previous columns?
Hopefully you now see how difficult it is to code for ALL the columns when they are side-by-side.
>>By using the NLV function, you simplify your query structure.
This will NEVER work. Two empty single quotes in Oracle is the same as a null.
So NVL(deptno_db1, '') will return a null if deptno_db1 is null. Then a null cannot '=' or '<>' anything so the statement is invalid.
>>but must i do this to all 50 columns
Is there a reason you do not use the MINUS that was suggested in several of your previous columns?
Hopefully you now see how difficult it is to code for ALL the columns when they are side-by-side.
a "report" of 100 columns (50 columns for each table side by side) is just not going to be worth the effort you will have to devote the query that produces it.
What will you do with this output?
I also recommend you don't try NVL(column,'') = ''
it will not work e.g.
What will you do with this output?
I also recommend you don't try NVL(column,'') = ''
it will not work e.g.
**Oracle 11g R2 Schema Setup**:
CREATE TABLE test
(ID int, ASTRING VARCHAR(20))
;
INSERT ALL
INTO test (ID, ASTRING)
VALUES (1,'')
INTO test (ID, ASTRING)
VALUES (2,NULL)
SELECT * FROM dual
;
**Query 1**:
select *
from test
where NVL(astring,'') IS NULL
**[Results][2]**:
| ID | ASTRING |
|----|---------|
| 1 | (null) |
| 2 | (null) |
**Query 2**:
select *
from test
where NVL(astring,'') = ''
**[Results][3]**:
[1]: http://sqlfiddle.com/#!4/490bf/1
[2]: http://sqlfiddle.com/#!4/490bf/1/0
[3]: http://sqlfiddle.com/#!4/490bf/1/1
This will NEVER work. Two empty single quotes in Oracle is the same as a null.
With a little ingenuity, one can make NLV work. If '' is null, then just any other string marker which you do not have in your data. Use NLV(field, 'GruHasABigNose') if it helps. The idea is to eliminate the nulls to make life easier. It is a model worth building on.
Fiddle Playground: http://www.sqlfiddle.com/#!4/3e45c/11
Create table
Still using the NVL function. Am using a SYS_ID to generate a unique identifier and using that in the NLV, since the experts said that NVL(fld, '') will not work.
Create table
CREATE TABLE test (
ID int,
ValA1 VARCHAR(20), ValA2 VARCHAR(20),
ValB1 NUMERIC(10), ValB2 NUMERIC(10),
ValC1 DATE, ValC2 DATE
);
Create dummy data and test with queries.Still using the NVL function. Am using a SYS_ID to generate a unique identifier and using that in the NLV, since the experts said that NVL(fld, '') will not work.
INSERT ALL
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 1,'', 'Alpha', 12, 14, TO_DATE('2015-01-01', 'yyyy-mm-dd'), TO_DATE('2015-02-01', 'yyyy-mm-dd') )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 2,'Alpha', 'Alpha', 12, 12, null, TO_DATE('2015-02-01', 'yyyy-mm-dd') )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 3,'Omega', 'Alpha', 12, 14, TO_DATE('2015-02-01', 'yyyy-mm-dd'), TO_DATE('2015-02-01', 'yyyy-mm-dd') )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 4, NULL, 'Alpha', null, null, null, null )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 5,'Omega', 'Omega', 0, 14, TO_DATE('2015-01-01', 'yyyy-mm-dd'), null )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 6,'Theta', 'Theta', null, 13, TO_DATE('2015-02-01', 'yyyy-mm-dd'), TO_DATE('2015-02-01', 'yyyy-mm-dd') )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 7,'Beta', 'Alpha', 12, 14, null, TO_DATE('2015-02-01', 'yyyy-mm-dd') )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 8,'Beta', 'Beta', 2, 14, TO_DATE('2015-01-01', 'yyyy-mm-dd'), TO_DATE('2015-02-01', 'yyyy-mm-dd') )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 9,null,null, null, 14, TO_DATE('2015-02-01', 'yyyy-mm-dd'), TO_DATE('2015-02-01', 'yyyy-mm-dd') )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 10,'Match', 'Match', 5, 5, TO_DATE('2015-02-01', 'yyyy-mm-dd'), TO_DATE('2015-02-01', 'yyyy-mm-dd') )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 11,'Match', 'Match', 5, 5, TO_DATE('2015-02-01', 'yyyy-mm-dd'), TO_DATE('2015-02-01', 'yyyy-mm-dd') )
INTO test (ID, ValA1, ValA2, ValB1, ValB2, ValC1, ValC2) VALUES( 12,'Match', 'Match', 5, 5, TO_DATE('2015-02-01', 'yyyy-mm-dd'), TO_DATE('2015-02-01', 'yyyy-mm-dd') )
SELECT * FROM dual;
/*Creating a column to flag mismatches. Returns all data*/
SELECT
ID,
ValA1, ValA2, CASE WHEN NVL(ValA1, '57831A90-6723-432B-82D3-3ED93C835FFA') != NVL(ValA2, '57831A90-6723-432B-82D3-3ED93C835FFA') THEN 1 ELSE 0 END as ValAResult,
ValB1, ValB2, CASE WHEN NVL(ValB1, 0) != NVL(ValB2, 0) THEN 1 ELSE 0 END as ValBResult,
ValC1, ValC2, CASE WHEN NVL(ValC1, TO_DATE('1900-01-01', 'yyyy-mm-dd')) != NVL(ValC2, TO_DATE('1900-01-01', 'yyyy-mm-dd')) THEN 1 ELSE 0 END as ValCResult
FROM test;
/*Filter data to return only where there is a mismatch*/
SELECT
ID,
ValA1, ValA2,
ValB1, ValB2,
ValC1, ValC2
FROM test
WHERE
CASE WHEN NVL(ValA1, '57831A90-6723-432B-82D3-3ED93C835FFA') != NVL(ValA2, '57831A90-6723-432B-82D3-3ED93C835FFA') THEN 1 ELSE 0 END = 1
OR CASE WHEN NVL(ValB1, 0) != NVL(ValB2, 0) THEN 1 ELSE 0 END = 1
OR CASE WHEN NVL(ValC1, TO_DATE('1900-01-01', 'yyyy-mm-dd')) != NVL(ValC2, TO_DATE('1900-01-01', 'yyyy-mm-dd')) THEN 1 ELSE 0 END = 1;
ASKER
the reason am not using minus is i what to see both data from both diffirent database so i can make decision which data to update in production the minus just show the diffirent values but does not show what the other column data it is
To me, the issue here is generating the where clause. It seems like a daunting task to do by hand. However, there are really 3 things you need to generate:
From the sample table that is posted in the original question, that query generates these clauses:
1.
column1 != column22.
column1 is null and column2 is not null3.
column1 is not null and column2 is nullSo, why do it by hand? Have the database generate those statements for you. This code should generate those 3 clauses for you.SELECT 'or '
||column_name
|| '_db1 != '
|| column_name
|| '_db2'
FROM (SELECT DISTINCT Substr(column_name, 1, Instr(column_name, '_', -1) - 1)
column_name
FROM user_tab_columns
WHERE table_name = 'EMPT') c
UNION ALL
SELECT 'or '
||column_name
|| '_db1 is null and '
|| column_name
|| '_db2 is not null'
FROM (SELECT DISTINCT Substr(column_name, 1, Instr(column_name, '_', -1) - 1)
column_name
FROM user_tab_columns
WHERE table_name = 'EMPT') c
UNION ALL
SELECT 'or '
||column_name
|| '_db1 is not null and '
|| column_name
|| '_db2 is null'
FROM (SELECT DISTINCT Substr(column_name, 1, Instr(column_name, '_', -1) - 1)
column_name
FROM user_tab_columns
WHERE table_name = 'EMPT') c;
You hit USER_TAB_COLUMNS multiple times, but this is a one time thing and I don't think performance on generating the where clause is an issue. You are just going to have to change the word OR to WHERE on the first line and then add your SELECT and FROM clauses to the front.From the sample table that is posted in the original question, that query generates these clauses:
'OR'||COLUMN_NAME||'_DB1!='||COLUMN_NAME||'_DB2'
--------------------------------------------------------------------------------
or TRANSDATE_db1 != TRANSDATE_db2
or RECEIPTDATE_db1 != RECEIPTDATE_db2
or DEPTNO_db1 != DEPTNO_db2
or DEPATNO_db1 != DEPATNO_db2
or EMPNO_db1 != EMPNO_db2
or TRANSDATE_db1 is null and TRANSDATE_db2 is not null
or RECEIPTDATE_db1 is null and RECEIPTDATE_db2 is not null
or DEPTNO_db1 is null and DEPTNO_db2 is not null
or DEPATNO_db1 is null and DEPATNO_db2 is not null
or EMPNO_db1 is null and EMPNO_db2 is not null
or TRANSDATE_db1 is not null and TRANSDATE_db2 is null
or RECEIPTDATE_db1 is not null and RECEIPTDATE_db2 is null
or DEPTNO_db1 is not null and DEPTNO_db2 is null
or DEPATNO_db1 is not null and DEPATNO_db2 is null
or EMPNO_db1 is not null and EMPNO_db2 is null
15 rows selected.
>>With a little ingenuity, one can make NLV work
I do not disagree. I was just pointing out what was posted, as-is, would never work. If you use your 'trick' you just need to make sure the value you pick as filler will NEVER appear in the data itself. Not impossible but it takes some thought.
>>the reason am not using minus is i what to see both data from both diffirent database
What does your join look like?
What happens when db1 has a row db2 doesn't and db2 has a row db1 doesn't? Does what you have capture both ways?
I do not disagree. I was just pointing out what was posted, as-is, would never work. If you use your 'trick' you just need to make sure the value you pick as filler will NEVER appear in the data itself. Not impossible but it takes some thought.
>>the reason am not using minus is i what to see both data from both diffirent database
What does your join look like?
What happens when db1 has a row db2 doesn't and db2 has a row db1 doesn't? Does what you have capture both ways?
Not impossible but it takes some thought.How well does a SYS_ID score here? Theoretically, they are globally unique. My Oracle skills are not 100%, but one could declare a variable, initialized to a SYS_ID and pass that around.
the reason am not using minus is i what to see both data from both diffirent databaseThis suggests a different table structure from the one posted in the original script (with the question) where the data columns are sitting in the same table. Either way, the only way to fulfill all the objectives laid out here is to use value comparisons as opposed to the MINUS approach.
>>but one could declare a variable, initialized to a SYS_ID and pass that around.
I guess you mean SYS_GUID. That will work if you declare a variable. That requires code and not straight SQL.
Depending on how my question about missing rows is answered, everything so far can be done with straight SQL. Messy, yes, but it keeps everything in the SQL engine.
I guess you mean SYS_GUID. That will work if you declare a variable. That requires code and not straight SQL.
Depending on how my question about missing rows is answered, everything so far can be done with straight SQL. Messy, yes, but it keeps everything in the SQL engine.
>>"i what to see both data from both different database so i can make decision which data to update in production"
I would just like to suggest (again) that a "report" of 100 columns is quite unusable, you will spend a lot of time trying to figure out which of the columns is different. You are very likely to make mistakes using such a layout.
I would recommend breaking this into several reports that are easier to read and use (less columns). You might divide then by data types perhaps? (strings, dates, numbers and so on)
One other thing that concerns me, clearly the sample data and table structure (in the question) is way different to the actual situation. What I do not see in the sample is a unique row identifier. If you intend to update a production table, how will you ensure that you can update the correct row?
Please think through how you are going to perform the update. (e.g. will you create a table containing the wanted values?)
I would just like to suggest (again) that a "report" of 100 columns is quite unusable, you will spend a lot of time trying to figure out which of the columns is different. You are very likely to make mistakes using such a layout.
I would recommend breaking this into several reports that are easier to read and use (less columns). You might divide then by data types perhaps? (strings, dates, numbers and so on)
One other thing that concerns me, clearly the sample data and table structure (in the question) is way different to the actual situation. What I do not see in the sample is a unique row identifier. If you intend to update a production table, how will you ensure that you can update the correct row?
Please think through how you are going to perform the update. (e.g. will you create a table containing the wanted values?)
on the mater of using NVL() instead of OR column IS NULL
I would NOT use NVL() here no matter how clever you get with the substitute value. As Johnsone has rightly pointed out you can generate the SQL anyway.
My viewpoint on this is drawn from a general dislike of using functions in the where clause as they are not "sargable" (they remove access to indexes/make the query slower) and most functions impose extra calculations for each row. All one needs to do is use OR column IS NULL to avoid those issues.
I would NOT use NVL() here no matter how clever you get with the substitute value. As Johnsone has rightly pointed out you can generate the SQL anyway.
My viewpoint on this is drawn from a general dislike of using functions in the where clause as they are not "sargable" (they remove access to indexes/make the query slower) and most functions impose extra calculations for each row. All one needs to do is use OR column IS NULL to avoid those issues.
ASKER
am returning 0 rows is this query
SELECT 'or '
||column_name
|| '_S != '
|| column_name
|| '_T'
FROM (SELECT DISTINCT Substr(column_name, 1, Instr(column_name, '_', -1) - 1)
column_name
FROM user_tab_columns
WHERE table_name = 'mr_er_core_temp') c
UNION ALL
SELECT 'or '
||column_name
|| '_S is null and '
|| column_name
|| '_T is not null'
FROM (SELECT DISTINCT Substr(column_name, 1, Instr(column_name, '_', -1) - 1)
column_name
FROM user_tab_columns
WHERE table_name = 'mr_er_core_temp') c
UNION ALL
SELECT 'or '
||column_name
|| '_S is not null and '
|| column_name
|| '_T is null'
FROM (SELECT DISTINCT Substr(column_name, 1, Instr(column_name, '_', -1) - 1)
column_name
FROM user_tab_columns
WHERE table_name = 'mr_er_core_temp') c;
The table name needs to be in UPPERCASE
SELECT DISTINCT Substr(column_name, 1, Instr(column_name, '_', -1) - 1)
column_name
FROM user_tab_columns
WHERE table_name = 'MR_ER_CORE_TEMP'
SELECT DISTINCT Substr(column_name, 1, Instr(column_name, '_', -1) - 1)
column_name
FROM user_tab_columns
WHERE table_name = 'MR_ER_CORE_TEMP'
ASKER
this query is returning everything if i select * from MR_ER_CORE_TEMP i get the ame data as am using OR from the above query
ASKER
if value are the same i don't what to return tham only value which diffier must be return
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
correct
Open in new window
This query would do it. However, in your example of what you want to return... both empno_db1 and empno_db2 seem to have a value of 002???... which actually means the query should not return this row?