AD1080
asked on
SQL Query to find groups of records with approximately same value in numeric field
Hi,
Using the sample data below, I am trying to build a query that returns all records from the VENDOR table where:
There are 2 or more records in the VENDOR table with a UNIT_COST that is up to 3% higher or lower than any other records.
The VEND_NO in the ITEM table matches the VEND_NO in the VENDOR table
The records in the VENDOR table have the same corresponding BRAND and SIZE values in the ITEM table.
In plain English, I am looking for cost records, in VENDOR, for items with the same brand, size, and approximately the same cost, regardless of what vendor we have them assigned to. (The vendor assignment being the value for VEND_NO in the item table.)
The query should return ITEM_NO's '1000001609' and '1000001610' as their assigned vendor costs are 10.8 and 10.7.
The query should not return ITEM_NO's '1000001612' or '1000001613' as their assigned vendor costs are 15.2 and 14.7, which is more than a 3% variance.
Using the sample data below, I am trying to build a query that returns all records from the VENDOR table where:
There are 2 or more records in the VENDOR table with a UNIT_COST that is up to 3% higher or lower than any other records.
The VEND_NO in the ITEM table matches the VEND_NO in the VENDOR table
The records in the VENDOR table have the same corresponding BRAND and SIZE values in the ITEM table.
In plain English, I am looking for cost records, in VENDOR, for items with the same brand, size, and approximately the same cost, regardless of what vendor we have them assigned to. (The vendor assignment being the value for VEND_NO in the item table.)
The query should return ITEM_NO's '1000001609' and '1000001610' as their assigned vendor costs are 10.8 and 10.7.
The query should not return ITEM_NO's '1000001612' or '1000001613' as their assigned vendor costs are 15.2 and 14.7, which is more than a 3% variance.
DECLARE @ ITEM
(ITEM_NO VARCHAR(10) NOT NULL,
BRAND VARCHAR(10) NOT NULL,
SIZE VARCHAR(10) NOT NULL,
VEND_NO VARCHAR(10) NOT NULL)
DECLARE @ VENDOR
(ITEM_NO VARCHAR(10) NOT NULL,
VEND_NO VARCHAR(10) NOT NULL,
UNIT_COST VARCHAR(10) NOT NULL)
INSERT @ITEM
(ITEM_NO,BRAND,SIZE,VEND_NO,)
VALUES
('1000001609','AMYS','12 OZ','NATUBEST'),
('1000001610','AMYS','12 OZ','UNITNATU'),
('1000001611','AMYS','14 OZ','UNITNATU'),
('1000001612','BEARITOS','10 OZ','NATUBEST'),
('1000001613','BEARITOS','10 OZ','UNITNATU'),
('1000001614','BEARITOS','10 OZ','NATUBEST'),
('1000001615','CADIA','10 OZ','NATUBEST'),
('1000001616','CADIA','10 OZ','NATUBEST'),
('1000001617','CADIA','12 OZ','NATUBEST'),
('1000001618','CADIA','12 OZ','NATUBEST'),
('1000001619','CADIA','9 OZ','NATUBEST')
INSERT @VENDOR
(ITEM_NO,VEND_NO,UNIT_COST)
VALUES
('1000001609','NATUBEST','10.8'),
('1000001609','UNITNATU','10.5'),
('1000001610','NATUBEST','10.8'),
('1000001610','UNITNATU','10.7'),
('1000001611','NATUBEST','11.9'),
('1000001611','UNITNATU','11.8'),
('1000001612','NATUBEST','15.2'),
('1000001612','UNITNATU','15.1'),
('1000001613','NATUBEST','15.2'),
('1000001613','UNITNATU','14.7'),
('1000001614','NATUBEST','14.6'),
('1000001614','UNITNATU','14.5'),
('1000001615','NATUBEST','12.7'),
('1000001615','UNITNATU','13.5'),
('1000001616','NATUBEST','12.7'),
('1000001616','UNITNATU','12.7'),
('1000001617','NATUBEST','10.9'),
('1000001617','UNITNATU','10.8'),
('1000001618','NATUBEST','10.9'),
('1000001618','UNITNATU','10.8'),
('1000001619','NATUBEST','8.5'),
('1000001619','UNITNATU','8.4')
Not sure, what you exactly want. E.g.
DECLARE @ITEM TABLE
(
ITEM_NO VARCHAR(10) NOT NULL ,
BRAND VARCHAR(10) NOT NULL ,
SIZE VARCHAR(10) NOT NULL ,
VEND_NO VARCHAR(10) NOT NULL
);
DECLARE @VENDOR TABLE
(
ITEM_NO VARCHAR(10) NOT NULL ,
VEND_NO VARCHAR(10) NOT NULL ,
UNIT_COST FLOAT NOT NULL
);
INSERT @ITEM
( ITEM_NO, BRAND, SIZE, VEND_NO )
VALUES ( '1000001609', 'AMYS', '12 OZ', 'NATUBEST' ),
( '1000001610', 'AMYS', '12 OZ', 'UNITNATU' ),
( '1000001611', 'AMYS', '14 OZ', 'UNITNATU' ),
( '1000001612', 'BEARITOS', '10 OZ', 'NATUBEST' ),
( '1000001613', 'BEARITOS', '10 OZ', 'UNITNATU' ),
( '1000001614', 'BEARITOS', '10 OZ', 'NATUBEST' ),
( '1000001615', 'CADIA', '10 OZ', 'NATUBEST' ),
( '1000001616', 'CADIA', '10 OZ', 'NATUBEST' ),
( '1000001617', 'CADIA', '12 OZ', 'NATUBEST' ),
( '1000001618', 'CADIA', '12 OZ', 'NATUBEST' ),
( '1000001619', 'CADIA', '9 OZ', 'NATUBEST' );
INSERT @VENDOR
( ITEM_NO, VEND_NO, UNIT_COST )
VALUES ( '1000001609', 'NATUBEST', 10.8 ),
( '1000001609', 'UNITNATU', 10.5 ),
( '1000001610', 'NATUBEST', 10.8 ),
( '1000001610', 'UNITNATU', 10.7 ),
( '1000001611', 'NATUBEST', 11.9 ),
( '1000001611', 'UNITNATU', 11.8 ),
( '1000001612', 'NATUBEST', 15.2 ),
( '1000001612', 'UNITNATU', 15.1 ),
( '1000001613', 'NATUBEST', 15.2 ),
( '1000001613', 'UNITNATU', 14.7 ),
( '1000001614', 'NATUBEST', 14.6 ),
( '1000001614', 'UNITNATU', 14.5 ),
( '1000001615', 'NATUBEST', 12.7 ),
( '1000001615', 'UNITNATU', 13.5 ),
( '1000001616', 'NATUBEST', 12.7 ),
( '1000001616', 'UNITNATU', 12.7 ),
( '1000001617', 'NATUBEST', 10.9 ),
( '1000001617', 'UNITNATU', 10.8 ),
( '1000001618', 'NATUBEST', 10.9 ),
( '1000001618', 'UNITNATU', 10.8 ),
( '1000001619', 'NATUBEST', 8.5 ),
( '1000001619', 'UNITNATU', 8.4 );
SELECT *
FROM @ITEM I
LEFT JOIN @VENDOR V ON V.ITEM_NO = I.ITEM_NO
WHERE ( SELECT COUNT(*)
FROM @VENDOR V2
WHERE V2.UNIT_COST BETWEEN V.UNIT_COST * 0.97 AND V.UNIT_COST * 1.03
) > 1;
ASKER
Hi,
The first solution offered above is returning records that vary by more than 3% in either direction.
The 2nd solution is returning millions of rows.
Please take another look. I can provide more sample data if that would help. In both suggestions, I don't see any reference to the ITEM.BRAND field. My goal is to return a list of VENDOR.COST records with same ITEM.BRAND, that have almost the same cost.
ITEM_NO is the primary key for the ITEM table. There can be any number of records in VENDOR. The primary key for VENDOR is a combination of ITEM_NO and VEND_NO.
The result set should only contain VENDOR records where the ITEM.VEND_NO matches the VENDOR.VEND_NO field.
Thanks in advance, and let me know if I can clarify further.
The first solution offered above is returning records that vary by more than 3% in either direction.
The 2nd solution is returning millions of rows.
Please take another look. I can provide more sample data if that would help. In both suggestions, I don't see any reference to the ITEM.BRAND field. My goal is to return a list of VENDOR.COST records with same ITEM.BRAND, that have almost the same cost.
ITEM_NO is the primary key for the ITEM table. There can be any number of records in VENDOR. The primary key for VENDOR is a combination of ITEM_NO and VEND_NO.
The result set should only contain VENDOR records where the ITEM.VEND_NO matches the VENDOR.VEND_NO field.
Thanks in advance, and let me know if I can clarify further.
How about:
This gives the result of:
SELECT Brand, Size, Min(cast(Unit_Cost as float)) as MinUnitCost, Max(cast(Unit_Cost as float)) as MaxUnitCost
FROM @vendor v
join @ITEM i on v.ITEM_NO = i.ITEM_NO
group by Brand, Size
having Max(cast(Unit_Cost as float))<=Min(cast(Unit_Cost as float))*1.03
This gives the result of:
Brand Size MinUnitCost MaxUnitCost
---------- ---------- ---------------------- ----------------------
AMYS 12 OZ 10.5 10.8
AMYS 14 OZ 11.8 11.9
CADIA 12 OZ 10.8 10.9
CADIA 9 OZ 8.4 8.5
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi Vitor,
Thanks for your help. I think your suggestion is more or less what I need.
I have adapted it into the following which is more or less working. I actually wanted to filter out any records for brands that only have one row in the result set, but I can live with this for now. Appreciate your help.
Thanks for your help. I think your suggestion is more or less what I need.
I have adapted it into the following which is more or less working. I actually wanted to filter out any records for brands that only have one row in the result set, but I can live with this for now. Appreciate your help.
SELECT V.ITEM_NO, V.UNIT_COST, I.ATTR_COD_2 AS BRAND, I.ATTR_COD_3 AS SIZE, I.DESCR
FROM PO_VEND_ITEM V, IM_ITEM I
WHERE
V.ITEM_NO = I.ITEM_NO AND V.VEND_NO = I.ITEM_VEND_NO AND
I.ATTR_COD_2 <> '' AND
(SELECT COUNT(*)AS BRAND_COUNT FROM PO_VEND_ITEM V2, IM_ITEM I2
WHERE CAST(V2.UNIT_COST AS FLOAT) BETWEEN CAST(V.UNIT_COST AS FLOAT) * .97 AND CAST(V.UNIT_COST AS FLOAT) * 1.03 AND
V2.ITEM_NO = I2.ITEM_NO AND V2.VEND_NO = I2.ITEM_VEND_NO AND I2.ATTR_COD_2 = I.ATTR_COD_2
GROUP BY I2.ATTR_COD_2 HAVING COUNT(I2.ATTR_COD_2) > 1) > 1
ORDER BY ATTR_COD_2, ATTR_COD_3
Open in new window