gregholl
asked on
Help with a MSSQL statement Select Query (Grouping)
Hi. I'm doing a Query, and I would like to group the result by the value "Products_Joined.ProductCo de"
Here is my code so far:
So for example, if there are several lines of the code "Product-A" and several lines of "Product-B", the query results should return only TWO lines, one for Product-A and one for Product-B. Furthermore, the quantity value for each ProductCode should be summed, i.e.: OrderDetails.Quantity.
As you can see, we are pulling 6 fields from the database:
Please help out with that. Thanks. Let me know if it all makes sense.
Here is my code so far:
SELECT Orders.BatchNumber AS Remesa
, Products_Joined.ProductCode AS CodigoProducto
, Products_Joined.ProductName AS Producto
, Products_Joined.warehousecustom AS Arancel
, Products_Joined.Vendor_Price AS PrecioUnitario
, OrderDetails.Quantity AS Cantidad
, (OrderDetails.Quantity * Products_Joined.Vendor_Price) AS TotalLinea
FROM ((Products_Joined WITH (NOLOCK) INNER JOIN OrderDetails ON Products_Joined.ProductCode = OrderDetails.ProductCode)
INNER JOIN Orders
ON OrderDetails.OrderID = Orders.OrderID)
INNER JOIN Customers
ON Orders.CustomerID = Customers.CustomerID
WHERE Orders.BatchNumber = 5304
So for example, if there are several lines of the code "Product-A" and several lines of "Product-B", the query results should return only TWO lines, one for Product-A and one for Product-B. Furthermore, the quantity value for each ProductCode should be summed, i.e.: OrderDetails.Quantity.
As you can see, we are pulling 6 fields from the database:
Orders.BatchNumber
Products_Joined.ProductCod e
Products_Joined.ProductNam e
Products_Joined.warehousec ustom
Products_Joined.Vendor_Pri ce
OrderDetails.Quantity
Products_Joined.ProductCod
Products_Joined.ProductNam
Products_Joined.warehousec
Products_Joined.Vendor_Pri
OrderDetails.Quantity
Please help out with that. Thanks. Let me know if it all makes sense.
ASKER
OK. Thanks. I tried that and I got this error:
Your SQL is invalid: Invalid column name 'ProductName'.
If you run your query that you posted do you get results?
ASKER
Yes. My query without the modification works fine.
I think you need to change the select x.productname to your alias name
Select X.Producto,sum(x.Cantidad)
From (SELECT Orders.BatchNumber AS Remesa
, Products_Joined.ProductCode AS CodigoProducto
, Products_Joined.ProductName AS Producto
, Products_Joined.warehousecustom AS Arancel
, Products_Joined.Vendor_Price AS PrecioUnitario
, OrderDetails.Quantity AS Cantidad
, (OrderDetails.Quantity * Products_Joined.Vendor_Price) AS TotalLinea
FROM ((Products_Joined WITH (NOLOCK) INNER JOIN OrderDetails ON Products_Joined.ProductCode = OrderDetails.ProductCode)
INNER JOIN Orders
ON OrderDetails.OrderID = Orders.OrderID)
INNER JOIN Customers
ON Orders.CustomerID = Customers.CustomerID
WHERE Orders.BatchNumber = 5304) as X
Group by x.Producto
ASKER
Excellent. We're getting there.
First of all, I would rather group by 'CodigoProducto' than 'Producto', but I can change that easy enough.
Now I get a list returned of only one column, namely: 'Producto'. How do I make it so that all the columns show?
First of all, I would rather group by 'CodigoProducto' than 'Producto', but I can change that easy enough.
Now I get a list returned of only one column, namely: 'Producto'. How do I make it so that all the columns show?
This can make sense only once you decide what value you want in other columns than CodigoProducto. e.g. for Quantity , you want it to be summed up , what about rest of columns? They have to be either aggregated ( sum etc. ) or present in group by .
Aggregation is needed since you need only 2 rows in output , else you can include them in group by and multiple rows will come if multiplle values are present in them for same CodigoProducto.
e.g. if u want max:
Select X.CodigoProducto,sum(x.Can tidad),
max(x.Remesa),
max(x.Producto),
max(x.Arancel),
max(x.PrecioUnitario),
sum(TotalLinea)
From (SELECT Orders.BatchNumber AS Remesa
, Products_Joined.ProductCod e AS CodigoProducto
, Products_Joined.ProductNam e AS Producto
, Products_Joined.warehousec ustom AS Arancel
, Products_Joined.Vendor_Pri ce AS PrecioUnitario
, OrderDetails.Quantity AS Cantidad
, (OrderDetails.Quantity * Products_Joined.Vendor_Pri ce) AS TotalLinea
FROM ((Products_Joined WITH (NOLOCK) INNER JOIN OrderDetails ON Products_Joined.ProductCod e = OrderDetails.ProductCode)
INNER JOIN Orders
ON OrderDetails.OrderID = Orders.OrderID)
INNER JOIN Customers
ON Orders.CustomerID = Customers.CustomerID
WHERE Orders.BatchNumber = 5304) as X
Group by x.Producto
Aggregation is needed since you need only 2 rows in output , else you can include them in group by and multiple rows will come if multiplle values are present in them for same CodigoProducto.
e.g. if u want max:
Select X.CodigoProducto,sum(x.Can
max(x.Remesa),
max(x.Producto),
max(x.Arancel),
max(x.PrecioUnitario),
sum(TotalLinea)
From (SELECT Orders.BatchNumber AS Remesa
, Products_Joined.ProductCod
, Products_Joined.ProductNam
, Products_Joined.warehousec
, Products_Joined.Vendor_Pri
, OrderDetails.Quantity AS Cantidad
, (OrderDetails.Quantity * Products_Joined.Vendor_Pri
FROM ((Products_Joined WITH (NOLOCK) INNER JOIN OrderDetails ON Products_Joined.ProductCod
INNER JOIN Orders
ON OrderDetails.OrderID = Orders.OrderID)
INNER JOIN Customers
ON Orders.CustomerID = Customers.CustomerID
WHERE Orders.BatchNumber = 5304) as X
Group by x.Producto
... the query results should return only TWO lines, one for Product-A and one for Product-B. Furthermore, the quantity value for each ProductCode should be summed ...GROUP BY produces a distinct set of rows for ALL columns included in that clause.
As you can see, we are pulling 6 fields from the database:
Orders.BatchNumber
Products_Joined.ProductCode
Products_Joined.ProductName
Products_Joined.warehousecustom
Products_Joined.Vendor_Price
OrderDetails.Quantity
So while it is probably quite reasonable to assume that there is only one product name per product code. With this:
GROUP BY
Products_Joined.ProductCod
, Products_Joined.ProductNam
the result of 2 rows, should be possible
Product-A Name of Product A
Product-B Name of Product B
However, let us assume you have many orders for your products, and that these have been processed over several batches. What then do you want from the column of Orders.BatchNumber ? If we include that column in the group by clause we get a new row when the batch number changes, like this:
Product-A Name of Product A Batch1
Product-A Name of Product A Batch2
Product-A Name of Product A Batch3
Product-B Name of Product B Batch1
Product-B Name of Product B Batch2
Product-B Name of Product B Batch3
We could instead COUNT(DISTINCT batches) which would look like this
Product-A Name of Product A 3
Product-B Name of Product B 3
Then maybe there are 2 vendors for each product? what do you want to do with the column Products_Joined.Vendor_Pri
perhaps the minimum and maximum price?, the average price? (but this may be difficult!)
The point I am trying to make is for EVERY COLUMN you include in this query you have to make a fundamental choice:
Do I want this column to define how rows are formed?
if Yes: then include it in the group by clause
if No: then you must aggregate it somehow, e.g. SUM() MIN() MAX() AVG() COUNT()
It is not our role to make those choices for you, although we can make suggestions. The MOST important thing to understand is that the choice exists and how to make it.
Suggestions:
Orders.BatchNumber seems to be irrelevant to a query that sums order quantity per product.
Vendor price, as I hinted at earlier this isn't always a one-to-one, they might also change over time. So I have no idea how you want to present this.
==========
one other comment.
None of the 6 columns you nominate as output come from the Customers table, so why join that table if it is not used. Similarly, if Orders.BatchNumber isn't relevant to this query then joining the Orders table isn't required either.
follow - up...
WHERE Orders.BatchNumber = 5304
ooops. You are filtering for a single batch number of orders. Sorry I missed that when making some comments above.
As long as you continue to filter thisi way, then you can include that in your GROUP BY without increasing the number of rows.
GROUP BY
Products_Joined.ProductCod e
, Products_Joined.ProductNam e
, Orders.BatchNumber --<< only one of these, so no extra rows
so, you have to know if these columns are "one to one" with each product or not
Products_Joined.warehousec ustom
Products_Joined.Vendor_Pri ce
WHERE Orders.BatchNumber = 5304
ooops. You are filtering for a single batch number of orders. Sorry I missed that when making some comments above.
As long as you continue to filter thisi way, then you can include that in your GROUP BY without increasing the number of rows.
GROUP BY
Products_Joined.ProductCod
, Products_Joined.ProductNam
, Orders.BatchNumber --<< only one of these, so no extra rows
so, you have to know if these columns are "one to one" with each product or not
Products_Joined.warehousec
Products_Joined.Vendor_Pri
recommendation since you seem to only care about CodigoProducto and not Producto I recommend that you reduce the number of columns in the original query, remove some of the extra option in your From inner join and then group and aggregate the two columns of interest.
Here is a modified version of your original statement with lines remarks out. Line 3 and 7 are the only two columns you are working with after you join all your tables together. This will reduce the noise you are looking at. As others have outlined the more columns you add the more you will need to group and aggregate the data. This may return an undesirable set of results. So by keeping very simple you should get what you are looking for.
Here is a modified version of your original statement with lines remarks out. Line 3 and 7 are the only two columns you are working with after you join all your tables together. This will reduce the noise you are looking at. As others have outlined the more columns you add the more you will need to group and aggregate the data. This may return an undesirable set of results. So by keeping very simple you should get what you are looking for.
SELECT
--Orders.BatchNumber AS Remesa ---not needed
Products_Joined.ProductCode AS CodigoProducto
--, Products_Joined.ProductName AS Producto ----Not Needed
--, Products_Joined.warehousecustom AS Arancel ---- Not Needed
--, Products_Joined.Vendor_Price AS PrecioUnitario ---- Not Needed
, Sum(OrderDetails.Quantity AS Cantidad)
--, (OrderDetails.Quantity * Products_Joined.Vendor_Price) AS TotalLinea
FROM Products_Joined
INNER JOIN OrderDetails ON Products_Joined.ProductCode = OrderDetails.ProductCode
INNER JOIN Orders ON OrderDetails.OrderID = Orders.OrderID
INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID
WHERE Orders.BatchNumber = 5304
Group by Product_Joined.CodigoProducto
ASKER
Thanks all. Let's not make it more complicated than necessary:
I will explain.
I need to see 6 columns of data for the output, namely:
Orders.BatchNumber - THIS WILL ALWAYS BE THE SAME, NO NEED TO SUM OR AGGREGATE
Products_Joined.ProductCod e
Products_Joined.ProductNam e - THIS WILL ALWAYS BE THE SAME, NO NEED TO SUM OR AGGREGATE
Products_Joined.warehousec ustom - THIS WILL ALWAYS BE THE SAME, NO NEED TO SUM OR AGGREGATE
Products_Joined.Vendor_Pri ce - THIS WILL ALWAYS BE THE SAME, NO NEED TO SUM OR AGGREGATE
OrderDetails.Quantity
Notice that there are four fields that simply will always stay the same. These are just the information I need in the output. None of the data in these columns will vary
Basically, I need to look up all the orders in:
Orders.BatchNumber
Then I need a summarized output based on the above-mentioned criteria.
I will explain.
I need to see 6 columns of data for the output, namely:
Orders.BatchNumber - THIS WILL ALWAYS BE THE SAME, NO NEED TO SUM OR AGGREGATE
Products_Joined.ProductCod
Products_Joined.ProductNam
Products_Joined.warehousec
Products_Joined.Vendor_Pri
OrderDetails.Quantity
Notice that there are four fields that simply will always stay the same. These are just the information I need in the output. None of the data in these columns will vary
Basically, I need to look up all the orders in:
Orders.BatchNumber
Then I need a summarized output based on the above-mentioned criteria.
Then use all columns except quantity in your GROUP BY clause
Use SUM() for quantity
& You do not need the Customer table in the query
Use SUM() for quantity
& You do not need the Customer table in the query
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks guys! I'm all set
You can build another cartesian then sum with a group by
Open in new window