Link to home
Start Free TrialLog in
Avatar of gregholl
greghollFlag for Ecuador

asked on

Help with a MSSQL statement Select Query (Grouping)

Hi. I'm doing a Query, and I would like to group the result by the value "Products_Joined.ProductCode"

Here is my code so far:

SELECT Orders.BatchNumber AS Remesa
     , Products_Joined.ProductCode AS CodigoProducto
     , Products_Joined.ProductName AS Producto
     , Products_Joined.warehousecustom AS Arancel
     , Products_Joined.Vendor_Price AS PrecioUnitario
     , OrderDetails.Quantity AS Cantidad
     , (OrderDetails.Quantity * Products_Joined.Vendor_Price) AS TotalLinea

FROM ((Products_Joined WITH (NOLOCK) INNER JOIN OrderDetails ON Products_Joined.ProductCode = OrderDetails.ProductCode) 
     INNER JOIN Orders 
     ON OrderDetails.OrderID = Orders.OrderID) 
          INNER JOIN Customers 
          ON Orders.CustomerID = Customers.CustomerID 
WHERE Orders.BatchNumber = 5304

Open in new window


So for example, if there are several lines of the code "Product-A" and several lines of "Product-B", the query results should return only TWO lines, one for Product-A and one for Product-B. Furthermore, the quantity value for each ProductCode should be summed, i.e.: OrderDetails.Quantity.

As you can see, we are pulling 6 fields from the database:
Orders.BatchNumber
Products_Joined.ProductCode
Products_Joined.ProductName
Products_Joined.warehousecustom
Products_Joined.Vendor_Price
OrderDetails.Quantity

Please help out with that. Thanks. Let me know if it all makes sense.
Avatar of yo_bee
yo_bee
Flag of United States of America image

If I am understanding your query you are getting results, just not able to get to the file stage where you want to group.

You can build another  cartesian then sum with a group by

Select X.ProductName,sum(x.quantity)

From (SELECT Orders.BatchNumber AS Remesa
     , Products_Joined.ProductCode AS CodigoProducto
     , Products_Joined.ProductName AS Producto
     , Products_Joined.warehousecustom AS Arancel
     , Products_Joined.Vendor_Price AS PrecioUnitario
     , OrderDetails.Quantity AS Cantidad
     , (OrderDetails.Quantity * Products_Joined.Vendor_Price) AS TotalLinea

FROM ((Products_Joined WITH (NOLOCK) INNER JOIN OrderDetails ON Products_Joined.ProductCode = OrderDetails.ProductCode) 
     INNER JOIN Orders 
     ON OrderDetails.OrderID = Orders.OrderID) 
          INNER JOIN Customers 
          ON Orders.CustomerID = Customers.CustomerID 
WHERE Orders.BatchNumber = 5304) as X
Group by x.ProductName

Open in new window

Avatar of gregholl

ASKER

OK. Thanks. I tried that and I got this error:

Your SQL is invalid: Invalid column name 'ProductName'.
If you run your query that you posted do you get results?
Yes. My query without the modification works fine.
I think you need to change the select x.productname to your alias name

Select X.Producto,sum(x.Cantidad)

From (SELECT Orders.BatchNumber AS Remesa
     , Products_Joined.ProductCode AS CodigoProducto
     , Products_Joined.ProductName AS Producto
     , Products_Joined.warehousecustom AS Arancel
     , Products_Joined.Vendor_Price AS PrecioUnitario
     , OrderDetails.Quantity AS Cantidad
     , (OrderDetails.Quantity * Products_Joined.Vendor_Price) AS TotalLinea

FROM ((Products_Joined WITH (NOLOCK) INNER JOIN OrderDetails ON Products_Joined.ProductCode = OrderDetails.ProductCode) 
     INNER JOIN Orders 
     ON OrderDetails.OrderID = Orders.OrderID) 
          INNER JOIN Customers 
          ON Orders.CustomerID = Customers.CustomerID 
WHERE Orders.BatchNumber = 5304) as X
Group by x.Producto

Open in new window

Excellent. We're getting there.

First of all, I would rather group by 'CodigoProducto' than 'Producto', but I can change that easy enough.

Now I get a list returned of only one column, namely: 'Producto'. How do I make it so that all the columns show?
Avatar of Saurabh Dua
Saurabh Dua

This can make sense only once you decide what value you want in other columns than CodigoProducto. e.g. for Quantity , you want it to be summed up , what about rest of columns? They have to be either aggregated ( sum etc. ) or present in group by .

Aggregation is needed since you need only 2 rows in output , else you can include them in group by and multiple rows will come if multiplle values are present in them for same CodigoProducto.

e.g. if u want max:

Select X.CodigoProducto,sum(x.Cantidad),
max(x.Remesa),
max(x.Producto),
max(x.Arancel),
max(x.PrecioUnitario),
sum(TotalLinea)


From (SELECT Orders.BatchNumber AS Remesa
     , Products_Joined.ProductCode AS CodigoProducto
     , Products_Joined.ProductName AS Producto
     , Products_Joined.warehousecustom AS Arancel
     , Products_Joined.Vendor_Price AS PrecioUnitario
     , OrderDetails.Quantity AS Cantidad
     , (OrderDetails.Quantity * Products_Joined.Vendor_Price) AS TotalLinea

FROM ((Products_Joined WITH (NOLOCK) INNER JOIN OrderDetails ON Products_Joined.ProductCode = OrderDetails.ProductCode)
     INNER JOIN Orders
     ON OrderDetails.OrderID = Orders.OrderID)
          INNER JOIN Customers
          ON Orders.CustomerID = Customers.CustomerID
WHERE Orders.BatchNumber = 5304) as X
Group by x.Producto
... the query results should return only TWO lines, one for Product-A and one for Product-B. Furthermore, the quantity value for each ProductCode should be summed ...

As you can see, we are pulling 6 fields from the database:
Orders.BatchNumber
Products_Joined.ProductCode
Products_Joined.ProductName
Products_Joined.warehousecustom
Products_Joined.Vendor_Price
OrderDetails.Quantity
GROUP BY produces a distinct set of rows for ALL columns included in that clause.

So while it is probably quite reasonable to assume that there is only one product name per product code. With this:

GROUP BY
        Products_Joined.ProductCode
      , Products_Joined.ProductName

the result of 2 rows, should be possible

Product-A Name of Product A
Product-B Name of Product B

However, let us assume you have many orders for your products, and that these have been processed  over several batches. What then do you want from the column of Orders.BatchNumber ? If we include that column in the group by clause we get a new row when the batch number changes, like this:

Product-A Name of Product A Batch1
Product-A Name of Product A Batch2
Product-A Name of Product A Batch3
Product-B Name of Product B Batch1
Product-B Name of Product B Batch2
Product-B Name of Product B Batch3

We could instead COUNT(DISTINCT batches) which would look like this

Product-A Name of Product A  3
Product-B Name of Product B  3

Then maybe there are 2 vendors for each product? what do you want to do with the column Products_Joined.Vendor_Price

perhaps the minimum and maximum price?, the average price? (but this may be difficult!)

The point I am trying to make is for EVERY COLUMN you include in this query you have to make a fundamental choice:

Do I want this column to define how rows are formed?
    if Yes: then include it in the group by clause
    if No: then you must aggregate it somehow, e.g. SUM() MIN() MAX() AVG() COUNT()

It is not our role to make those choices for you, although we can make suggestions. The MOST important thing to understand is that the choice exists and how to make it.

Suggestions:
Orders.BatchNumber seems to be irrelevant to a query that sums order quantity per product.

Vendor price, as I hinted at earlier this isn't always a one-to-one, they might also change over time. So I have no idea how you want to present this.

==========
one other comment.

None of the 6 columns you nominate as output come from the Customers table, so why join that table if it is not used. Similarly, if Orders.BatchNumber isn't relevant to this query then joining the Orders table isn't required either.
follow - up...

WHERE Orders.BatchNumber = 5304

ooops. You are filtering for a single batch number of orders. Sorry I missed that when making some comments above.

As long as you continue to filter thisi way, then you can include that in your GROUP BY without increasing the number of rows.

GROUP BY
        Products_Joined.ProductCode
      , Products_Joined.ProductName
      , Orders.BatchNumber                   --<< only one of these, so no extra rows

so, you have to know if these columns are "one to one" with each product or not

Products_Joined.warehousecustom
Products_Joined.Vendor_Price
recommendation since you seem to only care about CodigoProducto and not Producto I recommend that you reduce the number of columns in the original query, remove some of the extra option in your From inner join and then group and aggregate the two columns of interest.

Here is a modified version of your original statement with lines remarks out. Line 3 and 7 are the only two columns you are working with after you join all your tables together.  This will reduce the noise you are looking at.   As others have outlined the more columns you add the more you will need to group and aggregate the data.  This may return an undesirable set of results.  So by keeping very simple you should get what you are looking for.

SELECT 
	 --Orders.BatchNumber AS Remesa ---not needed
      Products_Joined.ProductCode AS CodigoProducto
     --, Products_Joined.ProductName AS Producto  ----Not Needed
     --, Products_Joined.warehousecustom AS Arancel ---- Not Needed
     --, Products_Joined.Vendor_Price AS PrecioUnitario  ---- Not Needed
     , Sum(OrderDetails.Quantity AS Cantidad)
     --, (OrderDetails.Quantity * Products_Joined.Vendor_Price) AS TotalLinea

FROM	Products_Joined
		INNER JOIN OrderDetails ON Products_Joined.ProductCode = OrderDetails.ProductCode 
		INNER JOIN Orders ON OrderDetails.OrderID = Orders.OrderID
        INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID 
WHERE Orders.BatchNumber = 5304

Group by Product_Joined.CodigoProducto

Open in new window

Thanks all. Let's not make it more complicated than necessary:

I will explain.

I need to see 6 columns of data for the output, namely:

Orders.BatchNumber - THIS WILL ALWAYS BE THE SAME, NO NEED TO SUM OR AGGREGATE
Products_Joined.ProductCode
Products_Joined.ProductName - THIS WILL ALWAYS BE THE SAME, NO NEED TO SUM OR AGGREGATE
Products_Joined.warehousecustom - THIS WILL ALWAYS BE THE SAME, NO NEED TO SUM OR AGGREGATE
Products_Joined.Vendor_Price - THIS WILL ALWAYS BE THE SAME, NO NEED TO SUM OR AGGREGATE
OrderDetails.Quantity

Notice that there are four fields that simply will always stay the same. These are just the information I need in the output. None of the data in these columns will vary

Basically, I need to look up all the orders in:
Orders.BatchNumber

Then I need a summarized output based on the above-mentioned criteria.
Then use all columns  except quantity in your  GROUP BY clause
Use SUM() for quantity

& You do not need the Customer table in the query
SOLUTION
Avatar of yo_bee
yo_bee
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks guys! I'm all set