Link to home
Start Free TrialLog in
Avatar of dij8
dij8Flag for New Zealand

asked on

Order results by counting the number of times a value from the table appears in another table

I have a table I am succesfully (I think) getting results from until I use the option to order the results by the number of times it appears in another table.

In as short a way as possible, I have a table with content details. This table has a reference in an item table, which has a reference in a module table, which has a reference in a container table, which has a reference in a page table which has a reference in a page hit table.  Every time a page on my website is visited a new entry is made in the page hit table.  I want to return all records from the content table ordered by Author.  No problem.  For each author I want to return their results in the order of most popular which can be seen by the number of entries in the PageHit table.  This is where the problems occur.

As soon as I add a count function in the order by I get an aggregate or group by is required.  So I added a count in the select and I still get the same.  I still don't understand inner, outer, left, right, et c. joins so only think I have the best options below by the results I am getting.

My sql and the error returned are as follows:
SELECT      ModuleHTMLContent.ModuleHTMLContentId, COUNT(ALL ModuleHTMLContent.Author)
      FROM      ModuleHTMLContent
            LEFT JOIN LookupValue CLV ON CLV.LookupValueId = ModuleHTMLContent.CategoryTypeId
            LEFT JOIN #tmpLGV CtmpLGV ON CtmpLGV.LookupValueId = CLV.LookupValueId
            LEFT JOIN LookupGroup CLG ON CLG.LookupGroupId = CtmpLGV.LookupGroupId
            LEFT JOIN LookupValue SLV ON SLV.LookupValueId = ModuleHTMLContent.SubCategoryTypeId
            LEFT JOIN #tmpLGV StmpLGV ON StmpLGV.LookupValueId = SLV.LookupValueId
            LEFT JOIN LookupGroup SLG ON SLG.LookupGroupId = StmpLGV.LookupGroupId
            LEFT JOIN ModuleHTMLContentItem gMHCI ON gMHCI.HTMLContentId = ModuleHTMLContent.ModuleHTMLContentId
            LEFT JOIN PageContainerModule gPCM ON gPCM.PageContainerModuleId = gMHCI.PageContainerModuleId
            LEFT JOIN PageContainer gPC ON gPC.PageContainerId = gPCM.PageContainerId
            LEFT JOIN Page gP ON gP.PageId = gPC.PageId
            LEFT JOIN PageHit gPH ON gPH.PageId = gP.PageId
WHERE      ModuleHTMLContent.IsOnHold<>1
            AND ModuleHTMLContent.CategoryTypeId > 0
AND (ModuleHTMLContent.CategoryTypeId = 1007)
AND (YEAR(ModuleHTMLContent.ItemDate) = YEAR(GETDATE()) )
AND (ModuleHTMLContent.Author<>'' AND ModuleHTMLContent.Author IS NOT NULL )
GROUP BY ModuleHTMLContent.ModuleHTMLContentId ORDER BY ModuleHTMLContent.Author ASC  
, COUNT (gPH.PageId) DESC
Column name 'ModuleHTMLContent.Author' is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause.

Any help on solving this would be fantastic.  And with all answers, consider me stupid.  I need detailed explanations.
Avatar of QPR
QPR
Flag of New Zealand image

When you have an aggregate (count) in your select you need to group by all non-aggregate fields.
In this case you are using it in your order by so try adding ModuleHTMLContent.Author to the list of grouped by fields.
Does this now work?
ASKER CERTIFIED SOLUTION
Avatar of David Todd
David Todd
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
if this is your actual query, your count will not work if it returns more then two even if you somehow are able to figure the order by part with count

you can only do "order by 1" (order by first column) or "order by 2" (order by second column)

if you try ordering by 3 or more you will get something like
"The ORDER BY position number xx is out of range of the number of items in the select list."


oh just like always I didnt read your complete comment...
anyway you need to change your group by to ... as pointed out by QPR

group by
ModuleHTMLContent.ModuleHTMLContentId,
ModuleHTMLContent.Author

and if you want to group by count(*) you have to go by dtodd's suggestion....
combinely your query will look like this


Select      ModuleHTMLContentId,
      Cnt_Author
from      
(      SELECT   ModuleHTMLContent.ModuleHTMLContentId,
            ModuleHTMLContent.Author,
            COUNT (gPH.PageId) Cnt_PageID,
            COUNT(ALL ModuleHTMLContent.Author) Cnt_Author
      FROM     ModuleHTMLContent
      LEFT JOIN
            LookupValue CLV
      ON       CLV.LookupValueId = ModuleHTMLContent.CategoryTypeId
      LEFT JOIN
            #tmpLGV CtmpLGV
      ON       CtmpLGV.LookupValueId = CLV.LookupValueId
      LEFT JOIN
            LookupGroup CLG
      ON       CLG.LookupGroupId = CtmpLGV.LookupGroupId
      LEFT JOIN
            LookupValue SLV
      ON       SLV.LookupValueId = ModuleHTMLContent.SubCategoryTypeId
      LEFT JOIN
            #tmpLGV StmpLGV
      ON       StmpLGV.LookupValueId = SLV.LookupValueId
      LEFT JOIN
            LookupGroup SLG
      ON       SLG.LookupGroupId = StmpLGV.LookupGroupId
      LEFT JOIN
            ModuleHTMLContentItem gMHCI
      ON       gMHCI.HTMLContentId = ModuleHTMLContent.ModuleHTMLContentId
      LEFT JOIN
            PageContainerModule gPCM
      ON       gPCM.PageContainerModuleId = gMHCI.PageContainerModuleId
      LEFT JOIN
            PageContainer gPC
      ON       gPC.PageContainerId = gPCM.PageContainerId
      LEFT JOIN
            Page gP
      ON       gP.PageId = gPC.PageId
      LEFT JOIN
            PageHit gPH
      ON       gPH.PageId = gP.PageId
      WHERE    ModuleHTMLContent.IsOnHold<>1
               AND ModuleHTMLContent.CategoryTypeId > 0
            AND (ModuleHTMLContent.CategoryTypeId = 1007)
            AND (YEAR(ModuleHTMLContent.ItemDate) = YEAR(GETDATE()) )
            AND (ModuleHTMLContent.Author<>'' AND ModuleHTMLContent.Author IS NOT NULL )
      GROUP BY
            ModuleHTMLContent.ModuleHTMLContentId,
            ModuleHTMLContent.Author ) as Data
ORDER BY
      CNT_Author ASC,
      CNT_PageId DESC
Avatar of dij8

ASKER

I was pulled away from this job for for a bit so sorry for the delayed response.  I am now back on it.  Here's hoping I can remember where I was at with it all.

Ordering by most popular for a selected Author seems to be working as suggested above.  However, as soon as I try to add a second field it gets more complicated and no longer works.  In this case I am trying to sort items by category, sub-category, and most popular.  It gets more complicated though.  I want, for example, the three most popular items from each sub-category.  One sub-category might have half a dozen items more popular than the most popular for another sub-category.  I don't care, I only want the top three from that sub-category and then the top three from the other sub-category.

I'm trying to picture the logic never mind the syntax.  For each item within a sub-category the items should be ordered by popularity.  Each sub-category will be ordered by either alphabetical or a predefined order number (both set in other tables (alphabetical is in another table because the items reference to a sub-category is by ID and the actual sub-category name is not stored in the items table)).  This is then the same for the categories.

Say for example my tree of items is as follows (ordered by category and sub-category here although I'm not sure how in SQL syntax):
Cat 1
    Sub-Cat 1
        Item 1 - hit 200 times
        Item 9 - hit 160 times
        Item 7 - hit 157 times
        Item 14 - hit 142 times
        Item 5 - hit 121 times
        Item 19 - hit 98 times
        Item 24 - hit 7 times
        Item 10 - hit 0 times (no record in the hit count table then)
    Sub-Cat 3
        Item 3 - hit 60 times
        Item 4 - hit 45 times
        Item 11 - hit 42 times
        Item 15 - hit 38 times
Cat 2
    Sub-Cat 4
        Item 16 - hit 100 times
        Item 2 - hit 57 times
        Item 17 - hit 42 times
        Item 18 - hit 21 times
        Item  - hit 8 times
    Sub-Cat 5
        Item 21 - hit 60 times
        Item 25 - hit 45 times
        Item 8 - hit 42 times
        Item 20 - hit 38 times
Cat 3
    Sub-Cat 2
        Item 13 - hit 4times
        Item 22 - hit 0 times (no record in the hit count table then)
    Sub-Cat 6
        Item 26 - hit 6 times
        Item 6 - hit 4 times
        Item 12 - hit 4 times
        Item 23 - hit 3 times

If you can follow this logic I would want the results in the following order (being the top three from each sub-category):
1, 9, 7, 3, 4, 11, 16, 2, 17, 21, 25, 8, 13, 22 (this one may not be possible and is OK to miss out), 26, 6, 12

What sort of logic do I need to consider if I want to achieve this?  Remembering too that this is just one example of the sorting option that may be chosen.  I could sort by publish date, author, release date, popularity, category, sub-category and pretty much any combination pair of them (although when ordered by category or sub-category they will both be chosen automatically plus the other choice to make the pair actually three)

Does this sound as messed up as I am thinking it is?  I really need to achieve this.  if it has to be done in 3000 lines of SQL then so be it.  But I would prefer it to be much less than that of course.  Currently I use a dynamic SQL query which I believe doesn't use the caching ability of SQL.  Which I guess means even more need for making this as smart as possible to reduce query loads on the server.
Hi,

order by
  Cat asc,
  subCat asc,
  CNT_PageId desc

Regards
  David
Avatar of dij8

ASKER

It's just not doing what I want.  I feel like I am getting close but I have thought that many times before and aren't.  One version of my generated code that isn't working is as below.  The LookupGroupValue table holds lookup groups for my categories and sub-categories.  In the results I am after I have five categories that each have the same sub-categories.  I should be getting the top three results from each sub-category for each category (is essence, 15 results for each sub-category but ordered by category first).

Can someone point out where the following is wrong:
SELECT LookupGroupValue.* INTO #tmpLGV FROM LookupGroupValue WHERE LookupGroupValue.SiteId = 1 OR LookupGroupValue.SiteId = 0

SELECT      ModuleHTMLContent.ModuleHTMLContentId, ModuleHTMLContent.CategoryTypeId, ModuleHTMLContent.SubCategoryTypeId
FROM      ModuleHTMLContent
      INNER JOIN LookupValue CLV ON CLV.LookupValueId = ModuleHTMLContent.CategoryTypeId
      INNER JOIN #tmpLGV tCLGV ON tCLGV.LookupValueId = CLV.LookupValueId
      INNER JOIN LookupGroup CLG ON CLG.LookupGroupId = tCLGV.LookupGroupId
      INNER JOIN LookupValue SLV ON SLV.LookupValueId = ModuleHTMLContent.SubCategoryTypeId
      INNER JOIN #tmpLGV tSLGV ON tSLGV.LookupValueId = SLV.LookupValueId
      INNER JOIN LookupGroup SLG ON SLG.LookupGroupId = tSLGV.LookupGroupId
      WHERE      (ModuleHTMLContent.SiteId = 1)
            AND ModuleHTMLContent.IsOnHold<>1
            AND ModuleHTMLContent.CategoryTypeId > 0
            AND ModuleHTMLContent.ModuleHTMLContentId
                  IN (SELECT TOP 100000 a2.ModuleHTMLContentId
                        FROM ModuleHTMLContent a2
                              INNER JOIN LookupValue xCLV ON xCLV.LookupValueId = a2.CategoryTypeId
                              INNER JOIN #tmpLGV xtCLGV ON xtCLGV.LookupValueId = xCLV.LookupValueId
                              INNER JOIN LookupGroup xCLG ON xCLG.LookupGroupId = xtCLGV.LookupGroupId
                        WHERE ModuleHTMLContent.ModuleHTMLContentId IN
                              (SELECT TOP 3 a3.ModuleHTMLContentId
                                    FROM ModuleHTMLContent a3
                                          INNER JOIN LookupValue xSLV ON xSLV.LookupValueId = a3.SubCategoryTypeId
                                          INNER JOIN #tmpLGV xtSLGV ON xtSLGV.LookupValueId = xSLV.LookupValueId
                                          INNER JOIN LookupGroup xSLG ON xSLG.LookupGroupId = xtSLGV.LookupGroupId
                                    WHERE a3.SubCategoryTypeId = ModuleHTMLContent.SubCategoryTypeId
                                    ORDER BY  xSLV.[Name] ASC)
                        ORDER BY  xCLV.Name ASC)
      ORDER BY  CLV.Name ASC, SLV.[Name] ASC , ModuleHTMLContent.HitCount  DESC

DROP TABLE #tmpLGV
since you basically want top 3 sub categorires, and all that data is coming from ModuleHTMLContent table I wonder why you have to do top3 in the sub-query of where clause

actually sorry about that last post...I meant you need to this simply, you are on right track

I dont know where you are getting "Hits" from so following query assumes you have "hits" saved in  ModuleHTMLContent table. If hits is not coming from same table then I would say just insert ModuleHTMLContentId , CategoryTypeId , SubCategoryTypeId, and Hits by joining on all the tables and insert that data in a temp table and then use that temp table instead of ModuleHTMLContent in following query


select       a.ModuleHTMLContentId ,
      a.CategoryTypeId ,
      a.SubCategoryTypeId,
      a.Hits
from       ModuleHTMLContent a
where       hits in (
      select       Top 3 Hits
      from       ModuleHTMLContent B
      where       a.ModuleHTMLContentId=b.ModuleHTMLContentId and
            a.CategoryTypeId=a.CategoryTypeId
      order by Hits desc)
Order by a.ModuleHTMLContentId,
      a.Hits Desc
Avatar of dij8

ASKER

Thanks guys.  I didn't really understand but as it turns out, I think the best option was one of the first given.  Sorry for dragging this one out.

I have now started a new question as it isn't really a part of this one even though it is in the same SQL query.  Have a look at https://www.experts-exchange.com/questions/22093872/Select-Top-X-usng-two-level-sub-queries-in-where-clause.html and see if you can help me out there too.

Thank you again.