Link to home
Start Free TrialLog in
Avatar of ccleebelt
ccleebeltFlag for United States of America

asked on

Find Count of Existence of Row not Null within data set

I have a table called "VisitDetails" which tracks pages viewed within a Visit.

Looks like this:

VisitID     PageID     CategoryID      ProductID  (all are int, VisitID FK to a table called Visits which hold data about the visitor)

A Visit consists of multiple page views - they may be Pages, Categories, or Products.

I am trying to figure out how many Visits have a ProductID in any rows.  For example data:

VisitID     PageID     CategoryID      ProductID
1               NULL          NULL                100
2                1000          NULL                NULL
1              1000            NULL                NULL
1               NULL           10000               NULL
2              NULL             NULL                101

My ideal results would show me that 100% (count 2 of 2 unique Visits had a ProductID).  This will tell me what proportion of Visits viewed any Product.

Struggling to write an effective query to do this.
ASKER CERTIFIED SOLUTION
Avatar of chaau
chaau
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Would it be possible to have data like this?
VisitID     PageID     CategoryID      ProductID
 1               NULL          NULL                100
 2                1000          NULL                NULL
 1              1000            NULL                NULL
 1               NULL           10000               NULL
 2              NULL             NULL                101
 3              1000            NULL                NULL  ==> No product id at all
 1               NULL          NULL                101     ==> Repeat VisitID for 2 ProductIDs
Avatar of ccleebelt

ASKER

Yes, it is very likely that a product is viewed in multiple visits.

I added a distinct to the visit ID and inner join to what Chaau offered and it seems to work.
I'm not sure where you inserted the distinct keyword, but I get some odd results when the data is like the possibility I presented
when using that query.

This query produced  a result of .66667 which would represent 2 visits with productids and 3 visitids -
select cast(sum(case when prodcnt = 0 then 0 else 1 end) as float)/cast(count(visitid) as float) as prodvisitratio from
(select visitid, count(productid) prodcnt
 from visitdetails
 group by visitid);

Note - I don't have sql server available to me so I didn't need to cast, but added it since I think it may be necessary.
The subquery needs an alias as well
(select visitid, count(productid) prodcnt
  from visitdetails
  group by visitid) v
Did you try both methods with those two records added to your sample? I'm not worried about any points, but I am curious about the results.