ccleebelt
asked on
Find Count of Existence of Row not Null within data set
I have a table called "VisitDetails" which tracks pages viewed within a Visit.
Looks like this:
VisitID PageID CategoryID ProductID (all are int, VisitID FK to a table called Visits which hold data about the visitor)
A Visit consists of multiple page views - they may be Pages, Categories, or Products.
I am trying to figure out how many Visits have a ProductID in any rows. For example data:
VisitID PageID CategoryID ProductID
1 NULL NULL 100
2 1000 NULL NULL
1 1000 NULL NULL
1 NULL 10000 NULL
2 NULL NULL 101
My ideal results would show me that 100% (count 2 of 2 unique Visits had a ProductID). This will tell me what proportion of Visits viewed any Product.
Struggling to write an effective query to do this.
Looks like this:
VisitID PageID CategoryID ProductID (all are int, VisitID FK to a table called Visits which hold data about the visitor)
A Visit consists of multiple page views - they may be Pages, Categories, or Products.
I am trying to figure out how many Visits have a ProductID in any rows. For example data:
VisitID PageID CategoryID ProductID
1 NULL NULL 100
2 1000 NULL NULL
1 1000 NULL NULL
1 NULL 10000 NULL
2 NULL NULL 101
My ideal results would show me that 100% (count 2 of 2 unique Visits had a ProductID). This will tell me what proportion of Visits viewed any Product.
Struggling to write an effective query to do this.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Yes, it is very likely that a product is viewed in multiple visits.
I added a distinct to the visit ID and inner join to what Chaau offered and it seems to work.
I added a distinct to the visit ID and inner join to what Chaau offered and it seems to work.
I'm not sure where you inserted the distinct keyword, but I get some odd results when the data is like the possibility I presented
when using that query.
This query produced a result of .66667 which would represent 2 visits with productids and 3 visitids -
select cast(sum(case when prodcnt = 0 then 0 else 1 end) as float)/cast(count(visitid) as float) as prodvisitratio from
(select visitid, count(productid) prodcnt
from visitdetails
group by visitid);
Note - I don't have sql server available to me so I didn't need to cast, but added it since I think it may be necessary.
when using that query.
This query produced a result of .66667 which would represent 2 visits with productids and 3 visitids -
select cast(sum(case when prodcnt = 0 then 0 else 1 end) as float)/cast(count(visitid)
(select visitid, count(productid) prodcnt
from visitdetails
group by visitid);
Note - I don't have sql server available to me so I didn't need to cast, but added it since I think it may be necessary.
The subquery needs an alias as well
(select visitid, count(productid) prodcnt
from visitdetails
group by visitid) v
(select visitid, count(productid) prodcnt
from visitdetails
group by visitid) v
Did you try both methods with those two records added to your sample? I'm not worried about any points, but I am curious about the results.
VisitID PageID CategoryID ProductID
1 NULL NULL 100
2 1000 NULL NULL
1 1000 NULL NULL
1 NULL 10000 NULL
2 NULL NULL 101
3 1000 NULL NULL ==> No product id at all
1 NULL NULL 101 ==> Repeat VisitID for 2 ProductIDs