We help IT Professionals succeed at work.

Complex Cf query and order display question

CFbubu
CFbubu asked
on
Hi, this question has stumped me and I am hoping someone would be able to enlighten me. It is kinda complex(to me) and I hope I can explain it well.

Question:
1) Let's say I have a DB table with 100 lines. Each line has info for 1 item, thus, DB table has 100 line items.
2) Each of these line items has to be entered with the inclusion of 2 critical data fields. One is 'time-stamp' at the instant of entry into the DB, and the other is 'View-Count'.
3) I have a page that I want to display 4 line items. These 4 items would be drawn from a query from the above mentioned table with the 100 line items.
4) The 4 items chosen to be displayed on the page will have to be displayed in the order of 'time_stamp' of when it was displayed, (oldest first) and 'view_count'.
5) Each time the user refreshes the page, it will show the next 4 items based on criteria in point number 4. Each time an item is displayed, the 'view_count' for that item is to be increased by +1 and the ''time_stamp' field updated by Now().

(Example)
Therefore, the way this works is that any items that has NOT been 'displayed'('view_count' of '0')  will take priority over any items that have already been displayed. If all items have been displayed once with a 'view_count ' of '1', and a new item is entered into the DB table(new item view_count' of '0'), than the new item will take priority.  If all line items have already been displayed once, than the display of the items will be prioritised by which items are 'older' based on the displayed 'time_stamp' field.

This means if 4 items are displayed on the page(From above paragraph), 3 items will have a view count of '1' while one item(new DB entry) will have a 'view_count' of '0'. Of the 3 items with a view_count of '1', they are arranged in the order of  when they were last displayed from the 'time_stamp' field. Of course, after all these 4 items are displayed on the page, all their 'view_counts' would go up by +1 and their respective 'time_stamp' fields would be updated to Now().

6) Each items to be displayed are only to be displayed for 10 days from the date it was first displayed. So this will require something like a 'Time_Display_Startdate' for each line item. This field would probably store time_stamp when the item is first displayed and this will not have to be updated.

(Interesting twist)
7) If just starting out, and the DB table has less than 100 items, lets say only 2 items, BUT the page still has to display 4 items, than would it be possible to fill in the other 2 slots with a random query from another DB(No special requirements here)? Similarly, if the original table only had 3 items, than the page will display the 3 items from the first query, and than 'fill' in the 4th slot with an item from another source.


So, from the above theoratical questions, I hope to be able to 'cycle' through the items in a DB 'fairly' without repeating the items for a limited time(10 days), and be able to draw from another source(No special requirements for the second source as it will just be getting 2 random items) if there are not enough items from the first table.

I realise this is a fairly complex question, but hope someone here may be able to see what I am trying to achieve and help me out. I really would like to know how I would be able to write the CF code/queries  to achieve such a mechanism. For better clarity, maybe we could say that the table is named 'T_Product' with current fields 'View_Count', 'Display_Timestamp', and 'Time_Display_Startdate'. Do feel free to add any other fields or suggestions/changes to make this work.

Thank you so very much!
Comment
Watch Question

CERTIFIED EXPERT
Commented:
That's a lot of words for this question!

select top 4 * from t_product
where Time_Display_Startdate > #createODBCdate(dateAdd("d",-10,now()))#
order by display_timestamp desc, view_count desc


If your query comes up with fewer than 4, then you need to just add in others manually or fetch them from a "defaults" table...

<cfif theQuery.recordCount lt 4>
   ... add some ...

</cfif>


CERTIFIED EXPERT
Most Valuable Expert 2015
Commented:
Building on gdemaria's query, I'll try and address some of the other questions.

> a new item is entered into the DB table(new item view_count' of '0'), than the new item
> will take priority.

You can use a CASE statement to assign "new" records a higher priority.  I don't know your db type, so I'll assume MS SQL. Note, if you don't want to pull the same records each time, you'll need to use NewID() in the ORDER BY to randomize the results further

SELECT    TOP 4
                CASE WHEN view_count = 0 THEN 0 ELSE 1 END AS Priority
               , OtherColumns
FROM     T_Product
WHERE  Time_Display_Startdate > #createODBCdate(dateAdd("d",-10,now()))#
ORDE BY  Priority, Display_Timestamp ASC, ViewCount ASC

 

> with a random query from another DB(No special requirements here)?

Another database or table?  If it's in another table, or one accessible from your dsn, you can just use a UNION.  I can provide an example once I know the answer to that question.
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
> Each time the user refreshes the page, it will show the next 4 items based on
> criteria in point number 4. Each time an item is displayed

Hm... logically I'm not clear on how that parts supposed to work. It's not like paging through a static set of records. Your "results" are dynamic and based on timestamps. So they could change every time another user(s) view the data. What might be results 13-16 for userA one moment, would change as soon as another user viewed any of the records.

But there's a lot of pieces to this question ... so maybe I'm misunderstanding the goal.

Author

Commented:
Wow...you guys are great :)

Btw, I am using MySQL DB.

What I am ultimately hoping to achieve is for 4 records to be displayed each time a user refreshes or hits on the page. Each different user may see 4 different records that are being displayed compered to a user on another different computer. The criteria is that these 4 records must all be different. So 'theoratically' if a user were to just keep pressing the refresh button on the same page, it will keep showing 4 different records, and cycle throught the whole DB line items without repeating. Once all have been viewed, than it will cycle thorugh the round all over again, but this time it will be looking for items with a view count of '1'? But if a new item is added, than it will display the new item(view count '0') as one of the 4 as priority, and than continue to pull up the remaining 3 that already is on view count '1'.  On top of all this, it will be choosing also to display the records by time_stamp of when it was last displayed.

Thanks, I see how your solutions could work, but I am still uncertain as how to code/structure it so that when the page is displayed, it will add +1 to the 'View_count' of the ones being displayed, update the time displayed timestamp field of the one's currently displayed, and how to code the page itself to display the 4 items with the filling in of the 'missing' items if there are fewer than 4 items from the main DB table.

Hope this makes it clearer.

Thanks again for your patience.

Author

Commented:
Sorry, not sure if I was clear...the view counts will keep going up
CERTIFIED EXPERT
Most Valuable Expert 2015
Commented:
(I have to get on the road for an hour or so But will be back later.)

> when the page is displayed, it will add +1 to the 'View_count' of the ones being displayed

That actually the easy part ;-) Once you've have the records to display in a query, you just feed the recordID's into an UPDATE query. Use cfqueryparam in the real query:


ie   <cfset idsToUpdate = valueList(recordToDisplayQuery.theUniqueRecordIDColumn)>
      <cfquery ...>
            UPDATE  YourTable
            SET        ViewCount = ViewCount + 1,
                          DateLastDisplayed = now()
            WHERE  theUniqueRecordIDColumn IN (
                     <cfqueryparam value="#idsToUpdate#" list="true" cfsqltype="cf_sql_integer">
            )
      </cfquery>

Author

Commented:
hhmm...about what you wrote _agx_

"Hm... logically I'm not clear on how that parts supposed to work. It's not like paging through a static set of records. Your "results" are dynamic and based on timestamps. So they could change every time another user(s) view the data. What might be results 13-16 for userA one moment, would change as soon as another user viewed any of the records.
"
I think you may have some idea of what I am trying to achieve. But even though each user may cause the time stamp to be updated for a line item, won't 'all' users still be viewing the 'next' 4 items from the one the last user had viewed?

Given your experience, how would you have handled such a requirement and made it possible? If not constrained by the above requirements? How ould I make it possible for 4 records to be displayed each time, same for each different user, and make it so that it will cycle through the records without repeating?  And to be able to fill in the missing records if there are less than 4 from the DB with a pull from another source? I am guessing some query caching may be involved somewhere.

Sorry again for all my questions. I am just at a lost for a good, efficient solution. I am thank ful for your help.
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
> But even though each user may cause the time stamp to be updated for a line item,
> won't 'all' users still be viewing the 'next' 4 items from the one the last user had viewed?

No. The order of your results is determined by the timestamp value. Since query results are dynamic, if that value changes, the order of your results change - for every user.  If the data were static, I might suggest caching the query. But then it wouldn't reflect the changes in timestamp and view_count.  Unless maybe you cycled through a snapshot of the results. And only refreshed it when reaching the end.

I guess I'm still seeing a logic flaw in the requirements ;-) The results can't be dynamic and static at the same time.  If they need to be based on the latest timestamp date (from any user) you can't also maintain the order.  

Author

Commented:
I realize that now, but how would you design something similar that would not suffer from such logic flaws? I would welcome any suggestions. Thanks again.
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
It's not perfect, but the closest I can think of would be to essentially take a snapshot of the results when a user starts viewing the 1st set.  Capture the ordered id's and store them in a work table (by user ID).   Then query the work table to display the results in sets of 4.  When the user has viewed all of the records, delete the records for that user and take another snapshot. Again, it's not perfect. But it'll the user get different records each time.  But still get the latest changes once they've cycled through all records.  

Also, I don't know if you saw my question about this "other" source of data.  It's relevant to how you'd fill in the missing records.

> with a random query from another DB(No special requirements here)?

Another database or table?  If it's in another table, or one accessible from your dsn, you can just use a UNION.  I can provide an example once I know the answer to that question.

CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
>  it'll the user get different records each time.  But still get the latest changes once they've cycled through all records.  

Gah... that should be  " it'll ENSURE the user gets 4 different records each time...."

CERTIFIED EXPERT

Commented:
Guys, this seems like an interesting conversation

How about this approach..

Fetch a list of ad IDs from the table in the correct order (new ones first, then by oldest touch).  Store these in a session variable.  On each page, pick off the top 4 IDs and show those ads, remove those IDs from the list.   Once the list has been exhausted, then simply refetch a new set of Ads based on the same oldest touch criteria.  

This will ensure that a user sees different ads and it will semi-dynamically force ads that don't have a lot of exposure to be viewed by a bunch of people earlier on.

Ok - I just read agx's idea again and it sounds like the same thing, except I'm suggesting a session variable instead of a table and I will reiterate that you're pulling a full set of IDs when the session starts...

I'm still posting this anyway because I guess if we both think it's a good idea, it may just be... :)
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
> except I'm suggesting a session variable instead of a table

Exactly.  A session variable has the benefit of being a lot easier.  The sole reason I didn't push it is because I don't know how many records and users they're talking about.  If it's a lot, I'd go with a db table instead. Otherwise, session variables would be better/simpler.

Now the only thing we need to know is where this "other" default data is located, and we can throw together a few sample queries :)

Author

Commented:
Guys, yes...I am very impressed by your suggestions. I am trying to think up an Ad kind of system and have been stumped for quite sometime.

Yes, _agx_ i missed that part of your question, sorry. The way I envisioned the implementation to work is that when there are not enough line items in the table, when the page detects that there is less that 4 displayed items(In-house), it will just fill in the other 2 spots with feeds from an external party like yahoo, google etc.

I see where both of you have kinda come to the same kind of solution and think that it would be more logical.

But I am not understanding some of the details. In both the examples where either a snap shot of the table is used or if a session is used, you are saying that the server would have to create this each individual user. Won't it be really taxing for the server if there are like thousands of users viewing the pages? Also, if one uses the session method, if there are 1000 or more line items in the Ad table, will the session variable be able to hold so much data?(Or maybe I am not understanding)

Won't it be somewhat simpler if the implemetation could be such that the first person to hit the page will generate the query and thus display the 4 items, and than this query is cached for the next 5mins, so everyone else who browsed to the page will see the same 4 items no matter how many time they refreshed the page, than the person who hits the page on the fifth minute will create the new cached query that will display the next 4 items for 5 min...and so on? Or maybe I am not thinking through on this planning.

Next, I am not even sure to do the steps to go about creating such an implementation suggested by either of you. Or is that simpler than what I imagine it to be?

To be fair, I do think both of you have given me great solutions from my original posting, but I still am hoping to find a way to build such a query and display page. Will there be a way to show how either of you would implement your concept in code? Thanks.

I would like click the buttons to show that you guys have indeed answered my original question, but am not sure that after I do so, will you guys still be able to continue posting your solutions here?

You guys do give me some hope that what is in my mind may be possible to do...Thank you.
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
> Yes, _agx_ i missed that part of your question, sorry. The way I envisioned the
> implementation to work ...

I think you misunderstood me :) I totally understand the goal.  What I was asking is where are those default records stored? You said they're stored in another database, which seemed odd. Normally you'd keep them in a different table - but still in the same database as T_Product.

> you are saying that the server would have to create this each individual user.

Yes

>  will the session variable be able to hold so much data?

Keep in mind you'd only be storing the record id's, not everything in the table.  So all that's held in memory is a list of numbers: "5,15,66,246,13" . But like anything else, session variables are limited by how much memory the server has. So if you expected really high volume or a really large recordset, storing them in the db might be better. However, it's more complicated than the session variable approach.


so everyone else who browsed to the page will see the same 4 items no matter how many time they refreshed the page, than the person who hits the page on the fifth minute will create the new cached query that will display the next 4 items for 5 min...and so on?


Well .. I didn't think that would meet your initial requirements.  But if it's acceptable that a user sees the same items for 5 minutes, that could work.  However, it wouldn't guarantee all users will see the entire list.  A user might visit the site midstream, when you were displaying records 900-904.  So they would never see the first 899 records, including any priority items.

But if that's okay, I'd tweak the approach to be a bit to be more like what  gdemaria suggested.  Just using the "application" scope instead of each user's session. So there'd only be one list of id's - shared by everyone.  

> I would like click the buttons to show that you guys have indeed answered my question

Don't worry, gd and I hang out here all the time. We're not going anywhere ;-) But as we start to get more detailed, you might want to split things into smaller questions.  So you can tackle one thing at a time.  

Author

Commented:
> _agx_"Well .. I didn't think that would meet your initial requirements.  But if it's acceptable that a user sees the same items for 5 minutes, that could work.  However, it wouldn't guarantee all users will see the entire list.  A user might visit the site midstream, when you were displaying records 900-904.  So they would never see the first 899 records, including any priority items. "

Well, I only suggested that as I was worried that a server may be too 'stressed' by the method you originally envisioned. I would of course prefer a solution whereby the 4 items are 'refreshed' each time a user hits the page so that more items can be 'cycled' through, exposed.

A friend refered mt to join this site and I was initially hesitant but now I have come to realise why he only had glowing things to say about it and about individuals who try to work out solutions here :).

Please pardon my initial long question as I have kept this issue in my head for sometime, and wanted to give you guys as complete a picture of what I envisioned I wanted to try to accomplish with you guys help.

Thank you for staying with me.


CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
> Well, I only suggested that as I was worried that a server may be too 'stressed'
> by the method you originally envisioned.

Well, obviously server memory has limits. So it's good that you want to avoid storing large objects in the session scope.  But in this case it's a simple list of numbers.  Let's say each list is 4000 characters long (ie the total length of all the id's separated by commas "1,2,3,4,5,6,7,....1000").  It would take up about 8kb of memory (roughly).  Multiply that by 1000 users and the total memory is about 8M.  Not too bad.  Keep in mind the estimate's very rough, but should give you an idea.

As far as which approach to use, I'd consider what's most important

            1) frequent cycling of data
            2) memory usage
            3) database hits
            4) simplicity
            5) scalability

... and pick the approach that strikes the best balance.  

          A) The application is strong on simplicity, memory and database activity. It's also the
          most scalable.  But it has the poorest cycling of data.
          B) Session scope is strong on simplicity and cycling of data. But has more frequent
          database hits and requires more memory
          C) The work table approach is strong on cycling of data and memory. But is more
          complex than any of the other methods.

> why he only had glowing things to say about it and about individuals who try to work out solutions here

Thanks, and welcome to EE :) I think the CF zone crew really enjoys the language, we like tackling tricky design/code problems.

Author

Commented:
Well, of the above choices you have provided, and I think you did so clearly give a good breakdown analysis of priorities, I would think approach 'A' would be what will be the best at this time. I am not too well versed in CF, and coding, and the extra complexity that is required in approach 'C' not be in my best interest even though I think it's the best solution. I also do not think approach 'B' would be suitable as I am hoping to minimize DB hits and memory usage.

Therefore, I am very interested to see how you will approach method 'A' even though it has the poorest cycling of data. I am hoping it will still be suitable/acceptable rate :). I am guessing this cycling time is easily adjustable.

It's been a great pleasure of mine to be introduced so nicely to the CF zone crew!
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
Ok, sounds like a plan.  (Btw: Unless you're expecting 100K visitors, the session scope would probably work as well. Fortunately, if you design things right it wouldn't be hard to switch.  As it's very similar to the application scope approach.)

> I am guessing this cycling time is easily adjustable.

Yep.

> I am very interested to see how you will approach method 'A'

Ok. So getting back to the earlier question, where are your "default" values (used to fill in missing values) stored? In another table or another database? The answer effects what code you'll need.

Fyi, I'm going to have to head off to sleep soon. But I'll be back tomorrow morning ...or I guess it's this morning already ;)
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
> the other 2 spots with feeds from an external party like yahoo, google etc

Ok.. now I know it's time for me to go to sleep ;-) Re-reading the comments,  I see I missed that you mentioned the data comes from an external "feed" wa-ay back.  Sorry about that.  But feel free to elaborate while I'm gone.  I'll check back later.

Night!

Author

Commented:
Thanks for your clarifications.
The reason behind the 'filling' in of spots by outside ads feeds like yahoo, google etc. is that if for some reason there is not enough internal line items to fill all 4 spots, than the 'vacant' spots will be filled by the 'random' yahoo/google ads. This way, all the spots on the page are always utilized.

Do have a good evening.
CERTIFIED EXPERT

Commented:
You guys were up late!     So, I'm catching up again here...

>    A) The application is strong on simplicity, memory and database activity. It's also the
          most scalable.  But it has the poorest cycling of data.

Not sure what approach this is?  Is it to use application scope variables?  Or perhaps the application scoped cached query so that everyone is using the same ads for X number of minutes.

If I understand the approach, the downside is that an individual user will see the same ads repeatedly.  Often users aren't even on a website for more than 5 minutes, in this scenario, the user will see the same ads again and again.

I think some of the possible problems we've discussed can be mitigated...

Session variables - perhaps there is no need to store the entire table of ad IDs in a session variable.  Grabbing 100 ad IDs for a user will give him 25 page views with different ads.  At the end of the 25 pages (if the user is still around), grab another 100 ad IDs.

Database Hits - Unless the ad table is really huge, I don't think grabbing the top 4 records of the table, once for every page view, would be an issue.   Using cfqueryparam binding, the DB will return the results in a nano second... you could query it 50 times on a page and not see a performance hit.

Here's a variation on the application scope method that may help...
Use a session variable to keep track of the number of page views the user has done during the session.   <cfset session.pageCount = session.pageCount + 1>.   Use this number to "walk through" the application scoped IDs.    If we are talking about a list of IDs in an applicaton scope variable, then start grabbing the ID at position "session.pageCount x 4"   If we are talking about an application scoped query, then start grabbing at record count "session.pageCount x 4" and take 4 records.    This approach would still give the user a different view each page and reduce session memory and database hits (although I'm not sure how necessary that is).


So, once we settle an approach, we can code it without much effort.

I think we should save the "filling in" ads (if there are fewer than 4) either until the end or a different question.


CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
> Database Hits - Unless the ad table is really huge, I don't think grabbing the top 4
> records of the table, once for every page view, would be an issue.

Remember it's not just select's that will be involved. The viewcount and timestamp need to be updated on every view.  Updates take an exclusive lock. So if there's a lot of concurrency there's a high potential for blocking or deadlocks.  I remember running into that in one application using ms sql and it took me by surprise. There's things you can do to mitigate it, like using the right index. But that's something to keep in mind if they're planning on very high concurrency.

CERTIFIED EXPERT

Commented:

True, but I don't think we addressed the update issue in any scenario.   I was thinking that the ad count would be updated on each page view regardless of the fetching option chosen.  

Did you have different update options as well?

Also, perhaps we should find out the real proposed size of the application.   A thousand users a day wouldn't be much to worry about..  

CFbubu, what could be the number of users and the number of ads?

Author

Commented:
Wow...you guys are up and about and active so early! :)

I try to always plan for high concurrency which in most cases makes me try to develop/code in a more efficient manner. However, more often than not, I am not well versed enough to code that way, and am very certain much stuff I do can be optimized much further. I am hoping to learn from you guys a great deal, that is if I am able to understand and apply your solutions to my vision ..:P

I was hoping to have the external "fill- in" concept from the get-go as I know that at the start, the DB will probably be empty or only have a few lines, and I hope to be able to fill in the empty 'spots' with generic yahoo/google ads until there is an adequate enough line items in the DB. However, I am building this out with the mind that it has to sustain high traffic(who doesn't hope..hee), and that was why I do plan on high concurrency and want to code with that in mind.

Again, _agx_ and gdmaria, thx for both your insight and time.

I can't wait to see what is finally brewed!

Author

Commented:
It definately has to sustain more than a 1000 users a day. Hoping alot more.
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
> True, but I don't think we addressed the update issue in any scenario.   I was thinking
> that the ad count would be updated on each page view regardless of the fetching option chosen.  

Right. But if the updates are static, the potential for blocking is directly related to how frequent the selects are.  ie Application scope (fewest selects/lower blocking potential).  Session scope (more selects/more blocking potential). Database table (most selects/highest blocking potential). And again, I'm referring to concurrent users. In the app where I hit deadlocking problems it was only a few hundred users. Granted it was very complex legacy app with LOTS of sql.  But getting a deadlock on a very simple select was a real eye opener ;-)
CERTIFIED EXPERT

Commented:

I don't see how to get around updating the ad count once per page view, we'll see what agx says about that.  It seems any other way would be inaccurate.  I don't see it as a problem though either.  I update a sessions table with every page view (last active date for example)  and the site gets 90,000 views/month.

> I was hoping to have the external "fill- in" concept from the get-go

I was just suggesting that we figure out the first part before moving on.   We need to know the horse before we can fit a cart to it..

Author

Commented:
Sorry...you

Author

Commented:
Oops....hit the submit button by mistake....by hitting the 'Enter' key

gdmaria- "I was just suggesting that we figure out the first part before moving on.   We need to know the horse before we can fit a cart to it.."

Yes..I agree.. Sorry, I just wanted you to know my intentions. Not sure if you guys needed that info in your approach when trying to help me.
 
CERTIFIED EXPERT

Commented:
agx, our posts crossed.  Makes sense if we keep the selects down we can do any update on every page.

So what's your preference guys?  

1)   session variables (limited to 100 IDs)

2)    application scope variables (with IDs or Query)
        -- possibly using session counter?

3)    query every page --- ruled out

4)    other?

CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
Option 1) sounds like it strikes a good balance. But I'm not clear on how you propose to get the next 100 id's, without duplication.

> It definately has to sustain more than a 1000 users a day. Hoping alot more.

Keep in mind with a lot of this stuff I'm talking about concurrent users ie 1000 users all hitting the app at the same time - not 1000 users throughout the course of 24hours.

Author

Commented:
HHmm.....I will be hoping to implement the solution here to several pages. Each page will have it's own query to pull a different set of line items and than display a specified number of ads on that page.
For example, in a kids section it will draw line items related to kids while in the flower section it will draw line items that relate to plants. Of course all the items will be stored inside the same table, but it will have a foreign_key to a another table that tells if it belongs to kids or plants.

gdemaria, I am not sure whats the best preference from your list. I am thinking either 1) or 2)?

Author

Commented:
_agx_, yes, I am looking for it to withstand the barrage of much more than 1000 uses to hit at the same time if everything pans out. Thanks.
CERTIFIED EXPERT

Commented:
>  Each page will have it's own query to pull a different set of line items and than display a specified number of ads on that page.

CFbubu, I think you need to share all your requirements up front... knowing that would have narrowed things down more quickly.   This statement rules out #2.


> Option 1) sounds like it strikes a good balance. But I'm not clear on how you propose to get the next 100 id's, without duplication.

The number could be higher, this is the way I look at it...

There are 100 IDs,ever page view, we pull off the top 4.  After 25 pages, the list is empty.   When the list is empty, we do another fetch.   After 25 pages, there will likely be a lot of movement in the view counts and the last touched fields.   While it is true that some ads may be duplicated in the next 25 pages, the question is.. does it matter if an ad repeats at that frequency?   The number can be changed to 200 ads which would be 50 page views....

What is the acceptable repeat, how many pages?


Author

Commented:
I do apologize for not writing all stuff up-front. I just didn't want to fill up the page with details that I presumed I could use the solution here across the board. I will try to be more clear and adequate with details from now on.

Yes, so if I have 10 pages on the site, than each page will have a different set of 4 ads that will relate to the 'content' of the page., and this will be logically tied to the query for that page.

I am not experience with envisioning an Ad engine like this as it's the first time for me trying to get one going. Let's say that at the given moment, there are 1000 line items that were drawn from the query on a specific page. Than if each time a page shows 4 items, than it would mean it would take 250 page views to see all of the ads for a single user. I am guessing all users will be able to view all ads from the first line item to the last if they refreshed the page 250 times. Once all the ads are 'seen', than the cycle will repeat itself and the user will go through the 1000 ads again or whatever the updated query will bring? I think the importance will have to ensure that all ads are shown in the cycle with minimal repeats so that it will give an equal chance for all ads to be shown in the cycle.

Author

Commented:
If the same user on page 3 of the the website does 3 page refreshes, and thus sees 12 ads(4 x3), and the user goes to another page and than clicks back to page 3 of the website, will that user see those same initial 12 ads or will it be possible to 'continue' to the 13th-15th ad for that page? I am guessing this means the session for how many ads the user has seen will have to be kept alive for a time period for each page to be able to keep track of this.

Man....I know how difficult it is to think in a 'virtual' setting how to code something...you passion for this field is admirable.
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
What about gd's questions:  What is the acceptable repeat, how many pages?
CERTIFIED EXPERT

Commented:
CFbubu, I was suggesting that any given user would be guaranteed not to see duplicate ads for 25 pages or perhaps 50 pages.  After those pages, the new list of IDs would be fetched based on the low volume/last touch principle.  So, the next set of ads would be the lesser touched ads, but some of those ads could be duplicated from the first batch.   Are you suggesting that you want to ensure 250 page views before duplication?    I don't know about your sites, but my experience is that many sites will only get at most 10-25 page views before the person leaves the site.  

How many page views do you want to ensure before the possibility of a repeat ad?

Another factor is of course the number of "qualified" ads even available.   How many "flower" related ads will you have to present on a  page about flowers... you're gonna have to hae a ton of ads in your dattabase to bring up 1,000 ads related to flowers.   And not to start another path, but how can you even determine what the category should be... but that's for another post.

Author

Commented:
Hi Guys,

I am not sure what the best repeat rate would be but I would go along with your suggestion that the site will get a maxumin of 25 page views before a person leaves. In this case, what would be the optimal balance of method to be used?

Regarding having enough qualified ads for each page, that was why I wanted to have the 'fill's from the external source ads like yahoo/google if the quesry finds that there are currently no ads that relate to that category.

Please feel free to keep asking me details as I am trying my best to get this idea working and know that you guys need as much info as possible.

Thank you.

Author

Commented:
I forgot to ask, in a 'real' world setting, I know some sites 'guarantee' an ad will be displayed at least 3x per 24 hours...how do you think this type of logic was accomplished?

CERTIFIED EXPERT

Commented:
> I forgot to ask, in a 'real' world setting, I know some sites 'guarantee' an ad will be displayed at least 3x per 24 hours...how do you think this type of logic was accomplished

They have a lot of page views... or their page view to ad ratio allows for it...  if you get 30 hits per day and have 1,000 ads you won't make it.  But if you get 2,000 page views per day and have 50 ads you will..


You may want to consider pulling your own ads as backups rather than going to 3rd party ads (google/yahoo)

Author

Commented:
gdemaria_"They have a lot of page views... or their page view to ad ratio allows for it...  if you get 30 hits per day and have 1,000 ads you won't make it.  But if you get 2,000 page views per day and have 50 ads you will..
"
Ahh...yes that makes sense!

Well, the reason why I require to pull ads from external sources is that some of the pages may not have any/enough 'in-house' ads to fill the spots. So it would make logical sense to just use the random feeds.

I hope I did not cause you guys any more trouble. Still very appreciative that you guys are still putting your thoughts on this one.
CERTIFIED EXPERT
Commented:
anyway, time to code it...

In your onSessionStart function, predefine the session variable...

<cfset session.adIDs = "">

In your onRequestStart function...

<cfif len(session.adIDs) eq 0>
    <cfquery name="getAds"  ....>
       select top 100 adID
       from T_Product
       WHERE  Time_Display_Startdate > #createODBCdate(dateAdd("d",-10,now()))#
       order by isnull(Display_Timestamp,'1/1/1900') ASC
    </cfquery>
    <cfset session.adIDs = valueList(getAds.adID)>
</cfif>
<cfset request.thisPageAdIDs = "">
<cfloop index="kk" from="1" to="#min(listLen(session.adIDs,4)#">
   <cfset request.thisPageAdIDs = listAppend(request.thisPageAdIDs listFirst(session.adIDs))>
   <cfset session.adIDs = listRest(session.adIDs)>
</cfloop>



Then in your page code for viewing ads, show the ads represented by the IDs in this variable:  

   request.thisPageAdIDs

Author

Commented:
Hi gdemaria,

By using both the OnSessionStart, and onRequestStart function, I guess all this code will be included inside the application.cfc file. Would this mean that all the Ads to be displayed on all the pages will be driven by the same single query? However, I was hoping that the mechanism would be triggered by the user when the user actaully lands on a specific page. By this, I mean, if the user lands on a page with kids related content, the query would be dynamically generated by using the page URL as one of the query conditions. This is because on different pages, I would like the quesry to be generated by different URL conditions based on the content of the page.

Would the solution you show also work in these conditions?

I have just modifed your query with the inclusion of the URL conditions as an example below.

Example Query:

<cfquery name="getAds"  ....>
       select top 100 adID
       from T_Product
       WHERE  Time_Display_Startdate > #createODBCdate(dateAdd("d",-10,now()))#
       AND T_Product.category_ID = T_Category.category_ID
       AND tCategory = URL.category_ID
       order by isnull(Display_Timestamp,'1/1/1900') ASC
    </cfquery>

In the above example, sometime it will be driven by URL.category_ID, sometimes if may be driven by other URL parameters. All dependent on where the user is in site.

Thanks for all your valuable time so far.
CERTIFIED EXPERT

Commented:

Right, we have to take the category into account.  If the category will always be available via the URL or Form variable, then keeping the code in the application.cfc will work because those variables are available to the application.cfc -  just as you have shown in your code

(of course you need to put ##s around your variable and add cfqueryparam...)


> In the above example, sometime it will be driven by URL.category_ID, sometimes if may be driven by other URL parameters. All dependent on where the user is in site.

Sounds like this logic can be included in the application.cfc if it's all based on query string parameters.

Author

Commented:
sorry for my obtuse questions, just trying to wrap my mind around your solution.

So, inside the application.cfc file I will insert the code below:

(Application.cfc )
<cffunction name="onSessionStart">
       <cfset session.adIDs = "">
</cffunction>

 <cffunction name="onRequestStart" output="false" returnType="void">
         <cfif len(session.adIDs) eq 0>
               <cfquery name="getAds"  ....>
                select top 100 adID
                from T_Product
                WHERE  Time_Display_Startdate > #createODBCdate(dateAdd("d",-10,now()))#
                order by isnull(Display_Timestamp,'1/1/1900') ASC
                </cfquery>
                <cfset session.adIDs = valueList(getAds.adID)>
          </cfif>
          <cfset request.thisPageAdIDs = "">
          <cfloop index="kk" from="1" to="#min(listLen(session.adIDs,4)#">
           <cfset request.thisPageAdIDs = listAppend(request.thisPageAdIDs listFirst(session.adIDs))>
           <cfset session.adIDs = listRest(session.adIDs)>
          </cfloop>

</cffunction>

On any given page that I want to pull the ads, I use:

<cfoutput>#request.thisPageAdIDs#</cfoutput>

The above would therefore output the 4 Ad_IDs, with each refresh showing 4 different ads? How does this tie to the use of the page URL paremeters?

How do I dynamically alter the original query written inside the application.cfc file on the pages that has the Ad content being displayed that is determined by the URL parameters?

I assume that for the final step, on each of the pages that I want the Ads to be displayed, I would create another 'dynamic' query to further query the results of the original query in the application.cfc page and 'plug-in' the URL parameters as a final 'filter'? How do I do this step?

Thanks for your patience.



CERTIFIED EXPERT
Commented:
> with each refresh showing 4 different ads

Yes, check out the logic of the code, see how it takes 4 ids off the top of the list every run


> How does this tie to the use of the page URL paremeters?

You illustrated this in your previous post...

  WHERE  Time_Display_Startdate > #createODBCdate(dateAdd("d",-10,now()))#
  AND tCategory = #URL.category_ID#   <===================


> , I would create another 'dynamic' query to further query the results of the original query in the application.cfc page and 'plug-in' the URL parameters as a final 'filter'? How do I do this step?

Yes and No.

You need a query to fetch the details of the 4 ad IDs and be able to display them on the page.    You already have the 4 ids of the ads you're going to show, so there is no further filtering... you can't filter down the 4 IDs, those are the ones you're going to use.

Author

Commented:
Oh..I think I may be seeing what you are saying.....I will need a little time to digest your logic.

So, because the original 'Master' query sits on the application.cfc file, each time the page is refreshed, because of the OnRequestStart function, it will run the query on whatever page the user is hitting. However, if I did plug into the 'master' query the URL.parameter requirement, what if the URL paremeters were sometimes different on some of the site pages? For example, sometimes it uses URL.category_id , and sometimes it uses URL.destination_ID, or even some other URL parameter. Because of this, I would not be able to just hardcode a specific URL parameter into the 'master' query located inside the application.cfc file. Because if I did, than it will throw an error if it was looking for the URL.category_ID when the requirement for the page was URL.destination_ID?

I feel bad for bugging you so much this whole day. Sorry.
CERTIFIED EXPERT
Commented:
Ok, for some reason, I was thinking that the list of ads chosen was going to be by a user's session, not on a page-by-page basis.   Now that that's more clear, the conversation is all fairly moot.    The option (3) that we ruled out, is the option you need, because you HAVE to fetch the ads for every page view since one page view will be different from another page view.

Now, it's as simple as

select top 4 ...
  from (your ads table)
where --- your category matches page category

then display those ads...

We can forget about the session and application variables storing everything because we have to do a query every time.
CERTIFIED EXPERT
Most Valuable Expert 2015

Commented:
Looks like you guys have been busy since yesterday ..

( I'm in agreement so far, so I'm in lurking mode right now ;-)
CERTIFIED EXPERT

Commented:
Lol, well you had the night shift yesterday

Author

Commented:
Hi guys,

Sorry, I was away this afternoon. Just got back.

As from the last reply:

"Now, it's as simple as
select top 4 ...
  from (your ads table)
where --- your category matches page category

then display those ads...

We can forget about the session and application variables storing everything because we have to do a query every time."

Does this mean that the whole discussion we had about this topic cannot work if the queries have to be dynamic as I have to fetch the ads for different pages? I am thinking this way as your solution made use of session variables inside the application.cfc file.

Hope you guys are having a good weekend so far.
 
CERTIFIED EXPERT

Commented:
> if the queries have to be dynamic

Let's clear up the definition of dynamic.  Every query we discussed in this thread is dynamic because it has some variable that changes.   A static query would have no changes such as pulling a list of categories or roles for a select tag.


> Does this mean that the whole discussion we had about this topic cannot work if the queries have to be dynamic FOR EACH PAGE as I have to fetch the ads for different pages?

Right, there is no sense in storing the ad IDs in session or application variables to be used on future pages if you just have to pull a different list of IDs on the next page.

> I am thinking this way as your solution made use of session variables inside the application.cfc file.

No longer using session variables.

Just fetch the 4 Ads you need for the current page.. on every page


Author

Commented:
So if I do not use session varaiables, how do I 'cycle' through the Ads on each page to ensure that the ads displayed do not get repeated during each cycle? Would I than be just be having a fields like 'View-Count', 'Time_Displayed' and other necessary fields inside the main Ad_Table and somehow cause some kind of update mechanism to update these fields? In fact, I think, judging from the complexity of the whole discussion, I am thinking that if we do if this way, than each user will always be seeing different ads since each 4 ads will only be seen by 1 user, but this will allow the ads to be cycled very quickly...and I would be able to 'keep track" of how many times each ad was 'viewed'...For the query on each page itself, I would than think of caching that query for maybe like 5mins, so that the hits on the server will be reduced to maintain server performance.

I know I have dragged this one out long enough, and trust me, I feel very bad in constantly badgering you with my questions about this. You guys have hung on for so long trying to help me make this vision work. I was just hoping to be able to creating a feasible working solution.

Thank you sincerely.
CERTIFIED EXPERT
Commented:

I guess you need to decide your requirements, you can' have all these different things at once.   If your ads are going to be fetched depending on the page content, then that pretty much defines how you are going to do things.    If you want to change the requirement, we can explore the other options.

We've done a lot of talking about improving performance, but it's likely waaaay too premature.  If you're getting over 10 or 20,000 page views a day then we can revisit performance.

But give your requirements, the nano second that it will take to fetch 4 ads that are applicable to the page seems to be the most logical solution.   The code still resides in the application.cfc file so you can modify it in a central location when the site goes viral.

Simply update the view count and last touch as described above for the 4 ads, that will keep pushing them down the queue and allow you to show the oldest ones first.    But it's not that relevant to keep track of what ads have been seens since every page is going to show different ads based on category.

Author

Commented:
Thanks for all your help and comments _agx_ and gdemaria. I do need some time to digest all that has been discussed the last few days, and will try to use your solutions. I am very appreciative of your time spent on this issue, and as stated, have been very impressed and happy with the support I have received here on EE. I am not sure how to grade on multiple solutions and hope I have done the correct selections.

Thank you very much again!

Explore More ContentExplore courses, solutions, and other research materials related to this topic.