Link to home
Start Free TrialLog in
Avatar of Synsthe
Synsthe

asked on

Threaded engine and number of posts

Well I've run into a problem it appears. I spent quite awhile working out a threaded model for a comments system I'm launching on my site, that would let me grab entire trees et al with single queries. What I've come up with works awesome.

All but for one thing. I want to inform everybody of how many posts exist beneath a certain thread. I do *not* want to do additional queries for each top level post just to see what's all beneath it. I want to figure out a way to do this with the original query, or at most with one additional query.

A general idea of how my threaded model works is that all top level posts are given an id of XXXXX - every level of thread below that is added on to that id. One level deep would be XXXXX.XXX, two levels deep XXXXX.XXXXXX and so on. This is a very rough explanation of course, since I'm not providing very detailed examples, however it should be enough to work on.

You can poke around the system as I build it at http://www.projectlinux.org/results.phtml - please *expect* to run into problems if you are playing with it, it's in beta and I'm constantly adding things and ripping things out right now. =)
ASKER CERTIFIED SOLUTION
Avatar of gravity
gravity

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of gravity
gravity

Incidentally, you can check out phorum @ http://www.phorum.org .
Avatar of Synsthe

ASKER

Yup I know about Phorums existence, and I've played with it a little - if you look at the url I provided, you'll notice what I'm doing isn't a regular type of forum set up. Phorum is also, in this coders opinion, a hack job, with very little security. It's got promise, but it's got a long ways to go.

As for your previous comment, that may be something I can look into.. I could grab all the sub levels first and use them to create an array with the counts in them. I'll give that a try in a moment here. That requires two sql statements on the page, which is something I could live with - if anybody can think of a way I can keep it at one query however, do let me know.
I must say that I'm not too keen on the way the forum has been written... I think that it would be better to have all the threads available to one post, as opposed to just the ones on that level.

Two SQL statements ? Not quite... one for to get the top-level posts, and then x statements according to how many posts there have been below that. However, unless you're going to be dealing with 5000+ posts on one page, MySQL is fast enough to not slow it down too much.

If you want a speed comparasin, my stats page for my poor excuse for a web site currently has around 80,000 records in it with which I do a few calculations. It takes around three seconds more than normal to load... and that's for quite a few records. It's at http://www.atari.co.uk/stats.html .
Avatar of Synsthe

ASKER

Heh, don't go beating up my forums just yet, they're far, far from done. I'm going to be adding the ability for people to view it all collapsed, fully threaded, or in flat/linear fashion. I'll also add the entire thread, from top level down, to the view posts pages, with some sort of indicator of "you are here". Various other navigational constructs will be added as well. I'm just trying to determine a way to get all posts below a certain level for the collapsed view.

My site revolves around an SQL database, and I use as little statements as possible. Doing an additional query for each top level post to see what's underneath it, is in my opinion, death. If enough people hit the site on pages with large amounts of posts in the future, they'll bring the server to its knees. Two SQL queries is my tops, one is best.

Here's what I do, I think, I haven't tried it yet:

first statement:
SELECT pin FROM comments_poll WHERE pin LIKE '%.%';

That will get all non-top level posts. From there I have to iterate through each with a while loop and create an array:


while ($postdata = mysql_fetch_array($result)) {
$currentpin = $postdata[pin]
$countarray[$currentpin][count] = $countarray[$currentpin][count] + 1;
}

Then when going through the second time and getting the top levels I can reference it via $countarray[$pin]][count] and get the number of messages below itself.

I'm not sure how the above code snippet will format itself, and it's of course untested code, but it may yet work. I guess I should go try it out.
Sorry, but this is one I'll argue to my death ;)

MySQL is a superbly efficient database, IMHO the best free one going which also beats some of the more commercial ones. Most of the speed differences are just down to search algorithms which cost £££ so...

Anyway, MySQL will not slow down your site even if you get a few thousand requests within a few seconds of each other. Trust me. I work for the company (Planet Three Ltd.) which also runs the Christmas Island Domain domain (.cx).
As you can imagine the site , www.nic.cx, gets a *lot* of hits, yet it's all written (very craply, though) in PHP with a MySQL database. Speed is not an issue (though I wish it was... I did most of the code behind it, and it is all one big hack... I really want to re write the whole thing, but I'm not allowed ;(

Avatar of Synsthe

ASKER

Welll, I'll argue you back and forth as long as you want. ;) All the php on my site is now home brewed, but previously I was using phpPolls from phpwizard.net - this poll did about 12 queries on my front page each time it showed a poll (mine now does.. none. ;). I got pcworld-dotted one morning by a large amount of traffic which took my site down, and I didn't know why. It was then that I fixed the poll. Since then any large surges of traffic haven't been able to take my site down anywhere near as easily. So I just try to keep my queries to an absolute minimum.

Even if that wasn't the issue, it's still not bad practice to do things in as little queries as possible.

In this case I've accomplished it inside of two queries. You can see the results at http://www.projectlinux.org/results.phtml - note that the plus buttons don't work yet, that's coming up. However it is an accurate count of everything below each top level. Two queries. That's all.

Since you were involved while I worked it all out, I'll hand the answer over to your first comment.

Now to write a function to strip all html tags but a select few..
No worries... any more probs with PHP in general, feel free to mail me : tom@atari.co.uk .

BTW, I took a look at your site and I'm certainly interested in helping. I doubt I could put in much more than eight hours a week, but if you want help ...
Avatar of Synsthe

ASKER

Any help at all is appreciated. I keep having people come and go and promising to help, but it never really comes through. Everything that's there right now was done by me as a result.

There'd just be one major requirement.. ;) all database stuff is done in as little queries as possible. lol

I'll get a hold of you as soon as I have a free moment from working on this, and we can discuss it further in email.
Groovy, hear from you then !
Avatar of Synsthe

ASKER

WARNING: Ego has hit critical. Following post merely a side effect, we apologize for anybody who gets hit in the process.

Oh yeah, I rule. =) Additional queries for each post? I think not.. two queries as per my original solution? I THINK NOT. I just pulled it off with a *single* query. I should've thought of this a long time ago.

I just queried the database, grabbed all the data I needed, placed it in a multi-dimensional array and worked on that. I can grep back and forth, since I don't lose any data from the array, and as a result I'm able to do a lot more than before.
Oh you son of an illegitamate...


:)