I am using count(1) not * sorry for the typo...
Main Topics
Browse All TopicsHi,
We are having too slow response times for a simple count on a fulltext in SQL Server 2005. Even using the most simple query on a good server (64 bit Dual Opteron 4GB Ram with high speed 16 raid disk storage)):
select count(*) from content_books where contains(searchData,'"engl
Takes 4 seconds to count the avg 300.000 results
I have removed all the joins with real table data so that the query is only inside the fulltext engine..
I would expect this to be down to 10 milli seconds. Isn't it just getting the size of the "english" word result index?
It seems the engine is going through all the results because if a do a more complex search that returns less results the performance is better.
Any clues of how to do this faster? We never need to read the thousands of records BUT we ALWAYS need to count them. The 4 seconds of CPU/IO usage can't go live as it won't scale up.
Has anyone gone through this?
Thank you very much,
Filipe
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
<<<
We never need to read the thousands of records BUT we ALWAYS need to count them.
<<<
However using the count property is also reading trough the record the difference is you just only display the count. Therefore if you are counting a large amount of data the speed will also get affected.
BUt you can try to use the GROUP BY might speed up the query...
select count(field_ id) from content_books
GROUP BY fieldname
where contains(searchData,'"engl
or
select count(field_ id), commonfieldname from content_books
GROUP BY commonfieldname
where contains(searchData,'"engl
Please note that in each commonfieldname, there will be a seperate count. Assuming that there are few common field this might work.
HTH.
One of most common problems on VLDB when users try to exact count huge values, for example your "english" word occured 284,327 times among 100mln entries.
This is almost always nonsense, very resource-consuming operation.
You can speed-up all searches to one second, when only returns "More than 10000" occurences, provided by SELECT TOP (N) clause.
Hi,
Thank you for the replies.
Cecrid D: I cannot do a work-around and the "more than N" can only be shown if i can know how many there are in a fast way, and that is THE problem...
Cecrid D & gladxml: the suggestions you make will INCREASE cost of processing this query (when testing it might not seem so because of internal SQL Server cache).
To improve it i need to get the info directly from the fulltext catalog. Extracting info from it will always have more cost and won't scale up.
See you!
Not to start a contraversy that is totally beside the point, but you will not find any difference between doing COUNT(*), COUNT(1), COUNT(YourColumnName). The SQL Server optimizer is intelligent enough to know what you want. That is not to say that it is always better to do IF EXISTS (SELECT 1 FROM ...
But now for the real problem. Have you done a Full Population recently?
If the answer is yes, than there is not much more you can do. It is sequentially counting the returned results. Incidentally, there is a limit as to the total number of rows returned when using Full-Text Search. With SQL Server 2000 it was only 20,000, but this max has been increased with SQL Server 2005
>> Cecrid D: I cannot do a work-around and the "more than N" can only be shown if i can know how many >> there are in a fast way, and that is THE problem...
when you learn SQL storage fundamentals well you'll be surprising that counting of "more than X" can be many times faster than determining exact value of COUNT(*) :)
because: to determine exact COUNT server should read much more data, but for TOP N it stops counting after N matches.
have you tried creating a Clustered Index?
http://www.devx.com/dbzone
or possibly shringking the field sizes to make the database smaller, thus quicker easier to sort because of using less cache?
Hi,
I don't know your experience with fulltext, but i it seems to me that you are missing the point. Please read this http://www.microsoft.com/t
Resuming, there is a boundary between the fulltext catalog and the sql server so to have the top performance that i need i am looking for some additional parameters to pass to the fulltext, not inside sql server 2005. These
The optimization i did so far (althoug it has a huge cost on disk space) that is giving better results is to create a seperate table with the same data (?) only for the search. I don't know why, but IO and CPU usage is 1/10 less than with the original table. We are still trying to figure out what is hapenning (yes we rebuilt all indexes from scratch).
I am still looking for the solution and it should be in the area of "telling" the fulltext engine that we only need the count (inverted index UNSORTED queries with count is of milliseconds for millions of results).
Thank you again.
Now back to the question. Have you implemented any of the recommendations from the link you posted? If so, what were they and were there any improvements?
My guess is that you are now down to bigger and better hardware, I cannot possible see how you can significantly improve the performance to retrieve a count over so many rows. This is no different than getting a count on a large table withour any indexes.
@acperkins:
Sometimes writen words sound "agressive" but they aren't (or where not when i wrote them :-) ).
Back to the subject:
We are testing the query recommendations found here: http://technet.microsoft.c
The ones in the first document i posted that can be applied to our case had already been done/tested.
"This is no different than getting a count on a large table withour any indexes."
I hope this is not true (for the sake of IO :-) ). For UNSORTED queries (i am only asking for a count) the fulltext engines usually do reverse index matching that is VERY high performance (pure pointer navigation on "compressed" arrays).
Thks.
>>The ones in the first document i posted that can be applied to our case had already been done/tested.<<
The two that I found interesting were using CONTAINSTABLE instead of CONTAINS and partitioning the table into multiple tables and then doing a UNION ALL to get the total.
>>I hope this is not true (for the sake of IO :-) ) <<
Try it some time. :)
<quote>
From the original link you posted, this is the part that you may want to highlight:
Full-text query performance is roughly proportional to the number of matching keys returned from the catalog. On a large catalog, querying for words that are only rarely present in the indexed column returns fewer keys than querying for words that are contained in many columns, and accordingly the queries that return fewer matching keys generally perform better.
</quote>
Well, but not exactly :) even for perfect indexing case, to determine N=535,456,735,986 count server should read LOG2(N) ~ 26 pages, but to determine "More than 500,000,000,000" it should read only one page...
Anyway, 26 times faster, and for non-perfect index matching (for ex LIKE '%WORD%') it can be thousands times.
Business Accounts
Answer for Membership
by: gladxmlPosted on 2008-02-15 at 08:08:30ID: 20903002
try to count only one field and do not use *.
ish"')
Try to use the SQL below where field_id is an index in your table
select count(field_ id) from content_books
where contains(searchData,'"engl