Avatar of Member_2_1348041
Member_2_1348041
Flag for Ireland asked on

Fulltext Search Mystery Result

I'm performing a fulltext search like so:

SELECT *
  FROM CONTAINSTABLE([dbo].[FulltextDiary], ([DocumentText]), '"Shaun*"') CTT

It returns a whole set of results, but one of them seems a bit odd. I check the [DocumentText] column in the relevant table, and, indeed, the string "Shaun" is nowhere to be found, let alone anything starting with the string "Shaun". I also cannot find anything looking remotely like Shaun, e.g. Shawn, Séan, etc etc etc.

How can I find out what it was that led to a particular row being included in the search result?
Microsoft SQL ServerMicrosoft SQL Server 2005Microsoft SQL Server 2008

Avatar of undefined
Last Comment
Member_2_1348041

8/22/2022 - Mon
Guy Hengel [angelIII / a3]

I think it is not mentioned in the full text index anywhere where the indexed value was found actually.
so, without having the document/file itself, impossible to reproduce the issue ...
if it is a office document, anything in the file properties?
Member_2_1348041

ASKER
Hey Guy, long time .... :-)

No, it's plain text in the [DocumentText] column. Tis a mystery to confound....

Cheers


Pino
ASKER CERTIFIED SOLUTION
Guy Hengel [angelIII / a3]

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Member_2_1348041

ASKER
Ok... I think I know what the problem is. The text in the [DocumentText] column is the content of a .msg file. How it works, exactly, is irrelevant. Suffice to say that we extract the plain text from the .msg file and then store it in that column for full-text indexing.

The problem is that in our original run we extracted the COMPLETE text from the .msg. That included not only the text in the .msg file itself, but also ALL the text that was contained in ALL its attachments, and lo and behold there is the string "Shaun" in one of the attachments.

However, we since fixed the procedure and re-ran the process over all the .msg files, and now it only extracts the body text, leaving the attachments alone. So the [DocumentText] column now no longer contains the text "Shaun".

So the real issue is: why has the fulltext index not updated itself?
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
Guy Hengel [angelIII / a3]

the full text index is not updated real-time, but only on intervals.
check up the full text indexing jobs related to the full text index(es), normally configured by the full text index catalog
https://msdn.microsoft.com/en-us/library/ms142575.aspx
Member_2_1348041

ASKER
I had change tracking switched on, but I found an error in the LOG, so I ran

ALTER FULLTEXT INDEX ON [dbo].[FullTextDiary] RESUME POPULATION

and for good measure I've also done

ALTER FULLTEXT INDEX ON [dbo].[FullTextDiary] SET CHANGE_TRACKING AUTO

I guess time will tell whether that will lead to it updating the index. If not, I will drop and re-create.
Member_2_1348041

ASKER
The only thing that I'm a little worried about is that it told me "Warning: Full-text auto propagation is currently enabled for table or indexed view" when I ran the second statement. Is that cause for concern?
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Member_2_1348041

ASKER
Also.... [dbo].[FullTextDiary] is actually a view. Please tell me that Change Tracking still works in that situation?
Member_2_1348041

ASKER
Well... it seems to be doing something. When I run this

SELECT COUNT(CASE WHEN CTT.[KEY] = 5256811 THEN 1 ELSE NULL END) AS [StillThere],
       COUNT(CASE WHEN CTT.[KEY] = 5256811 THEN NULL ELSE 1 END) AS [TheRest]
  FROM CONTAINSTABLE([dbo].[FulltextDiary], ([DocumentText]), '"Shaun*"') CTT

Open in new window


(the offending row was one with key value 5256811)

The TheRest value is steadily decreasing now. I am guessing that at some point StillThere will also go back to 0.
Member_2_1348041

ASKER
Yup. That sorted the problem. Gee whiz.
All of life is about relationships, and EE has made a viirtual community a real community. It lifts everyone's boat
William Peck
Member_2_1348041

ASKER
Here's a few poinks for old time's sake :P