Link to home
Start Free TrialLog in
Avatar of bxglxbxglx2000
bxglxbxglx2000

asked on

I have very abnomal Fulltext Search mysql result:

I have very abnomal fulltext search mysql result, like in this image:

SELECT * FROM `music_file` WHERE MATCH(`parent_all`) AGAINST('18094'); ==> No result
SELECT * FROM `music_file` WHERE MATCH(`parent_all`) AGAINST('18095'); ==> Have result

While: 18094 and 18095 is the same in rows

User generated image
User generated image
Avatar of Vijaya Kumar
Vijaya Kumar

why cant't you try like query
Avatar of bxglxbxglx2000

ASKER

Please access this :

---REMOVED---

and run 2 SQL to compare:

1. Return extract result:
SELECT * FROM `music_file` WHERE MATCH(parent_all) AGAINST('18095');
2. Return wrong result:
SELECT * FROM `music_file` WHERE MATCH(parent_all) AGAINST('18094');

I can not understand why? a new bug?
If you're debugging a fulltext query, you should start with the myisam_ftdump tool:
https://dev.mysql.com/doc/refman/5.7/en/myisam-ftdump.html

...which should dump a bunch of information about your fulltext index.
On another note, I get the impression you might be using the FULLTEXT index to solve a problem that is better solved in another way.

It looks like you're trying to match parent IDs for a given music file. If that's accurate, then you should really have a separate table for defining those relationships. For example, create a 2 column table called "music_file_relationships" where one field is music_file_id and the other is parent_id (both integers), and index both fields separately. Then instead of storing a comma-separate ID list in string form  on the music_file record, just create the records in this new relationships table.

Then you can do a query like:
SELECT music_file_id FROM music_file_relationships WHERE parent_id=18094;

The performance from that should be MUCH faster, especially as your database grows.

MySQL can do a numeric comparison EXTREMELY fast, but searching against text is actually a little bit complicated to do, so it's much slower in comparison. Plus, a FULLTEXT index adds a lot of unnecessary overhead, and it's also less accurate. For example, even if you got your original query to work:

SELECT * FROM `music_file` WHERE MATCH(parent_all) AGAINST('18095');

...it would be matching against a substring. So if you did this query later on:

SELECT * FROM `music_file` WHERE MATCH(parent_all) AGAINST('18');

...it would ALSO pick up "18095" because it contains "18". To avoid false positives, you would have to match with delimiters like this:

SELECT * FROM `music_file` WHERE MATCH(parent_all) AGAINST(',18095,');
SELECT * FROM `music_file` WHERE MATCH(parent_all) AGAINST(',18,');

...which means that you have a bunch of extra commas sitting in the database for no reason.
@gr8gonzo: seem full text search work with seperate WORD, not work like "LIKE AS" command ?
so SELECT * FROM `music_file` WHERE MATCH(parent_all) AGAINST('18');  will NOT return 18095 ?

(Only like return SELECT * FROM `music_file` WHERE (parent_all) LIKE '%18%'  ===> will return 18095 )


I need do fulltext search because one item belong multi parent, not one parent to use command "parent = " like your guide!
(In fact : one product belong multi catagories)

- And I still can not fix my case
music_file--3-.sql
My guide assumes that one item has multiple parents. That is why I suggested a separate table - to hold the relationships between items and parents. This way of separating the DB structure is called normalization.
ASKER CERTIFIED SOLUTION
Avatar of bxglxbxglx2000
bxglxbxglx2000

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@bxglxbxglx2000
First, please do not go out of your way to insult other people. I removed your initial sentence.

I had presumed you had read the documentation where the 50% threshold is stated:
https://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html

...and it seemed highly unlikely that over 9500 of your rows (your screenshot indicated you had over 19,000 rows) all had this same parent ID, especially considering you're dealing with something called "music_files" and there's only a handful of artists who have ever recorded more than 1,000 songs, so I did not bring this up.

Second, we provided you with a far more efficient long-term solution, which was a normalized structure. If you keep going down the path of using a full text boolean search to find IDs, it will not perform as well as a normalized structure, and it will be harder to manage as the database grows. So if you don't like that advice, then that's your choice, but you never responded any further to it.