[Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 533
  • Last Modified:

Best way to store large number of keywords

I am working on a site where thousands of user-submitted URLs will be submitted and keywords (50-75 each) automatically generated for these URLs and stored in the database for use with targeted advertising. I need to be able to check with a mysql query if at least one keyword from an advertiser matches a keyword for a URL. What is the best way to store this large amount of data and access it at high speeds?
0
jls33fsls
Asked:
jls33fsls
  • 3
  • 2
2 Solutions
 
mstrelanCommented:
foreach url you could have a textarea of comma separated keywords. then setup a fulltext mysql index on the keywords column. then use the match ... against mysql syntax to rank the urls based on their keywords. let me know if you need more detail otherwise google "fulltext mysql index" and "mysql match against"
0
 
Steve BinkCommented:
The best approach is to compare numeric data.

tableKeywords
--------------------
id (PK,autoincrement)
keyword (varchar)

tableAdKeywords
-----------------------
id (PK,autoincrement)
keywordID (FK to tableKeywords.id)
advertiserID (FK to advertisers table)

tableURLKeywords
-------------------------
id (PK,autoincrement)
keywordID (FK to tableKeywords.id)
URLID (FK to URL table)

To find all advertisers hitting a keyword associated with a particular URL:

SELECT * FROM tableURLKeywords a INNER JOIN tableAdKeywords b ON a.keywordID=b.keywordID INNER JOIN tableKeywords c ON a.keywordID=c.id WHERE a.URLID=##

This means when you parse the keywords, you need to handle them as separate managed data points.  If they don't exist, add them.  If they do exist, track the ID field to add them to the URL or advertiser.
0
 
jls33fslsAuthor Commented:
routinet, thanks for the great explanation!  I have one final question.  For the FK's, do I put anything for an action on update or delete?
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
Steve BinkCommented:
Not unless you believe you have a need for it.  I only showed the FK to bring your attention to the relationships between the tables; you do not actually need to explicitly define the foreign relationships, and no enforcement is necessary.  If you're using MyISAM tables, those will only be implicit relationships anyways.  Do make sure you index the related fields, though.  It will speed things up quite a bit for you.
0
 
jls33fslsAuthor Commented:
So do you recommend using MyISAM or InnoDB?  Which will be faster with a higher volume?
0
 
Steve BinkCommented:
I normally use MyISAM tables unless I have a need for transactions or enforced/cascading relationships, which requires InnoDB, but that has nothing to do with a performance evaluation.  That's just me being lazy.  :)

Overall, InnoDB shows higher/better performance that MyISAM, and does so consistently.  There are some advantages to using MyISAM, but in high volume environments, InnoDB tends to scale better.  When the new Falcon engine debuted, I read this article:

http://www.mysqlperformanceblog.com/2007/01/08/innodb-vs-myisam-vs-falcon-benchmarks-part-1/

What I really pulled away from the article, aside from "wow, falcon sucks", was that InnoDB really nails MyISAM when it comes to scaling into large applications.  A related article I found when I started looking for more information for this response:

http://tag1consulting.com/MySQL_Engines_MyISAM_vs_InnoDB

Very slanted towards InnoDB, but the reasons for it have merit.  I believe it is a fair assessment of both engines.
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now