Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1014
  • Last Modified:

MySQL regexp / regular expressions help - scrabble solver

I'm trying to build a scrabble solver - I can make it work, but not the way I want it to - basically I get all words less than or equal to the length of the input string, then walk through the restultset in my php script checking against the letter array built from the input string.

I know this is a bad - and slow - way to do it, and I'm sure there is a way to do it just from the query, but I'm struggling with this.

The example below assumes the input string is ABCDE, so finds all words which are 5 characters or less and only use the letters ABCDE - but the two problems with this are it allows letters to be repeated (which needs to be dependent on the input string), and it can't handle wildcards.

SELECT WORD FROM WORDLIST WHERE CHARACTER_LENGTH(WORD) <= 5 AND WORD NOT REGEXP '[^ABCDE]'

It would be quick enough to get the resultset and then filter out any words that use letters more than once (except of course if the input string contained eg. more than one 'A' and this is allowed), but this doesn't solve the wildcard problem.

I've looked at the mysql reference and I'm sure there must be a way to do this, but I've tried what seems like every combination and I'm not getting anywhere.

Does anyone have any suggestions? Is this actually possible through mysql alone?
0
rockers07
Asked:
rockers07
1 Solution
 
akshah123Commented:
You should try following ...

SELECT WORD 
FROM WORDLIST 
WHERE WORD REGEXP '^[ABCDE]{5}$'

Open in new window

0
 
rockers07Author Commented:
Thanks, but that only selects 5 letter words, and doesn't solve the duplicate letters or wildcard problems...
0
 
akshah123Commented:
You will not be able to solve the duplication problem within the query.  That's just not possible unless you want to provide all possible combinations.  

IF you need to solve this problem multiple times, it might make more sense to have this data in memory instead.  

In any case, your only option is to take care of that in php.
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
rockers07Author Commented:
Duplication problem solved:

SELECT WORD FROM WORDLIST
WHERE CHARACTER_LENGTH(WORD) <= 5
AND WORD NOT REGEXP '[^ABCDE]'
AND LENGTH(WORD) - LENGTH(REPLACE(WORD, 'A', '')) <= 1
AND LENGTH(WORD) - LENGTH(REPLACE(WORD, 'B', '')) <= 1
AND LENGTH(WORD) - LENGTH(REPLACE(WORD, 'C', '')) <= 1
AND LENGTH(WORD) - LENGTH(REPLACE(WORD, 'D', '')) <= 1
AND LENGTH(WORD) - LENGTH(REPLACE(WORD, 'E', '')) <= 1

And change to <= 2 if the letter can appear twice etc.

Not sure that this is the most efficient method though - easy enough to process the duplications in PHP using substr_count(), and this is feasible if the result has already been filtered to the right length and characters - but it still feels like this should be possible through regexp.

I'm also still left with the wildcard problem.....
0
 
HonorGodCommented:
Use a GADDAG data structure...

http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol24/issue2/spe880.pdf

Which is a way of storing the entire (US English) Scrabble dictionary in such a way as to:

- consume lots less space
- make is very quick to search
0
 
ee_autoCommented:
Question PAQ'd, 500 points refunded, and stored in the solution database.
0

Featured Post

NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now