[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 736
  • Last Modified:

Scrabble Match

Hi, I'm trying to write a regular expression that will search a word list for available words based on letters given (think scrabble).

Suppose my letters are "scuton", how would I match the word/words that contain as many of these letters are possible? Remember a letter can be used only once. Also, what if I have two s, such as "scutons", how would I match that case too, now s can be used twice.

Does that make sense?

Thanks,
Brian
0
BrianGEFF719
Asked:
BrianGEFF719
  • 3
  • 2
1 Solution
 
ozoCommented:
you won't know how many are possible until you search the entire list, so the best you can do with a regular expression is to look for a word containing a certain number of letters.  when you succeed with one number and fail at another number, that is the mist possible.
If you want to match a words that contain no other letters, that would be
\b[scuton]+\b
If you want to contain at least <number> of those letters, where it can also contain other letters, that would be
([scuton].*){<number>}
If you want to exclude words containing one of those letters twice that would be
(?!.*([scuton]).*\1)
If you want  exclude words containing 3 or more s
(?!(.*s){3})
0
 
Bill PrewCommented:
This feels more like a permutations problem than a regex problem.  There's no way to represent permutations in regex that I know of.  You can certainly generate a regex that handles all permutations, for example if you had the letters abc then this would match words in a dictionary using those letters.

^(a*b*c*|a*c*b*|b*a*c*|b*c*a*|c*a*b*|c*b*a*)$

But the length of the expression will increase exponentially with the number of characters.

~bp
0
 
ozoCommented:
^(a*b*c*|a*c*b*|b*a*c*|b*c*a*|c*a*b*|c*b*a*)$
is equivalent to
^(?!.*(a[^a]+a|b[^b]+b|c[^c]+c|[^abc]))
which increases linearly with the number of characters.
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 
ozoCommented:
although strictly speaking, (?!...) is not a regular expression,
even if it is recognized by many programs that process regular expressions.
0
 
Bill PrewCommented:
Yes, I was trying to use the most basic regex syntax for my example, but do appreciate that things like Perl etc offer extensions to that.

~bp
0
 
BrianGEFF719Author Commented:
Thank you for your help thus far. Here is what I'm doing using grep:

 brian$ cat MASTER | grep -E '\b[nsfroeak]+\b' | grep -Ev '\b.*(.).*(\1).*\b'

In this case my letters are nsfroeak, and I can have no repeated letters, if I can repeat an e, I am doing:

 brian$ cat MASTER | grep -E '\b[nsfroeak]+\b' | grep -Ev '\b.*([^e]).*(\1).*\b'

Can this be improved in anyway? I'm having troubles with the wildcard, meaning a tile that can be used for any letter. Is there anyway to integrate one "wildcard" letter into this regular expressions?


Thanks.
Brian

0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now