Do not use on any
shared computer
July 24, 2008 01:29pm pdt
 
[x]
Attachment Details

Anagram Script taking too long/dying while solving big words

Hi,

I'm trying to make a simple anagram solver, obviously using the already existing modules. I'm using Games::Word and Games::Word::Wordlist. A very simple version is online at http://www.naveeng.com/fun/anagram.html. Its working fine except cases where a big word, i.e more than 8-9 characters is being provided. The process running the script stays in the memory for an awful lot of time, taking 99.9% of CPU usage, due to which it is being killed by the webhost. However, for 6 or even 7 characters, script gets successfully executed in seconds, hardly taking any CPU usage.

Also, the module Games::Word had a loop of the format for (0..upper limit) to calculate the permutations of the given string by calling the factorial() function of the module Math::Combinatorics , which was giving the error "Range iterator outside integer range" when a big word was given. However, I changed the loop to the format for (my $i=0;$i <= limit; $i++; ), after which it does not give that error. I suppose it is a system limitation.

Any pointers on which part of the whole process might be the reason for the script taking so long and eventually dying? I understand that with high number of characters, the factorial is huge and overheads increase, but here something else is wrong i guess.

Or if anyone can suggest a method to debug the issue, and/or a better alternative to go about solving anagrams?

I know that writing everything from scratch without using any module, is a possible solution, which I will do once all hopes on this one fail. Just wanted to know the probability of facing the same roadblocks in that case too.

Regards,
Naveen
Start your free trial to view this solution
[x]
The Solution Rating System

With so many solutions, how can you tell which solutions are most likely to help you and which ones are not? To provide you with a tool to use, we rate our solutions based on various elements that most accurately determine if a solution is a quality solution. To explain what factors affect the solution rating, here are the elements we take into consideration when formulating our solution rating.

  • The Grade of the Solution
  • The Zone Rank of the Expert Providing the Solution
  • The Number of Author and Expert Comments
  • The Number of Experts Contributing
  • The Feedback of the Community

Your Input Matters
Because of the way the system is set up, the most important variable in this equation is you. As a member of Experts Exchange, you are able to cast your vote on the quality of the solutions in regard to how complete, accurate, helpful and easy to understand each solution is. When you provide your feedback, each rating is adjusted accordingly. So, if you see a solution that has a poor rating that you think is a good solution, let us know by rating it. As you do, the rating will be adjusted and will become more accurate for other members of our site.

If you have any suggestions that you would like to make for our rating system, please ask a question in the Suggestions Zone of Community Support.

Thank you!

Question Stats
Zone: Programming
Question Asked By: nkrgupta
Solution Provided By: ozo
Participating Experts: 2
Solution Grade: A
Views: 0
Translate:
Loading Advertisement...
 
[+][-]Expert Comment by fredradford
Expert Comment by fredradford:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
[+][-]Author Comment by nkrgupta
Author Comment by nkrgupta:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
[+][-]Expert Comment by ozo

Rank: Genius

Expert Comment by ozo:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
[+][-]Author Comment by nkrgupta
Author Comment by nkrgupta:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
[+][-]Accepted Solution by ozo

Rank: Genius

Accepted Solution by ozo:

All comments and solutions are available to Premium Service Members only.

Start your 7-day free trial and see for yourself why Experts Exchange is the easiest and most proven technology resource in the world. Get Started

Already a member? Login to view this solution.

 
 
Loading Advertisement...
Open Discussion
Open Discussion
 
Comment by nkrgupta
Thanks a lot ozo! Your code is not only blazing fast, but also seems to provide words which are there in the dictionary, but strangely my earlier code didn't give them as solution!!

Just for academic interest, if anyone can throw light on what was wrong with my original code, that it died and didn't give all the words.

Thanks
 
 
Comment by nkrgupta
Also, can you please explain what's happening here:

if join('',sort split//,uc) =~ $r;

I've understood the rest.
 
 
Comment by ozo
there are 9864101 subpermutations of a 10 letter word which can take more time than the 13700  subpermutations of a 7 letter word
 
 
Comment by ozo
if $word is 'abcdefg'
$r will be qr/^A?B?C?D?E?F?G?$/
which is a regular expression matching
the beginning of the string
an optional A
an optional B
an optional C
an optional D
an optional E
an optional F
an optional G
the end of the string
join('',sort split//,uc)
will be a line from $dict converted to upper case with the letters sorted alphabetically, so for example
join('',sort split//,uc 'caged') would be 'ACDEG'
which matches /^A?B?C?D?E?F?G?$/
 
 
Comment by nkrgupta
Awesome. Thanks a ton :-)
 
 
Comment by ozo
which could be unnecessarily slow if there are a lot of repeated letters, since
E?E?E?E?E?E?
may try to backtrack over all combinations of optional E's before giving up
It may be faster to convert that to E{0,6}
but a better speed up may be to pre process the dictionary so that it stores ACDEG along with caged so that you don't need to recompute join('',sort split//,uc) for each line
 
 
Comment by nkrgupta
sure, i'll now work on that as well :-)
 
 
20080723-EE-VQP-34 / EE_QW_2_20070628