KinnaerD
asked on
scan a dictionary for a certain pattern
What is the best/easiest way for scanning a dictionary when given a certain pattern - a pattern like "ABCD" should result in a list of words like 'meat' 'fear' 'word', where a pattern like "ABAB" should result in a list of words like 'haha' 'hoho' and so on... All help on pattern matching is very welcome.
Best regards,
Daniel
Best regards,
Daniel
ASKER
Hi J,
This ABC-thing is the pattern which a word follows. For example, the pattern for 'there' is 'ABCDC'. Another word that matches this pattern is 'where'. The idea is to use a dictionary (a very long list of words) which is scanned with a given pattern (like this ABCDC) and which returns the words that fit the given pattern here 'there', 'where' ...)
regards,
Daniel
This ABC-thing is the pattern which a word follows. For example, the pattern for 'there' is 'ABCDC'. Another word that matches this pattern is 'where'. The idea is to use a dictionary (a very long list of words) which is scanned with a given pattern (like this ABCDC) and which returns the words that fit the given pattern here 'there', 'where' ...)
regards,
Daniel
As long as you use no '*' wildcard in your pattern the length of the pattern and the length of candidates must be the same.
Apart from that you will have to check all words (assuming that the words are sorted alphabetically).
The match itself is a very interesting one.
If the above pattern description is complete then it describes only equal vs unequal positional char matches.
So transform the pattern in a list of compares.
ABAB transforms to
1<>2
1= 3
1<>4
2<>3
2= 4
3<>4
This list contains unneeded compares.
Optimize it if you encounter speed problems.
Add new <> compares by substituting the other position from the = compares. Normalize the
new <> compare by having the smaller position on the left side.
Now remove all <> compares which are not unique.
In the above sample you would get 2<>3 at least twice meaning that it is implied by the other compares.
I hope this optimization is correct.
Apart from that you will have to check all words (assuming that the words are sorted alphabetically).
The match itself is a very interesting one.
If the above pattern description is complete then it describes only equal vs unequal positional char matches.
So transform the pattern in a list of compares.
ABAB transforms to
1<>2
1= 3
1<>4
2<>3
2= 4
3<>4
This list contains unneeded compares.
Optimize it if you encounter speed problems.
Add new <> compares by substituting the other position from the = compares. Normalize the
new <> compare by having the smaller position on the left side.
Now remove all <> compares which are not unique.
In the above sample you would get 2<>3 at least twice meaning that it is implied by the other compares.
I hope this optimization is correct.
//Here is a function... Please note! Pattern is case sensitive, word is case insensitive...
//======================== ========== ========== ======
function match(Pattern,Word:string) :Boolean;
var
i:Integer;
j:Integer;
begin
Result := False;
if Length(pattern)<>Length(Wo rd) then Exit;
Word := AnsiUpperCase(Word); // Word is case insensitive, Pattern is case sensitive.
for i:=1 to Length(Pattern) do
for j:=1 to Length(Pattern) do
if (Word[j]=Word[i]) xor (Pattern[i]=Pattern[j]) then
Exit;
Result := True;
end;
//======================== ========== ========== ======
// That's ALL!!! Dmn. :)
//======================== ========== ========== ======
// p.s. Nice task. Nice solution. You can even optimize it...
//========================
function match(Pattern,Word:string)
var
i:Integer;
j:Integer;
begin
Result := False;
if Length(pattern)<>Length(Wo
Word := AnsiUpperCase(Word); // Word is case insensitive, Pattern is case sensitive.
for i:=1 to Length(Pattern) do
for j:=1 to Length(Pattern) do
if (Word[j]=Word[i]) xor (Pattern[i]=Pattern[j]) then
Exit;
Result := True;
end;
//========================
// That's ALL!!! Dmn. :)
//========================
// p.s. Nice task. Nice solution. You can even optimize it...
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Oop's. I'm so sorry, but "FOR I" has a little mistake too... More faster solution:
function match(Pattern,Word:string) :Boolean;
var
i:Integer;
j:Integer;
begin
Result := False;
if Length(pattern)<>Length(Wo rd) then Exit;
Word := AnsiUpperCase(Word); // Word is case insensitive, Pattern is case sensitive.
for i:=2 to Length(Pattern) do
for j:=1 to i-1 do
if (Word[j]=Word[i]) xor (Pattern[i]=Pattern[j]) then
Exit;
Result := True;
end;
function match(Pattern,Word:string)
var
i:Integer;
j:Integer;
begin
Result := False;
if Length(pattern)<>Length(Wo
Word := AnsiUpperCase(Word); // Word is case insensitive, Pattern is case sensitive.
for i:=2 to Length(Pattern) do
for j:=1 to i-1 do
if (Word[j]=Word[i]) xor (Pattern[i]=Pattern[j]) then
Exit;
Result := True;
end;
ASKER
Hi DMN,
>for i:=2 to Length(Pattern) do
> for j:=1 to i-1 do
> if (aWord[j]=aWord[i]) xor (Pattern[i]=Pattern[j]) >then
> Exit;
> Result := True;
thanks for this wonderful function! As you see, I changed the variable Word into aWord. I was fiddling with XOR as well, but your answer is the best. Thanks.
Daniel
>for i:=2 to Length(Pattern) do
> for j:=1 to i-1 do
> if (aWord[j]=aWord[i]) xor (Pattern[i]=Pattern[j]) >then
> Exit;
> Result := True;
thanks for this wonderful function! As you see, I changed the variable Word into aWord. I was fiddling with XOR as well, but your answer is the best. Thanks.
Daniel
(aWord[j]=aWord[i]) xor (Pattern[i]=Pattern[j])
is equivalent of:
(aWord[j]<>aWord[i]) and (Pattern[i]=Pattern[j])
or
(aWord[j]=aWord[i]) and (Pattern[i]<>Pattern[j])
Good luck!
is equivalent of:
(aWord[j]<>aWord[i]) and (Pattern[i]=Pattern[j])
or
(aWord[j]=aWord[i]) and (Pattern[i]<>Pattern[j])
Good luck!
SELECT Word FROM Dictionary
WHERE Word LIKE '*a*'
OR Word LIKE '*A*'
This will give you any word with a capital or lowercase A in it at any position. You can reiterate this for all the letters to search for.
J.