Solved

Search string for whole words in Oracle

Posted on 2011-02-22
24
651 Views
Last Modified: 2012-05-11
Hi,

I am looking for the most simple and down-to-earth formulation of an SQL search query for whole words in a stored string field.

The input will be a single word, e.g. 'foo'. And the string stored in the DB will be a semicolon-delimited string that might contain the single search word.

How should I write the SQL so that I will get a hit if the field looks like
'foo; foo1; foo2'
...but NO hit if the field looks like
'foo1; foo2''
?

Obviously, this won't work:
SELECT *
FROM thetable
WHERE thesemicolonstring LIKE '%foo%'
...since that will get me a hit in both cases.

Thanks,

Cos
0
Comment
Question by:thecosimist
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 10
  • 7
  • 4
  • +1
24 Comments
 
LVL 77

Accepted Solution

by:
slightwv (䄆 Netminder) earned 150 total points
ID: 34951119
If you can guarantee the spacing:

select * from thetable where ';' || thesemicolonstring || ';' like '%;foo;%'
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 34951140
You might also research Oracle Text indexes.  They require a little more maintenance because of how they work.  You will want to optimize them on a regular basis.

create index thetable_idx on thetable(thesemicolonstring) indextype is ctxsys.context;

Then you can do:
select * from thetable where contains(thesemicolonstring,'foo) > 0;
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 34951226
where regexp_like(thesemicolonstring,' ?foo;')
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 74

Expert Comment

by:sdstuber
ID: 34951238
or, if the semicolon might not be present at the end

where regexp_like(thesemicolonstring,' ?foo;?')
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 34951242
in your example you have spaces after your semi-colons,  if that's not guaranteed then

where regexp_like(thesemicolonstring,'[ ;]?foo;?')
0
 

Author Comment

by:thecosimist
ID: 34951245
Thanks!

However, the semicolon-delimited string cannot be guaranteed to have semicolons on each side of the values. So two values may be stored like this: 'foo; foo1'

For the other solution - this SQL is going to be used in a system that is not maintained by very database-savvy people. So preferably nothing that needs regular optimizing.

Aren't there special characters that can be used for quasi-regexp purposes? Like the underscore character '_' or percent '%'.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 34951253
or maybe this would be better

where regexp_like(thesemicolonstring,'([ ;]?foo;)|([ ;]?foo$)')
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 34951260
>>However, the semicolon-delimited string cannot be guaranteed

This is why I pre-pended and appended my own semi colon in the very first post.
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 34951270
>>'([ ;]?foo;)|([ ;]?foo$)

Won't this also hit:  many_foo
0
 
LVL 74

Assisted Solution

by:sdstuber
sdstuber earned 100 total points
ID: 34951277
jeez, not enough caffeine yet this morning,
not sure why I was trying to make the space and semicolon optional


where regexp_like(thesemicolonstring,'(^foo[;])|([ ;]foo;)|([ ;]foo$)')
0
 

Author Comment

by:thecosimist
ID: 34951301
sdstuber,

I tried your solution but realized that a search for "foo" will result in a hit where thesemicolonstring is just "foo" but also if it's "foo1", i.e., if there is just one value and no semicolons.

So to clarify, a search for 'foo' should return a hit if thesemicolonstring looks like this:
'foo'
- or -
'xyz; foo; abc'

...but result in NO hits if the thesemicolonstring looks like this:
'foo1'
- or -
'foo1; foo2'
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 34951322
Did you not try my LIKE select above?
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 34951327
I would still suggest Oracle Text if you have a large amount of data to search through.  It is what it was created to do.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 34951342
>> I tried your solution

which one?  I posted several

anyway, I forgot the simplest case of just 'foo'
also, you never mentioned the space-semicolon,  is that reliable or not?


 WHERE REGEXP_LIKE(thesemicolonstring, '^foo$|(^foo[;])|([ ;]foo;)|([ ;]foo$)')
0
 

Author Comment

by:thecosimist
ID: 34951347
:-)

Thanks for the answers!... I think I've had too much caffeine actually.

However, sorry for being picky, but I would prefer to not use the advanced (and neat!) regexp solution you posted sdstuber, since the system is used by people who are not very good in Oracle/advanced SQL, or very good at reading regexps. Is there a simpler solution?

If there's no simpler solution, that's an answer as well of course. Part of the question is to find out what's best practice, and curiosity about the simplest possible SQL.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 34951401
simplest - use the concatenated semicolon string suggested by slightwv, unless the ";" vs " ;" isn't reliable

next - use text indexes, also suggested by slightwv

other suggestion, use a function that will parse yoru string into a collection and check for membership in the collection (search EE for str2tbl)
0
 

Author Comment

by:thecosimist
ID: 34951408
Ooops, I misread your first suggestion, slightwv. That will work perfectly. Too much caffeine, indeed. However, I will also have use for the regexp you posted, sdstuber.

Since I'm kinda new to experts-exchange, is it possible to accept two posts as solutions? I remember trying it once before and failing miserably.

Thanks a lot for the fast and reliable input!!!

/Cos
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 34951417
another option, DON'T use a concatenated string,  instead split the values and put them into a child table keyed back to that row
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 34951430
yes you can accept multiple solutions and divide the points however you want among all accepted answers
0
 
LVL 77

Expert Comment

by:slightwv (䄆 Netminder)
ID: 34951435
>> but I would prefer to not use the advanced

This would be Oracle Text.  It's a simple 'contains' query.

>> is it possible to accept two posts as solutions?

Yes.  Click the 'Accept Multiple Solutions link.
0
 
LVL 20

Expert Comment

by:gatorvip
ID: 34951467
How about replacing punctuation with a space, then searching with INSTR (LIKE should work too):

instr(
   translate( ' '|| lower(thesemicolonstring), ';', ' '),
   ' foo '
)>0

Due to the space before and after 'foo', neither foo1 nor foo2 will be returned.
0
 
LVL 74

Expert Comment

by:sdstuber
ID: 34951472
gatorvip,  that doesn't find "foo"
0
 
LVL 20

Expert Comment

by:gatorvip
ID: 34951507
True, would have needed another space at the end. I should have refreshed the page before posting as I missed a bunch of replies!

instr(
   translate( ' '|| lower(thesemicolonstring) || ' ', ';', ' '),
   ' foo '
)>0
0

Featured Post

PeopleSoft Has Never Been Easier

PeopleSoft Adoption Made Smooth & Simple!

On-The-Job Training Is made Intuitive & Easy With WalkMe's On-Screen Guidance Tool.  Claim Your Free WalkMe Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Working with Network Access Control Lists in Oracle 11g (part 1) Part 2: http://www.e-e.com/A_9074.html So, you upgraded to a shiny new 11g database and all of a sudden every program that used UTL_MAIL, UTL_SMTP, UTL_TCP, UTL_HTTP or any oth…
I remember the day when someone asked me to create a user for an application developement. The user should be able to create views and materialized views and, so, I used the following syntax: (CODE) This way, I guessed, I would ensure that use…
This video shows syntax for various backup options while discussing how the different basic backup types work.  It explains how to take full backups, incremental level 0 backups, incremental level 1 backups in both differential and cumulative mode a…
This video explains what a user managed backup is and shows how to take one, providing a couple of simple example scripts.

738 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question