Link to home
Start Free TrialLog in
Avatar of ipaman
ipaman

asked on

Selecting a distinct column that is a CLOB datatpye

I have the following table:

create table foo(
base_id      number(12),
threat_id    number(12),
description CLOB,
Unique(base_id,threat_id))

My first attempt at my query is the following:

    select distinct description,threat_id from foo where threat_id=1234567;

...and I got the error:
   Inconsistent datatypes: expected - got CLOB

Since I know I can't select distinct on a clob datatype, How can I select from this
table such that I do not get duplicate descriptions?

ipaman
Avatar of Mark Geerlings
Mark Geerlings
Flag of United States of America image

The supplied PL\SQL package: dbms_lob has lots of procedures and functions that you can use on CLOB columns, but no, you cannot use most standard SQL commands on CLOB columns.
Avatar of BobMc
BobMc

Just a thought - do what Oracle does when you enter a query - hash the description into something more manageable, and put it in another column, and select distinct on that instead.

Assuming you arent updating these descriptions continuously, it will be much more efficient than scanning your CLOBs again and again.

I havent got a decent hash algorithm at hand, but a simple convert it all to ASCII character by character and add it up isnt a bad starting point.

HTH
Bob
A perhaps not-so-elegant solution (but one that works - if slowly) would be to take the first 4000 characters of your clob, cast it to a varchar2 and then do a distinct on that!  Fewer characters would be preferable, and this would only work if the differences do occur in the first 4000 characters.

SELECT DISTINCT CAST (SUBSTR(description, 0, 4000) AS VARCHAR2(4000)) AS description, threat_id
FROM   foo
Also, be careful with the "distinct" operator in Oracle queries.  If you have SQL Server experience, you may be used to including that in most queries out of habit. ***DON'T DO THAT IN ORACLE QUERIES!***  If you only select one column, then it is OK, but if you want to select multiple columns based on a distinct value in one (or more) columns, test the results and the response times thoroughly before you deploy this.  "Distinct" will always force a sort that you may not need.   Make sure that Oracle's implementation of "distinct" matches your knowledge of the data - the results may surprise you.  It usually works better in Oracle to use a group operator (min, max, avg, count, etc.) when you want a distinct result set.
ASKER CERTIFIED SOLUTION
Avatar of paquicuba
paquicuba
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial