Avatar of diannagibbs
diannagibbs
Flag for United States of America asked on

Oracle TO_NUMBER from varchar2 datatype

I'm trying to write a select for a report to clean up bad data.  The source is varchar2 and the target is varchar2 but goal is to load into a number datatype.  I have used CASE for the most part but when I add the ELSE to give me the good numbers, it fails.  So, I have read that probably reading from bottom up so this will not work.  Any ideas?  I was also thinking spaces but I've tried TRIM, LTRIM,RTRIM, DECODE and TRANSLATE and nothing works.  Any direction appreciated.

SQL> select rowid DI_ROW_ID, NAME_FIRST_MIDDLE_LAST, PROVIDERS_DATA_DZ_COL,
  2  to_number(CASE
  3  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%A%') then 9999999999
  4  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%B%') then 9999999999
  5  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%C%') then 9999999999
  6  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%D%') then 9999999999
  7  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%E%') then 9999999999
  8  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%F%') then 9999999999
  9  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%G%') then 9999999999
 10  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%H%') then 9999999999
 11  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%I%') then 9999999999
 12  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%J%') then 9999999999
 13  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%K%') then 9999999999
 14  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%L%') then 9999999999
 15  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%M%') then 9999999999
 16  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%N%') then 9999999999
 17  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%O%') then 9999999999
 18  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%P%') then 9999999999
 19  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%Q%') then 9999999999
 20  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%R%') then 9999999999
 21  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%S%') then 9999999999
 22  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%T%') then 9999999999
 23  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%U%') then 9999999999
 24  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%V%') then 9999999999
 25  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%W%') then 9999999999
 26  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%X%') then 9999999999
 27  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%Y%') then 9999999999
 28  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%Z%') then 9999999999
 29  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%.%') then 9999999999
 30  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%?%') then 9999999999
 31  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%_%') then 9999999999
 32  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%/%') then 9999999999
 33  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%*%') then 9999999999
 34  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%@%') then 9999999999
 35  WHEN PROVIDERS_DATA_DZ_COL like UPPER('%-%') then 9999999999
 36  WHEN PROVIDERS_DATA_DZ_COL like UPPER(' ') then 9999999999
 37  WHEN PROVIDERS_DATA_DZ_COL IS null then 9999999999
 38  ELSE PROVIDERS_DATA_DZ_COL
 39  END) DI_PROVIDERS_FIX
 40  from DI_PROVIDERS
 41  where rownum < 2
 42  order by 2
 43  /
ELSE PROVIDERS_DATA_DZ_COL
     *
ERROR at line 38:
ORA-00932: inconsistent datatypes: expected NUMBER got CHAR



SQL> select PROVIDERS_DATA_DZ_COL from DI_PROVIDERS where rownum < 2;

PROVIDERS_DATA_DZ_COL
----------------------------------------
962399
Oracle DatabaseDB Reporting ToolsDatabases

Avatar of undefined
Last Comment
awking00

8/22/2022 - Mon
Sean Stuber

ELSE to_number(PROVIDERS_DATA_DZ_COL)


and then remove the to_number around the CASE
diannagibbs

ASKER
That resolved the error but I would think I would see my PROVIDERS_DATA_DZ_COL value in the fix field, not 9999999999?    Which is my goal.  The FIX_DATA field with all the nine's would be the data to correct.  I feel like I'm missing something simple.


DI_ROW_ID         |NAME_FIRST_MIDDLE_LAST                  |PROVIDERS_DATA_DZ_COL                   |  FIX_DATA
------------------|----------------------------------------|----------------------------------------|----------
AAZ9BIAL0AAAAMjAAA|Test Patient                      |962399                                  |9999999999
awking00

select case when regexp_replace(providers_data_dz_col,'[[:digit:]]','') is null
                   then 9999999999
                   else to_number(providers_data_dz_col)
           end fix_data
Experts Exchange has (a) saved my job multiple times, (b) saved me hours, days, and even weeks of work, and often (c) makes me look like a superhero! This place is MAGIC!
Walt Forbes
Sean Stuber

the regexp_replace above assumes your providers_data_dz_col is an integer if it's numeric.

rather than changing the value and looking at the result you could use regexp_like to simply evaluate it as-is


 case when not regexp_like(providers_data_dz_col,'[[:digit:]]')
                   then 9999999999
                   else to_number(providers_data_dz_col)
end

if your value might not be an integer then try this...


 case when not regexp_like(providers_data_dz_col,'^(\+|\-)?[0-9]*\.?[0-9]*$')
                   then 9999999999
                   else to_number(providers_data_dz_col)
end
flow01

In your original query the
WHEN PROVIDERS_DATA_DZ_COL like UPPER('%_%') then 9999999999
is the cause of the  9999999999 because _ means any single character in the like operation
you can change it to
WHEN PROVIDERS_DATA_DZ_COL like UPPER('%\_%') escape '\' then 9999999999

the regexp_like is of course a much nicer solution ,
though it lacks some natural  readability.
diannagibbs

ASKER
Thanks for all the great solutions, but I'm still not seeing what I would expect.  I would expect my PROVIDERS_DATA_DZ_COL  value to be in my FIX column.  So I'm guessing that there are spaces in this value and that is why it is not working correctly?  As I've mentioned, I've tried to use TRIM, LTRIM, RTIM, TRANSLAATE, SET TRIM ON, etc.  None of these seems to working for me.


PROVIDERS_DATA_DZ_COL                           FIX
------------------------------ --------------------
962399                                   9999999999
900324                                   9999999999
936633                                   9999999999
911941                                   9999999999
Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Sean Stuber

where/when are you applying trim

something like this?


case when not regexp_like(trim(providers_data_dz_col),'[[:digit:]]')
                   then 9999999999
                   else to_number(trim(providers_data_dz_col))
end
Sean Stuber

if you want to see invisible characters in that column use DUMP


select PROVIDERS_DATA_DZ_COL,dump(PROVIDERS_DATA_DZ_COL) from DI_PROVIDERS
where rownum < 10
diannagibbs

ASKER
LTRIM, RTIM example:
SQL> select
  2  PROVIDERS_DATA_DZ_COL,
  3  (case when not regexp_like(LTRIM(RTRIM(providers_data_dz_col)),'^(\+|\-)?[0-9]*\.?[0-9]*$\')
  4                     then 9999999999
  5                     else to_number(providers_data_dz_col)
  6  end ) FIX
  7  from DI_MIDAS_PROVIDERS where rownum < 5;

PROVIDERS_DATA_DZ_COL                           FIX
------------------------------ --------------------
962399                                   9999999999
900324                                   9999999999
936633                                   9999999999
911941                                   9999999999
I started with Experts Exchange in 2004 and it's been a mainstay of my professional computing life since. It helped me launch a career as a programmer / Oracle data analyst
William Peck
Sean Stuber

what does DUMP show you?
diannagibbs

ASKER
This DUMP is very, interesting and found my issue.  What would ^M,^J be?  Is that when you move data as binary instead of ascii?


PROVIDERS_DATA_DZ_COL                    DUMP(PROVIDERS_DATA_DZ_COL,1017)
---------------------------------------- -------------------------------------------------------------
962399                                   Typ=1 Len=8 CharacterSet=US7ASCII: 9,6,2,3,9,9,^M,^J
900324                                   Typ=1 Len=8 CharacterSet=US7ASCII: 9,0,0,3,2,4,^M,^J
936633                                   Typ=1 Len=8 CharacterSet=US7ASCII: 9,3,6,6,3,3,^M,^J
911941                                   Typ=1 Len=8 CharacterSet=US7ASCII: 9,1,1,9,4,1,^M,^J
9119587                                  Typ=1 Len=9 CharacterSet=US7ASCII: 9,1,1,9,5,8,7,^M,^J
915623                                   Typ=1 Len=8 CharacterSet=US7ASCII: 9,1,5,6,2,3,^M,^J
014159                                   Typ=1 Len=8 CharacterSet=US7ASCII: 0,1,4,1,5,9,^M,^J
13755                                    Typ=1 Len=7 CharacterSet=US7ASCII: 1,3,7,5,5,^M,^J
936369                                   Typ=1 Len=8 CharacterSet=US7ASCII: 9,3,6,3,6,9,^M,^J
Sean Stuber

here's a small example showing the code does work  note the "9" in the 1234 column

SQL> create table di_midas_providers (PROVIDERS_DATA_DZ_COL varchar2(50));

Table created.

SQL> insert into di_midas_providers (PROVIDERS_DATA_DZ_COL) values ('962399');

1 row created.

SQL> insert into di_midas_providers (PROVIDERS_DATA_DZ_COL) values ('900324');

1 row created.

SQL> insert into di_midas_providers (PROVIDERS_DATA_DZ_COL) values ('936633');

1 row created.

SQL> insert into di_midas_providers (PROVIDERS_DATA_DZ_COL) values ('911941');

1 row created.

SQL> insert into di_midas_providers (PROVIDERS_DATA_DZ_COL) values ('invalid');

1 row created.

SQL> insert into di_midas_providers (PROVIDERS_DATA_DZ_COL) values ('1234'||chr(9));

1 row created.

SQL> SELECT providers_data_dz_col,
  2         DUMP(providers_data_dz_col) dump_output,
  3         (CASE
  4              WHEN NOT REGEXP_LIKE(LTRIM(RTRIM(providers_data_dz_col)), '^(\+|\-)?[0-9]*\.?[0-9]*$\')
  5              THEN
  6                  9999999999
  7              ELSE
  8                  TO_NUMBER(providers_data_dz_col)
  9          END)
 10             fix
 11    FROM di_midas_providers;

PROVIDERS_DATA_DZ_COL                              DUMP_OUTPUT                                            FIX
-------------------------------------------------- -------------------------------------------------- ----------
962399                                             Typ=1 Len=6: 57,54,50,51,57,57                      962399
900324                                             Typ=1 Len=6: 57,48,48,51,50,52                      900324
936633                                             Typ=1 Len=6: 57,51,54,54,51,51                      936633
911941                                             Typ=1 Len=6: 57,49,49,57,52,49                      911941
invalid                                            Typ=1 Len=7: 105,110,118,97,108,105,100            9999999999
1234                                                  Typ=1 Len=5: 49,50,51,52,9                      9999999999

6 rows selected.

Open in new window

Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
Sean Stuber

^M = chr(13) =  carriage return

^J=chr(10) = new line
ASKER CERTIFIED SOLUTION
Sean Stuber

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
awking00

Strangely enough, if there is a linefeed but no carriage return, the expression will match, but the to_number function will fail.
SQL> select '-123.456'||chr(10) as prov_data,
  2  case when not regexp_like('-123.456'||chr(10),'^(\+|\-)?[0-9]*\.?[0-9]*$')
  3       then 9999999999
  4       else to_number('-123.456')  ==> without the linefeed
  5  end fix
  6  from dual;

PROV_DATA        FIX
--------- ----------
-123.456    -123.456

SQL> select '-123.456'||chr(10) as prov_data,
  2  case when not regexp_like('-123.456'||chr(10),'^(\+|\-)?[0-9]*\.?[0-9]*$')
  3       then 9999999999
  4       else to_number('-123.456'||chr(10))
  5  end fix
  6  from dual;
     else to_number('-123.456'||chr(10)) ==> with the linefeed
                               *
ERROR at line 4:
ORA-01722: invalid number

However, the replace function (could also use translate) will remedy that in any case. I just found that match somewhat surprising.
Sean Stuber

check  the previous post where I explicitly check for any combination of space, carriage return or new line at the end of the string
Your help has saved me hundreds of hours of internet surfing.
fblack61
diannagibbs

ASKER
This is exactly what I was looking for - thanks so much!!!!!
awking00

I did check the previous post, which is why I said the replace function will remedy that in any case. I should have said the replace function(s) that sdstuber employed will remedy that in any case. I was just pointing out the fact that I thouhght it was unusual that the regexp_like(expression,'^(\+|\-)?[0-9]*\.?[0-9]*$') would match any expression
that contained a linefeed as long as the rest of the expression could be converted to a number.