jknj72
asked on
Non ASCII character
I have a function that cleans up NoN Ascii characaters but I came across this bad boy (┬╡)and when I try to run the function that does the cleaning of Non ASCII characters it prompts me for a replacement value for the word MARBLE?
Select clean_non_ascii('‘JOHNS TILE ┬╡& MARBLE INSTALLATION INC ‘') From Dual
The function looks like this
function clean_non_ascii(input_str in varchar2,
usetable in number default 0,
use90 in number default 0)
return varchar2 as
str VARCHAR2(100 BYTE); --Extra 2 Bytes for ^
str60 VARCHAR2(70 BYTE); -- Extra 2 BYTES for ^
act number :=0;
cnt number :=0;
askey number :=0;
OUTPUT_STR VARCHAR2(90 BYTE);
OUTPUT_STR_60 VARCHAR2(60 BYTE);
err_code NUMBER;
err_msg VARCHAR2(200);
begin
str:= '^'||TO_CHAR(INPUT_STR)||' ^';
cnt:=length(str);
if length(cnt) = 0 then
for i in 1 .. cnt loop
askey :=0;
select ascii(substr(str,i,1)) into askey
from dual;
if askey < 32 or askey >=127 then
str := '^'||REPLACE(str, CHR(askey),'');
end if;
end loop;
OUTPUT_STR := trim(ltrim(rtrim(trim(str) ,'^'),'^') );
else
OUTPUT_STR := INPUT_STR;
end if;
RETURN (OUTPUT_STR);
end clean_non_ascii;
Select clean_non_ascii('‘JOHNS TILE ┬╡& MARBLE INSTALLATION INC ‘') From Dual
The function looks like this
function clean_non_ascii(input_str in varchar2,
usetable in number default 0,
use90 in number default 0)
return varchar2 as
str VARCHAR2(100 BYTE); --Extra 2 Bytes for ^
str60 VARCHAR2(70 BYTE); -- Extra 2 BYTES for ^
act number :=0;
cnt number :=0;
askey number :=0;
OUTPUT_STR VARCHAR2(90 BYTE);
OUTPUT_STR_60 VARCHAR2(60 BYTE);
err_code NUMBER;
err_msg VARCHAR2(200);
begin
str:= '^'||TO_CHAR(INPUT_STR)||'
cnt:=length(str);
if length(cnt) = 0 then
for i in 1 .. cnt loop
askey :=0;
select ascii(substr(str,i,1)) into askey
from dual;
if askey < 32 or askey >=127 then
str := '^'||REPLACE(str, CHR(askey),'');
end if;
end loop;
OUTPUT_STR := trim(ltrim(rtrim(trim(str)
else
OUTPUT_STR := INPUT_STR;
end if;
RETURN (OUTPUT_STR);
end clean_non_ascii;
Can you not use regular expressions?
regexp_replace(input_str,' [^[:print: ]]')
regexp_replace(input_str,'
Just to demonstrate slightwv's solution -
SQL> select 'JOHNS TILE ┬╡& MARBLE INSTALLATION INC' from dual;
Enter value for marble: GRANITE
old 1: select 'JOHNS TILE ┬╡& MARBLE INSTALLATION INC' from dual
new 1: select 'JOHNS TILE ┬╡GRANITE INSTALLATION INC' from dual
'JOHNSTILE┬┐GRANITEINSTALL ATIONINC'
-------------------------- ---------- ---
JOHNS TILE ┬╡GRANITE INSTALLATION INC
SQL> SET DEFINE OFF
SQL> select 'JOHNS TILE ┬╡& MARBLE INSTALLATION INC' from dual;
'JOHNSTILE┬┐&MARBLEINSTALL ATIONINC'
-------------------------- ---------- ----
JOHNS TILE ┬╡& MARBLE INSTALLATION INC
No points please.
SQL> select 'JOHNS TILE ┬╡& MARBLE INSTALLATION INC' from dual;
Enter value for marble: GRANITE
old 1: select 'JOHNS TILE ┬╡& MARBLE INSTALLATION INC' from dual
new 1: select 'JOHNS TILE ┬╡GRANITE INSTALLATION INC' from dual
'JOHNSTILE┬┐GRANITEINSTALL
--------------------------
JOHNS TILE ┬╡GRANITE INSTALLATION INC
SQL> SET DEFINE OFF
SQL> select 'JOHNS TILE ┬╡& MARBLE INSTALLATION INC' from dual;
'JOHNSTILE┬┐&MARBLEINSTALL
--------------------------
JOHNS TILE ┬╡& MARBLE INSTALLATION INC
No points please.
ASKER
Where would I put the SET DEFINE OFF in the function? And slight I sure can use Regex_Replace. DO I just use that instead of Replace? Keep all the other code?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
I guess I dont need the function and Im wondering why I went through all the characters.
If this....
return regexp_replace(input_str,' [^[:print: ]]');
will bring me back only ASCII characters and take out NON ASCII chars then Im good to go...
If this....
return regexp_replace(input_str,'
will bring me back only ASCII characters and take out NON ASCII chars then Im good to go...
>>will bring me back only ASCII characters and take out NON ASCII chars then Im good to go...
The print class is all printable characters:
https://en.wikipedia.org/wiki/Regular_expression
The print class is all printable characters:
https://en.wikipedia.org/wiki/Regular_expression
[\x20-\x7E] Visible characters and the space character.
ASKER
I would want to replace the NON ASCII with a whitespace though. That was the logic I was trying to accomplish
>>I would want to replace the NON ASCII with a whitespace though.
I understand that. The '^' inside the brackets means 'NOT'. So, what I posted replaces 'NOT printable'.
Did you try what I posted and compare it with what you had?
I understand that. The '^' inside the brackets means 'NOT'. So, what I posted replaces 'NOT printable'.
Did you try what I posted and compare it with what you had?
ASKER
Yes and it seems to work and if your saying thats what I need then Im good to go. I didnt know if it was also doing an ASCII check as well?
It should do the same thing that you had.
ASKER
Thanks Slight
ASKER
Thanks again!
set define off
It sees the & as a substitution variable.