Link to home
Start Free TrialLog in
Avatar of riaancornelius
riaancornelius

asked on

Replace text in returned query

Hi, I have a table that stores a message that has been sent to a user. In the message there is information in the format:

somechar:#####         (Example - ' Pin is P:1234 reference is R:123456')

What I want to do is hide this information in my frontend. IE, When I select the message, I want the returned result to contain (from example above) 'Pin is P:**** reference is R:******'

The message is stored in an nText field and both pin and reference numbers can be dynamic length, but will always be followed by a single space(unless the number is last element of message - like the R:123456 above).

Is it possible to write a select query to format the strings like this before returning a resultset. Because of the way the data is returned, it will be extremely slow to format it in the java app.

Avatar of puranik_p
puranik_p
Flag of India image

You can use xp_pcre_replace mentioned here.
http://www.codeproject.com/database/xp_pcre.asp?df=100&forumid=16452&exp=0&select=687483

look at this....

To replace all numbers (regardless of length) with "###":

DECLARE @out VARCHAR(8000)
EXEC xp_pcre_replace
   '12345 is less than 99999, but not 1, 12, or 123',
   '\d+',
   '###',
   @out OUTPUT

PRINT @out
prints out:

### is less than ###, but not ###, ###, or ###

All the best!
Pura
Avatar of riaancornelius
riaancornelius

ASKER

Thanks puranik_p , but I'd prefer it if I could do this using only sql, not anything external to the server.
Its only SQL.
The link gives a stored procedure named 'xp_pcre_replace', which you can use, instead of writing your own.
yes, but then i need to install that dll on each client machine.
NO.
Its just a procedure which will be compiled and stored on SQL server.
YES
If I fire up my trusty Query Analyzer and type:

DECLARE @out VARCHAR(8000)
EXEC xp_pcre_replace
   '12345 is less than 99999, but not 1, 12, or 123',
   '\d+',
   '###',
   @out OUTPUT

PRINT @out

Then hit execute, I get the following:

Server: Msg 2812, Level 16, State 62, Line 2
Could not find stored procedure 'xp_pcre_replace'.

I cannot change the installation procedure of the java app in any way. For this to work, I will still need to copy the dll to the clients server and execute the commands to add the stored procedures to MS SQL Server. I can't/Wont do that.
This code works, but it's kind of clumsy. Also, I'm not sure about performance when you have a lot of records. Maybe you can give it a try and let us know?

declare @Test varchar(8000)

set @Test = 'Pin is P:1234 reference is R:123456'

select replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(@Test,'0','*'),'1','*'),'2','*'),'3','*'),'4','*'),'5','*'),'6','*'),'7','*'),'8','*'),'9','*')

problem is that I only want to replace when it starts with somechar:numbers, not necessarily all numbers.
Is the number of occurrences of somechar:numbers specified? Or can you have varying, and possibly a large number of occurrences?
IMHO You can't use replace on a nText column,
you'd have to convert to nvarchar(4000) first, thus "losing" any information after the 4000th character, and then use either a regular expression (slow) or multiple replace

ie instead of
select ntextColumn, .... from yourtable

select replace(replace(replace(convert(ntextColumn as nvarchar(4000)), '0', '#'), '1', '#'), '2', '#')  , ....
from yourtable

the code above would replace 0,1,2, you get the idea of how to replace up to 9 ....

I'd stick to Java side :
did you try the replaceAll method ?? (in the following exemple, '\d' matches any numeric digit)

myString = myString.replaceAll('\d', '#')
Hilaire, the problem is that I load all the records into a resultset, and then load the first 100 records into a JTable. When the user scrolls to the bottom of the table, I load the next 100. I'd prefer not to have to iterate through a hundred rows replacing values each time the user gets to the bottom of the list. Furthermore, I can't replace all numeric digits, as we need to leave the ref number in a readable format in some cases which means i cant just use replaceAll indiscriminately.

It looks like I'll need to do it in java, But I'd really prefer to move that to SQL.
Another approach..

Can you have two columns instead of one?
First one will contain..
'Pin is P:**** reference is R:******'
and, second one will contain..
P:1234,R:123456

You can take care of this while inserting itself.
we already have clients with million of rows of existing data. I can change structure of tables if absolutely necessary, but I can't change existing data. I need a dynamic solution that will be able to deal with any message that contains char:digits as we use this approach for several different formats.
If you give more details on when you want to hide REF or not,
I can post sample code using regular expressions in T-SQL with the sp_OA* stored procedures and the VBScript.RegExp object

Not sure it would be fatser than doing it on the front-end side (Java), but it might be worth a try ...

well, let's assume we never want to show the ref number, but that there is some numbers that needs to be displayed. Example:

Your ID is 7744, Pin is P:1234 and ref is R:123456. This info is accurate as of 14/03/2005.

I know it can be done, but I don't have the time right now to research it.

Thanks in advance.
Please give this code a try,
can be tweaked to fit your requirements better if need be

CREATE FUNCTION dbo.ufn_CustomReplace(@strIn nvarchar(4000))
RETURNS varchar(8000) AS
BEGIN
DECLARE @objRegExp integer
EXECUTE sp_OACreate 'VBScript.RegExp', @objRegExp OUTPUT
EXECUTE sp_OASetProperty @objRegExp, 'Global', 1
EXECUTE sp_OASetProperty @objRegExp, 'Pattern', '\s([PR]):\d*'
EXECUTE sp_OAMethod @objRegExp, 'Replace', @strIn OUTPUT, @strIn, ' $1:####'
EXECUTE sp_OADestroy @objRegExp
RETURN @strIn
END
GO

-- how to use it
select dbo.ufn_CustomReplace('Your ID is 7744, Pin is P:1234 and ref is R:123456. This info is accurate as of 14/03/2005.')

-- to use it on a nText column, you'll have to cast the text as nvarchar first

select dbo.ufn_CustomReplace(cast(YourNTextColumn as nvarchar(4000))) as ObfuscatedCol from YourTable

HTH

Hilaire
Excellent! It works Hilaire.
Can you please give me a short explanation of whats happening in the stored procedure. Also, the P and R characters is defined by client and can be dynamic, so I assume I can just take another input variable with the chars when calling this (can I do something like  '\s(['+@chars+']):\d*' ).
But what happens if it is set up to be a string instead of a char (ie - pin: instead of p:) can I set up a pattern for that?
Avatar of Mehul Shah
you can try the below code. It still has some gaps but other experts can add their suggestions for the same.

declare @Test varchar(8000)
set @Test = 'Pin is P:1234 reference 1234 is R:654321'
select replace(@Test,substring(@Test,patindex('%_:%',@Test)+2,charindex(' ',@Test,patindex('%_:%',@Test)-patindex('%_:%',@Test))),':**********')
quite weird MehulS78, I get:
Could not find stored procedure 'patindex'.
Replaced patindex with PATINDEX and it works. Is stored procedure names case sensitive?

in any case, I get:
Pin is P::********** reference :********** is R:654321
>>so I assume I can just take another input variable with the chars when calling this (can I do something like  '\s(['+@chars+']):\d*' ). <<
Definitely !
You'll have to declare an additional @variable, cause you can't concatenate and pass a parameter to sp_oamethod on the same line.


CREATE FUNCTION dbo.ufn_CustomReplace(@strIn nvarchar(4000), @chars nvarchar(10))
RETURNS nvarchar(4000) AS
BEGIN
DECLARE @objRegExp integer, @strPattern nvarchar(200)
EXECUTE sp_OACreate 'VBScript.RegExp', @objRegExp OUTPUT
EXECUTE sp_OASetProperty @objRegExp, 'Global', 1
set @strPattern = N's([' + @chars + N']):\d+'
EXECUTE sp_OASetProperty @objRegExp, 'Pattern', @strPattern
EXECUTE sp_OAMethod @objRegExp, 'Replace', @strIn OUTPUT, @strIn, ' $1:####'
EXECUTE sp_OADestroy @objRegExp
RETURN @strIn
END
GO

>>But what happens if it is set up to be a string instead of a char (ie - pin: instead of p:) can I set up a pattern for that?<<
Sure

'\sP(in)?:\d*' would match either ' P:123546' or ' Pin:123456'

the function can even be tweaked to make several replaces and be more robust ...
Awesome, This looks like What I Need, will do some tests tonight and look at the T-SQL Reference to see what this is actually doing, and will get back to you guys with some results tomorrow. unfortunately I'm a bit swamped with other stuff right now.
I'm trying this with the new stored procedure:
select dbo.ufn_CustomReplace(cast(message as nvarchar(4000)),'PR') as ObfuscatedCol from messages
-and
select dbo.ufn_CustomReplace('Your ID is 7744, Pin is P:1234 and ref is R:123456. This info is accurate as of 14/03/2005.','PR')

and nothing is getting replaced? Am I missing something obvious?
Oops, sorry typo in this line

set @strPattern = N's([' + @chars + N']):\d+'

should be  (missing '\' at the beginning)

set @strPattern = N'\s([' + @chars + N']):\d+'

thanks, that sorted it.

bit confused though, what would the select look like if I wanted to match 'pin:1234' and 'ref:1234' ? or would I need to change the sp? maybe pass in the whole pattern?
Sorry, after I asked this question I realised that our settings actually allows a string as ref or pin character. So it is always a string, it's just usually set up to be a string of length 1. Doesn't need to be though...
ASKER CERTIFIED SOLUTION
Avatar of Hilaire
Hilaire
Flag of France image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Thanks Hillaire, this is what I was looking for. Need to do some tests to see if its fast enough, so I might not use it, but it is exactly what I was looking for.

Glad I could help.

Feel free to post a follow-up if you the function needs more tweaking.

For complicated string processing regular expressions really rock.

I suggest you post a question in the Java Topic Area to see you this would write in Java, but I think you could reuse the same patterns and logic in any programming langage that has a decent regular expressions support.

Regards
Hilaire
I have actually done it in java, it just became a bit cumbersome to use the java code after I redesigned the way I fetch data from the db and display it (my app was taking way too long to load all the rows into the tables.).
I just wanted a way to do it in SQL because then I can run the query and continue to do any other stuff that needs to be done while SQL Server can sort out the details of obfuscating the necessary stuff.