perl search and replace

Hi - My file looks like below

ISDFFS*00*          *00*          *30*7984581235      *ZZ*AKLSJFHADSJF   *624654*1805*^*00501*3465465*1*T*:~RS*GS*5145454121*AKLSJFHADSJF*624654*1805*7984581235*X*005010X306~ST*820*7984581235*005010X306~SDF*I*0*C*NON************624654~SDFS*3*99999~SDF*582****RD8*20141201-20141231~SWRE*1121326~REF*DOL*100000037~RSRRE*RER*ERER**-2132154021.14~DRERF*582****RD8*20150701-20150731~REF*DOL*100000257~SE*8*7984581235~RT*1*7984581235~IAS*1*7984581235~

Just to explain my problem clearly I split the file into multiple lines using '~' and it looks like below

ISDFFS*00*          *00*          *30*7984581235      *ZZ*AKLSJFHADSJF   *624654*1805*^*00501*3465465*1*T*:~
RS*GS*5145454121*AKLSJFHADSJF*624654*1805*7984581235*X*005010X306~
ST*820*7984581235*005010X306~
SDF*I*0*C*NON************624654~
SDFS*3*99999~
SDF*582****RD8*20141201-20141231~
SWRE*1121326~
REF*DOL*100000037~
RSRRE*RER*ERER**-2132154021.14~
DRERF*582****RD8*20150701-20150731~
REF*DOL*100000257~
SE*8*7984581235~
RT*1*7984581235~
IAS*1*7984581235~

Now I want to replace all occurences line starting with REF*DOL* ....... with other values

first i want to find the occurence of REF*DOL*
example: REF*DOL*100000037~
now I want the number 100000037 and use it for a query like below
select memid from emp where eid = '100000037'

Open in new window

once i got the memid I want to use it for my replace option, lets say the memid returned from the query is 100000123
then i want to replace REF*DOL*100000037~
with
REF*DOL*100000123~REF*AP*100000123~

I want to do this for all occurances, this is a simple file so i have just two REF*DOL* but for big files i have these in 1000's

Can some one provide me perl script for this.

Thanks,
shragiAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
use DBI;
$dbh = DBI->connect("DBI:mysql:database=$database", $user, $password) or die $!;
while( <> ){
    s/(?<=REF\*DOL\*)(\d+)~/@{[@r=$dbh->selectrow_array( "select memid from emp where eid = '$1'")]}~REF*AP*@r~/g;
}
wilcoxonCommented:
Can this be done before you split the file into individual lines (or just on the multi-line version of the file data)?  Is the multi-line version of the file available in a single var or is it in a list (one line per entry) or something else?

Can a single eid appear in more than one record within the file?  If so, for efficiency, you want to only do the db access once.  Will all of the eids appear within a file?  Is the emp table large?  If not, I'd think about just caching the table.
shragiAuthor Commented:
@wilcoxon
your answers
Can this be done before you split the file into individual lines (or just on the multi-line version of the file data)?
YES THERE IS NO NEED TO SPLIT - I JUST SPLITTED HERE TO EXPLAIN MY PROBLEM
 Is the multi-line version of the file available in a single var or is it in a list (one line per entry) or something else?
THE FILE I GET HAS ONLY ONE LINE AND ONCE AGAIN NO NEED TO SPLIT, BUT EACH LINE IS SEPARATED BY ~

Can a single eid appear in more than one record within the file?  
NO
If so, for efficiency, you want to only do the db access once.  Will all of the eids appear within a file?
IF THAT IS POSSIBLE I WANT TO ACCESS DB ONCE BUT IS IT POSSIBLE
 Is the emp table large?  If not, I'd think about just caching the table.
YES IT IS AROUND 500K RECORDS AND 60 COLUMNS, ALSO ITS JUST NOT EMPID TABLE
I GAVE THAT AS EXAMPLE THERE IS A BIG QUERY THAT I WILL RUN FOR THIS.


Thanks,
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

shragiAuthor Commented:
@ozo
i got compilation errors, can you please check my attached perl script and let me know where did i go wrong

ERRORS:
Possible unintended interpolation of @r in string at line 40
global symbol @r requires explicit package name at lien 40
perlscript.txt
ozoCommented:
my @r;
while (<INFILE>) {
    s/(?<=REF\*DOL\*)(\d+)~/@{[@r=$dbh->selectrow_array( "select DISTINCT member_id    
from relation as cr
join Contacts as c on c.id = cr.id and c.status = cr.status  
join Coverage as ec on ec.id = cr.id and ec.status = cr.status
where cr.status = 'A'  and ec.member_id = '$1'")]}~REF*AP*@r~/g;
   print OUTFILE;
}
# but that looks like it may be a complicated query to perform 1000's of times
# could it be faster to perform one query to select all ec.member_id, member_id replacements  and have perl cache the results?

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
shragiAuthor Commented:
thanks ozo - that worked.
I want a small extension to this, when i ran the query i got multiple results so to filter out and get just one result i thought of adding one add more condition to where in sql.
but for that i want to grab a value from the text file.

Going back to previous example:
ISDFFS*00*          *00*          *30*7984581235      *ZZ*AKLSJFHADSJF   *624654*1805*^*00501*3465465*1*T*:~
RS*GS*5145454121*AKLSJFHADSJF*624654*1805*7984581235*X*005010X306~
ST*820*7984581235*005010X306~
SDF*I*0*C*NON************624654~
SDFS*3*99999~
SDF*582****RD8*20141201-20141231~
SWRE*1121326~
NM1*IL*1*BOURNE*BOND*L***C1*100000257~
REF*DOL*100000037~
RSRRE*RER*ERER**-2132154021.14~
DRERF*582****RD8*20150701-20150731~
REF*DOL*100000257~
SE*8*7984581235~
RT*1*7984581235~
IAS*1*7984581235~

There is this below line before 4 lines of REF*DOL which gives user last name and first name
NM1*IL*1*BOURNE*BOND*L***C1*100000257~

here bourne is lastname
and bond is first name
i want both to filter out, so that i can write query as below

select DISTINCT member_id    
from relation as cr
join Contacts as c on c.id = cr.id and c.status = cr.status  
join Coverage as ec on ec.id = cr.id and ec.status = cr.status
where cr.status = 'A'  and ec.member_id = '$1' and lastname = 'bourne' and firstname = 'bond'

the important point is i want only the name from NM1*IL* which is 4 lines before REF*DOL*
these things can be repeated and i mean in file there are multiple sections like this...
so each NM1*IL* is associated to REF*DOL*
so if i find NM1*IL* then i use the name from this to update the immediate REF*DOL*

is this a quick fix.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.