Solved

regular expression

Posted on 2004-09-15
12
244 Views
Last Modified: 2010-03-05
Hi All,

I need help modifying a regular expression.  My current one is :

if (/^ATOM\s+\d+\s+\CA\s+(\S+)\s+/)

which gets "VAL" code from the line below in a large file

ATOM      2  CA  VAL A   3      17.591  48.101  25.416  1.00 27.93           C
 

however sometimes the file can be line
ATOM   1358  CA ALEU A 199      -3.698 -19.821 -32.696  0.50 21.71           C  

and I can't match the "LEU" code.

Can someone help modify my expression so I can accomadate both these lines in the input to get the desired  code like VAL and LEU in the examples above?

Thanks

Sarah
0
Comment
Question by:sarahJo
12 Comments
 
LVL 19

Accepted Solution

by:
Kim Ryan earned 125 total points
ID: 12071903
Try this one instead, ignores an optional A before the VAL or LEU
/^ATOM\s+\d+\s+\CA\s+A?(\S+)\s+/
0
 
LVL 10

Assisted Solution

by:rj2
rj2 earned 125 total points
ID: 12071913
Sample code below matches both VAL and LEU.

$_='ATOM      2  CA  VAL A   3      17.591  48.101  25.416  1.00 27.93           C';

if (/^ATOM\s+\d+\s+\CA\s+(\S+)\s+/) {
      print "Match: $1\n";
}
$_='ATOM   1358  CA LEU A 199      -3.698 -19.821 -32.696  0.50 21.71           C';
if (/^ATOM\s+\d+\s+\CA\s+(\S+)\s+/) {
      print "Match: $1\n";
}
0
 

Author Comment

by:sarahJo
ID: 12071936
Hi Teraplane,

Thanks for that.......it might be any letter befor the VAL or LEU...not just A, can it be chnaged to accomadate this?

Thanks!!!
0
Netscaler Common Configuration How To guides

If you use NetScaler you will want to see these guides. The NetScaler How To Guides show administrators how to get NetScaler up and configured by providing instructions for common scenarios and some not so common ones.

 

Author Comment

by:sarahJo
ID: 12071943
Hi rj2,

the second line is

ATOM   1358  CA ALEU A 199      -3.698 -19.821 -32.696  0.50 21.71           C  

with an A before the LEU...that was the problem so the expression needs a little tinkering!

Thanks
Sarah
0
 
LVL 19

Expert Comment

by:Kim Ryan
ID: 12071955
Yes you can, but is it always followd by LEU or VAL? If so this would work.
/^ATOM\s+\d+\s+\CA\s+\w?(VAL|LEU)\s+/
0
 

Author Comment

by:sarahJo
ID: 12071958
Oh, its always followed by LEU or VAl i'm afraid....could be any 3 letter code!
0
 
LVL 19

Expert Comment

by:Kim Ryan
ID: 12071965
Could you specify your data format more completely so I can define your problem more precisely? If you are looking for any character followed by 2 or 3 characters, we cannot filter out the first consistenlty. For example it would grab the V from VAL, but ALEU would be OK. We need tu use a pattern either based on string data or character position.
0
 
LVL 84

Assisted Solution

by:ozo
ozo earned 125 total points
ID: 12076145
If you always want the last 3 letters of 3 or 4 letters that would be
/^ATOM\s+\d+\s+CA\s+\w?(\w\w\w)\s+/
0
 

Assisted Solution

by:joedundas
joedundas earned 125 total points
ID: 12089361
Hey Sarah,
  When you are parsing PDB files, it is usually better to use substr().  The A in ALEU is in the Alternative Location Indicator column.  There are many more instance when there won't be spaces between data columns.  
The following link will give you the column information to use for substr()
http://www.rcsb.org/pdb/docs/format/pdbguide2.2/guide2.2_frame.html

Joe
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This Micro Tutorial will teach you how to censor certain areas of your screen. The example in this video will show a little boy's face being blurred. This will be demonstrated using Adobe Premiere Pro CS6.

809 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question