Link to home
Start Free TrialLog in
Avatar of kousis
kousis

asked on

find the index of the string

HEADER    TOPOISOMERASE                           03-FEB-97   1AB4              
TITLE     59KDA FRAGMENT OF GYRASE A FROM E. COLI                              
COMPND    MOL_ID: 1;                                                            
COMPND   2 MOLECULE: GYRASE A;                                                  
COMPND   3 CHAIN: NULL;                                                        
COMPND   4 FRAGMENT: 59KDA FRAGMENT;                                            
COMPND   5 EC: 5.99.1.3;                                                        
COMPND   6 ENGINEERED: YES;                                                    
COMPND   7 BIOLOGICAL_UNIT: DIMER                                              
SOURCE    MOL_ID: 1;                                                            
SOURCE   2 ORGANISM_SCIENTIFIC: ESCHERICHIA COLI;                              
SOURCE   3 EXPRESSION_SYSTEM: ESCHERICHIA COLI                                  
KEYWDS    TOPOISOMERASE II, GYRASE, SUPERCOILING DNA                            
EXPDTA    X-RAY DIFFRACTION                                                    
AUTHOR    J.H.M.CABRAL,A.MAXWELL,R.C.LIDDINGTON                                
REVDAT   1   14-OCT-98 1AB4    0                                                
JRNL        AUTH   J.H.CABRAL,A.P.JACKSON,C.V.SMITH,N.SHIKOTRA,                
JRNL        AUTH 2 A.MAXWELL,R.C.LIDDINGTON                                    
JRNL        TITL   CRYSTAL STRUCTURE OF THE BREAKAGE-REUNION DOMAIN OF          
JRNL        TITL 2 DNA GYRASE                                                  
ATOM      1  N   VAL    30      43.915  76.424  29.465  1.00100.00           N  
ATOM      2  CA  VAL    30      43.454  77.777  29.905  1.00100.00           C  
ATOM      3  C   VAL    30      44.406  78.396  30.932  1.00100.00           C  
ATOM      4  O   VAL    30      44.773  79.571  30.829  1.00100.00           O  
ATOM      5  CB  VAL    30      42.003  77.729  30.486  1.00100.00           C  
ATOM      6  CG1 VAL    30      40.998  77.426  29.379  1.00 85.18           C  
ATOM      7  CG2 VAL    30      41.890  76.676  31.589  1.00 85.18           C  

The above mentioned is my file. I want to print form 'header' till "GYRASE" and save it another file.
how to find the index?
TIA
Avatar of venkateshwarr
venkateshwarr


int end=str.indexOf("GYRASE");
System.out.println(str.substring(0,end));
Avatar of kousis

ASKER

int end=str.indexOf("GYRASE");
System.out.println(str.substring(0,end));
I've already used this command.
 it's giving me error-
Stringindexoutof boundsException
how to rectify it?
int end=str.indexOf("GYRASE");
if (end<0) System.out.println("no GYRASE");
else System.out.println(str.substring(0,end));
You may need to seek "ATOM " in fact :

int end=str.indexOf("ATOM ");
if (end<0) System.out.println(str);
else System.out.println(str.substring(0,end)); // ATOM not included
If "ATOM " can appear in the middle of a header line :
int end=str.indexOf("\nATOM");
Avatar of zzynx
>>  it's giving me error- Stringindexoutof boundsException
If the string ("GYRASE") is not found then indexOf() returns -1
Using -1 in substring() gives the error.

>> how to rectify it?
Check for -1 value like Webstorm said in

>>int end=str.indexOf("GYRASE");
>>if (end<0) System.out.println("no GYRASE");
>>else System.out.println(str.substring(0,end));
>> it's giving me error-
>> Stringindexoutof boundsException

Its thrown when you try to access a String outside its limits. You must be getting StringIndexOutOfBoundsException: -1

That is because 'end' would be -1, perhaps because:

>> int end=str.indexOf("GYRASE");

- I guess "GYRASE" is not found in the String. The exception can also be thrown if the String is of 0 length, because then str.substring (0, anything) will also be invalid.... since str.charAt ( 0 ) is also not defined.
Aw.... sorry about the -1 thing, zzynx, didn't refresh.
No problem mayankeagle
Refreshing seems to be a common problem for us all
;)
>> The exception can also be thrown if the String is of 0 length, because then str.substring (0, anything) will also be invalid....
except for anything==0 because substring(start,end) end is not included (length=end-start)

Avatar of kousis

ASKER

ATOM exists in my file, eventhough it's returning -1. I don't know why? is any other way to get the index?
Post your code for reading the file.
>> ATOM exists in my file, eventhough it's returning -1
Impossible. There must be something wrong.
Can you post the code where you process your file (line by line I suppose)

>> is any other way to get the index?
I would search for the reason "why?" instead of using another method
(that will give the same result)
It's always good to understand why things happen.

Are you reading it line by line (I guess so)?

In that case, every line (String) that you read will not contain "ATOM" or whatever String you're searching for.
We're multiprocessing
;)
Avatar of kousis

ASKER

yes, I am reading it line by line I'll try to change my code.
Avatar of kousis

ASKER

import java.io.*;
import java.lang.*;
import java.lang.String;
import java.util.*;
import java.text.*;
public class Copy3 {
    public static void main(String[] args) throws IOException {
       File inputFile = new File("pro.rtf");
                        FileReader in = new FileReader(inputFile);
      String s;
                         BufferedReader br=new BufferedReader(in);
      while((s=br.read())!=null)
      {
      int end=s.indexOf("ATOM");
      if(end<0)
      System.out.println("No atom");
      else
      System.out.println(s.substring(0,end));
      }
      }
      }
This is my code.
Avatar of kousis

ASKER

sorry that was br.readLine()!=Null
Use:

boolean flag = false ;

while((s=br.read())!=null)
      {
      int end=s.indexOf("ATOM");
      if ( end >= 0 )
      {
        flag = true ;
        System.out.println(s.substring(0,end));
      }
}

if ( flag == false )
  System.out.println ( "No ATOM" ) ;

What you're doing is - you're reading each line. If the line does not contain "ATOM", you're printing "No ATOM". That is wrong. Every line need not contain "ATOM". You have to read till you get "ATOM" and if you don't get "ATOM" till the end of the file, you should declare that "ATOM" was not found. That's why I've used a Boolean flag.

I guess you could add a break statement to the if ( end >= 0 ) block in my code.
I'm wondering why we talk about "ATOM" while your initial question was about "GYRASE"?
kousis, could you lead us to the right answering path again?
Whatever it is, just the String in the indexOf () call needs to be changed. Perhaps there was a question from the same person before this one which had some similar thing to do with "ATOM".... maybe that's why Webstorm suggested it (and indeed, the questioner too started asking about "ATOM" only, instead of "GYRASE").

Try this:

      while((s=br.read())!=null)
      {
          if (s.startsWith("ATOM")) break;
          System.out.println(s);
      }
The code i posted before (using indexOf) was good if you were reading the whole file in one big String.
Avatar of kousis

ASKER

I got to print the text from the "HEADER" till the first word "ATOM"
import java.lang.String;
import java.util.*;
import java.text.*;
public class Copy3 {
    public static void main(String[] args) throws IOException {
       File inputFile = new File("pro.rtf");
                        FileReader in = new FileReader(inputFile);
      String s;
              int i;
boolean flag=false;
           BufferedReader br=new BufferedReader(in);
      while((i=br.read())!=0)
      {
s=Integer.toString(i);

int end=s.indexOf("ATOM");
      if ( end >= 0 )
      {
        flag = true ;
        System.out.println(s.substring(0,end));
      }

}  
if ( flag == false )
  System.out.println ( "No ATOM" ) ;


 
      }
      }
I modified the program with this code. but it's hanging.
>>Try this:
>>
>>      while((s=br.read())!=null)
>>      {
>>          if (s.startsWith("ATOM")) break;
>>          System.out.println(s);
>>      }

Oh, I see, you want to print out all header lines till you reach your first ATOM line.
It's always good to explain as much as possible...
Webstorm talked about startsWith("ATOM") instead of indexOf("ATOM")
>> while((i=br.read())!=0)

Read a line, not an int.

while ( ( String strLine = br.readLine () ) != null )
Why do you do this

s=Integer.toString(i);
>> s=Integer.toString(i);

It will give you String equivalent of the integer you read. Does you no good. You are not reading an entire line. Instead of "ATOM .... (followed by the whole line)", you are reading ints one by one. Does you no good.

Try:

while ( ( s = br.readLine () ) != null )
while  ( s = br.readLine () ) != null )
      {
      int end=s.indexOf("ATOM");
      if ( end >= 0 )
      {
        flag = true ;
        System.out.println(s.substring(0,end));
      }
      else
        System.out.println ( s ) ;

}  
import java.lang.String;
import java.util.*;
import java.text.*;
public class Copy3 {
    public static void main(String[] args) throws IOException {
       File inputFile = new File("pro.rtf");
                        FileReader in = new FileReader(inputFile);
      String s;
      boolean flag=false;
      BufferedReader br=new BufferedReader(in);

      while((s=br.readLine())!=null)
      {
            int end=s.indexOf("ATOM");
            if ( end >= 0 )
            {
               flag = true ;
               System.out.println(s.substring(0,end));
            }
     }  

     if ( flag == false )
        System.out.println ( "No ATOM" ) ;
 
     }
}
>> startsWith("ATOM")

What if there is a blank space before the "ATOM"? Then you will have to trim () it. I guess indexOf () would be better for all cases.
Multiprocessing continues ;-)

>> while  ( s = br.readLine () ) != null )

Missed an extra ( there. Refer to my previous comment, before that one.

while ( ( s = br.readLine () ) != null )

Just noticed that you are reading an RTF file. RTF files won't get read directly using readLine (). You need to use different API for that. Is it that you have only plain text in that file, and you gave it an extension RTF just like that, or have you created the file in WordPad or Word?
Ignore my previous comment. This is it:

import java.lang.String;
import java.util.*;
import java.text.*;
public class Copy3 {
    public static void main(String[] args) throws IOException {
       File inputFile = new File("pro.rtf");
                        FileReader in = new FileReader(inputFile);
      String s;
      boolean flag=false;
      BufferedReader br=new BufferedReader(in);

      while((s=br.readLine())!=null)
      {
            boolean stop = s.startsWith("ATOM");
            if ( !stop )
               System.out.println(s);
      }
}
}
>> Multiprocessing continues ;-)
kousis blown away?
;-)
ASKER CERTIFIED SOLUTION
Avatar of Mayank S
Mayank S
Flag of India image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Better not continue reading when not needed so added an "else break":

      while((s=br.readLine())!=null)
      {
            boolean stop = s.startsWith("ATOM");
            if ( !stop )
               System.out.println(s);
            else
               break;
      }
>> // I GUESS THAT IS WHAT HE WANTS TO DO - not sure          
>> System.out.println(s.substring ( 0, s.indexOf ("ATOM" ) );

Now again.... why to call startsWith () and also call indexOf ().... becomes an extra method call. So I would prefer:

while((s=br.readLine())!=null)
      {
            end = s.indexOf ("ATOM");
            if ( end < 0 )
               System.out.println(s);
            else
            {
               stop = true ;
               System.out.println(s.substring ( 0, end ));
               break ;
            }
      }

Also rules out the case when there is a blank-space at the start. In the previous code:

>> System.out.println(s.substring ( 0, s.indexOf ("ATOM" ) );

Missed an extra ) at the end.
I think we're driving kousis mad, don't we?
LOL :)
>> so added an "else break":

Yeah, already done ;-)

Anyway, kousis is perhaps offline.
>> I think we're driving kousis mad, don't we?

Perhaps. We might freeze posting comments for a while, till he returns with feedback ;-)
>> startsWith () and also call indexOf ()....
Where do you find *both* in my comment?

But, as for the starting with a space you're right.
>> Where do you find *both* in my comment?

I was not talking about your comment ;-) I was optimizing my previous comment.


      while((s=br.readLine())!=null)
      {
            if (s.trim().startsWith("ATOM")) break;
            // or  if (s.indexOf("ATOM ")>=0) break; // but it will not work for JRNL   TITL 2 DNA NATOM for example
            System.out.println(s);
      }
>> // but it will not work for JRNL   TITL 2 DNA NATOM for example

Why? It will yield a true. I guess that is the kind of situation which the questioner was addressing when he was using:

>> System.out.println(s.substring(0,end));

I guess he wanted to that.