Solved

Reading in a text file and parsing it in java

Posted on 2004-08-13
8
213 Views
Last Modified: 2010-03-31

Hi

I have a large text file that i want to read into java and parse the information in it. Can anyone tell me the best way about doing this.

example of one line of the file (each line is the same format)

there are 3 lines above this that are completly irrelevant that i wont need either

9406458572012631790140390464112    00000003500  textNotNeededhere          textnotneededhere   Joe Bloggs

I need to parse the numbers at the front into a database, as in the first 6 are a value, the next 8 are a value and so forth

Any help would be great!!!
Thanks,
Suzy
0
Comment
Question by:fyness
  • 4
  • 3
8 Comments
 
LVL 86

Expert Comment

by:CEHJ
ID: 11791698
Use a StreamTokenizer. Here's an example - you need to


http://javaalmanac.com/egs/java.io/ParseJava.html

use the number bit
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11791707
You need only this bit:

case StreamTokenizer.TT_NUMBER:
               
0
 
LVL 7

Accepted Solution

by:
tomboshell earned 250 total points
ID: 11791761
use BufferedReader, FileReader, StringTokenizer

like:

BufferedReader br = new BufferedReader(new FileReader(yourTextFile));
String line=null;
int cntr=0;
while((line = br.readline())!=null){
  cntr++;
  if(cntr<4) continue;  // skip the first three
  String[] items = line.split(Character.toString('\t'));
  for(int i = 0; i < items.length; i++){
    // now you have an array with each element in the array being one of the entries
    // this loop will go through the items, simply store the items based upon the value of 'i' with '0' being the first position
    // I don't know how you are storing the items so will leave that up to you...
     //
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11791865
You'll find that the StreamTokenizer will give you better performance ;-)
0
How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

 
LVL 7

Expert Comment

by:tomboshell
ID: 11791970
I have wondered about it.  Everytime I read the javadocs about StreamTokenizer I get the serious impression that it is ideally suited to parse items like code source files and not really what I parse.  But ya, I can see where it could be used. Would also have to notice the line numbers to be able to skip the first three, and it provides the lineNo() method.  Then watch the tokens and positions since it looks like most of the numbers on the lines are taken for storage, and all the text except the name is ignored.  I would think that would result in a bit more questioning of the the values contained.  But then if it had something like 'tokenNumber' property or method then this would be no problem.  But I am willing to learn :)

Could also use a StringTokenizer to parse the individual lines, but I kinda like the split method.  That way I work with arrays which make it easy to think of the data placed into table structures.
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 11791992
>>Everytime I read the javadocs about StreamTokenizer ...

Yes, i agree. The reason i recommended it in this case is that you can handily ignore everything except numbers

0
 

Author Comment

by:fyness
ID: 11792751
Just one other thing on the code above, at the moment the array takes in each line of the file, how could i now split up the lines ie take each array box and parse them?

Thanks
0
 
LVL 7

Expert Comment

by:tomboshell
ID: 11793057
It reads each line, line per line in a loop until the end-of-file is reached.  Each loop iteration breaks the line into an array with each array element being one of the elements of that line.  I was assuming that you were working with tab-separated files.  
I would then assume that you were providing some easy way to set the values...like

setColumnOne(items[0]);
setColumnTwo(items[1]);
// items 2 & 3 being not used.
setUser(items[4]);  


That way you can perform any special handling on the individual values as needed.  

Have a great weekend!
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
bunnyEars challenge 6 62
bunnyEars2 challenge 6 67
count11 challenge 6 47
micro services spring boot application error 3 23
Java Flight Recorder and Java Mission Control together create a complete tool chain to continuously collect low level and detailed runtime information enabling after-the-fact incident analysis. Java Flight Recorder is a profiling and event collectio…
Introduction This article is the last of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers our test design approach and then goes through a simple test case example, how …
Viewers learn about the third conditional statement “else if” and use it in an example program. Then additional information about conditional statements is provided, covering the topic thoroughly. Viewers learn about the third conditional statement …
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now