Link to home
Start Free TrialLog in
Avatar of superfly18
superfly18

asked on

Parsing Text into Tab Delimeted File

I have some text data from a legacy system (old Mainframe) which I am trying to input into
a tab delimeted file for input into a relational database.  The goal is to read in a file
of type *.DAT, and then output into a tab delimeted *.TXT file
I only have soft copies of the *.DAT files for input.  The data output is static
and certain fields are specified by their line number, :#:.  I would like to parse these
files, but I am not sure on how to go about doing it.  The file structure looks like this.  
A5

A5543645674645646446
     :01:KI
:02:AMERA123456C897
:03:
:04:
:05:
:10:BIRDCAGE
:12:A50212USD1234,89
:74:LONG STRING HERE
:113:B
:245:123456XIX1234
-
A62354424334242423234
     :01:KI
:02:EURO123456C897
:03:
:04:
:05:
:06:
:10:BIRDCAGE
:12:A50212USD2345,89
:74:LONG STRING HERE
:113:B
:245:123456XIX1235
-

The values theat I would like to import into SQL Server, are only :02:,:10: :12:,:74:
:113:,:245: However in some cases these values will be null, and
sometimes those lines, and line numbers will not exist at all.  When parsing into
tab delimeted, I want the output to look like this:


:02:     :10:     :12:     :74:     :113:     :245:
Value     Value     Value     Value     Value     Value
Value     Value     Null     Value     Value     Value

So that each line number is a column, and the values for those columns are the strings
next to those values.  If the line number does not exist, or there is no data in the line
number, the value will be null.

As you can see, there are multiple records in one file, and it there are not always
the same number of records in each file.  

I have never done text parsing before, however need a lot of help on this one, any code,
suggestions, or pointers to the right direction will be most helpful.

Thank You!
Avatar of aozarov
aozarov

Here is half code / half pseudo code:

 Set itemsThatICare = new HashSet();
itemsThatICare.add(":02:");
itemsThatICare.add(":10:");
..

Map values = new HashMap();

// read the file:
 BufferedReader in = new BufferedReader(new FileReader("infilename"));
        String str;
        while ((str = in.readLine()) != null) {
            process(str, values, itemsThatIcare);
        }
        in.close();
    } catch (IOException e) {
    }

// write to new File
 try {
        BufferedWriter out = new BufferedWriter(new FileWriter("outfilename"));
        // write headers
        for (Iterator i = itemsThatIcare.iterator(); i.hasNext(); )
        {
             out.write(i.next().toString());
             out.write("\t");
        }
         out.write("\n");

       //write values
       while (!values.isEmpty())
       {
        for (Iterator i = itemsThatIcare.iterator(); i.hasNext(); )
        {
            LinkedList list = (LinkedList) values.get(i.next());
           if (list.isEmpty())
               out.write("Null");
          else
          {
               out.write(list.removeFirst().toString());
          }
         
          out.write("\t");
        }
         out.write("\n");

       }

        out.close();
    } catch (IOException e) {
    }

// process logic:
if (!str.startsWith(":"))
return;

int indexOfColon = str.indexOf(':',  1);
if (indexOfcolon <= 0)
return;

String token  = str.substing(0, indexOfColon + 1);
String value = str.substring(indexOfColon + 1);
if (itemsThatICare.contains(token))
{
List list = (List) values.get(token);
if (list == null)
{
list = new LinkedList();
values.put(token, list);
}
list.add(value);
}
ASKER CERTIFIED SOLUTION
Avatar of aozarov
aozarov

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
is ur intention to extract only :02:     :10:     :12:     :74:     :113:     :245:
strings?
Avatar of superfly18

ASKER

Yes, only those strings.
i've made a method for u. try the following ,

//define these globally in ur class.

  Vector requiredStringsArray = new Vector();
  int firstIndexOfColon = -1;
  int lastIndexOfColon = -1;

// add followin method
  void getStr(String str)
  {
    //System.out.println(str);
      firstIndexOfColon = str.indexOf(":");
      if(firstIndexOfColon > -1)
      {
        for(int i = firstIndexOfColon + 1; i < str.length(); i++)
        {
          if((str.charAt(i) == ':'))
          {
            lastIndexOfColon = i;
            break;
          }
        }
       
        requiredStringsArray.addElement(str.substring(firstIndexOfColon, lastIndexOfColon + 1));
        String s = str.substring((lastIndexOfColon + 1), str.length());
        //System.out.println(s);
        getStr(s.trim());
      }    
  }


// now call that method
    getStr(":02:     asdf:10:BFDF     343:12:adfs     :74:232     :113:     :245:");

//-- display extracted stings
    for(int i = 0; i < requiredStringsArray.size(); i++)
      System.out.println(requiredStringsArray.elementAt(i));

// and tell me whether it works or not? i tested here with above string , it works fine
superfly18, did you undestand the half logic/ half code I gave you above?
Still working on it, as soon as I get a chance to put it together I will post it

Thanks!
As a side note, it was helpful to use the | as opposed to the /t for inputing into a database.  
superfly18 did'nt u test my code?