?
Solved

Sorting of multiple log files

Posted on 2002-04-16
25
Medium Priority
?
217 Views
Last Modified: 2010-03-31
Lets assume the case with two files.

File1.txt
911     Wed Oct 17 12:33:44 2001     2
913     Wed Oct 17 12:44:43 2001     4

File2.txt
912     Wed Oct 17 12:36:44 2001     3
914     Wed Oct 17 12:54:43 2001     5

Now I would first read these two files in buffer and then like to run a sorting algorithm (preferably quicksort or anything faster than that) to sort the two files on basis of timestamp. The sorted information should be stored in the output file

Output.txt
911     Wed Oct 17 12:33:44 2001     2
912     Wed Oct 17 12:36:44 2001     3
913     Wed Oct 17 12:44:43 2001     4
914     Wed Oct 17 12:54:43 2001     5

There can be a case where multiple files are fed to this java class. How should I sort these files .Should I convert the date in some specific format inside the code itself for easier comparison.
0
Comment
Question by:alwayshunk
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 11
  • 7
  • 3
  • +4
25 Comments
 
LVL 16

Expert Comment

by:heyhey_
ID: 6947067
> Should I convert the date in some specific format inside the code itself for easier comparison.

if you use timestamps like
20011017125443
it will be much easier to compare them
0
 
LVL 3

Expert Comment

by:saxaboo
ID: 6947086
The main issue here is to find a way not to code the quicksort, although you'll find it in virtually every computer science course ...
Java implements the mergesort algorithm for collections. All you have to do is make sure that the elements in your collection implement the Comparable interface.

The idea is :

public class LogEntry implements Comparable
{
     //getters/settters would be nicer ...
     //anyway you get the idea
     public int mID;
     public Date mDate;
     public int mSeverity;

    public int compareTo(Object pOtherEntry)
    {
         return  mDate.compareTo(pOtherEntry.mDate);
    }
}


Now in your main program :
- read the files and manage to build a java.util.List (using the arrayList class, for instance) containing all your entries. Let's call it theList

Now sort it :
java.util.Collections.sort(theList);

=> your list is sorted ! Isn't oo-programming great ?

Hope this helps,

-S

0
 

Author Comment

by:alwayshunk
ID: 6947090
I'll convert the date in thte format you have told. But still I need to sort multiple files and then sort it in ascending order. Any idea?
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:alwayshunk
ID: 6947095
I'll convert the date in thte format you have told. But still I need to sort multiple files and then sort it in ascending order. Any idea?
0
 

Author Comment

by:alwayshunk
ID: 6947103
I'll convert the date in thte format you have told. But still I need to sort multiple files and then sort it in ascending order. Any idea?
0
 
LVL 2

Expert Comment

by:CSuvendra
ID: 6947118
Use TreeMap. Here is an Example. Pls. check for performance with regards to your requirement. I can post the full code if you are not clear about something.

/* Create TreeMap */
TreeMap tmp = new TreeMap();

/* Add Values */
SimpleDateFormat sdf = new SimpleDateFormat("EEE MMM d hh:mm:ss y");

/* For each file, each Record */
/* s1 is input line from File(s) */
Date dt = sdf.parse(s1.substring(8, 31));
tmp.put(dt, s1);

/* Now write the collection to file */

Iterator i = ((Collection)tmp.values()).iterator();
while(i.hasNext()) // Write to File here
    System.out.println(""+i.next());

/* Here is the output */
911     Wed Oct 17 12:33:44 2001     2
912     Wed Oct 17 12:36:44 2001     3
913     Wed Oct 17 12:44:43 2001     4
914     Wed Oct 17 12:54:43 2001     5
0
 

Author Comment

by:alwayshunk
ID: 6947136
Suvendra your solution sounds interesting. I would appreciate if you can pass me the complete code starting from reading the file, putting it in buffer, changing the timestamp format and sorting it. Thanks in advance
0
 

Expert Comment

by:jodear
ID: 6947158
> Should I convert the date in some specific format inside the code itself for easier comparison.

You can convert each line of text in your log file into a Date object so that it would be sortable using saxaboo's idea above.

To convert your text to a Date object, substr or StringTokenize the line of text so that you only get the date-time part as follows:

"Wed Oct 17 12:33:44 2001"

then use the Date.parse() method as follows:

Date td = Date.parse("Wed Oct 17 12:33:44 2001")

Its deprecated though, but it still works better than the DateFormat.parse() method.  The date object td can now be inserted/appended to your list.  Do these for each line of text in your log files then sort your list.
0
 
LVL 2

Expert Comment

by:CSuvendra
ID: 6947171
/* Syntax: java Test File1.txt File2.txt */

**********************************************
import java.io.*;
import java.util.*;
import java.text.*;

public class Test {
    public static void main(String [] args) {

          TreeMap tmp = new TreeMap();
          BufferedReader f1;
          String s1 = new String();
          SimpleDateFormat sdf = new SimpleDateFormat("EEE MMM d hh:mm:ss y");

          for (int j=0; j<args.length; j++) {
               try {
                    f1 = new BufferedReader(new FileReader(args[j]));
                    while((s1=f1.readLine()) != null) {
                         Date dt = sdf.parse(s1.substring(8, 31));
                         tmp.put(dt, s1);
                    }

                    f1.close();

               } catch (IOException e) {
                    e.printStackTrace();
               } catch (ParseException ex) {
                    ex.printStackTrace();
               }
          }

          Iterator i = ((Collection)tmp.values()).iterator();

          while(i.hasNext()) {
               System.out.println(""+i.next());
          }
    }
}
0
 
LVL 2

Expert Comment

by:CSuvendra
ID: 6947200
alwayshunk
this question has moved into the 'Answered' zone because 'iodear' has proposed not a comment but an answer. Since the suggestions mentioned by him were already proposed by me and/ or 'saxaboo', I would urge you to Reject the proposed answer.

iodear
Your answer did not seem to add any value to previous posts since you did not give any new ideas. See the 'Tips on Comments and Answers' at the bottom of this page, if you are unsure of the meanings.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947265
Hi
you could merge two log files (all records in any log file are in sorted order). Sorting takes much more time then merging.

Code:

java.io.BufferedReader r1 = new java.io.BufferedReader(new java.io.FileReader("d:/a.txt"));
java.io.BufferedReader r2 = new java.io.BufferedReader(new java.io.FileReader("d:/b.txt"));
java.io.Writer w = new java.io.FileWriter("d:/merged.txt");
try
{
     if (r1.ready() && r2.ready())
     {
          java.text.DateFormat dateParser = new java.text.SimpleDateFormat("EEE MMMM d hh:mm:ss yyyy", java.util.Locale.US);

          // initialization    
          String s1 = r1.readLine() + "\n";
          java.util.Date timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
         
          String s2 = null;
          java.util.Date timeStamp2 = null;
         
          boolean flag = true;
         
          while (true)
          {
               if (flag)
               {
                    if (!r2.ready())
                         break;
                    s2 = r2.readLine() + "\n";
                    timeStamp2 = dateParser.parse(s2.substring(s2.indexOf(" "), s2.lastIndexOf(" ")).trim());
               } else {
                    if (!r1.ready())
                         break;
                    s1 = r1.readLine() + "\n";
                    timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
               }
               w.write((flag = timeStamp1.after(timeStamp2)) ? s2 : s1);
          }

          w.write((flag) ? s1 : s2);
     }

     while (r1.ready())
          w.write(r1.readLine() + "\n");
     while (r2.ready())
          w.write(r2.readLine() + "\n");
} finally {
     r1.close();
     r2.close();
     w.close();
}

0
 

Author Comment

by:alwayshunk
ID: 6947387
Your answer did not seem to add any value to previous posts since you did not give any new ideas
0
 

Author Comment

by:alwayshunk
ID: 6947394
Suvendra it seems to work fine but I have some performance issues. Can I merge  the two files and then try a sort on it. Any idea. Moreover if a case arises in which the timestamp is same in two files, I need to sort it on the last column.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947427
alwayshunk see my previous comment. My code merges two log files into one. Time complexity O(n). Any sorter(for example MapTree) have time complexity O(n*logn)
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947435
You could try all implementation on big log files... for best choice :)

Best regards
0
 

Author Comment

by:alwayshunk
ID: 6947444
Andrey I am new to JAVA. Can u pass me the code as a class file which I can compile and try out with two input files as parameter.
Thanks
0
 
LVL 9

Expert Comment

by:Venci75
ID: 6947473
Are the lines in each log file sorted?
0
 
LVL 2

Accepted Solution

by:
Andrey_Kulik earned 700 total points
ID: 6947474
java science.MergeLog logFile1 logFile2 mergedLog

package science;
/**
 * @author: Kulik Andrey
 */
public class MergeLog implements Cloneable {
private MergeLog() {
     super();
}
/**
 * Usage: logFile1 logFile2 mergedLogFile
 */
public static void main(java.lang.String[] args) {
     java.io.BufferedReader r1 = new java.io.BufferedReader(new java.io.FileReader(args[0]));
     java.io.BufferedReader r2 = new java.io.BufferedReader(new java.io.FileReader(args[1]));
     java.io.Writer w = new java.io.FileWriter(args[2]);
     try
     {
          if (r1.ready() && r2.ready())
          {
               java.text.DateFormat dateParser = new java.text.SimpleDateFormat("EEE MMMM d hh:mm:ss yyyy", java.util.Locale.US);

               // initialization    
               String s1 = r1.readLine() + "\n";
               java.util.Date timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
               
               String s2 = null;
               java.util.Date timeStamp2 = null;
               
               boolean flag = true;
               
               while (true)
               {
                    if (flag)
                    {
                         if (!r2.ready())
                              break;
                         s2 = r2.readLine() + "\n";
                         timeStamp2 = dateParser.parse(s2.substring(s2.indexOf(" "), s2.lastIndexOf(" ")).trim());
                    } else {
                         if (!r1.ready())
                              break;
                         s1 = r1.readLine() + "\n";
                         timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
                    }
                    // if timeStamps are equals then sort on last column
                    if (timeStamp1.equals(timeStamp2))
                         flag = (Integer.parseInt(s1.substring(s1.lastIndexOf(" ")).trim()) > Integer.parseInt(s2.substring(s2.lastIndexOf(" ")).trim()));
                    else
                         flag = flag = timeStamp1.after(timeStamp2);
                         
                    w.write((flag) ? s2 : s1);
               }

               w.write((flag) ? s1 : s2);
          }

          while (r1.ready())
               w.write(r1.readLine() + "\n");
          while (r2.ready())
               w.write(r2.readLine() + "\n");
     } finally {
          r1.close();
          r2.close();
          w.close();
     }
}
0
 

Author Comment

by:alwayshunk
ID: 6947558
Andrey
The sorting happens properly. But the output has some special characters if I open it in notepad. Can u tell me how to remove them.
0
 

Author Comment

by:alwayshunk
ID: 6947606
Andrey
The sorting happens properly. But the output has some special characters if I open it in notepad. Can
u tell me how to remove them.
Thanks
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947645
What the special characters ? (hex code)
0
 

Author Comment

by:alwayshunk
ID: 6947673
If I open it in Notepad the lines are not coming in new lines. They are just seperated by a box like character. If I open it in MS-Word, it works fine. Each new line comes in next line. The problem seems to be because of "\n". It happens in notepad.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947674
OK
I see :)

Please change the source:
1.

try
{
     String separator = System.getProperty("line.separator");
     if (r1.ready() && r2.ready())
....

2.
replace all '"\n"' strings with 'separator' variable

Good luck
0
 

Author Comment

by:alwayshunk
ID: 6947686
Andrey

Bingo... I dont have extra points otherwise I have surely given u some.

Thanks a ton.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947696
:) not at all ...
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

An old method to applying the Singleton pattern in your Java code is to check if a static instance, defined in the same class that needs to be instantiated once and only once, is null and then create a new instance; otherwise, the pre-existing insta…
In this post we will learn how to connect and configure Android Device (Smartphone etc.) with Android Studio. After that we will run a simple Hello World Program.
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.
How to fix incompatible JVM issue while installing Eclipse While installing Eclipse in windows, got one error like above and unable to proceed with the installation. This video describes how to successfully install Eclipse. How to solve incompa…
Suggested Courses
Course of the Month15 days, 8 hours left to enroll

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question