Solved

Sorting of multiple log files

Posted on 2002-04-16
25
194 Views
Last Modified: 2010-03-31
Lets assume the case with two files.

File1.txt
911     Wed Oct 17 12:33:44 2001     2
913     Wed Oct 17 12:44:43 2001     4

File2.txt
912     Wed Oct 17 12:36:44 2001     3
914     Wed Oct 17 12:54:43 2001     5

Now I would first read these two files in buffer and then like to run a sorting algorithm (preferably quicksort or anything faster than that) to sort the two files on basis of timestamp. The sorted information should be stored in the output file

Output.txt
911     Wed Oct 17 12:33:44 2001     2
912     Wed Oct 17 12:36:44 2001     3
913     Wed Oct 17 12:44:43 2001     4
914     Wed Oct 17 12:54:43 2001     5

There can be a case where multiple files are fed to this java class. How should I sort these files .Should I convert the date in some specific format inside the code itself for easier comparison.
0
Comment
Question by:alwayshunk
  • 11
  • 7
  • 3
  • +4
25 Comments
 
LVL 16

Expert Comment

by:heyhey_
ID: 6947067
> Should I convert the date in some specific format inside the code itself for easier comparison.

if you use timestamps like
20011017125443
it will be much easier to compare them
0
 
LVL 3

Expert Comment

by:saxaboo
ID: 6947086
The main issue here is to find a way not to code the quicksort, although you'll find it in virtually every computer science course ...
Java implements the mergesort algorithm for collections. All you have to do is make sure that the elements in your collection implement the Comparable interface.

The idea is :

public class LogEntry implements Comparable
{
     //getters/settters would be nicer ...
     //anyway you get the idea
     public int mID;
     public Date mDate;
     public int mSeverity;

    public int compareTo(Object pOtherEntry)
    {
         return  mDate.compareTo(pOtherEntry.mDate);
    }
}


Now in your main program :
- read the files and manage to build a java.util.List (using the arrayList class, for instance) containing all your entries. Let's call it theList

Now sort it :
java.util.Collections.sort(theList);

=> your list is sorted ! Isn't oo-programming great ?

Hope this helps,

-S

0
 

Author Comment

by:alwayshunk
ID: 6947090
I'll convert the date in thte format you have told. But still I need to sort multiple files and then sort it in ascending order. Any idea?
0
Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

 

Author Comment

by:alwayshunk
ID: 6947095
I'll convert the date in thte format you have told. But still I need to sort multiple files and then sort it in ascending order. Any idea?
0
 

Author Comment

by:alwayshunk
ID: 6947103
I'll convert the date in thte format you have told. But still I need to sort multiple files and then sort it in ascending order. Any idea?
0
 
LVL 2

Expert Comment

by:CSuvendra
ID: 6947118
Use TreeMap. Here is an Example. Pls. check for performance with regards to your requirement. I can post the full code if you are not clear about something.

/* Create TreeMap */
TreeMap tmp = new TreeMap();

/* Add Values */
SimpleDateFormat sdf = new SimpleDateFormat("EEE MMM d hh:mm:ss y");

/* For each file, each Record */
/* s1 is input line from File(s) */
Date dt = sdf.parse(s1.substring(8, 31));
tmp.put(dt, s1);

/* Now write the collection to file */

Iterator i = ((Collection)tmp.values()).iterator();
while(i.hasNext()) // Write to File here
    System.out.println(""+i.next());

/* Here is the output */
911     Wed Oct 17 12:33:44 2001     2
912     Wed Oct 17 12:36:44 2001     3
913     Wed Oct 17 12:44:43 2001     4
914     Wed Oct 17 12:54:43 2001     5
0
 

Author Comment

by:alwayshunk
ID: 6947136
Suvendra your solution sounds interesting. I would appreciate if you can pass me the complete code starting from reading the file, putting it in buffer, changing the timestamp format and sorting it. Thanks in advance
0
 

Expert Comment

by:jodear
ID: 6947158
> Should I convert the date in some specific format inside the code itself for easier comparison.

You can convert each line of text in your log file into a Date object so that it would be sortable using saxaboo's idea above.

To convert your text to a Date object, substr or StringTokenize the line of text so that you only get the date-time part as follows:

"Wed Oct 17 12:33:44 2001"

then use the Date.parse() method as follows:

Date td = Date.parse("Wed Oct 17 12:33:44 2001")

Its deprecated though, but it still works better than the DateFormat.parse() method.  The date object td can now be inserted/appended to your list.  Do these for each line of text in your log files then sort your list.
0
 
LVL 2

Expert Comment

by:CSuvendra
ID: 6947171
/* Syntax: java Test File1.txt File2.txt */

**********************************************
import java.io.*;
import java.util.*;
import java.text.*;

public class Test {
    public static void main(String [] args) {

          TreeMap tmp = new TreeMap();
          BufferedReader f1;
          String s1 = new String();
          SimpleDateFormat sdf = new SimpleDateFormat("EEE MMM d hh:mm:ss y");

          for (int j=0; j<args.length; j++) {
               try {
                    f1 = new BufferedReader(new FileReader(args[j]));
                    while((s1=f1.readLine()) != null) {
                         Date dt = sdf.parse(s1.substring(8, 31));
                         tmp.put(dt, s1);
                    }

                    f1.close();

               } catch (IOException e) {
                    e.printStackTrace();
               } catch (ParseException ex) {
                    ex.printStackTrace();
               }
          }

          Iterator i = ((Collection)tmp.values()).iterator();

          while(i.hasNext()) {
               System.out.println(""+i.next());
          }
    }
}
0
 
LVL 2

Expert Comment

by:CSuvendra
ID: 6947200
alwayshunk
this question has moved into the 'Answered' zone because 'iodear' has proposed not a comment but an answer. Since the suggestions mentioned by him were already proposed by me and/ or 'saxaboo', I would urge you to Reject the proposed answer.

iodear
Your answer did not seem to add any value to previous posts since you did not give any new ideas. See the 'Tips on Comments and Answers' at the bottom of this page, if you are unsure of the meanings.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947265
Hi
you could merge two log files (all records in any log file are in sorted order). Sorting takes much more time then merging.

Code:

java.io.BufferedReader r1 = new java.io.BufferedReader(new java.io.FileReader("d:/a.txt"));
java.io.BufferedReader r2 = new java.io.BufferedReader(new java.io.FileReader("d:/b.txt"));
java.io.Writer w = new java.io.FileWriter("d:/merged.txt");
try
{
     if (r1.ready() && r2.ready())
     {
          java.text.DateFormat dateParser = new java.text.SimpleDateFormat("EEE MMMM d hh:mm:ss yyyy", java.util.Locale.US);

          // initialization    
          String s1 = r1.readLine() + "\n";
          java.util.Date timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
         
          String s2 = null;
          java.util.Date timeStamp2 = null;
         
          boolean flag = true;
         
          while (true)
          {
               if (flag)
               {
                    if (!r2.ready())
                         break;
                    s2 = r2.readLine() + "\n";
                    timeStamp2 = dateParser.parse(s2.substring(s2.indexOf(" "), s2.lastIndexOf(" ")).trim());
               } else {
                    if (!r1.ready())
                         break;
                    s1 = r1.readLine() + "\n";
                    timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
               }
               w.write((flag = timeStamp1.after(timeStamp2)) ? s2 : s1);
          }

          w.write((flag) ? s1 : s2);
     }

     while (r1.ready())
          w.write(r1.readLine() + "\n");
     while (r2.ready())
          w.write(r2.readLine() + "\n");
} finally {
     r1.close();
     r2.close();
     w.close();
}

0
 

Author Comment

by:alwayshunk
ID: 6947387
Your answer did not seem to add any value to previous posts since you did not give any new ideas
0
 

Author Comment

by:alwayshunk
ID: 6947394
Suvendra it seems to work fine but I have some performance issues. Can I merge  the two files and then try a sort on it. Any idea. Moreover if a case arises in which the timestamp is same in two files, I need to sort it on the last column.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947427
alwayshunk see my previous comment. My code merges two log files into one. Time complexity O(n). Any sorter(for example MapTree) have time complexity O(n*logn)
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947435
You could try all implementation on big log files... for best choice :)

Best regards
0
 

Author Comment

by:alwayshunk
ID: 6947444
Andrey I am new to JAVA. Can u pass me the code as a class file which I can compile and try out with two input files as parameter.
Thanks
0
 
LVL 9

Expert Comment

by:Venci75
ID: 6947473
Are the lines in each log file sorted?
0
 
LVL 2

Accepted Solution

by:
Andrey_Kulik earned 175 total points
ID: 6947474
java science.MergeLog logFile1 logFile2 mergedLog

package science;
/**
 * @author: Kulik Andrey
 */
public class MergeLog implements Cloneable {
private MergeLog() {
     super();
}
/**
 * Usage: logFile1 logFile2 mergedLogFile
 */
public static void main(java.lang.String[] args) {
     java.io.BufferedReader r1 = new java.io.BufferedReader(new java.io.FileReader(args[0]));
     java.io.BufferedReader r2 = new java.io.BufferedReader(new java.io.FileReader(args[1]));
     java.io.Writer w = new java.io.FileWriter(args[2]);
     try
     {
          if (r1.ready() && r2.ready())
          {
               java.text.DateFormat dateParser = new java.text.SimpleDateFormat("EEE MMMM d hh:mm:ss yyyy", java.util.Locale.US);

               // initialization    
               String s1 = r1.readLine() + "\n";
               java.util.Date timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
               
               String s2 = null;
               java.util.Date timeStamp2 = null;
               
               boolean flag = true;
               
               while (true)
               {
                    if (flag)
                    {
                         if (!r2.ready())
                              break;
                         s2 = r2.readLine() + "\n";
                         timeStamp2 = dateParser.parse(s2.substring(s2.indexOf(" "), s2.lastIndexOf(" ")).trim());
                    } else {
                         if (!r1.ready())
                              break;
                         s1 = r1.readLine() + "\n";
                         timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
                    }
                    // if timeStamps are equals then sort on last column
                    if (timeStamp1.equals(timeStamp2))
                         flag = (Integer.parseInt(s1.substring(s1.lastIndexOf(" ")).trim()) > Integer.parseInt(s2.substring(s2.lastIndexOf(" ")).trim()));
                    else
                         flag = flag = timeStamp1.after(timeStamp2);
                         
                    w.write((flag) ? s2 : s1);
               }

               w.write((flag) ? s1 : s2);
          }

          while (r1.ready())
               w.write(r1.readLine() + "\n");
          while (r2.ready())
               w.write(r2.readLine() + "\n");
     } finally {
          r1.close();
          r2.close();
          w.close();
     }
}
0
 

Author Comment

by:alwayshunk
ID: 6947558
Andrey
The sorting happens properly. But the output has some special characters if I open it in notepad. Can u tell me how to remove them.
0
 

Author Comment

by:alwayshunk
ID: 6947606
Andrey
The sorting happens properly. But the output has some special characters if I open it in notepad. Can
u tell me how to remove them.
Thanks
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947645
What the special characters ? (hex code)
0
 

Author Comment

by:alwayshunk
ID: 6947673
If I open it in Notepad the lines are not coming in new lines. They are just seperated by a box like character. If I open it in MS-Word, it works fine. Each new line comes in next line. The problem seems to be because of "\n". It happens in notepad.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947674
OK
I see :)

Please change the source:
1.

try
{
     String separator = System.getProperty("line.separator");
     if (r1.ready() && r2.ready())
....

2.
replace all '"\n"' strings with 'separator' variable

Good luck
0
 

Author Comment

by:alwayshunk
ID: 6947686
Andrey

Bingo... I dont have extra points otherwise I have surely given u some.

Thanks a ton.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947696
:) not at all ...
0

Featured Post

Courses: Start Training Online With Pros, Today

Brush up on the basics or master the advanced techniques required to earn essential industry certifications, with Courses. Enroll in a course and start learning today. Training topics range from Android App Dev to the Xen Virtualization Platform.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Python Assistance 7 79
what is a "java.lang.System Property"   ? 20 65
JavaScript/Java - Changing an image background color 4 65
web services creation SOAP vs REST 5 37
Java contains several comparison operators (e.g., <, <=, >, >=, ==, !=) that allow you to compare primitive values. However, these operators cannot be used to compare the contents of objects. Interface Comparable is used to allow objects of a cl…
This was posted to the Netbeans forum a Feb, 2010 and I also sent it to Verisign. Who didn't help much in my struggles to get my application signed. ------------------------- Start The idea here is to target your cell phones with the correct…
Viewers will learn one way to get user input in Java. Introduce the Scanner object: Declare the variable that stores the user input: An example prompting the user for input: Methods you need to invoke in order to properly get  user input:
This video teaches viewers about errors in exception handling.

813 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now