Solved

Sorting of multiple log files

Posted on 2002-04-16
25
213 Views
Last Modified: 2010-03-31
Lets assume the case with two files.

File1.txt
911     Wed Oct 17 12:33:44 2001     2
913     Wed Oct 17 12:44:43 2001     4

File2.txt
912     Wed Oct 17 12:36:44 2001     3
914     Wed Oct 17 12:54:43 2001     5

Now I would first read these two files in buffer and then like to run a sorting algorithm (preferably quicksort or anything faster than that) to sort the two files on basis of timestamp. The sorted information should be stored in the output file

Output.txt
911     Wed Oct 17 12:33:44 2001     2
912     Wed Oct 17 12:36:44 2001     3
913     Wed Oct 17 12:44:43 2001     4
914     Wed Oct 17 12:54:43 2001     5

There can be a case where multiple files are fed to this java class. How should I sort these files .Should I convert the date in some specific format inside the code itself for easier comparison.
0
Comment
Question by:alwayshunk
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 11
  • 7
  • 3
  • +4
25 Comments
 
LVL 16

Expert Comment

by:heyhey_
ID: 6947067
> Should I convert the date in some specific format inside the code itself for easier comparison.

if you use timestamps like
20011017125443
it will be much easier to compare them
0
 
LVL 3

Expert Comment

by:saxaboo
ID: 6947086
The main issue here is to find a way not to code the quicksort, although you'll find it in virtually every computer science course ...
Java implements the mergesort algorithm for collections. All you have to do is make sure that the elements in your collection implement the Comparable interface.

The idea is :

public class LogEntry implements Comparable
{
     //getters/settters would be nicer ...
     //anyway you get the idea
     public int mID;
     public Date mDate;
     public int mSeverity;

    public int compareTo(Object pOtherEntry)
    {
         return  mDate.compareTo(pOtherEntry.mDate);
    }
}


Now in your main program :
- read the files and manage to build a java.util.List (using the arrayList class, for instance) containing all your entries. Let's call it theList

Now sort it :
java.util.Collections.sort(theList);

=> your list is sorted ! Isn't oo-programming great ?

Hope this helps,

-S

0
 

Author Comment

by:alwayshunk
ID: 6947090
I'll convert the date in thte format you have told. But still I need to sort multiple files and then sort it in ascending order. Any idea?
0
Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

 

Author Comment

by:alwayshunk
ID: 6947095
I'll convert the date in thte format you have told. But still I need to sort multiple files and then sort it in ascending order. Any idea?
0
 

Author Comment

by:alwayshunk
ID: 6947103
I'll convert the date in thte format you have told. But still I need to sort multiple files and then sort it in ascending order. Any idea?
0
 
LVL 2

Expert Comment

by:CSuvendra
ID: 6947118
Use TreeMap. Here is an Example. Pls. check for performance with regards to your requirement. I can post the full code if you are not clear about something.

/* Create TreeMap */
TreeMap tmp = new TreeMap();

/* Add Values */
SimpleDateFormat sdf = new SimpleDateFormat("EEE MMM d hh:mm:ss y");

/* For each file, each Record */
/* s1 is input line from File(s) */
Date dt = sdf.parse(s1.substring(8, 31));
tmp.put(dt, s1);

/* Now write the collection to file */

Iterator i = ((Collection)tmp.values()).iterator();
while(i.hasNext()) // Write to File here
    System.out.println(""+i.next());

/* Here is the output */
911     Wed Oct 17 12:33:44 2001     2
912     Wed Oct 17 12:36:44 2001     3
913     Wed Oct 17 12:44:43 2001     4
914     Wed Oct 17 12:54:43 2001     5
0
 

Author Comment

by:alwayshunk
ID: 6947136
Suvendra your solution sounds interesting. I would appreciate if you can pass me the complete code starting from reading the file, putting it in buffer, changing the timestamp format and sorting it. Thanks in advance
0
 

Expert Comment

by:jodear
ID: 6947158
> Should I convert the date in some specific format inside the code itself for easier comparison.

You can convert each line of text in your log file into a Date object so that it would be sortable using saxaboo's idea above.

To convert your text to a Date object, substr or StringTokenize the line of text so that you only get the date-time part as follows:

"Wed Oct 17 12:33:44 2001"

then use the Date.parse() method as follows:

Date td = Date.parse("Wed Oct 17 12:33:44 2001")

Its deprecated though, but it still works better than the DateFormat.parse() method.  The date object td can now be inserted/appended to your list.  Do these for each line of text in your log files then sort your list.
0
 
LVL 2

Expert Comment

by:CSuvendra
ID: 6947171
/* Syntax: java Test File1.txt File2.txt */

**********************************************
import java.io.*;
import java.util.*;
import java.text.*;

public class Test {
    public static void main(String [] args) {

          TreeMap tmp = new TreeMap();
          BufferedReader f1;
          String s1 = new String();
          SimpleDateFormat sdf = new SimpleDateFormat("EEE MMM d hh:mm:ss y");

          for (int j=0; j<args.length; j++) {
               try {
                    f1 = new BufferedReader(new FileReader(args[j]));
                    while((s1=f1.readLine()) != null) {
                         Date dt = sdf.parse(s1.substring(8, 31));
                         tmp.put(dt, s1);
                    }

                    f1.close();

               } catch (IOException e) {
                    e.printStackTrace();
               } catch (ParseException ex) {
                    ex.printStackTrace();
               }
          }

          Iterator i = ((Collection)tmp.values()).iterator();

          while(i.hasNext()) {
               System.out.println(""+i.next());
          }
    }
}
0
 
LVL 2

Expert Comment

by:CSuvendra
ID: 6947200
alwayshunk
this question has moved into the 'Answered' zone because 'iodear' has proposed not a comment but an answer. Since the suggestions mentioned by him were already proposed by me and/ or 'saxaboo', I would urge you to Reject the proposed answer.

iodear
Your answer did not seem to add any value to previous posts since you did not give any new ideas. See the 'Tips on Comments and Answers' at the bottom of this page, if you are unsure of the meanings.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947265
Hi
you could merge two log files (all records in any log file are in sorted order). Sorting takes much more time then merging.

Code:

java.io.BufferedReader r1 = new java.io.BufferedReader(new java.io.FileReader("d:/a.txt"));
java.io.BufferedReader r2 = new java.io.BufferedReader(new java.io.FileReader("d:/b.txt"));
java.io.Writer w = new java.io.FileWriter("d:/merged.txt");
try
{
     if (r1.ready() && r2.ready())
     {
          java.text.DateFormat dateParser = new java.text.SimpleDateFormat("EEE MMMM d hh:mm:ss yyyy", java.util.Locale.US);

          // initialization    
          String s1 = r1.readLine() + "\n";
          java.util.Date timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
         
          String s2 = null;
          java.util.Date timeStamp2 = null;
         
          boolean flag = true;
         
          while (true)
          {
               if (flag)
               {
                    if (!r2.ready())
                         break;
                    s2 = r2.readLine() + "\n";
                    timeStamp2 = dateParser.parse(s2.substring(s2.indexOf(" "), s2.lastIndexOf(" ")).trim());
               } else {
                    if (!r1.ready())
                         break;
                    s1 = r1.readLine() + "\n";
                    timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
               }
               w.write((flag = timeStamp1.after(timeStamp2)) ? s2 : s1);
          }

          w.write((flag) ? s1 : s2);
     }

     while (r1.ready())
          w.write(r1.readLine() + "\n");
     while (r2.ready())
          w.write(r2.readLine() + "\n");
} finally {
     r1.close();
     r2.close();
     w.close();
}

0
 

Author Comment

by:alwayshunk
ID: 6947387
Your answer did not seem to add any value to previous posts since you did not give any new ideas
0
 

Author Comment

by:alwayshunk
ID: 6947394
Suvendra it seems to work fine but I have some performance issues. Can I merge  the two files and then try a sort on it. Any idea. Moreover if a case arises in which the timestamp is same in two files, I need to sort it on the last column.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947427
alwayshunk see my previous comment. My code merges two log files into one. Time complexity O(n). Any sorter(for example MapTree) have time complexity O(n*logn)
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947435
You could try all implementation on big log files... for best choice :)

Best regards
0
 

Author Comment

by:alwayshunk
ID: 6947444
Andrey I am new to JAVA. Can u pass me the code as a class file which I can compile and try out with two input files as parameter.
Thanks
0
 
LVL 9

Expert Comment

by:Venci75
ID: 6947473
Are the lines in each log file sorted?
0
 
LVL 2

Accepted Solution

by:
Andrey_Kulik earned 175 total points
ID: 6947474
java science.MergeLog logFile1 logFile2 mergedLog

package science;
/**
 * @author: Kulik Andrey
 */
public class MergeLog implements Cloneable {
private MergeLog() {
     super();
}
/**
 * Usage: logFile1 logFile2 mergedLogFile
 */
public static void main(java.lang.String[] args) {
     java.io.BufferedReader r1 = new java.io.BufferedReader(new java.io.FileReader(args[0]));
     java.io.BufferedReader r2 = new java.io.BufferedReader(new java.io.FileReader(args[1]));
     java.io.Writer w = new java.io.FileWriter(args[2]);
     try
     {
          if (r1.ready() && r2.ready())
          {
               java.text.DateFormat dateParser = new java.text.SimpleDateFormat("EEE MMMM d hh:mm:ss yyyy", java.util.Locale.US);

               // initialization    
               String s1 = r1.readLine() + "\n";
               java.util.Date timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
               
               String s2 = null;
               java.util.Date timeStamp2 = null;
               
               boolean flag = true;
               
               while (true)
               {
                    if (flag)
                    {
                         if (!r2.ready())
                              break;
                         s2 = r2.readLine() + "\n";
                         timeStamp2 = dateParser.parse(s2.substring(s2.indexOf(" "), s2.lastIndexOf(" ")).trim());
                    } else {
                         if (!r1.ready())
                              break;
                         s1 = r1.readLine() + "\n";
                         timeStamp1 = dateParser.parse(s1.substring(s1.indexOf(" "), s1.lastIndexOf(" ")).trim());
                    }
                    // if timeStamps are equals then sort on last column
                    if (timeStamp1.equals(timeStamp2))
                         flag = (Integer.parseInt(s1.substring(s1.lastIndexOf(" ")).trim()) > Integer.parseInt(s2.substring(s2.lastIndexOf(" ")).trim()));
                    else
                         flag = flag = timeStamp1.after(timeStamp2);
                         
                    w.write((flag) ? s2 : s1);
               }

               w.write((flag) ? s1 : s2);
          }

          while (r1.ready())
               w.write(r1.readLine() + "\n");
          while (r2.ready())
               w.write(r2.readLine() + "\n");
     } finally {
          r1.close();
          r2.close();
          w.close();
     }
}
0
 

Author Comment

by:alwayshunk
ID: 6947558
Andrey
The sorting happens properly. But the output has some special characters if I open it in notepad. Can u tell me how to remove them.
0
 

Author Comment

by:alwayshunk
ID: 6947606
Andrey
The sorting happens properly. But the output has some special characters if I open it in notepad. Can
u tell me how to remove them.
Thanks
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947645
What the special characters ? (hex code)
0
 

Author Comment

by:alwayshunk
ID: 6947673
If I open it in Notepad the lines are not coming in new lines. They are just seperated by a box like character. If I open it in MS-Word, it works fine. Each new line comes in next line. The problem seems to be because of "\n". It happens in notepad.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947674
OK
I see :)

Please change the source:
1.

try
{
     String separator = System.getProperty("line.separator");
     if (r1.ready() && r2.ready())
....

2.
replace all '"\n"' strings with 'separator' variable

Good luck
0
 

Author Comment

by:alwayshunk
ID: 6947686
Andrey

Bingo... I dont have extra points otherwise I have surely given u some.

Thanks a ton.
0
 
LVL 2

Expert Comment

by:Andrey_Kulik
ID: 6947696
:) not at all ...
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

By the end of 1980s, object oriented programming using languages like C++, Simula69 and ObjectPascal gained momentum. It looked like programmers finally found the perfect language. C++ successfully combined the object oriented principles of Simula w…
Java functions are among the best things for programmers to work with as Java sites can be very easy to read and prepare. Java especially simplifies many processes in the coding industry as it helps integrate many forms of technology and different d…
Viewers will learn about the different types of variables in Java and how to declare them. Decide the type of variable desired: Put the keyword corresponding to the type of variable in front of the variable name: Use the equal sign to assign a v…
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question