Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 399
  • Last Modified:

Syslog text file into database or into .CSV

I have a large Syslogchatchall text file and I want to convert it into database, I tried to open the file using Excel but all the information in a one log becomes one column. Is someone help me how I change that file into column based file. Thanks.
syslog.txt
0
Mohamed Mohamud
Asked:
Mohamed Mohamud
  • 8
  • 5
  • 3
  • +2
1 Solution
 
Haris DjulicCommented:
Hello,

you need to use the Excel import option from text and then select fixed widht and there you using the separators set the length of the column and after you finish process you will have the data in multiple columns..

I attached screenshots of the process and result.
Capture1.PNG
Capture.PNG
0
 
alicainCommented:
Hi,

This looks to be a TAB seperated file, there are two methods that you could use :

1. Open a bland spreadsheet and then import the data.  From the Data ribbon -> From File.... browse to the file ... Delimited -> select "Tab".

2. With the data opened in a spreadsheet an in one column, use the "Text to column" feature in Excel.

Regards,
Alastair.
0
 
Mohamed MohamudAuthor Commented:
Haris Djulic and alicain thanks for your help, but the problem is like  proto=17, policy_id=52 and others are in the same column is there a way I can separate or write into database? Thanks again.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Haris DjulicCommented:
Can you post sample file and desired output?
0
 
alicainCommented:
If you specifyig both TAB and SPACE a delimiters (a tick against each on the import dialog) then Excel also places proto=17 and  policy_id=52 into seperate columns.

Have you come across LogParser?  It might be useful for what you are looking at doing.  There's an overhead in getting up to speed, but if you're familiar with SQL commands, that will help :
   http://technet.microsoft.com/en-gb/scriptcenter/dd919274.aspx


Regards,
Alastair.
0
 
Mohamed MohamudAuthor Commented:
Dear Experts, my problem still not solved, my file is very big and excel cant handle the file it exceeds the limit and I want to import it to SQL DB so I can filter and run some quires on it, but the first 4 columns are separated with tab and  there is a long message witch also some times different depending whether it is notice or warring as the attached sample shows and I can't handle that part.
Is there also other way I can run quires as file remains text file with out importing into DB?
0
 
Haris DjulicCommented:
Hello,

you can use SQL Server import wizard where you can choose to import from Text file and select that it is tab delimited and you can write query to narrow the number of imported rows to database..
0
 
FishMongerCommented:
Your comment on the "long message" indicates to me that we probably can't simply separate the fields based solely on the whitespace (tabs and spaces).

Please attach a new sample file which shows how/where you want the field separation.

You've tagged several different languages so I'm assuming that you know each of them.  In my case I'd use Perl to parse the log and insert the data.
0
 
Mohamed MohamudAuthor Commented:
FishMonger I attached the file, but I appreciate SQL if possible.
Sample-file.rtf
sample.txt
0
 
FishMongerCommented:
Your log file is not in the proper format to only use an sql statement to import it into the database fields you wish.  It will need to be parsed and either converted to csv for the sql import statement or the parsing script can load it without the need of a csv file.

How big is the file in both the number of lines and bytes (MB or GB)?
0
 
Mohamed MohamudAuthor Commented:
It is 500MB, I can't open it but it is more than 100000 lines
0
 
FishMongerCommented:
Here's a sample perl script that will parse your log and generate a csv file which can be imported into the db via an sql statement.

#!/usr/bin/perl

use strict;
use warnings;

open my $in_fh, '<', 'syslog.txt' or die "failed to open 'syslog.txt' <$!>";
open my $out_fh, '>', 'syslog.csv' or die "failed to open 'syslog.csv' <$!>";

while (my $line = <$in_fh>) {
    chomp $line;
    if ($line =~ /^(\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   (\w+:\s\S+) \s+
                   (.+?) \s+
                   (.+) \s+
                   (reason.+)$
                 /xi) {
        print {$out_fh} join(',', $1,$2,$3,$4,$5,$6,$7,$8) . "\n";
    }
}
close $in_fh;
close $out_fh;

Open in new window


If some of the lines already contain a comma within any of the fields, then we would need to make a small adjustment to make sure they are escaped.

If you want, adding the db code to the script would be very easy.
0
 
Mohamed MohamudAuthor Commented:
FishMonger;
Almost done, there is only one thing some fields and their content may become one column like these

 fuhfw: NetScreen,  device_id=fuhfw, reason=Creation<000>

fuhfw, device_id, reason are column names

also
SQL is not accepting this as column content:
,[Root]system-notification-00257(traffic): start_time="2014-10-21 00:10:26" duration=0 policy_id=52 service=dns proto=17 src zone=Trust dst zone=Untrust action=Permit sent=0 rcvd=0 src=10.8.114.161 dst=10.1.1.8 src_port=65128 dst_port=53 src-xlated ip=10.8.114.161 port=65128 dst-xlated ip=10.1.1.8 port=53 session_id=234313,
result.rtf
0
 
FishMongerCommented:
I suspected that the db import would fail due to the "loose" csv formatting.

I've added the use of the Text::CSV_XS module to fix the quoting of the fields and escaping of those quote marks within the field.

I also adjusted the regex to strip out those field names you mentiond.

#!/usr/bin/perl

use strict;
use warnings;
use Text::CSV_XS;

my $csv = Text::CSV_XS->new ({
                              binary         => 1,
                              eol            => $/,
                              always_quote   => 1,
                              escape_char    => "\\",
                              auto_diag      => 1
                            });

open my $in_fh, '<', 'syslog.txt' or die "failed to open 'syslog.txt' <$!>";
open my $out_fh, '>', 'syslog.csv' or die "failed to open 'syslog.csv' <$!>";

while (my $line = <$in_fh>) {

    if ($line =~ /^(\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   fuhfw:\s (\S+) \s+
                   device_id= (.+?) \s+
                   (.+) \s+
                   reason[:=] \s* (.+)
                 $/xi) {
        $csv->print( $out_fh, [ $1,$2,$3,$4,$5,$6,$7,$8 ] );
    }
}

close $in_fh;
close $out_fh;

Open in new window


If you need to post more samples, please use plain text files instead of the rtf file.  The rtf file makes it less clear on what the rows look like.
0
 
Mohamed MohamudAuthor Commented:
Dear FishMonger;
One last thing, is it possible to create different column within different lines I as showed in the sample, I tried but it escapes the line that has less columns. Because I didn't get the information I need I want to split one line into more columns.
#!/usr/bin/perl

use strict;
use warnings;
use Text::CSV_XS;



open my $in_fh, '<', 'c:\tijabo\am.txt' or die "failed to open 'c:\tijabo\am.txt' <$!>";
open my $out_fh, '>', 'c:\tijabo\am.csv.' or die "failed to open 'c:\tijabo\am.csv' <$!>";

while (my $line = <$in_fh>) {

    if ($line =~ /^(\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   fuhfw:\s (\S+) \s+
                   device_id= (.+?) \s+
                   duration= (.+?) \s+
                   policy_id= (.+?) \s+
                   service= (.+?) \s+
                   proto= (.+?) \s+                 
                   action= (.+?) \s+
                   sent= (.+?) \s+
                   rcvd= (.+?) \s+
                   src= (.+?) \s+
                   dst= (.+?) \s+
                   src_port= (.+?) \s+
                   dst_port= (.+?) \s+
                  
                   port= (.+?) \s+
                 
                   port= (.+?) \s+
                   session_id= (.+?) \s+
                  
                   reason[:=] \s* (.+)
                 $/xi) {
       print {$out_fh} join(',', $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21) . "\n";
   }
 
}

close $in_fh;
close $out_fh;

Open in new window

also I got a problem with these  {src zone=, dst zone=} due to space
sample.txt
0
 
FishMongerCommented:
Creating a csv file with varying numbers of fields is a very very bad idea but you can do it with multiple regexs.  Be aware that you won't be able to do a direct import of that csv file into the db.

also I got a problem with these  {src zone=, dst zone=} due to space
What do you mean by that?
0
 
Mohamed MohamudAuthor Commented:
I am aware of that thanks, could you please give me an example how to put  multiple regexs ?
I meant  this {src zone=, dst zone=} I couldn't put like this     action= (.+?) \s+     in the code to truncate them.
0
 
Mohamed MohamudAuthor Commented:
In the attached sample there is three line which contains different fields I run this code
use strict;
use warnings;

open my $in_fh, '<', 'c:\tijabo\test.txt' 
or die "failed to open 'c:\tijabo\test.txt' <$!>";
open my $out_fh, '>', 'c:\tijabo\test.csv.' 
or die "failed to open 'c:\tijabo\test.csv' <$!>";

while (my $line = <$in_fh>) {
    if ($line =~ /^(\S+) \s+
                   (\S+) \s+                                                                       
                   service= (.+?) \s+
                   proto= (.+?) \s+     
                   port= (.+?) \s+                
                   session_id= (.+?) \s+
                   reason[:=] \s* (.+)
                 $/xi) {
       print {$out_fh} join(',', $1,$2,$3,$4,$5,$6,$7) . "\n"; 
       } 
      }

Open in new window

but it truncates the shorter line and displays this:

         21-10-2014,00:00:00,dns,17,65128,234313,Creation<000>
         21-10-2014,00:00:00,http,6,4334,249770,Close - TCP FIN<000>

But I want to display this:

       21-10-2014,00:00:00,dns,17,65128,234313,Creation<000>
       21-10-2014,00:00:00,REASON: BY_OTHER, , , ,
       21-10-2014,00:00:00,http,6,4334,249770,Close - TCP FIN<000>

Could you please help?
test.txt
0
 
Suhas .QA ManagerCommented:
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.

I have recommended this question be closed as follows:

Accept: FishMonger (http:#a40417301)

If you feel this question should be closed differently, post an objection and the moderators will review all objections and close it as they feel fit. If no one objects, this question will be closed automatically the way described above.

suhasbharadwaj
Experts-Exchange Cleanup Volunteer
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 8
  • 5
  • 3
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now