Link to home
Start Free TrialLog in
Avatar of Mohamed Mohamud
Mohamed Mohamud

asked on

Syslog text file into database or into .CSV

I have a large Syslogchatchall text file and I want to convert it into database, I tried to open the file using Excel but all the information in a one log becomes one column. Is someone help me how I change that file into column based file. Thanks.
syslog.txt
Avatar of Haris Dulic
Haris Dulic
Flag of Austria image

Hello,

you need to use the Excel import option from text and then select fixed widht and there you using the separators set the length of the column and after you finish process you will have the data in multiple columns..

I attached screenshots of the process and result.
Capture1.PNG
Capture.PNG
Avatar of alicain
alicain

Hi,

This looks to be a TAB seperated file, there are two methods that you could use :

1. Open a bland spreadsheet and then import the data.  From the Data ribbon -> From File.... browse to the file ... Delimited -> select "Tab".

2. With the data opened in a spreadsheet an in one column, use the "Text to column" feature in Excel.

Regards,
Alastair.
Avatar of Mohamed Mohamud

ASKER

Haris Djulic and alicain thanks for your help, but the problem is like  proto=17, policy_id=52 and others are in the same column is there a way I can separate or write into database? Thanks again.
Can you post sample file and desired output?
If you specifyig both TAB and SPACE a delimiters (a tick against each on the import dialog) then Excel also places proto=17 and  policy_id=52 into seperate columns.

Have you come across LogParser?  It might be useful for what you are looking at doing.  There's an overhead in getting up to speed, but if you're familiar with SQL commands, that will help :
   http://technet.microsoft.com/en-gb/scriptcenter/dd919274.aspx


Regards,
Alastair.
Dear Experts, my problem still not solved, my file is very big and excel cant handle the file it exceeds the limit and I want to import it to SQL DB so I can filter and run some quires on it, but the first 4 columns are separated with tab and  there is a long message witch also some times different depending whether it is notice or warring as the attached sample shows and I can't handle that part.
Is there also other way I can run quires as file remains text file with out importing into DB?
Hello,

you can use SQL Server import wizard where you can choose to import from Text file and select that it is tab delimited and you can write query to narrow the number of imported rows to database..
Your comment on the "long message" indicates to me that we probably can't simply separate the fields based solely on the whitespace (tabs and spaces).

Please attach a new sample file which shows how/where you want the field separation.

You've tagged several different languages so I'm assuming that you know each of them.  In my case I'd use Perl to parse the log and insert the data.
FishMonger I attached the file, but I appreciate SQL if possible.
Sample-file.rtf
sample.txt
Your log file is not in the proper format to only use an sql statement to import it into the database fields you wish.  It will need to be parsed and either converted to csv for the sql import statement or the parsing script can load it without the need of a csv file.

How big is the file in both the number of lines and bytes (MB or GB)?
It is 500MB, I can't open it but it is more than 100000 lines
Here's a sample perl script that will parse your log and generate a csv file which can be imported into the db via an sql statement.

#!/usr/bin/perl

use strict;
use warnings;

open my $in_fh, '<', 'syslog.txt' or die "failed to open 'syslog.txt' <$!>";
open my $out_fh, '>', 'syslog.csv' or die "failed to open 'syslog.csv' <$!>";

while (my $line = <$in_fh>) {
    chomp $line;
    if ($line =~ /^(\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   (\w+:\s\S+) \s+
                   (.+?) \s+
                   (.+) \s+
                   (reason.+)$
                 /xi) {
        print {$out_fh} join(',', $1,$2,$3,$4,$5,$6,$7,$8) . "\n";
    }
}
close $in_fh;
close $out_fh;

Open in new window


If some of the lines already contain a comma within any of the fields, then we would need to make a small adjustment to make sure they are escaped.

If you want, adding the db code to the script would be very easy.
FishMonger;
Almost done, there is only one thing some fields and their content may become one column like these

 fuhfw: NetScreen,  device_id=fuhfw, reason=Creation<000>

fuhfw, device_id, reason are column names

also
SQL is not accepting this as column content:
,[Root]system-notification-00257(traffic): start_time="2014-10-21 00:10:26" duration=0 policy_id=52 service=dns proto=17 src zone=Trust dst zone=Untrust action=Permit sent=0 rcvd=0 src=10.8.114.161 dst=10.1.1.8 src_port=65128 dst_port=53 src-xlated ip=10.8.114.161 port=65128 dst-xlated ip=10.1.1.8 port=53 session_id=234313,
result.rtf
ASKER CERTIFIED SOLUTION
Avatar of FishMonger
FishMonger
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Dear FishMonger;
One last thing, is it possible to create different column within different lines I as showed in the sample, I tried but it escapes the line that has less columns. Because I didn't get the information I need I want to split one line into more columns.
#!/usr/bin/perl

use strict;
use warnings;
use Text::CSV_XS;



open my $in_fh, '<', 'c:\tijabo\am.txt' or die "failed to open 'c:\tijabo\am.txt' <$!>";
open my $out_fh, '>', 'c:\tijabo\am.csv.' or die "failed to open 'c:\tijabo\am.csv' <$!>";

while (my $line = <$in_fh>) {

    if ($line =~ /^(\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   (\S+) \s+
                   fuhfw:\s (\S+) \s+
                   device_id= (.+?) \s+
                   duration= (.+?) \s+
                   policy_id= (.+?) \s+
                   service= (.+?) \s+
                   proto= (.+?) \s+                 
                   action= (.+?) \s+
                   sent= (.+?) \s+
                   rcvd= (.+?) \s+
                   src= (.+?) \s+
                   dst= (.+?) \s+
                   src_port= (.+?) \s+
                   dst_port= (.+?) \s+
                  
                   port= (.+?) \s+
                 
                   port= (.+?) \s+
                   session_id= (.+?) \s+
                  
                   reason[:=] \s* (.+)
                 $/xi) {
       print {$out_fh} join(',', $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21) . "\n";
   }
 
}

close $in_fh;
close $out_fh;

Open in new window

also I got a problem with these  {src zone=, dst zone=} due to space
sample.txt
Creating a csv file with varying numbers of fields is a very very bad idea but you can do it with multiple regexs.  Be aware that you won't be able to do a direct import of that csv file into the db.

also I got a problem with these  {src zone=, dst zone=} due to space
What do you mean by that?
I am aware of that thanks, could you please give me an example how to put  multiple regexs ?
I meant  this {src zone=, dst zone=} I couldn't put like this     action= (.+?) \s+     in the code to truncate them.
In the attached sample there is three line which contains different fields I run this code
use strict;
use warnings;

open my $in_fh, '<', 'c:\tijabo\test.txt' 
or die "failed to open 'c:\tijabo\test.txt' <$!>";
open my $out_fh, '>', 'c:\tijabo\test.csv.' 
or die "failed to open 'c:\tijabo\test.csv' <$!>";

while (my $line = <$in_fh>) {
    if ($line =~ /^(\S+) \s+
                   (\S+) \s+                                                                       
                   service= (.+?) \s+
                   proto= (.+?) \s+     
                   port= (.+?) \s+                
                   session_id= (.+?) \s+
                   reason[:=] \s* (.+)
                 $/xi) {
       print {$out_fh} join(',', $1,$2,$3,$4,$5,$6,$7) . "\n"; 
       } 
      }

Open in new window

but it truncates the shorter line and displays this:

         21-10-2014,00:00:00,dns,17,65128,234313,Creation<000>
         21-10-2014,00:00:00,http,6,4334,249770,Close - TCP FIN<000>

But I want to display this:

       21-10-2014,00:00:00,dns,17,65128,234313,Creation<000>
       21-10-2014,00:00:00,REASON: BY_OTHER, , , ,
       21-10-2014,00:00:00,http,6,4334,249770,Close - TCP FIN<000>

Could you please help?
test.txt
No comment has been added to this question in more than 21 days, so it is now classified as abandoned.

I have recommended this question be closed as follows:

Accept: FishMonger (http:#a40417301)

If you feel this question should be closed differently, post an objection and the moderators will review all objections and close it as they feel fit. If no one objects, this question will be closed automatically the way described above.

suhasbharadwaj
Experts-Exchange Cleanup Volunteer