• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 604
  • Last Modified:

Log File Parsing with Script

Hello experts,

I am trying to figure how to parse this log file to a point I can import into a database table. Export to CSV would be helpful.

I can use Powershell, VBScript, or MS-DOS Batch.

Example Data:


111.111.111.11 - - [11/Dec/2013:07:24:09 -0500] "OPTIONS /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk HTTP/1.1" 401 401
111.111.111.11 - user.name [11/Dec/2013:07:24:09 -0500] "OPTIONS /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk HTTP/1.1" 200 143
111.111.111.11 - user.name [11/Dec/2013:07:24:09 -0500] "OPTIONS /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk HTTP/1.1" 200 96
111.111.111.11 - user.name [11/Dec/2013:07:24:09 -0500] "PROPFIND /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk HTTP/1.1" 207 255
111.111.111.11 - - [11/Dec/2013:07:24:09 -0500] "OPTIONS /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk HTTP/1.1" 401 401
111.111.111.11 - user.name [11/Dec/2013:07:24:09 -0500] "OPTIONS /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk HTTP/1.1" 200 143
111.111.111.11 - user.name [11/Dec/2013:07:24:09 -0500] "OPTIONS /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk HTTP/1.1" 200 96
111.111.111.11 - user.name [11/Dec/2013:07:24:09 -0500] "REPORT /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk HTTP/1.1" 200 115

I need to be able to remove the " - - ", " - ", and the last sets of numbers at the end.

In essence I just need the IPs, usernames, date, and paths (all info between the double quotes).
0
jb428j
Asked:
jb428j
  • 7
  • 5
2 Solutions
 
becraigCommented:
This might be high overhead but does what you want:
(gc file.txt) |%{$_.split('"')[1]} | out-file newfile.txt -append
0
 
jb428jAuthor Commented:
Yes, that does get the path information. Sorry if my question wasn't worded correctly. I also need the IP, username, the path which is the information between the double quotes. And need to remove the HTTP/1.1 within the double quotes. Finally, drop any line with " - - ".

Which is why a CSV export will help because of import to a database.

Overhead really isn't an issue honestly. It is a 4+ GB file and better than hand jamming. lol
0
 
becraigCommented:
Ok got you one minute.
0
Creating Active Directory Users from a Text File

If your organization has a need to mass-create AD user accounts, watch this video to see how its done without the need for scripting or other unnecessary complexities.

 
becraigCommented:
(gc file.txt) -replace 'HTTP/1.1', '' |%{$_.split('"')[0,1] -join " "} | out-file newfile.txt -append


We can then clean it up

(gc newfile.txt | select-string -notmatch "--")  | out-file cleanfile.txt -append
0
 
jb428jAuthor Commented:
Working great so far. How can I go about keeping the IPs, usernames, and date at the beginning of the file.

Date being in a better format would be helpful, removing both [ ] brackets and the -500. Drop the first : with a space.

11/Dec/2013 07:24:09
0
 
becraigCommented:
 (gc file.txt | select-string -notmatch "- -") -replace ' HTTP/1', '' -replace '\[', '' -replace '\]', '' -replace '\[', '' -replace '-0500', ''| %{$_.split('"')[0,1]
 -join " "} | out-file newfile.txt -append

Open in new window


That replaces all the characters you need gone, you can change what is used for the join (if you want to), currently I have a space but you can replace with a comma or tab if that helps
0
 
becraigCommented:
If you're gonna use this, delete the old output file created earlier so you do not have a dirty file, or rename the output file.
0
 
jb428jAuthor Commented:
It is getting there. Now there is a tough change that I need to do with the date.

111.111.111.11  user.name 11/Dec/2013:07:24:09  REPORT /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk
But I need:

111.111.111.11  user.name 11/Dec/2013 07:24:09  REPORT /repo/Goodrich-SISVT/767-2C_FQIS/S01590-0101%20767-2C%20BLSP/data/trunk
I can just had another column for time and the activity "Report" in the database to deal with column separation.

Maybe some sort of index, replace first : with space?
0
 
becraigCommented:
Simply run this on the new file created:
(gc newfile.txt) | % {$_.split(':',2)[0,1] -join " "} | out-file newfile.txt

Open in new window

0
 
jb428jAuthor Commented:
Works great!
0
 
jb428jAuthor Commented:
Instead of dumping to a file and modding the newfile1, I did the following:

$Var = (gc file.txt | select-string -notmatch "- -") -replace ' HTTP/1.1', '' -replace '\[', '' -replace '\]', '' -replace '\[', '' -replace '-0500', ''| %{$_.split('"')[0,1] -join " "} 
$Var | % {$_.split(':',2)[0,1] -join " "} | out-file newfile.txt 

Open in new window

0
 
becraigCommented:
Great I only added the  second run since I figured you might have been running the first already.   Good stuff
0

Featured Post

New feature and membership benefit!

New feature! Upgrade and increase expert visibility of your issues with Priority Questions.

  • 7
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now