• Status: Solved
  • Priority: Medium
  • Security: Private
  • Views: 61
  • Last Modified:

how to extract data from log and create CSV in bash

Hi experts,
trying to parse attached log in bash script and get some meaning full data in csv format.
Pls check the attached file with full log.

could you pls help.

From attached sample file.

Consider these lines:
#####################
Dec 17, 2017 6:08:10,621 PM /job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit by p709375 (line 3)
Dec 17, 2017 6:08:21,363 PM CBD_JENKINS_CONFIGURATION_AUDIT #193 Started by timer, Started by user Mark Gabbie, Rebuilds build #192 on node Jenkins started at 2017-12-17T07:08:20Z completed in 252ms completed: SUCCESS (line 5)
Dec 18, 2017 3:43:52,729 PM /job/CBD_PRE_APP_DEPLOY_ARTIFACTORY/3658/rebuild/configSubmit by p771488
Dec 17, 2017 6:08:58,562 PM CBCD_PIPELINE_CREATE_TAG_CBDnaipsoa #868 Started by upstream project "CBCD_SCHEDULE_CBDnaipsoaGIT_INTGIT_MERGE_18_1_CBAD" build number 303 on node build_hud2 started at 2017-12-17T07:08:01Z completed in 56842ms completed: SUCCESS (line 8)

Ignore these lines since its a duplicate:
#########################################
Dec 17, 2017 6:08:01,422 PM job/CBCD_PIPELINE_CREATE_TAG_CBDnaipsoa/ #868 Started by upstream project "CBCD_SCHEDULE_CBDnaipsoaGIT_INTGIT_MERGE_18_1_CBAD" build number 303 (line 2)
Dec 17, 2017 6:08:20,665 PM job/CBD_JENKINS_CONFIGURATION_AUDIT/ #193 Started by timer, Started by user Mark Gabbie, Rebuilds build #192 (line 4)

Trying to get:
#################
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375|
Dec 17, 2017 6:08:21,363 PM|CBD_JENKINS_CONFIGURATION_AUDIT|Mark Gabbie|SUCCESS
Dec 18, 2017 3:43:52,729 PM|job/CBD_PRE_APP_DEPLOY_ARTIFACTORY/3658/rebuild/configSubmit|p771488
Dec 17, 2017 6:08:58,562 PM|CBCD_PIPELINE_CREATE_TAG_CBDnaipsoa|upstream|SUCCESS
run_jenkins_audit.log
0
enthuguy
Asked:
enthuguy
3 Solutions
 
ozoCommented:
sort -u jenkins_audit.log
0
 
enthuguyAuthor Commented:
thanks ozo, but not exactly what I would like to achieve :)
0
 
David FavorLinux/LXD/WordPress/Hosting SavantCommented:
This depends on whether your log is exactly the same format for each record. If it is, then what you're asking is easy to accomplish in many languages. I'll use a PERL example, which reads from STDIN + writes to STDOUT.

Code written for understandability, rather than efficiency.

You can easily convert this to using bash arrays or awk or whatever tool you prefer.

#!/usr/bin/env perl

use strict;
use warnings;

while (my $rec = <STDIN>) {

    chop $rec;

    my @parts = split /\s+/, $rec;

    # date parts
    print $parts[0], ' ';
    print $parts[1], ' ';
    print $parts[2], ' ';
    print $parts[3], ' ';
    print $parts[4], '|';

    # detail
    print $parts[5], '|';

    # id
    print $parts[6], '|';

    print "\n";

}

Open in new window

0
 
Murugesan NagarajanSubject-matter expert at C++ C delivery, implementation, at UNIX oriented operating systems (Windows: CYGWIN_NT MINGW32_NT MINGW64_NT)Commented:
@enthuguy
Here goes related action using shell script:
#!/bin/bash
if [[ -f /bin/gawk ]]
then
	AWK="/bin/gawk"
elif [[ -f /bin/awk ]]
then
	AWK="/bin/awk"
else
	echo "/bin/awk /bin/gawk no such files"
fi
if [[ "" != "$AWK" ]] && [[ -f ./run_jenkins_audit.log ]]
then
	export NUMBEROFLINES=''`$AWK 'END { print NR}'  ./run_jenkins_audit.log`''
	$AWK 'BEGIN {
		NUMBEROFLINES=ENVIRON["NUMBEROFLINES"];
	}
	{
		for( CurCol=1;CurCol<=5;CurCol++)
		{
			if ( 5 != CurCol)
			{
				printf( "%s ", $CurCol) ;
			}
			else
			{
				printf( "%s|", $CurCol) ;
			}
		}
		if ( substr( $CurCol, 1,1) == "/")
		{
			printf( "%s|", substr( $CurCol, 2, length($CurCol) ) );
		}
		else
		{
			printf( "%s|", $CurCol );
		}
		if ( ( "Started" == $11 ) && ( "by" == $12 ) && ( "user" == $13 ) )
		{
			printf( "%s %s|", $14, substr( $15, 1, index($15,",")-1 ) );
		}
		else if ( ( "Started" == $8 ) && ( "by" == $9 ) && ( "user" == $10 ) )
		{
			printf( "%s %s|", $14, substr( $15, 1, index($15,",")-1 ) );
		}
		else if ( ( "Started" == $8 ) && ( "by" == $9 ) && ( "project" == $11 ) )
		{
			printf( "%s|", $10 );
		}
		printf( "%s", $NF );
		if ( NR != NUMBEROFLINES)
		{
			printf( "\n" );
		}
	}' ./run_jenkins_audit.log
else
	echo "./run_jenkins_audit.log No such file"
fii

Open in new window

Sample testing:
TEST1.
$ /bin/mv -i run_jenkins_audit.log run_jenkins_audit.log.Original.log
$ ./29074269.sh
./run_jenkins_audit.log No such file

Open in new window

TEST2. QUESTION1 on requirement
$ /bin/mv -i run_jenkins_audit.log.Original.log run_jenkins_audit.log
$ ./29074269.sh | /bin/grep -E "^Dec 17, 2017 6:08:21,363|^Dec 17, 2017 6:08:58,562 PM|^Dec 17, 2017 6:08:10,621 PM"
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375
Dec 17, 2017 6:08:21,363 PM|CBD_JENKINS_CONFIGURATION_AUDIT|Mark Gabbie|SUCCESS
Dec 17, 2017 6:08:58,562 PM|CBCD_PIPELINE_CREATE_TAG_CBDnaipsoa|upstream|SUCCESS

Open in new window

Need to know your comment on requirement:
1. Your requirement was:
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375|
Current output displaying:
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375
whether this is expected output or need to handle any exception ?
TEST3. QUESTION2 on requirement
Attaching current output:
./29074269.sh > currentoutput.txt
Last line In attached file is not having new line.
$ /usr/bin/tail -1 currentoutput.txt | od -bc | /usr/bin/tail -2
          C   |   S   U   C   C   E   S   S
0000071

Open in new window

Previous line before last line having new line.
$ /usr/bin/tail -2 currentoutput.txt  | /usr/bin/head -1 | od -bc | /usr/bin/tail -2
          _   S   Y   N   C   /   |   t   i   m   e   r  \n
0000075

Open in new window

We need to know whether new line required at last line ?

TEST4. QUESTION3 on requirement
>> Comment from David Favor
>> Code written for understandability

using perl
Replace:
        print $parts[6], '|';
With:
        print $parts[7], '|';

/bin/cat ./run_jenkins_audit.log | ./usingperl.pl  | /bin/egrep "Dec 17, 2017 6:08:10,621 PM"
Dec 17, 2017 6:08:10,621 PM|/job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375|

./29074269.sh | /bin/grep -E "Dec 17, 2017 6:08:10,621 PM"
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375
We need to know expected output ?
currentoutput.txt
0
 
Murugesan NagarajanSubject-matter expert at C++ C delivery, implementation, at UNIX oriented operating systems (Windows: CYGWIN_NT MINGW32_NT MINGW64_NT)Commented:
1. Question in actve for 14 days
2. Provided assisted solutions
3. Tested and provided sample result.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

Tackle projects and never again get stuck behind a technical roadblock.
Join Now