how to extract data from log and create CSV in bash

Hi experts,
trying to parse attached log in bash script and get some meaning full data in csv format.
Pls check the attached file with full log.

could you pls help.

From attached sample file.

Consider these lines:
#####################
Dec 17, 2017 6:08:10,621 PM /job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit by p709375 (line 3)
Dec 17, 2017 6:08:21,363 PM CBD_JENKINS_CONFIGURATION_AUDIT #193 Started by timer, Started by user Mark Gabbie, Rebuilds build #192 on node Jenkins started at 2017-12-17T07:08:20Z completed in 252ms completed: SUCCESS (line 5)
Dec 18, 2017 3:43:52,729 PM /job/CBD_PRE_APP_DEPLOY_ARTIFACTORY/3658/rebuild/configSubmit by p771488
Dec 17, 2017 6:08:58,562 PM CBCD_PIPELINE_CREATE_TAG_CBDnaipsoa #868 Started by upstream project "CBCD_SCHEDULE_CBDnaipsoaGIT_INTGIT_MERGE_18_1_CBAD" build number 303 on node build_hud2 started at 2017-12-17T07:08:01Z completed in 56842ms completed: SUCCESS (line 8)

Ignore these lines since its a duplicate:
#########################################
Dec 17, 2017 6:08:01,422 PM job/CBCD_PIPELINE_CREATE_TAG_CBDnaipsoa/ #868 Started by upstream project "CBCD_SCHEDULE_CBDnaipsoaGIT_INTGIT_MERGE_18_1_CBAD" build number 303 (line 2)
Dec 17, 2017 6:08:20,665 PM job/CBD_JENKINS_CONFIGURATION_AUDIT/ #193 Started by timer, Started by user Mark Gabbie, Rebuilds build #192 (line 4)

Trying to get:
#################
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375|
Dec 17, 2017 6:08:21,363 PM|CBD_JENKINS_CONFIGURATION_AUDIT|Mark Gabbie|SUCCESS
Dec 18, 2017 3:43:52,729 PM|job/CBD_PRE_APP_DEPLOY_ARTIFACTORY/3658/rebuild/configSubmit|p771488
Dec 17, 2017 6:08:58,562 PM|CBCD_PIPELINE_CREATE_TAG_CBDnaipsoa|upstream|SUCCESS
run_jenkins_audit.log
enthuguyAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
sort -u jenkins_audit.log
0
enthuguyAuthor Commented:
thanks ozo, but not exactly what I would like to achieve :)
0
David FavorLinux/LXD/WordPress/Hosting SavantCommented:
This depends on whether your log is exactly the same format for each record. If it is, then what you're asking is easy to accomplish in many languages. I'll use a PERL example, which reads from STDIN + writes to STDOUT.

Code written for understandability, rather than efficiency.

You can easily convert this to using bash arrays or awk or whatever tool you prefer.

#!/usr/bin/env perl

use strict;
use warnings;

while (my $rec = <STDIN>) {

    chop $rec;

    my @parts = split /\s+/, $rec;

    # date parts
    print $parts[0], ' ';
    print $parts[1], ' ';
    print $parts[2], ' ';
    print $parts[3], ' ';
    print $parts[4], '|';

    # detail
    print $parts[5], '|';

    # id
    print $parts[6], '|';

    print "\n";

}

Open in new window

0
murugesandinsShell_script Automation /bin/bash /bin/bash.exe /bin/ksh /bin/mksh.exe AIX C C++ CYGWIN_NT HP-UX Linux MINGW32 MINGW64 SunOS Windows_NTCommented:
@enthuguy
Here goes related action using shell script:
#!/bin/bash
if [[ -f /bin/gawk ]]
then
	AWK="/bin/gawk"
elif [[ -f /bin/awk ]]
then
	AWK="/bin/awk"
else
	echo "/bin/awk /bin/gawk no such files"
fi
if [[ "" != "$AWK" ]] && [[ -f ./run_jenkins_audit.log ]]
then
	export NUMBEROFLINES=''`$AWK 'END { print NR}'  ./run_jenkins_audit.log`''
	$AWK 'BEGIN {
		NUMBEROFLINES=ENVIRON["NUMBEROFLINES"];
	}
	{
		for( CurCol=1;CurCol<=5;CurCol++)
		{
			if ( 5 != CurCol)
			{
				printf( "%s ", $CurCol) ;
			}
			else
			{
				printf( "%s|", $CurCol) ;
			}
		}
		if ( substr( $CurCol, 1,1) == "/")
		{
			printf( "%s|", substr( $CurCol, 2, length($CurCol) ) );
		}
		else
		{
			printf( "%s|", $CurCol );
		}
		if ( ( "Started" == $11 ) && ( "by" == $12 ) && ( "user" == $13 ) )
		{
			printf( "%s %s|", $14, substr( $15, 1, index($15,",")-1 ) );
		}
		else if ( ( "Started" == $8 ) && ( "by" == $9 ) && ( "user" == $10 ) )
		{
			printf( "%s %s|", $14, substr( $15, 1, index($15,",")-1 ) );
		}
		else if ( ( "Started" == $8 ) && ( "by" == $9 ) && ( "project" == $11 ) )
		{
			printf( "%s|", $10 );
		}
		printf( "%s", $NF );
		if ( NR != NUMBEROFLINES)
		{
			printf( "\n" );
		}
	}' ./run_jenkins_audit.log
else
	echo "./run_jenkins_audit.log No such file"
fii

Open in new window

Sample testing:
TEST1.
$ /bin/mv -i run_jenkins_audit.log run_jenkins_audit.log.Original.log
$ ./29074269.sh
./run_jenkins_audit.log No such file

Open in new window

TEST2. QUESTION1 on requirement
$ /bin/mv -i run_jenkins_audit.log.Original.log run_jenkins_audit.log
$ ./29074269.sh | /bin/grep -E "^Dec 17, 2017 6:08:21,363|^Dec 17, 2017 6:08:58,562 PM|^Dec 17, 2017 6:08:10,621 PM"
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375
Dec 17, 2017 6:08:21,363 PM|CBD_JENKINS_CONFIGURATION_AUDIT|Mark Gabbie|SUCCESS
Dec 17, 2017 6:08:58,562 PM|CBCD_PIPELINE_CREATE_TAG_CBDnaipsoa|upstream|SUCCESS

Open in new window

Need to know your comment on requirement:
1. Your requirement was:
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375|
Current output displaying:
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375
whether this is expected output or need to handle any exception ?
TEST3. QUESTION2 on requirement
Attaching current output:
./29074269.sh > currentoutput.txt
Last line In attached file is not having new line.
$ /usr/bin/tail -1 currentoutput.txt | od -bc | /usr/bin/tail -2
          C   |   S   U   C   C   E   S   S
0000071

Open in new window

Previous line before last line having new line.
$ /usr/bin/tail -2 currentoutput.txt  | /usr/bin/head -1 | od -bc | /usr/bin/tail -2
          _   S   Y   N   C   /   |   t   i   m   e   r  \n
0000075

Open in new window

We need to know whether new line required at last line ?

TEST4. QUESTION3 on requirement
>> Comment from David Favor
>> Code written for understandability

using perl
Replace:
        print $parts[6], '|';
With:
        print $parts[7], '|';

/bin/cat ./run_jenkins_audit.log | ./usingperl.pl  | /bin/egrep "Dec 17, 2017 6:08:10,621 PM"
Dec 17, 2017 6:08:10,621 PM|/job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375|

./29074269.sh | /bin/grep -E "Dec 17, 2017 6:08:10,621 PM"
Dec 17, 2017 6:08:10,621 PM|job/CBD_JENKINS_CONFIGURATION_AUDIT/configSubmit|p709375
We need to know expected output ?
currentoutput.txt
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
murugesandinsShell_script Automation /bin/bash /bin/bash.exe /bin/ksh /bin/mksh.exe AIX C C++ CYGWIN_NT HP-UX Linux MINGW32 MINGW64 SunOS Windows_NTCommented:
1. Question in actve for 14 days
2. Provided assisted solutions
3. Tested and provided sample result.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux

From novice to tech pro — start learning today.