Avatar of imad imad
imad imad
 asked on

Filtering a file to table

I have a file that contains many logs :

at 10:00 carl 1 STR0 STR1 STR2 STR3 <STR4 STR5> [STR6 STR7] STR8:
academy/course1:oftheory:SMTGHO:nothing:
academy/course1:ofapplicaton:SMTGHP:onehour:

at 10:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR78> [STR6 STR111] STR8:
academy/course2:oftheory:SMTGHM:math:
academy/course2:ofapplicaton:SMTGHN:twohour:

at 10:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR758> [STR6 STR155] STR8:
academy/course3:oftheory:SMTGHK:geo:
academy/course3:ofapplicaton:SMTGHL:halfhour:

at 10:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR87> [STR6 STR74] STR8:
academy/course4:oftheory:SMTGH:SMTGHI:history:
academy/course4:ofapplicaton:SMTGHJ:nothing:

at 14:00 carl 1 STR0 STR1 STR2 STR3 <STR4 STR11> [STR6 STR784] STR8:
academy/course5:oftheory:SMTGHG:nothing:
academy/course5:ofapplicaton:SMTGHH:twohours:

at 14:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR86> [STR6 STR85] STR8:
academy/course6:oftheory:SMTGHE:music:
academy/course6:ofapplicaton:SMTGHF:twohours:

at 14:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR96> [STR6 STR01] STR8:
academy/course7:oftheory:SMTGHC:programmation:
academy/course7:ofapplicaton:SMTGHD:onehours:

at 14:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR335> [STR6 STR66] STR8:
academy/course8:oftheory:SMTGHA:philosophy:
academy/course8:ofapplicaton:SMTGHB:nothing:

Open in new window


Is there anyway to get a ride of these strings STR* and SMTGH* in order to get this output using awk / perl / script:

carl 1,10:00,14:00
applicaton,halfhour,onehours
theory,geo,programmation

carl 2,10:00,14:00
applicaton,nothing,nothing
theory,history,philosophy

david 1,10:00,14:00
applicaton,onehour,twohours
theory,nothing,nothing

david 2,10:00,14:00
applicaton,twohour,twohours
theory,math,music

Open in new window

Shell ScriptingPerlSystem ProgrammingUnix OSScripting Languages

Avatar of undefined
Last Comment
jmcg

8/22/2022 - Mon
jmcg

This was more complicated than first I thought. You are not simply asking to have the unwanted fields suppressed; you want fields completely re-organized. And, I have to assume that the example result file is an example of how you want things to generally look rather than the exact output expected from the offered input.

I don't know that this couldn't be accomplished in awk or shell, but Perl is up to it.

# perl

# for Experts-Exchange.com/questions/28693077

use strict;
use Data::Dumper;

# organize data with these
my %courseinfo = ();
my @namelist = ();

# these will carry over line boundaries
my ($keyname, $keytime);

while( <> ) {

	# for a line matching /^at/ we want to capture the name and time fields
	if( my @matches = m/^at (\d\d:\d\d) (\S+ \S+)/ ) {
		($keytime, $keyname) = @matches;
		unless (exists $courseinfo{$keyname}) {
			$courseinfo{$keyname} = {times=>[] };
			push @namelist, $keyname;
			}
		push @{$courseinfo{$keyname}{times}}, $keytime;
		next;
		}

	# for other lines, split on colons to find fields of interest, but first clean up SMTG fields
	s/:SMTG\w+//g;
	my ( $acad, $of, $wanted) = split /:/;
	next unless $acad =~ m/^academy/;
	
	(my $key2 = $of) =~ s/^of//; 
	$courseinfo{$keyname}{$key2} = [] unless exists $courseinfo{$keyname}{$key2};
	push @{$courseinfo{$keyname}{$key2}}, $wanted;
	}

	### print STDERR Data::Dumper->Dump( [ \%courseinfo], [ qw( *courseinfo ) ] ); ### DEBUG
	
# now, after all records have been read, put out the output...
foreach $keyname (@namelist) {
	printf "%s\n", join ',', $keyname, @{$courseinfo{$keyname}{times}};
	for my $key2 ( sort keys %{$courseinfo{$keyname}} ) {
		next if $key2 eq "times";
		printf "%s\n", join ',', $key2, @{$courseinfo{$keyname}{$key2}};
		}
	print "\n";
	}

Open in new window


So if I run that as imad1.pl against your offered input as imad1.txt

perl -f imad1.pl imad1.txt >imad3.txt

I get this:

carl 1,10:00,14:00
applicaton,onehour,twohours
theory,nothing,nothing

carl 2,10:00,14:00
applicaton,twohour,twohours
theory,math,music

david 1,10:00,14:00
applicaton,halfhour,onehours
theory,geo,programmation

david 2,10:00,14:00
applicaton,nothing,nothing
theory,history,philosophy

Open in new window


I had to make a number of generalizing assumptions about what variations might appear in your real input, so let me know if you run into trouble applying this script outside of the small sample input.
ozo

perl -lan00e '
$n="@F[2,3]";
push @{$s{$n}{""}},$F[1];
push @{$s{$n}{$1}},$2 while/:of(\w+):.*:(\w+):/g;
END{
  print $k,map{join(",",$_,@{$v->{$_}}),"\n"}sort keys %$v  while ($k,$v)=each %s;
} ' <<HERE
at 10:00 carl 1 STR0 STR1 STR2 STR3 <STR4 STR5> [STR6 STR7] STR8:
academy/course1:oftheory:SMTGHO:nothing:
academy/course1:ofapplicaton:SMTGHP:onehour:

at 10:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR78> [STR6 STR111] STR8:
academy/course2:oftheory:SMTGHM:math:
academy/course2:ofapplicaton:SMTGHN:twohour:

at 10:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR758> [STR6 STR155] STR8:
academy/course3:oftheory:SMTGHK:geo:
academy/course3:ofapplicaton:SMTGHL:halfhour:

at 10:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR87> [STR6 STR74] STR8:
academy/course4:oftheory:SMTGH:SMTGHI:history:
academy/course4:ofapplicaton:SMTGHJ:nothing:

at 14:00 carl 1 STR0 STR1 STR2 STR3 <STR4 STR11> [STR6 STR784] STR8:
academy/course5:oftheory:SMTGHG:nothing:
academy/course5:ofapplicaton:SMTGHH:twohours:

at 14:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR86> [STR6 STR85] STR8:
academy/course6:oftheory:SMTGHE:music:
academy/course6:ofapplicaton:SMTGHF:twohours:

at 14:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR96> [STR6 STR01] STR8:
academy/course7:oftheory:SMTGHC:programmation:
academy/course7:ofapplicaton:SMTGHD:onehours:

at 14:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR335> [STR6 STR66] STR8:
academy/course8:oftheory:SMTGHA:philosophy:
academy/course8:ofapplicaton:SMTGHB:nothing:
                                 
HERE
imad imad

ASKER
@jmcg  Here is a smooth modification of my input :


academy/course1:offdf5D:SM<wbr ></wbr>TGHP:twohu<wbr ></wbr>r:
academy/course1:zfd6X:SMTG<wbr ></wbr>HP:nonehou<wbr ></wbr>r:
academy/course1:sd99R:SMTG<wbr ></wbr>HP:somthin<wbr ></wbr>g :
academy/course1:qs35H:SMTG<wbr ></wbr>HP:nothing<wbr ></wbr>:
academy/course1:odf33G:SMT<wbr ></wbr>GHP:onehou<wbr ></wbr>r:

at 10:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR78> [STR6 STR111] STR8:
academy/course2:thefsf8A:S<wbr ></wbr>MTGHM:math<wbr ></wbr>:
academy/course2:fdf5B:SMTG<wbr ></wbr>HN:twohour<wbr ></wbr>:
academy/course2:offdf5D:SM<wbr ></wbr>TGHP:twohu<wbr ></wbr>r:
academy/course2:zfd6X:SMTG<wbr ></wbr>HP:nonehou<wbr ></wbr>r:
academy/course2:sd99R:SMTG<wbr ></wbr>HP:somthin<wbr ></wbr>g :
academy/course2:qs35H:SMTG<wbr ></wbr>HP:nothing<wbr ></wbr>:
academy/course2:odf33G:SMT<wbr ></wbr>GHP:onehou<wbr ></wbr>r:

at 10:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR758> [STR6 STR155] STR8:
academy/course3:thefsf8A:S<wbr ></wbr>MTGHK:geo:<wbr ></wbr>
academy/course3:fdf5B:SMTG<wbr ></wbr>HL:halfhou<wbr ></wbr>r:
academy/course3:offdf5D:SM<wbr ></wbr>TGHb:twohu<wbr ></wbr>r:
academy/course3:zfd6X:SMTG<wbr ></wbr>HPx:noneho<wbr ></wbr>ur:
academy/course3:sd99R:SMTG<wbr ></wbr>Hw:somthin<wbr ></wbr>g :
academy/course3:qs35H:SMTG<wbr ></wbr>HbP:nothin<wbr ></wbr>g:
academy/course3:odf33G:SMT<wbr ></wbr>GHPs:oneho<wbr ></wbr>ur:

at 10:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR87> [STR6 STR74] STR8:
academy/course4:thefsf8A:S<wbr ></wbr>MTGH:SMTGH<wbr ></wbr>I:history:<wbr ></wbr>
academy/course4:fdf5B:SMTG<wbr ></wbr>HJ:nothing<wbr ></wbr>:
academy/course4:offdf5D:SM<wbr ></wbr>TGHd:twohu<wbr ></wbr>r:
academy/course4:zfd6X:SMTG<wbr ></wbr>Hg:nonehou<wbr ></wbr>r:
academy/course4:sd99R:SMTG<wbr ></wbr>Hs:somthin<wbr ></wbr>g :
academy/course4:qs35H:SMTG<wbr ></wbr>Hb:nothing<wbr ></wbr>:
academy/course4:odf33G:SMT<wbr ></wbr>GHs:onehou<wbr ></wbr>r:

at 14:00 carl 1 STR0 STR1 STR2 STR3 <STR4 STR11> [STR6 STR784] STR8:
academy/course5:thefsf8A:S<wbr ></wbr>MTGHG:noth<wbr ></wbr>ing:
academy/course5:fdf5B:SMTG<wbr ></wbr>HH:twohour<wbr ></wbr>s:
academy/course5:offdf5D:SM<wbr ></wbr>TGHf:twohu<wbr ></wbr>r:
academy/course5:zfd6X:SMTG<wbr ></wbr>Hgd:noneho<wbr ></wbr>ur:
academy/course5:sd99R:SMTG<wbr ></wbr>Hsf:somthi<wbr ></wbr>ng :
academy/course5:qs35H:SMTG<wbr ></wbr>Hbs:nothin<wbr ></wbr>g:
academy/course5:odf33G:SMT<wbr ></wbr>GHsf:oneho<wbr ></wbr>ur:

at 14:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR86> [STR6 STR85] STR8:
academy/course6:thefsf8A:S<wbr ></wbr>MTGHEx:mus<wbr ></wbr>ic:
academy/course6:fdf5B:SMTG<wbr ></wbr>HF:twohour<wbr ></wbr>s:
academy/course6:offdf5D:SM<wbr ></wbr>TGHdf:twoh<wbr ></wbr>ur:
academy/course6:zfd6X:SMTG<wbr ></wbr>Hs:nonehou<wbr ></wbr>r:
academy/course6:sd99R:SMTG<wbr ></wbr>Hqf:somthi<wbr ></wbr>ng :
academy/course6:qs35H:SMTG<wbr ></wbr>Hv:nothing<wbr ></wbr>:
academy/course6:odf33G:SMT<wbr ></wbr>GHw:onehou<wbr ></wbr>r:
at 14:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR96> [STR6 STR01] STR8:
academy/course7:thefsf8A:S<wbr ></wbr>MTGHC:prog<wbr ></wbr>rammation:<wbr ></wbr>
academy/course7:fdf5B:SMTG<wbr ></wbr>HDs:onehou<wbr ></wbr>rs:
academy/course7:thefsf8A:S<wbr ></wbr>MTGHdx:mus<wbr ></wbr>ic:
academy/course7:fdf5B:SMTG<wbr ></wbr>HsF:twohou<wbr ></wbr>rs:
academy/course7:offdf5D:SM<wbr ></wbr>TGHqf:twoh<wbr ></wbr>ur:
academy/course7:zfd6X:SMTG<wbr ></wbr>Hws:noneho<wbr ></wbr>ur:
academy/course7:sd99R:SMTG<wbr ></wbr>Hwf:somthi<wbr ></wbr>ng :
academy/course7:qs35H:SMTG<wbr ></wbr>Hcv:nothin<wbr ></wbr>g:
academy/course7:odf33G:SMT<wbr ></wbr>GHv:onehou<wbr ></wbr>r:
at 14:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR335> [STR6 STR66] STR8:
academy/course8:thefsf8A:S<wbr ></wbr>MTGHA:phil<wbr ></wbr>osophy:
academy/course8:fdf5B:SMTG<wbr ></wbr>HhB:nothin<wbr ></wbr>g:
academy/course8:offdf5D:SM<wbr ></wbr>TGeHqf:two<wbr ></wbr>hur:
academy/course8:zfd6X:SMTG<wbr ></wbr>Hfws:noneh<wbr ></wbr>our:
academy/course8:sd99R:SMTG<wbr ></wbr>Hdwf:somth<wbr ></wbr>ing :
academy/course8:qs35H:SMTG<wbr ></wbr>Hcvv:nothi<wbr ></wbr>ng:
academy/course8:odf33G:SMT<wbr ></wbr>GHbv:oneho<wbr ></wbr>ur:

Open in new window



The I input I would like :


carl 1,10:00,14:00
A, --,--
B, --,--
D --,--
X --,--
R --,--
H --,--
G --,--

carl 2,10:00,14:00
A, --,--
B, --,--
D --,--
X --,--
R --,--
H --,--
G --,--

david 1,10:00,14:00
A, --,--
B, --,--
D --,--
X --,--
R --,--
H --,--
G --,--

david 2,10:00,14:00
A, --,--
B, --,--
D --,--
X --,--
R --,--
H --,--
G --,--

Open in new window



the '--' refers to the values  ' onehour , nothing , math, .....' normaly they should be displayed
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
ozo

Are the blank lines between the last 3 entries really missing?

perl -ln00e 'for(split/(?=^at )/m){ ($t,$n)=/\s(\S+) (\S+ \S+)/;
push @{$s{$n}{""}},$t;
push @{$s{$n}{$1}},$2 while/:\w*(\w):.*:(\w+)</g;
}END{
  print $k,map{join(",",$_,@{$v->{$_}}),"\n"}sort keys %$v  while ($k,$v)=each %s;
} '
imad imad

ASKER
Oh mY BAD , here is the correct vesion of the Input and  Output :
at 10:00 carl 1 STR0 STR1 STR2 STR3 <STR4 STR78> [STR6 STR111] STR8:
academy/course1:thefsf8A:SMTGHM:Philo:
academy/course1:fdf5B:SMTGHN:twohour:
academy/course1:offdf5D:SMTGHP:twohur:
academy/course1:zfd6X:SMTGHP:nonehour:
academy/course1:sd99R:SMTGHP:somthing:
academy/course1:qs35H:SMTGHP:nothing:
academy/course1:odf33G:SMTGHP:onehour:
at 10:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR78> [STR6 STR111] STR8:
academy/course2:thefsf8A:SMTGHM:math:
academy/course2:fdf5B:SMTGHN:twohour:
academy/course2:offdf5D:SMTGHP:twohur:
academy/course2:zfd6X:SMTGHP:nonehour:
academy/course2:sd99R:SMTGHP:somthing:
academy/course2:qs35H:SMTGHP:nothing:
academy/course2:odf33G:SMTGHP:onehour:
at 10:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR758> [STR6 STR155] STR8:
academy/course3:thefsf8A:SMTGHK:geo:
academy/course3:fdf5B:SMTGHL:halfhour:
academy/course3:offdf5D:SMTGHb:twohur:
academy/course3:zfd6X:SMTGHPx:nonehour:
academy/course3:sd99R:SMTGHw:somthing:
academy/course3:qs35H:SMTGHbP:nothing:
academy/course3:odf33G:SMTGHPs:onehour:
at 10:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR87> [STR6 STR74] STR8:
academy/course4:thefsf8A:SMTGH:SMTGHI:history:
academy/course4:fdf5B:SMTGHJ:nothing:
academy/course4:offdf5D:SMTGHd:twohur:
academy/course4:zfd6X:SMTGHg:nonehour:
academy/course4:sd99R:SMTGHs:somthing :
academy/course4:qs35H:SMTGHb:nothing:
academy/course4:odf33G:SMTGHs:onehour:
at 14:00 carl 1 STR0 STR1 STR2 STR3 <STR4 STR11> [STR6 STR784] STR8:
academy/course5:thefsf8A:SMTGHG:nothing:
academy/course5:fdf5B:SMTGHH:twohours:
academy/course5:offdf5D:SMTGHf:twohur:
academy/course5:zfd6X:SMTGHgd:nonehour:
academy/course5:sd99R:SMTGHsf:somthing:
academy/course5:qs35H:SMTGHbs:nothing:
academy/course5:odf33G:SMTGHsf:onehour:
at 14:00 carl 2 STR0 STR1 STR2 STR3 <STR4 STR86> [STR6 STR85] STR8:
academy/course6:thefsf8A:SMTGHEx:music:
academy/course6:fdf5B:SMTGHF:twohours:
academy/course6:offdf5D:SMTGHdf:twohur:
academy/course6:zfd6X:SMTGHs:nonehour:
academy/course6:sd99R:SMTGHqf:somthing:
academy/course6:qs35H:SMTGHv:nothing:
academy/course6:odf33G:SMTGHw:onehour:
at 14:00 david 1 STR0 STR1 STR2 STR3 <STR4 STR96> [STR6 STR01] STR8:
academy/course7:thefsf8A:SMTGHC:programmation:
academy/course7:fdf5B:SMTGHDs:onehours:
academy/course7:thefsf8A:SMTGHdx:music:
academy/course7:fdf5B:SMTGHsF:twohours:
academy/course7:offdf5D:SMTGHqf:twohur:
academy/course7:zfd6X:SMTGHws:nonehour:
academy/course7:sd99R:SMTGHwf:somthing:
academy/course7:qs35H:SMTGHcv:nothing:
academy/course7:odf33G:SMTGHv:onehour:
at 14:00 david 2 STR0 STR1 STR2 STR3 <STR4 STR335> [STR6 STR66] STR8:
academy/course8:thefsf8A:SMTGHA:philosophy:
academy/course8:fdf5B:SMTGHhB:nothing:
academy/course8:offdf5D:SMTGeHqf:twohur:
academy/course8:zfd6X:SMTGHfws:nonehour:
academy/course8:sd99R:SMTGHdwf:somthing:
academy/course8:qs35H:SMTGHcvv:nothing:
academy/course8:odf33G:SMTGHbv:onehour:

Open in new window




here how the output should looks like :

carl 1,10:00,14:00
A,--,--
B,--,--
D,--,--
X,--,--
R,--,--
H,--,--
G,--,--

carl 2,10:00,14:00
A,--,--
B,--,--
D,--,--
X,--,--
R,--,--
H,--,--
G,--,--

david 1,10:00,14:00
A,--,--
B,--,--
D,--,--
X,--,--
R,--,--
H,--,--
G,--,--

david 2,10:00,14:00
A,--,--
B,--,--
D,--,--
X,--,--
R,--,--
H,--,--
G,--,--

Open in new window


the '--' refers to the values  ' onehour , nothing , math, .....' normaly they should be displayed
ASKER CERTIFIED SOLUTION
ozo

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
imad imad

ASKER
I wrote this
perl -lane 'BEGIN{$/="at "} $n="@F[1,2]";
push @{$s{$n}{""}},$F[0];
push @{$s{$n}{$1}},$2 while/:\w*(\w):.*:(\w+):/g;
END{
  print $k,map{join(",",$_,@{$v->{$_}}),"\n"}sort keys %$v  while ($k,$v)=each %s;
} ' test20.txt

Open in new window


I got this :

String found where operator expected at 1.pl line 6, near "}'"
  (Might be a runaway multi-line '' string starting on line 1)
        (Missing semicolon on previous line?)
Bareword found where operator expected at 1.pl line 6, near "}' test20"
        (Missing operator before test20?)
syntax error at 1.pl line 6, near "}'"
Execution of 1.pl aborted due to compilation errors.

Open in new window


the command I have executed is :

perl  1.pl  

Open in new window

Get an unlimited membership to EE for less than $4 a week.
Unlimited question asking, solutions, articles and more.
ozo

The command to execute would be
perl -lane 'BEGIN{$/="at "}
$n="@F[1,2]";
push @{$s{$n}{""}},$F[0];
push @{$s{$n}{$1}},$2 while/:\w*(\w):.*:(\w+):/g;
END{
  print $k,map{join(",",$_,@{$v->{$_}}),"\n"}sort keys %$v  while ($k,$v)=each %s;
} ' test20.txt
not
perl  1.pl
jmcg

The relationship between input and output has become too obscure for me to work it out.