imad imad
asked on
filter a file to a table
I have a file that contains the following lines of code. here the file displays a schedules which are sorted one by one .
I've tried to sort it this way :
using these commands :
Now I am looking to improve the filtered file by ignoring all non significant words and sorting the most valuable information in a table , I've tried to think/search how get this format but in vain .
at 12:00 the schedule of james version1 is :
first_task:eating:nothing
second_task:rest:onehour
third_task:watching:nothing
at 12:00 the schedule of james version2 is :
first_task:eating:fruits
second_task:rest:twohour
third_task:watching:manga
at 12:00 the schedule of alex version1 is :
first_task:eating:fruit
second_task:rest:halfhour
third_task:watching:horrorfilm
at 12:00 the schedule of alex version2 is :
first_task:eating:meal
second_task:rest:nothing
third_task:watching:nothing
at 18:00 the schedule of james version1 is :
first_task:eating:fastfood
second_task:rest:twohours
third_task:watching:series
at 18:00 the schedule of james version2 is :
first_task:eating:nothing
second_task:rest:onehours
third_task:watching:series
at 18:00 the schedule of alex version1 is :
first_task:eating:vegetals
second_task:rest:threehours
third_task:watching:manga
at 18:00 the schedule of alex version2 is :
first_task:eating:bread
second_task:rest:fivehours
third_task:watching:manga
at 22:00 the schedule of james version1 is :
first_task:eating:nothing
second_task:rest:sevenhours
third_task:watching:nothing
at 22:00 the schedule of james version2 is :
first_task:eating:meal
second_task:rest:sixnhours
third_task:watching:nothing
at 22:00 the schedule of alex version1 is :
first_task:eating:vegetals
second_task:rest:sevehours
third_task:watching:manga
at 22:00 the schedule of alex version2 is :
first_task:eating:icecream
second_task:rest:sevenhours
third_task:watching:nothing
I've tried to sort it this way :
12:00 eating:fruit
18:00 eating:vegetals
22:00 eating:nothing
12:00 rest:onhour
18:00 rest:threehour
22:00 rest:sevenhour
12:00 watching:horrorfilm
18:00 watching:manga
22:00 watching:nothing
using these commands :
awk -F '[\ :]' '/the schedule is/{h=$2;m=$3} /eating/{print " "h":"m" watching:"$3}' f.txt
awk -F '[\ :]' '/the schedule is/{h=$2;m=$3} /rest/{print " "h":"m" rest:"$3}' f.txt
awk -F '[\ :]' '/the schedule is/{h=$2;m=$3} /watching/{print " "h":"m" watching:"$3}' f.txt
Now I am looking to improve the filtered file by ignoring all non significant words and sorting the most valuable information in a table , I've tried to think/search how get this format but in vain .
James version1,12:00,18:00,22:00
eating,nothing,fastfood,nothing
rest,onehour,halfhour,sevenhours
watching,nothing,series,nothing
James version2,12:00,18:00,22:00
eating,fruits,nothing,meal
rest,twohour,onehours,sixnhours
watching,manga,series,nothing
alex version1,12:00,18:00,22:00
eating,fruit,vegetals,vegetals
rest,halfhour,threehours,sevehours
watching,horrorfilm,manga,manga
alex version2,12:00,18:00,22:00
eating,meal,bread,icecream
rest,nothing,fivehours,sevenhours
watching,nothing,manga,nothing
what are the non significant words, and what is the most valuable information?
ASKER
The non significant words are :
first_task
second_task:
third_task:
the schedule of
the most valuable for example : sorted as an csv table
James version1,12:00,18:00,22:00
eating,nothing,fastfood,no thing
rest,onehour,halfhour,seve nhours
watching,nothing,series,no thing
first_task
second_task:
third_task:
the schedule of
the most valuable for example : sorted as an csv table
James version1,12:00,18:00,22:00
eating,nothing,fastfood,no
rest,onehour,halfhour,seve
watching,nothing,series,no
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
why "james version1" is one attribute and not two like "james", "version1"?
why eating,nothing is two (unrelated) attributes and not one like eating:nothing?
why did you turn relations like "James version2 12:00 eating:fruits" to be unrelated?
you could get two entities from the data:
Sara
why eating,nothing is two (unrelated) attributes and not one like eating:nothing?
why did you turn relations like "James version2 12:00 eating:fruits" to be unrelated?
you could get two entities from the data:
entity 1: ID,Name,Version
1,james,1
2,james,2
3,alex,1
4,alex,2
entity 2: ID,Schedule,Task,Activity,What
1,12:00,1,eating,nothing
1,12:00,2,rest,onehour
1,12::00,3,watching,nothing
2,12:00,1,eating,fruits
2,12:00,2,rest,twohour
2,12::00,3,watching,manga
...
but I am not good at perl , Any awk command ?i would write a little parser program with c++.
Sara
ASKER
it works