I am a newbie in this awk field so any help is greatly appreciated.
My document has 4 columns that are "Id#", "Version#", "Offset" and "Counter". For each Id#, I want to summarize the 4th column numbers ("Counter") in first line, then in first 2 lines, then in first 6 lines, then in first 12 lines, then in first 72 lines, and finally, in first 288 lines if there are so many lines; otherwise just move to the next Id#.
Then for Id# "113148", I should get "1 2 7 13 15"
for Id# "113144", I should get "1 2 9 19 35"
.........
I have a generic code as attached, however it does not work accurately. In the code, "$CURRENT_VERSION" corresponds to the 1st line for a certain Id#, "$TEN_MIN" corresponds to the first 2 lines for this Id#, and "$THIRTY_MIN" corresponds to the first 6 lines, "
$ONE_HOUR" corresponds to the first 12 lines, and "$SIX_HOUR" corresponds to the first 72 lines.
This bash/awk gives following result:
for Id# 113148: 2 7 15
for Id# 113144: 1 1 14
they are not correct.
Any help? Thanks so much.
INDEX='partnerId.txt'SUMMARY='partnerId_sum.txt'CURRENT_VERSION=`awk '{ if ($2 > max) max = $2} END { print max }' $INDEX`let "TEN_MIN = $CURRENT_VERSION - 1"let "THIRTY_MIN = $CURRENT_VERSION - 5"let "ONE_HOUR = $CURRENT_VERSION - 10"let "SIX_HOUR = $CURRENT_VERSION - 60"awk '$2 >= v1 {a[$1]++;b[$1]=b[$1]+$4} END {for (i in a) print i,b[i]}' v1=$CURRENT_VERSION $INDEX > $SUMMARYecho "---" >> $SUMMARYawk '$2 >= v1 {a[$1]++;b[$1]=b[$1]+$4} END {for (i in a) print i,b[i]}' v1=$TEN_MIN $INDEX >> $SUMMARYecho "---" >> $SUMMARYawk '$2 >= v1 {a[$1]++;b[$1]=b[$1]+$4} END {for (i in a) print i,b[i]}' v1=$THIRTY_MIN $INDEX >> $SUMMARYecho "---" >> $SUMMARYawk '$2 >= v1 {a[$1]++;b[$1]=b[$1]+$4} END {for (i in a) print i,b[i]}' v1=$ONE_HOUR $INDEX >> $SUMMARYecho "---" >> $SUMMARYawk '$2 >= v1 {a[$1]++;b[$1]=b[$1]+$4} END {for (i in a) print i,b[i]}' v1=$SIX_HOUR $INDEX >> $SUMMARYecho "---" >> $SUMMARYawk '{a[$1]++;b[$1]=b[$1]+$4} END {for (i in a) print i,b[i]}' $INDEX >> $SUMMARY~~