Avatar of Jason_Sutiono
Jason_Sutiono
 asked on

Perl Script to count the number of elements in an array

Hi all,

Would really appreciate some input on how to do the following  with a Perl script to process a text file.

Here is my input file:
col1|col2|col3|col4|col5|col6|col7|col8|col9|col10|col11|col12
BLA|001036|S|3228|10|1|2|3|001036|W035|S|
BLA|001036|S|3228|0|0|0|0|001036|W035|S|08961029909655092918
BLA|001036|S|3228|0|0|0|0|001036|W035|S|08961029909655092926
BLA|001036|S|3228|0|0|0|0|001036|W035|S|08961029909655092934
BLA|001036|S|3228|0|0|0|0|001036|W035|S|08961029909655092942
BLT|600123|S|3437|0|20|0|0|001036|W035|S|
BRO|900177|S|3531|-1|0|0|0|001036|W035|S|
CHL|123777|S|3327|3|0|0|0|001036|W035|S|
CHL|123777|S|3327|0|0|0|0|001036|W035|S|08961029909655093791
CHL|123777|S|3327|0|0|0|0|001036|W035|S|08961029909655093775

The final output that I am trying to achieve:
BLA|001036|S|3228|10|1|2|3|001036|W035|S| |4
BLT|600123|S|3437|0|20|0|0|001036|W035|S| |0
BRO|900177|S|3531|-1|0|0|0|001036|W035|S| |0
CHL|123777|S|3327|3|0|0|0|001036|W035|S| |2

Basically I am trying to count the number of string that appears in the last column and append the count as a new column in the output file.

My references/main keys for the initial array are column 2 (001036) and column 4 (3228).

For each new occurrence of col 2 and col 4(e.g 001036 and 3228), the last column would always be a space (" ").

So if($col[12] != " "), i need to count the number of string in the last column that appeared after it.
W035|S|
W035|S|08961029909655092918
W035|S|08961029909655092926
W035|S|08961029909655092934
W035|S|08961029909655092942

As such, the outcome for line 1 would be:
BLA|001036|S|3228|10|1|2|3|001036|W035|S| |4

In other words, $lastcol(001036)(3228)=4

The count of the strings is appended to the last column.

I would also require col 5,6,7,8 from line 1.

Likewise for 123777 and 3327, since there are 2 strings that appear in the entries below it (08961029909655093791 and 08961029909655093775), the outcome is
CHL|123777|S|3327|3.0000|0.0000|0.0000|0.0000|001036|W035|S| |2

If there are no entries below it, I would just append a 0 at the end of it
e.g BLT|600123|S|3437|0|20|0|0|001036|W035|S| |0

I hope I am clear in my brief.

Looking forward to the responses!!

Thank you in advance!

Jason
PerlProgramming

Avatar of undefined
Last Comment
Jason_Sutiono

8/22/2022 - Mon
ASKER CERTIFIED SOLUTION
ozo

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Jason_Sutiono

ASKER
Thanks Ozo for your help. I will be trying out your solution tomorrow.

Actually there is one thing I forgot to ask.

Input:
col1|col2|col3|col4|col5|col6|col7|col8|col9|col10|col11|col12
MCA|U8350WHT|S|3320|1|0|0|166.5400|U8350WHT|W007|S|
MCA|U8350WHT|S|3320|0|0|0|0|U8350WHT|W007|S|356899040614534
MEL|U8350WHT|S|3532|2|0|0|166.5400|U8350WHT|W007|S|
MEL|U8350WHT|S|3532|0|0|0|0|U8350WHT|W007|S|356899040614526
MEL|U8350WHT|S|3532|0|0|0|0|U8350WHT|W007|S|356899040614658
MOR|U8350WHT|S|3867|1|0|0|166.5400|U8350WHT|W007|S|
MOR|U8350WHT|S|3867|0|0|0|0|U8350WHT|W007|S|356899040614971
PEN|U8350WHT|S|3526|1|0|0|166.5400|U8350WHT|W007|S|
PEN|U8350WHT|S|3526|0|0|0|0|U8350WHT|W007|S|356899040614690

What should I do to get only rows where the last column is equals to " "?

Outcome:
MCA|U8350WHT|S|3320|1|0|0|166.5400|U8350WHT|W007|S|
MEL|U8350WHT|S|3532|2|0|0|166.5400|U8350WHT|W007|S|
MOR|U8350WHT|S|3867|1|0|0|166.5400|U8350WHT|W007|S|
PEN|U8350WHT|S|3526|1|0|0|166.5400|U8350WHT|W007|S|

I have tried to only filter by $col12=" " as per attached but it actually prints out everything without excluding those that are not " ".

Help would be much appreciated.

Thank you!

foreach (<FILE4>) {
		
		if ($col[11] = ~/\S/; {
	push(@soh"$col[0]|$col[1]|$col[2]|$col[3]|$col[4]|$col[5]|$col[6]|$col[7]|$col[8]|$col[9]|$col[10]|$col[11]");

}
}

Open in new window

ozo

@soh = grep /\|\s*$/,<FILE4>;
Jason_Sutiono

ASKER
Hi Ozo,

Thank you for your help!!

Its almost there just one thing though. The output that I get is:

BLA|001036|S|3228|10|1|2|3|001036|W035|S|
|4
CHL|123777|S|3327|3|0|0|0|001036|W035|S|
|2
BLT|600123|S|3437|0|20|0|0|001036|W035|S|
|0
BRO|900177|S|3531|-1|0|0|0|001036|W035|S|
|0


How do I get the count value to not print to a new line?

BLA|001036|S|3228|10|1|2|3|001036|W035|S| |4
CHL|123777|S|3327|3|0|0|0|001036|W035|S| |2
BLT|600123|S|3437|0|20|0|0|001036|W035|S| |0
BRO|900177|S|3531|-1|0|0|0|001036|W035|S| |0

Thank you in advance!!
This is the best money I have ever spent. I cannot not tell you how many times these folks have saved my bacon. I learn so much from the contributors.
rwheeler23
Jason_Sutiono

ASKER
Thanks Ozo u rock!!