• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 182
  • Last Modified:

Best way to split string

Hi

I am trying to split a string based on the content in the second column. Below is an example of the file.

Example%%%--14986.mtf, Example%%%--14986.aqm      AQM
Example%%%--14986.mtf, Example%%%--14986.aqm      MTF
Example%%%--20080.mtf, Example%%%--20080.aqm      AQM
Example%%%--20080.mtf, Example%%%--20080.aqm      MTF

I would like for my results to produce a file like this:
Example%%%--14986.aqm
Example%%%--14986.mtf
Example%%%--20080.aqm
Example%%%--20080.mtf

Is this possible?
0
DOCDGA
Asked:
DOCDGA
  • 7
  • 5
1 Solution
 
woolmilkporcCommented:
awk -F",| " '{if($NF=="MTF") printf "%s", $1; if($NF=="AQM") printf "%s",$3; print ""}' inputfile > outputfile
0
 
woolmilkporcCommented:
More versatile (and shorter):

awk -F",| " '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' inputfile > outputfile
0
 
DOCDGAAuthor Commented:
Thanks woolmilkporc,

I new to linux and the commands. Can you explain what the command is doing?

I just tried your solution and the outputfile was empty.
I appreciate your help.
0
What does it mean to be "Always On"?

Is your cloud always on? With an Always On cloud you won't have to worry about downtime for maintenance or software application code updates, ensuring that your bottom line isn't affected.

 
woolmilkporcCommented:
If there was no output then the input data you used were not identical to those you posted.
Please post the data you used.
0
 
woolmilkporcCommented:
What my second solution does is printing out the field whose last characters correspond (in lowercase) to the characters at the end of the line.
So if a line has "AQM" in its last field my code searches and displays (if found) the field of this line which ends in "aqm", e. g. "Example%%%--14986.aqm".

While my first solution is limited to "AQM" and "MTF" and is depending on exact field positions my second solution can work with any character string found in the last field of the input line to search for any field with a matching (lowercase) ending.

If this field ending can be lowercase or uppercase (instead of always lowercase) try this:

awk -F",| " '{for(n=1;n<=NF;n++) if($n~"."$NF"$"||$n~"."tolower($NF)"$") print $n}' inputfile > outputfile

Isn't this what you desire? Please let me know if I got you wrong.
0
 
DOCDGAAuthor Commented:
Here is my data

iCOLI100--20080.tfr, iCOLI100--20080.alo      TFR
iCOLI100--20175.tfr, iCOLI100--20175.alo      ALO
0
 
woolmilkporcCommented:
OK, my very first solution cannot work with those data, because they neither contain "AQM" nor "MTF" nbut both follow-up solutions will work just fine, e.g. this one:

awk -F",| " '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' inputfile > outputfile

If this doesn't produce the desired output - what is this desired otput then? Please post a sample!
0
 
DOCDGAAuthor Commented:
I'm still getting 0 byte output file, but it works with my the Example

iCOLI100--20080.tfr, iCOLI100--20080.alo      TFR
iCOLI100--20175.tfr, iCOLI100--20175.alo      ALO

Results should be
iCOLI100--20080.tfr
iCOLI100--20175.alo
0
 
woolmilkporcCommented:
That's exactly what my solution provides:

Using this code:

awk -F",| " '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' inputfile

with these data being in "inputfile":

iCOLI100--20080.tfr, iCOLI100--20080.alo      TFR
iCOLI100--20175.tfr, iCOLI100--20175.alo      ALO

I get this output:

iCOLI100--20080.tfr
iCOLI100--20175.alo

which seems to be what you're expecting.

So by using which code and which input data do you get this "0 byte output file"?

With small input files you can omit "> outputfile" to see the results immediately on your terminal.
0
 
DOCDGAAuthor Commented:
I have just used this below:

[rwilliams@rwilliams ~]$ awk -F",| " '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' prTest.txt
[rwilliams@rwilliams ~]$

I hope I explain this correctly. It seems my data has tab space versus the data your using.
I can press my right arrow key 6 times between (.alo - - - - - -TFR) the data your using but my data
I press the right arrow 1 time and I'm at the final String. (.alo-TFR)

iCOLI100--20080.tfr, iCOLI100--20080.alo      TFR
iCOLI100--20175.tfr, iCOLI100--20175.alo      ALO

I manually went into the file and deleted the tab and press the spacebar 1 time. Now the command works but is there a better way.
0
 
woolmilkporcCommented:
That's easy:

awk -F",| |\t" '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' prTest.txt
0
 
DOCDGAAuthor Commented:
Great job!!! Excellent communication and did a great job at explaining things. I learned something new.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 7
  • 5
Tackle projects and never again get stuck behind a technical roadblock.
Join Now