Solved

Best way to split string

Posted on 2014-09-03
12
173 Views
Last Modified: 2014-09-04
Hi

I am trying to split a string based on the content in the second column. Below is an example of the file.

Example%%%--14986.mtf, Example%%%--14986.aqm      AQM
Example%%%--14986.mtf, Example%%%--14986.aqm      MTF
Example%%%--20080.mtf, Example%%%--20080.aqm      AQM
Example%%%--20080.mtf, Example%%%--20080.aqm      MTF

I would like for my results to produce a file like this:
Example%%%--14986.aqm
Example%%%--14986.mtf
Example%%%--20080.aqm
Example%%%--20080.mtf

Is this possible?
0
Comment
Question by:DOCDGA
  • 7
  • 5
12 Comments
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40301493
awk -F",| " '{if($NF=="MTF") printf "%s", $1; if($NF=="AQM") printf "%s",$3; print ""}' inputfile > outputfile
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40301544
More versatile (and shorter):

awk -F",| " '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' inputfile > outputfile
0
 

Author Comment

by:DOCDGA
ID: 40301641
Thanks woolmilkporc,

I new to linux and the commands. Can you explain what the command is doing?

I just tried your solution and the outputfile was empty.
I appreciate your help.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40301666
If there was no output then the input data you used were not identical to those you posted.
Please post the data you used.
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40301831
What my second solution does is printing out the field whose last characters correspond (in lowercase) to the characters at the end of the line.
So if a line has "AQM" in its last field my code searches and displays (if found) the field of this line which ends in "aqm", e. g. "Example%%%--14986.aqm".

While my first solution is limited to "AQM" and "MTF" and is depending on exact field positions my second solution can work with any character string found in the last field of the input line to search for any field with a matching (lowercase) ending.

If this field ending can be lowercase or uppercase (instead of always lowercase) try this:

awk -F",| " '{for(n=1;n<=NF;n++) if($n~"."$NF"$"||$n~"."tolower($NF)"$") print $n}' inputfile > outputfile

Isn't this what you desire? Please let me know if I got you wrong.
0
 

Author Comment

by:DOCDGA
ID: 40301884
Here is my data

iCOLI100--20080.tfr, iCOLI100--20080.alo      TFR
iCOLI100--20175.tfr, iCOLI100--20175.alo      ALO
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40301932
OK, my very first solution cannot work with those data, because they neither contain "AQM" nor "MTF" nbut both follow-up solutions will work just fine, e.g. this one:

awk -F",| " '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' inputfile > outputfile

If this doesn't produce the desired output - what is this desired otput then? Please post a sample!
0
 

Author Comment

by:DOCDGA
ID: 40302138
I'm still getting 0 byte output file, but it works with my the Example

iCOLI100--20080.tfr, iCOLI100--20080.alo      TFR
iCOLI100--20175.tfr, iCOLI100--20175.alo      ALO

Results should be
iCOLI100--20080.tfr
iCOLI100--20175.alo
0
 
LVL 68

Expert Comment

by:woolmilkporc
ID: 40302188
That's exactly what my solution provides:

Using this code:

awk -F",| " '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' inputfile

with these data being in "inputfile":

iCOLI100--20080.tfr, iCOLI100--20080.alo      TFR
iCOLI100--20175.tfr, iCOLI100--20175.alo      ALO

I get this output:

iCOLI100--20080.tfr
iCOLI100--20175.alo

which seems to be what you're expecting.

So by using which code and which input data do you get this "0 byte output file"?

With small input files you can omit "> outputfile" to see the results immediately on your terminal.
0
 

Author Comment

by:DOCDGA
ID: 40302240
I have just used this below:

[rwilliams@rwilliams ~]$ awk -F",| " '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' prTest.txt
[rwilliams@rwilliams ~]$

I hope I explain this correctly. It seems my data has tab space versus the data your using.
I can press my right arrow key 6 times between (.alo - - - - - -TFR) the data your using but my data
I press the right arrow 1 time and I'm at the final String. (.alo-TFR)

iCOLI100--20080.tfr, iCOLI100--20080.alo      TFR
iCOLI100--20175.tfr, iCOLI100--20175.alo      ALO

I manually went into the file and deleted the tab and press the spacebar 1 time. Now the command works but is there a better way.
0
 
LVL 68

Accepted Solution

by:
woolmilkporc earned 500 total points
ID: 40302252
That's easy:

awk -F",| |\t" '{for(n=1;n<=NF;n++) if($n~tolower($NF)"$") print $n}' prTest.txt
0
 

Author Closing Comment

by:DOCDGA
ID: 40304095
Great job!!! Excellent communication and did a great job at explaining things. I learned something new.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

If you want to move up through the ranks in your technology career, talent and hard work are the bare necessities. But they aren’t enough to make you stand out. Expanding your skills, actively promoting your accomplishments and using promotion st…
Digital marketing agencies have encountered both the opportunities and difficulties that emerge from working with a wide-ranging organizations.
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

759 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

21 Experts available now in Live!

Get 1:1 Help Now