Link to home
Start Free TrialLog in
Avatar of G Ram
G Ram

asked on

Separate out IP and text spanning different lines , into a specified format using bash scripting

How to separate out a text file having the following format  on to another text file ?

10.10.10.06  | skjahdkjhhadjhahdahkahdhajkdhajkhjdkhakjhdjkahjdhajkhdjkahjkddddddddddddddddddhakkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkddshajhd
10.10.10.06  |dsjhdjhjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
 *ashadjahddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddda
10.10.10.06 | xcnbxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxzczc

I would like to have
 
10.10.10.06
-----------------
1) skjahdkjhhadjhahdahkahdhajkdhajkhjdkhakjhdjkahjdhajkhdjkahjkddddddddddddddddddhakkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkddshajhd
2) dsjhdjhjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
 *ashadjahddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddda

3) xcnbxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxzczc

Thanks,
GR
Avatar of arnold
arnold
Flag of United States of America image

Use awk or cut to parse based in |

First field will be the Ip, the second is the SATA string

awk -F\|  ' {printf "%s\n----------\n%s\n",$1,$2 }. 'Data_spurce_file
Dont't forget to first sort the file if it is not sorted
Avatar of G Ram
G Ram

ASKER

Hello @arnold,
  Yes this is OK. But how could I avoid  underlining for multiple lines appearing for the second column?

[Current output ]

10.10.10.06  
---------------
skjahdkjhhadjhahdahkahdhajkdhajkhjdkhakjhdjkahjdhajkhdjkahjkddddddddddddddddddhakkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
kkkkkkkkkkddshajhd
-------------------------------
10.10.10.06  
-----------------------

dsjhdjhjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
jjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------



Desired output
:

IP=10.10.10.06  

----------------

skjahdkjhhadjhahdahkahdhajkdhajkhjdkhakjhdjkahjdhajkhdjkahjkddddddddddddddddddhakkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
kkkkkkkkkkddshajhd

IP=10.10.10.06  
-----------------------

dsjhdjhjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj
jjjjjjjjjjjjjjjjjjjjjjjjjjjjjj

Thanks,
GR
Tre difficulty is whether the data you gave includes patterns
|contin...

Use the awk portion
echo '1|2' | awk -F\| ' {printf "%s\n--------\n%s\n",$1,$2 } '
Do you get
1
-------
2
Avatar of G Ram

ASKER

Yes, I get that . The issue is 2nd column values spans multiple lines. As you can see, 1st and 2nd column data is separated by |
"The issue is 2nd column values spans multiple lines."

Do you mean "it does and should not" or "it does not and should" for the problem you mention

More precisely does it in the initial data? If no AND you don't want in output, the simply concatenate. Depending on where you look the result, it will either hide what is outside screen, or span aditional lines.
Can you post a sample of the text file surrounding the portions that have this issue as an example?



If the second part spans multiple lines

try the following, adding some debugging feature to indicate whether the data is coming from AWK or external sources.

awk -F\|  ' {printf "%s\n----------\n%s\n++++++n",$1,$2 }. 'Data_spurce_file

the effect is
1
-----------
2
++++++

see what your output is like.

adding a condition outside the { as (length($1)>5) will check whether the IP is present in the first item

awk -F\|  ' (length($1) >5 ) {printf "%s\n----------\n%s\n",$1,$2 }. 'Data_spurce_file

in which case if the IP is not present (using five to deal with any errand spaces, tab characters....

see if that changes the display, though note that your data file might have
IP | comment
this is a new line continueing the comment.

The awk as posted only checks a line at a time.
Avatar of G Ram

ASKER

Hello @Bernard,
  I mean the data file which I am parsing has this issue .So current solution by @arnold does separate out the columns .But since 2nd column data spans multiple rows, obviously the output is
10.10.10.06  
---------------
skjahdkjhhadjhahdahkahdhajkdhajkhjdkhakjhdjkahjdhajkhdjkahjkddddddddddddddddddhakkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
kkkkkkkkkkddshajhd
-------------------------------
What is need is just the IP in bold and under that  each of 2nd column value. and it does not matter if it repeats IP because description would be different. I have already given my desired output in earlier reply

Thanks
try the following script
cat data | perl script.pl

the below is script.pl
#!/usr/bin/perl

$found=0;
$last=<STDIN>;
while (<STDIN> ) {
chomp(); #remove the lf/cr if exists at the end of this line.
if ( /\d+\.\d+\.\d+\.\d+/ ) {
   #there is a match to an IP meaning we reached a new line
   @array=split('\|',$last,2); #break the line into two only
   printf ("%s\n-------\n%s\n",$array[0],$array[1]); #output the prior line since we found a new line
    $last="$_";
}
else {
   $last.="$_"; #append the current line to the prior as it is a continuation
}
} # end of while loop, the below is to clear the last line or it will be omitted
@array=split('\|',$last,2); #break the line into two only
   printf ("%s\n-------\n%s\n",$array[0],$array[1]); #output the prior line since we found a new line

Open in new window

Avatar of G Ram

ASKER

Hello @arnold,
  I checked perl ver, its 5 . I tried it out ,while it gives better control of what we do .Sample  input data below..

10.10.10.06|An value exists.
It is sometimes opened by this/these Programs:
 notepad.exe
 notepad++.exe
 


Unless you know for sure what is behind it, you'd better
check your system
**** have been dynamically allocated to system
10.10.10.06|Certificate of this service will expire shortly

I do see a 3 rd column of values. Is that why it is still outputting some lines separated by -----

Thanks,
you are on a Windows system?

Perl script.pl <filename

I am unsure which file opens with notepad?
If you want to run script.pl, you would need to change its file association to open using perl.

Not sure how to answer, the awk example parses the li evaded on |
Starting from 1 each resulting element.

I.e. 1|2|3|4|5|6|7
If passed to awk, only 1 and 2 will be output.
The issue begins if the lifespan multiple lines
Ip | sone text
Sone additional text | so eother info
With awk
Ip, sone additional text will be in column 1, while the .... In column 2

What is your environment made up of.

On Linux, UNIX do you gave an editor, vi, vim, emacs, nano, pico, etc?
Avatar of G Ram

ASKER

Centos 7. Vi editor.what I do is bash script runs the sql against SQLite3 dB and put in resultant txt file which I want to parse, and that’s when I hit the text wrap issues.perl -v  gave me ver as 5. I am not familiar with Perl  . So I can modify the script to suit more than 2 columns?  Only issue with the perl script given is that when there are multiple lines in column 2 having blank lines in between, it underlines some lines. I guess I could get over that with making the column 1 (ip) as bold. BecAuse this file will mailed as attachment , readability is important
Yes,
Split(delimeter,"string",number of elements; optional)

In the perl script I was only interested in two fields, the ip, and the second column.
Changing the 2 to 3 will split the string into three columns if there are two | ...

Just add %s\n in the display, first portion of printf. It functions the same way as in c. And add ,$array[2] ....
Avatar of G Ram

ASKER

Hello @Arnold,
That works . But how to output the parsed file so that sendmail in bash can send as attachment ?
When you say attachment, presumably it means not unlike.

One way is to writeout out
File="fIlename.$$"
The $$ is the PID of the process.

You can output the results into $file

Sendmail will include.

If you use perl, and Mail module you can encode the file.


In Bash, you need to use an email client such as mail, mutt, etc.
With those you an include the file as an attachment.


At no point in the question emailing the results ....



mail  -s "subject" somerecipient@somedomain.com <$file
Avatar of G Ram

ASKER

Thanks @Arnold . I know that I have to use sendmail in bash.Just wanted to know after calling perl script in bash, how to return the result file back to bash ,so that I can send as attachment using sendmail
Avatar of G Ram

ASKER

using $() operator ?
If you use perl, you can send email from within without the need to return it back to bash.

Example of sending email usin perl.

https://learn.perl.org/examples/email.html

The example of sending with attachment ....


Can be seen...
See the use and example of attaching ......

https://metacpan.org/pod/Email::MIME
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.