Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 273
  • Last Modified:

File Manipulation: Sed and unique value replacement

I recently have been looking for a way to help ease a process, I have been working on it via some bash scripting: but I believe I am failing to go in the right direction and need some advice in which direction I should be pointed!

Problem: CSV File modification
1. Needs additional column (Simple)
2. Host Column Field contains dynamic data located within the DESC field (Where HOST data is)
3. Any duplicate DESC fields need to be removed **worry about later
--------------------------------------------------------

Simple example of what the file presently looks like:
ID, IP, PORT, DESC

What the file needs to look like:
ID, IP, HOST, PORT, DESC

(simple sed s/IP/&\"\, \"$HOST_&/g) ...to keep $HOST and $IP unique

now I need to remove the "." with "_" to each $HOST  (simplest way to do this would be greatly appreciated)

I can also: sed -n 's/[0-9]{3} [parse ip]/&/p' | sed -n 's/DESC/&/p'  | sed -n 's/REMOVE garbage/\=/p'  .... until it leaves me with “UNIQUE_IP=CORRESPEONDING_UNIQUE_HOST”

eg: 10.10.10.1=John.thinker.net (sed again to get it to HOST_10_10_10_1=John.thinker.net)

Here is where I am having a problem trying to associate the UNIQUE_HOST with the new HOST field to the corresponding UNIQUE_IP. It's already configured; but what is the easier way to search/replace through the file until all unique hosts are replaced (besides manually)

Original:

“334”,”10.10.10.1”,””,”John.thinker.net”
“335”,”10.10.10.2”,”23”,”Telnet”

Currently:
“334”,”10.10.10.1”,”$HOST_10_10_10_1”,””,”John.thinker.net”
“335”,”10.10.10.2”,”$HOST_10_10_10_2”,”23”,”Telnet

Need:
“334”,”10.10.10.1”,”John.thinker.net”,””,”John.thinker.net”
“335”,”10.10.10.2”,”Mark.thinker.net”,”23”,”Telnet”




0
mattlal
Asked:
mattlal
  • 6
  • 3
  • 3
1 Solution
 
ahoffmannCommented:
> now I need to remove the "." with "_" to each $HOST
you mean replace . by _
  tr '.' '_' <file >newfile

hmm can't follow your other description ..
is this what you want:
  echo '"334","10.10.10.1","","John.thinker.net"' | awk -F, '{print $1","$2","$4","$3","$4}'
0
 
manav_mathurCommented:
How do you generate a list of
UNIQUE_IP=CORRESPEONDING_UNIQUE_HOST
for example, where do you resolve 10.10.10.2 to mark.thinker.net??

as this information is not in your file?? Or is it the description field that you want to be put into the host field??

0
 
mattlalAuthor Commented:
Sorry for the confusion:

the $HOST information can be found in the DESC field.


as for the $HOST(HOST_$IP[without "."])  [eg.$HOST10.10.1.1 cannot work must be $HOST_10_10_1_1; but how do i manipulate only one field without changing the other fields in the file?) [I was thinking use sed ...replace the 4-7th occurance only ..but i tried s/\./\{4,7\}/\_/g with no luck (I am more than likely just using a function wrong)]

as for generating unique_ip = unique_hostname that's easy: just use muliple sed and save to a serpate file; I can have it say anything "export HOST_10_10_1_1=john.thinker.net" (although here I would probably be best off with the s/\./\_/g on first 3! same problem as before; but still not the real issue; more concerend on $HOST_($IP) replacement with correct value according to the sed'd file (or however else it can be done :) )
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
ahoffmannCommented:
confusing, very confusing ...
could you please just post your input file (2 or 3 lines) and the output you expect for that, nothing else
0
 
manav_mathurCommented:
What we can do in Perl is
- make one pass thru the file, storing all ip->host mapping in memory (read)
- make another pass thru the file, reading the correct host entry corresponding to a record's IP from the memory, and insert the host value into a record....that will save us the intermediate process of creating HOST_10_10_2 type files.

Is this what you want??

Manav
0
 
mattlalAuthor Commented:
Manav: that could work as well,

I was trying for a generic script to work almost anywhere; but perl has it's place to in this: Originally I thought it would have done simply with bash/sed/awk/etc. (guess not!)

ahoffmann:
------------------------------------------
Original:
IP, PORT, DESC
"10.10.10.1”,””,”John.thinker.net”
”10.10.10.2”,”23”,”Telnet”

Need:
IP,HOST,PORT,DESC
”10.10.10.1”,”John.thinker.net”,””,”John.thinker.net”
”10.10.10.2”,”Mark.thinker.net”,”23”,”Telnet”


if this helps: it's basically a custom nmap output....the needs an extra field from the resolved hostname
0
 
manav_mathurCommented:
The only thing I dont uderstand is where do you get "mark.thinker.net" corresponding to 10.10.10.2.....does it occur somewhere down in the input file??

use strict ;
use warnings ;
my %host; #this will maintain our ip->host mappings
open(INFILE,"< input") or die("$! : cant open input") ;
while(<INFILE>) {
chomp ;
my ($id,$ip,$port,$desc) = split(/,/$_,4) ;
$hash{$ip}=$desc if !$hash{$ip}; ##map ip->host if entry for tthat ip doesn;t exist
}
seek(INFILE,0,0) ;
while(<INFILE>) {
my ($id,$ip,$port,$desc) = split(/,/$_,4) ;
print "$id,$ip,$host{$ip},$port,$desc" ; ##insert host for the corresonding ip
}
close(INFILE) ;
 
 

 
0
 
mattlalAuthor Commented:
oh sorry should have posted a few more lines:

"10.10.10.2”,”443”,”https"
"10.10.10.2”,”21”,”ftp”
"10.10.10.2”,””,”mark.thinker.net”

(and 10.10.10.3 may not have a hostname associated with it; but neeeds the "field" to maintain the CSV integrity).

The split should probably be done with ' "," '  [There might be ',' in the DESC field]

The DESC field contains many "banners"


0
 
manav_mathurCommented:
use strict ;
use warnings ;
my %host; #this will maintain our ip->host mappings
open(INFILE,"< input") or die("$! : cant open input") ;
while(<INFILE>) {
chomp ;
my ($id,$ip,$port,$desc) = split(/,/$_,4) ;
$hash{$ip}=$desc if($desc =~ /^(?:\d{1,3}\.){3}\d+$/) ; ##map ip->host if desc matched aa.bb.cc.dd
}
seek(INFILE,0,0) ;
while(<INFILE>) {
my ($id,$ip,$port,$desc) = split(/,/$_,4) ;
print "$id,$ip,$host{$ip},$port,$desc" ; ##insert host for the corresonding ip
}
close(INFILE) ;

as split limit has been set as 4, any commas in desc will not be considered for splitting as long as there is no comma in $id,$ip,$port
0
 
manav_mathurCommented:
Ok, is this what you want

SCRIPT
=====
use strict ;
use warnings ;
my %host; #this will maintain our ip->host mappings
open(INFILE,"< input_file") or die("$! : cant open input") ;
while(<INFILE>) {
chomp ;
my ($id,$ip,$port,$desc) = split(/,/,$_,4) ;
$host{$ip}=$desc if($desc =~ /^\"\w+\.\w+\.\w+\"$/) ;
}
seek(INFILE,0,0) ;
while(<INFILE>) {
my ($id,$ip,$port,$desc) = split(/,/,$_,4) ;
print "$id,$ip,$host{$ip},$port,$desc" ; ##insert host for the corresonding ip
}
close(INFILE) ;


INPUT
====
"334","10.10.10.1","","John.thinker.net"
"335","10.10.10.2","23","Telnet"
"336","10.10.10.2","23","Telnet"
"336","10.10.10.2","443","https"
"336","10.10.10.2","21","ftp"
"336","10.10.10.2","","mark.thinker.net"

OUTPUT
=====
"334","10.10.10.1","John.thinker.net","","John.thinker.net"
"335","10.10.10.2","mark.thinker.net","23","Telnet"
"336","10.10.10.2","mark.thinker.net","23","Telnet"
"336","10.10.10.2","mark.thinker.net","443","https"
"336","10.10.10.2","mark.thinker.net","21","ftp"
"336","10.10.10.2","mark.thinker.net","","mark.thinker.net"
0
 
manav_mathurCommented:
tested it with different inputs
===================

E:\>type file1
"334","10.10.10.1","","John.thinker.net"
"335","10.10.10.2","23","Telnet"
"336","10.10.10.2","23","Telnet"
"336","10.10.10.1","443","https" ##notice this changes ip address
"336","10.10.10.1","21","ftp"      ##and here again
"336","10.10.10.2","","mark.thinker.net"

E:\>tr.pl
"334","10.10.10.1","John.thinker.net","","John.thinker.net"
"335","10.10.10.2","mark.thinker.net","23","Telnet"
"336","10.10.10.2","mark.thinker.net","23","Telnet"
"336","10.10.10.1","John.thinker.net","443","https"
"336","10.10.10.1","John.thinker.net","21","ftp"
"336","10.10.10.2","mark.thinker.net","","mark.thinker.net"
0
 
ahoffmannCommented:
hmm, my suggestion was nearly perfect :-(
awk -F, '{if($3~/""/){print $1","$2","$4","$4}else{print $1","$2",\"mark.thinker.net\","$2","$3}}'

or do I miss something?
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 6
  • 3
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now