• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 365
  • Last Modified:

SED Download and Edit

I have a very large UNIX file (>4G) that I need to edit.  Every occurance of a tilde (~) not followed by a pipe (|) should be changed to a tilde followed by a space.  I am told that SED is a UNIX stream editing program that would be ideal for this kind of edit, and I am told SED is freely available on the web.

First, where can I get an SED download?

Second, what would the replace command line parameter look like to accomplish the task?
0
dlmedici
Asked:
dlmedici
  • 3
  • 3
  • 2
  • +3
5 Solutions
 
rockiroadsCommented:
I take it u want sed for windows, as its free on Unix anyway
u can find a copy here
http://gnuwin32.sourceforge.net/packages/sed.htm

using sed is like substituing in vi

e.g.

sed 's/abc/def/'

changes abc to def

0
 
rockiroadsCommented:
in Unix, u display a file using cat <file>
In windows, its type <file>

Ive not tried sed for windows but in Unix to change a word, u do this

cat myfile.txt | sed s/OLDSTR/NEWSTR/

this will redirect to screen, so to redirect to new file

cat myfile.txt | sed s/OLDSTR/NEWSTR/ > newfile.txt

can u use pipes in DOS?

0
 
sunnycoderCommented:
Hi dlmedici,

> First, where can I get an SED download?
You forgot to mention the platform ... sed is available for most platforms .. a simple google search should provide you with a link
 
> Second, what would the replace command line parameter look like to
> accomplish the task?
sed 's:~|:~ :g' input_file_name > ouput_file_name

mv output_file_name input_file_name ----> this would replace your input file with the processed version ... Note that your original file will be lost by this command ... Avoid it if you are not sure.

>I have a very large UNIX file (>4G) that I need to edit.
sed might break on being given voluminous input ... You can still give it a try ... Following command might speed up the processing a bit for you ...

sed '/~|/ s/~|/~ /g' input_file_name > ouput_file_name

Cheers!
Sunnycoder
0
Cloud Class® Course: Python 3 Fundamentals

This course will teach participants about installing and configuring Python, syntax, importing, statements, types, strings, booleans, files, lists, tuples, comprehensions, functions, and classes.

 
root_startCommented:
Hi dlmedici,

Sunnycoder, your command is not doing what dlmedici need because you are changing every "~|" by "~ " and what is needed here, is: if "~" is NOT followed by a pipe, change this to "~ ".

You can use sed to do what you want, and if you want to change it in the file without redirecting the output, you can use the following: "sed -i" but if you want to change and redirect the output, just use the sed below:
==================================================================
sed 's/~/~ /g;s/~ |/~|/g' your_file > result.txt
==================================================================

I created the following file to check what you need:
==================================================================
[utest@Test]:# cat txt
~|aaaaaaaaaaaaaaaaaaaaaaaaaaaa~aaaaaaaaaaaaaaaaaa~|aaaaaaaaaaaaaaaaaaaaaaa
==================================================================

==================================================================
[utest@Test]:# sed 's/~/~ /g;s/~ |/~|/g' txt
~|aaaaaaaaaaaaaaaaaaaaaaaaaaaa~ aaaaaaaaaaaaaaaaaa~|aaaaaaaaaaaaaaaaaaaaaaa
==================================================================

As you can see, it did what you were looking for.

Hope it helps. =0)
0
 
root_startCommented:
Ops... I forgot to comment that you can download SED from the following links:
==================================================================
GNU(Unix): http://www.gnu.org/software/sed/sed.html
Windows: http://gnuwin32.sourceforge.net/packages/sed.htm
==================================================================

You can download the binaries or you compile it by yourself if you don't find the specific compiled binaries for your plataform.

Again, I hope it helps. =0)
0
 
TintinCommented:
rockiroads.

No need for the UUOC.

sed s/OLDSTR/NEWSTR/ myfile.txt

is the same as

cat myfile.txt | sed s/OLDSTR/NEWSTR/

but doesn't need to fork an additional process.
0
 
sunnycoderCommented:
Hi  root_start ,

Thanks for bringing the "not" to my attention. In that case, using two sed expressions would still be unnecessary ...:)

Hi dlmedici,

sed 's:~[^|]:~ :g'   infile > outfile

would be sufficient .... Since input data is voluminous, this would have significant impact on procesing time ....

Cheers!
sunnycoder
0
 
sunnycoderCommented:
Hi dlmedici,

Another issue ...
>Every occurance of a tilde (~) not followed by a pipe (|) should be changed to a tilde followed by a space.
Do you want the next non-pipe character to be replaced with the space or would you like it to be still present with space added between ~ and this character?

If you wish to have it replaced, then above script would work fine .. if you wish to preserve it, then use

sed 's:~\([^|]\):~ \1:g'   infile > outfile

This would still not be able to handle two or more consecutive tildes... such input would necessiate use of multiple sed expressions and you might as well use root_start's solution in that case.

Cheers!
sunnycoder
0
 
root_startCommented:
Hi dlmedici,

Since the input is too large, you can execute SED with nohup and let it executing in background. =0)
0
 
ahoffmannCommented:
most sed would fail with such huge files, I'd use perl instead

perl  -pe 's/~(?!\|)/~ /g' file
# or
perl -i.bak -pe 's/~(?!\|)/~ /g' file
0
 
dlmediciAuthor Commented:
Thank you, Everyone.  I'll forward these excellent suggestions to the fellow in charge of the file.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

  • 3
  • 3
  • 2
  • +3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now