Link to home
Start Free TrialLog in
Avatar of gadh98
gadh98

asked on

sed script needed

(it can be done with other unix/linux utils such as awk, nawk, etc)
i need a script that will get a cpp file (or more than one, such as *.cpp) from the commandline, and will do :
1. add <endline> in the end of file (if it not ending with such)
2. do "dos2unix" (i need a batch proccessing too. and dos2unix itself do not support more than 1 input file.)
3. do a new "hard line brakes" in each line
4. if have my include written as <dds/DDSArray.h> - will replace it with the same but in "", not in <>
(standard includes will remain in <>)


Gad
Avatar of ahoffmann
ahoffmann
Flag of Germany image

1. do you mean to add the literal string   <endline>  as last line of the file?
3. what are "hard line brakes" for you? (which hexkey?)
4. have your includes a unique pattern?
Avatar of moonbeam012200
moonbeam012200

I'm a bit confused about the "<endline>" as well, but the
following perl script should do the trick.

-->cat cvt
#!/usr/bin/perl -w -i-orig
#
# program to convert dos formatted cpp files and
# perform include file syntax replacements.
#
###
while (<>) {
    $_ =~ y;\r;;d;
    $_ =~ s/<(.*)>/"$1"/;
    print;
}
-->cat cvt
#!/usr/bin/perl -w -i-orig
#
# program to convert dos formatted cpp files and
# perform include file syntax replacements.
#
###
while (<>) {
    $_ =~ y;\r;;d;
    $_ =~ s/<(.*)>/"$1"/;
    print;
}

To run this script enter:

cvt *.cpp

The original cpp files will be retained with "-orig" appended to the filename.

Doh! netscape dupped my paste. Well, you get the idea.

oops... forgot to account for leaving "standard" includes alone. here is a new version.

-->cat cvt
#!/usr/bin/perl -w -i-orig
#
# program to convert dos formatted cpp files and
# perform include file syntax replacements.
#
###
while (<>) {
    $_ =~ y;\r;;d;
    $_ =~ s/<(.*\/.*)>/"$1"/;
    print;
}

Avatar of gadh98

ASKER

moon beam: i'll check your comment right away
ahoffman and moonbeam:
when i say <endline> i mean to erase the current <LF> (if i remeber right - this is the endline in unix/linux files)
and re-writing it again. why ?
caused i saw that when i compile cpp code in solaris-forte compiler, if i had 2 rows written in windows like this:

#include <dds/DDSValue.h>
#include <dds/DDSArray.h>

sometimes the compiler ignored the second line !
just after i had inserted a newline between the two - it recognized it


A line feed (LF) is not used as a unix eof. Generally, applications will use, <EOD> (^d), as a "end if document".

UNIX has no End-Of-Document (^D) inside files
c and C++ compilers use a free flow, you can write all your code in one long line and compiler will recognize it.  
you can use the C shell
$foreach fn (*.cpp)
foreach> cp -f $fn tmpcat
foreach> dos2unix tmpcat >$fn
foreach> cp -f $fn tmpcat
foreach> tr -d "\000"<tmpcat>$fn
foreach> cp -f $fn tmpcat
foreach> sed "s/<\(pattern\)/\"\1\"/g" tmpcat >$fn
foreach> end
this will seach for all your cpp files
run dostounix on them
clean all the null pointers at end of line caused by conversions and the end of file as well
replace the <pattern.h> with "pattern.h"
now you can write a file with all the  patterns and substitde sed "s/<\(pattern\)/\"\1\"/g" tmpcat >$fn
with sed -f sedfile tmpcat >$fn
and the sed file will contail all patterns
"s/<\(pattern\)/\"\1\"/g"
"s/<\(pattern2\)/\"\1\"/g"
etc etc
you can run this command from commandline or put it in a script file.
you can also do it from a tar file a zip file or generate them from command such as
foreach fn (`gzip -cd som.tar.gz| tar xvf -`)
it will take a zip file unzip it untar it and send it to your script which will clean up files and then write them
 
gadh98
, would you like to give more informations (see very first comment), otherwise you'll get incomplete answers
Avatar of gadh98

ASKER

yes, i'll calrify a bit
first - thanks for all your efforts.

to ahoffman: your script is not working . this is the err:

./cvt: --: command not found
./cvt: line 8: syntax error near unexpected token `(<>)'
./cvt: line 8: `while (<>) {'


to ahoffman's first comment:

1. i mean like your doing <enter> manually.
2. i do not know the hex codes.
3. my include patterns are starting and ending with brackets <>, and have to have slash inside - /

this is similar to some of the std. includes, like <sys/time.h> for example, but you have to recognize it.
so i suggest that if you see 'sys/' - ignore it, while in all other cases in which you see <.../....> - change it to ""

(if you know other USEFULL include dirs other than sys/ - let me know)

garboao: it better you'll edit your text so i can just run it...

Gad
Avatar of gadh98

ASKER

and one more comment: i use BASH shell.
in your question you write:
  3. do a new "hard line brakes" in each line
in your last comment (assuming 2. equals 3.):
  2. i do not know the hex codes.

It,s realy very hard to imagine what you want to have if you
   a. cannot tell us what is
   b. cannot tell us what should be
and also: how do you identify the position where a <hard brake> should occour, how should a program identify it? If there is no special character at the moment (like 0x12 or ox15) it cannot be done.

Could you please give exact answers to questions 1. and 3. (numbering according to your question) in my very first comment.

About the includes you wrote (in previous comment):
   while in all other cases in which you see <.../....> - change it to ""

Does this mean that all includes of the form:
   #include <file.h>
are to be treated as system include, means you don't have any include in . just in subdirectories?
BTW, there are dozents of subdirectories in /usr/include other than sys, for example net, nfs.
   
Avatar of gadh98

ASKER

about the "hard line break":
sorry for this misunderstanding.
i'll start from the beginning - my problem is that i edit my files in windows, and therefore they are kept in PC format. while running them and compiling them on unix/linux. the main problem is that they are compiled wrong, cause some include lines are not recognized by the compiler, and also my executables cannot be executed.
my solutions so far were dos2unix - which is not working , or just doing a thing that won't solve this problem, or to open the files in an advanced editor in windows , like "TextPad", and saving them in unix format.
the last solution is almost enough, but even after it i have some compilation errors also, so i have to insert a line break just before the "disappearing" line in the code, in order to the compiler to recognize it.

what i'm looking now is a batch utility/script, that will do this automatically for me.
i still do not know all the way how to do this, so this is the reasin i'm telling you all this story.
if you still do not know what to do, lets say that i will be satisfied is you'll convert the files to unix format, and also add a simple line break between each line of the include lines. (i'm not sure this is the appropriate solution, but as ia said before, i do not know the exact reason to this problem)


about the include pattern:
the <file.h> is to be treated like system include.

about the BTW:
i saw in my solaris /usr/include - 25 sub dirs.
if you can add all them to the pattern rules, i'll be glad, and also make this pattern modular, so i can use one variable to all sub dirs names, so i can change it to my needs.

if you can do also the same var. to MY dirs - it'll be perfect. for example:
<dds/file.h>
<poem/file.h>
these are my dirs
while:
<sys/file.h>
is a system dir

and for this i'll give more points, offcourse.

for all the guys working on a solution:
please write your script with comments, so everyone can understand, and make it ONE script to do all this, or one script to each operation.
i will not accept just general instructions with 1-2 lines that i have to alter to my needs .

i need a whole soluion ...
Avatar of gadh98

ASKER

one more : i mean one variable to my dirs, one variable to system dirs, offcourse.
Avatar of gadh98

ASKER

i do not know why, bu i cannot raise the points to 400 ...
Avatar of gadh98

ASKER

sorry about the dos2unix. now i know how to work  with it:
dos2unix <source> <destination>

and have to enter a destination - otherwise it will not work ! (there is no default dest. like i thought)

(i'm on Solaris 7)
ok, now I'm going to understand it too ;-)

> .. dos2unix - which is not working
this is exactly what you need, usualy.
It does all the convertings about linefeed, carriage return.

If it does not, could you plese tell me which OS, which version of OS and dos2unix.
I only know of a problem about DOS' EOF character: ^Z which is not converted, sometimes, but this should not cause your problems.
Also if the compiler fails, could you please post the message. I will not belief that this is caused by the DOS-versusUNIX-format problem, a compiler reads abyte stream usually.

If you have perl (either on M$ or UNIX), you may convert like:
    perl -i -pe 'BEGIN{ $/="\012"; $\="\015\012"} chomp' dosfile >unixfile


> .. like "TextPad",

I highly recommend to use such an editor, you'll just avoid a lot of other problems. TextPad works for me.

> .. insert a line break just before the "disappearing" line in the code
ok, that's the disadvantage of a byte stream ;-)


About the includes and the (more or less) complete script, I'll work on it. Do you have perl on Solaris (usually Solaris comes with perl 4.0.036 at least).


Just a question about the current format, as you have the files before anything done on UNIX: could you please do a:

   od -c file|head

You'll see a few lines of your file printed as characters, the unprintable characters shown like \n or \r, etc.
I'll just need to know what's the current line break, probably it is:  \r\n
Avatar of gadh98

ASKER

ok, dos2unix is working , and i can do also a batch converting using for...do loop

yes, i have perl, but still the initial script you gave do not work. i suppose you have to fill the <> with a living example. do it , please, and i'll change the value to mine.


gadh98, do you mean me by "the initial script you gave do not work." ?
I didn't gave a script in this thread.
I posted the script. From the error output it looks like you included the "-->cat cvt" line in the script. The script should start with the line "#!/usr/bin/perl -w -i-orig". Depending on where you have perl installed, you might need to edit that line. Enter "type perl" to see where perl is installed on your system.
moonbeam, as you gave the basic perl script, I think it's up to you to add the loop around the files (probaly command args), and a hash for not changable include, and then collect the points ;-)
Avatar of gadh98

ASKER

o'k, i'll try moonbeam's comment.

BTW, how can i raise the points above 300 ?
The loop around the files is already implicit in the existing script. I am however concerned about the comment:

>If you can do also the same var. to MY dirs - it'll be
>perfect. for example:
>               <dds/file.h>
>               <poem/file.h>
>               these are my dirs
>               while:
>               <sys/file.h>
>               is a system dir

The current version will not change any include that is in the form <anydir/include.h>. It would not be hard to fix this, but I don't have time right now, and will post a new version later tonight. The approach I will take will to key on any subdirectory name that is relative to the current directory. If however, gadh98 has installed any "MY" dirs in /usr/include, there isn't any "general" solution.
oops...

I said: "current version will not change"

I should have said, "current version will only change"

Sorry
Sorry for the delay, it's been a busy week. I think I have a complete solution for you. Here is the script.

#!/usr/bin/perl -w -i-orig
#
# program to convert dos formatted cpp files and
# perform include file syntax replacements.
#
###

#
# generate a set of "system" includes
#
$dirs = join( " ", map($_, </usr/include/*\/>));
$dirs =~  s/\+//g;
$dirs =~ s/\/usr\/include\///g;
@dirs = split /\s+/,$dirs;

#
# loop over every file and perform modifications
#
while (<>) {                            # loop over each line and each file
    $_ =~ y;\r;;d;                      # translate dos line end characters
    foreach $dir (@dirs) {              # loop over each system include
        if ( /$dir/) {                  # check the line for a system include
            print;                      # preserve the line
            goto LINE;                  # get the next line
        }
    }
    $_ =~ s/<(.*\/.*)>/"$1"/;           # modify local includes
    print;
    LINE:
}

In my test case, the original includes looked like this:

#include <iostream.h>
#include <sys/it.h>

#include <my/mitrace.h>
#include <string.h>

After running the utility, they look like this:


#include <iostream.h>
#include <sys/it.h>

#include "my/mitrace.h"
#include <string.h>

To verify that only the necessary changes where made (my test case was a complete c++ program, not just a fragment) I checked the sysname.cpp against the sysname.cpp-orig:
-->diff sysname.cpp-orig sysname.cpp
5c5
< #include <my/mitrace.h>
---
> #include "my/mitrace.h"

Looks good!
perl -e '( $ ,, $ ")=("a".."z")[0,-1]; print "sh", $ ","m\n";;";;"'

William

For those type "A" people out there, it is possible to shorten this script by one line.

#!/usr/bin/perl -w -i-orig
#
# program to convert dos formatted cpp files and
# perform include file syntax replacements.
#
###

#
# generate a set of "system" includes
#
$dirs = join( " ", map($_, </usr/include/*\/>));
$dirs =~  s/\+//g;
$dirs =~ s/\/usr\/include\///g;
@dirs = split /\s+/,$dirs;

#
# loop over every file and perform modifications
#
while (<>) {                            # loop over each line and each file
    $_ =~ y;\r;;d;                      # translate dos line end characters
    foreach $dir (@dirs) {              # loop over each system include
        if ( /$dir/) {                  # check the line for a system include
            goto LINE;                  # get the next line
        }
    }
    $_ =~ s/<(.*\/.*)>/"$1"/;           # modify local includes
    LINE:
    print;
}
ASKER CERTIFIED SOLUTION
Avatar of kiffney
kiffney

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
BTW, start with the *right version* of textpad, or you even will have more problems than these (as usual on ...)
Or better yet, vim (vi improved) is available on win and unix.

vi is my shepherd; i shall not font.
.. was there any better suggestion than vi[m]
it even manages the ^M characters propper
Not bad for somthing already 30 years old :-]]
Avatar of gadh98

ASKER

but i cannot use other txt editor frequently, cause i'm doing a porting code job from windows to unix, and i compile in the same time for windows using VC++, and on unix - solaris forte 6
Avatar of gadh98

ASKER

to moonbeam: what is type "a" people, and what is the difference between the 2 scripts you gave
(only the first works for me)
Avatar of gadh98

ASKER

to moobeam - i wanted to give you the points , but by accident i gave them to the wrong man.

what can i do to fix it ?