batch file with sed - escape character, trailing newlines

I'm using ssed (super-sed) version 3.59, based on GNU sed version 3.02.80. sed newbie here, so be prepared for horrors below.

I need to create a batch file to run under Windows XP which will repair broken lines in a text data file. Valid lines in the data file end in one of these three strings:
""" (three double quotes)
... so I need a ssed command which will find any newline NOT preceded by one of the above strings, and remove the newline.

The naive best I've been able to do so far is:
ssed -R "(?<!\"\"\"/TEMPORARY.ID\"/NEXT\*\*\*\"\n/d" oldfile > newfile

Apart from the fact that there's got to be a way of escaping all characters in a group, there seem to be two problems here.
(1) The escape character isn't working ahead of " (it works ahead of *). The command seems to be being interpreted as finishing at the first ".
(2) It doesn't find the newline character. I found a reference to sed removing trailing newlines before doing pattern matching, but I can make no sense of the example given to demonstrate how one line should be joined to the next.

Who is Participating?
Monky42Connect With a Mentor Commented:
Hello, I am no sed expert, but I have some experience with regular expressions. Here are some hints that might be worth trying:
- Your regular expression is delimited by "..." try '...' (single quotes) instead. This might be the reason for your trouble with \"
- To match a newline it might be necessary to enable "multiline matching" for the regular expression. Some regexp engines only match single lines by default. Quote: "In Perl, you do this by adding an m after the regex code, like this: m/^regex$/m;."
- When you are working in a windows environment the lines might end with \r\n instead of the unix style plain \n. Add \r? (an optional linefeed) to your expression.

Good luck.
anserisAuthor Commented:
Hi Monky42,

- Using single quotes as delimiters gets the error message "The system cannot find the file specified", even with an expression which works using double quotes.

- \r\n made no difference.

Please stand by, I'm testing the multiple matching. It inspired a bit of research which has turned up some possibilities.

I guess it is possible to do it with (s)sed using it's h H g G x and n commands; but I'd use (g)awk instead as it is more simple to do and more readable for humans:)
Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

anserisAuthor Commented:
I got this far with awk:

>awk "/NEXT\"\"\"$|\"\"\"$/ { print $0 } " inputfile
awk: /NEXT"""$
awk:  ^ unterminated regexp
The network path was not found.

It looks as though I can't escape the " on the left side of |.

Still trying sed, will try the commands you mention.
ahoffmannConnect With a Mentor Commented:
> .. create a batch file to run under Windows XP ..
> .. data file end in one of these three strings:
> """ (three double quotes)

that's a challange for crappy systems ;-)

You need to write your script (awk, sed, whatever) in a file and then use the tool (awk, sed, ...) with a proper option to read its command from that file. Something (awk) like:

{ print "bad line ["NR"]: "$0 }
anserisAuthor Commented:
Apologies for the length of time taken to close this. Other more urgent tasks have prevented me continuing working on this problem. I tried splitting the points earlier, but somehow that feature seemed absent for a while.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.