Want to protect your cyber security and still get fast solutions? Ask a secure question today.Go Premium

x
?
Solved

Bourne Shell Script (sed) - Messes up File formatting

Posted on 2005-04-20
15
Medium Priority
?
427 Views
Last Modified: 2013-12-26
I have a script that takes text like:

about His Excellency, and the best means of pleasing him, and so
on. I had the patience to sit like a fool beside these people for four hours at
a stretch, listening to them without knowing what to say to them or
venturing to say a word.  I became stupefied, several times I felt myself
perspiring, I was overcome by a sort of paralysis; but this was pleasant and
good for me.  On returning home I deferred for a time my desire to
embrace all mankind.

And converts it into this formatting (justified), using sed in a shell script:

Excellency, and the best means of pleasing him, and so on. I had the patience
to    sit   like   a   fool  beside  these   for  four  hours  at  a  stretch,  listening
to   them  without  knowing  what to say to them or venturing to say a word. I
become stupefied, several times I felt myself perspiring,  I was overcome by a

.. and so on. ..

echo Converting formatted text to Postscript and PDF, please wait...                                                       <-- Echoes what the process below does.
# Convert Formatted Standard Output to Postscript > PDF                                                                     <-- Comment
(cat <<!                                                                                                                                            <-- Opens a process, and starts a Adobe Postscript Boilerplate to be on top of stdout.
%!PS-Adobe-3.0                                                                                                                                <-- Postscript boilerplate
30 dict                                                                                                                                              <-- Postscript boilerplate    
/PS { /level0 save def         } bind def                                                                                                 <-- Postscript boilerplate
/PE {  level0 restore showpage } bind def                                                                                            <-- Postscript boilerplate
/C  { /Courier        findfont 10.0 scalefont setfont } def                                                                         <-- Postscript boilerplate
begin                                                                                                                                               <-- Postscript boilerplate
C                                                                                                                                                     <-- Postscript boilerplate
!                                                                                                                                                      <-- Ends Postscript boilerplate
nnr=0                                                                                                                                              <-- Initializes nnr=0, a counter
cat textfileout.txt | sed 's/[()]/\\&/g' | sed 's/^.*$/(&) 20 + moveto show/' |                                         <-- Converts each line to: (become stupefied, several times I) 20 + moveto show. Formatting is retained.    
while read N                                                                                                                                     <-- Send line above to a while loop.
do
nnr=`expr $nnr + 1`                                                                                                                         <-- Increments nnr.
rem=`expr $nnr % 66`                                                                                                                     <-- Finds remainder of nnr and 66.
if [ $rem -eq 1 ]              
then echo PS                                                                                                                                    <-- If remainder equals 1, puts a PS into the document (ex: when nnr = 1)
fi
fp=`echo "$N" | cut -d"+" -f1`                                                                                                          
sp=`echo "$N" | cut -d"+" -f2`
echo $fp `expr 760 - 11 \* \( \( $nnr - 1 \) % 66 \)` $sp                                                                     <-- Converts each line to (become stupefied, several times I) 20 254 moveto show. The 254 is a number that if [ "$rem" -eq 0 ]                                                                                                                                     decrements from 760 by 11, when result is less than 0, restarts at 760. In this loop, the formatting that
then echo PE                                                                                                                                         we had (justified) reverts back to unformatted. This is what I cannot figure out.  
fi
done
echo PE
echo end) > project4.ps                                                                                                                   <-- Puts a PE and end at the end of doc. Puts stdout in a file.
/usr/local/bin/ps2pdf project4.ps > project4.pdf
echo DONE.

The part that I cannot figure out above is why the formatted text is reverted.. is this something about echo line that removes the formatting?

(Excellency, and the best means of pleasing him, and so on. I had the patience) 20 50 moveto show
(to    sit   like   a   fool  beside  these   for  four  hours  at  a  stretch,  listening) 20 760 moveto show
(to   them  without  knowing  what to say to them or venturing to say a word. I) 20 749 moveto show
(become stupefied, several times I felt myself perspiring,  I was overcome by a) 20 738 moveto show

and so on.. instead, the while loop is converting it to something like:

(about His Excellency, and the best means of pleasing him, and so) 20 760 moveto show
(on. I had the patience to sit like a fool beside these people for four hours at) 20 749 moveto show
(a stretch, listening to them without knowing what to say to them or) 20 738 moveto show

Any ideas? This is messing up the resulting pdf file quite a bit.
0
Comment
Question by:vupadhya
  • 10
  • 4
15 Comments
 
LVL 48

Expert Comment

by:Tintin
ID: 13830917
When spaces are important, you should always quote the variable.

In the line:

echo $fp `expr 760 - 11 \* \( \( $nnr - 1 \) % 66 \)` $sp

you should change it to:

echo "$fp" `expr 760 - 11 \* \( \( $nnr - 1 \) % 66 \)` "$sp"
0
 

Author Comment

by:vupadhya
ID: 13839228
That worked - cool!

One more minor thing:

In this part:
cat textfileout.txt | sed 's/[()]/\\&/g' | sed 's/^.*$/(&) 20 + moveto show/' |

doesnt care for existing ( ) in the text. This produces confusion to the pdf.

Something like:

(my  liver is diseased. However, I know nothing  at all about my disease, and do ) 20  507  moveto show
(not  know for certain what ails me. I  don't consult a doctor for it, and never ) 20  496  moveto show
(have, though I have a respect for medicine and doctors. Besides, I am extremely ) 20  485  moveto show
(superstitious, sufficiently so to respect medicine, anyway (I am  well-educated ) 20  474  moveto show
(enough  not  to  be  superstitious,   but  I am superstitious). No, I refuse to ) 20  463  moveto show
(consult  a  doctor  from spite. That  you probably will not understand. Well, I ) 20  452  moveto show

throws pdf off. Those round brackets need to be in there (as part of the text)... but they are special characters in PS (denote ends of lines, as u can see)

The PDF shows something like:
superstitious, sufficiently so to respect medicine, anyway (I am  well-educated ) 20  474  moveto show
consult  a  doctor  from spite. That  you probably will not understand. Well, I  

When it should display something like:
superstitious, sufficiently so to respect medicine, anyway (I am  well-educated
enough  not  to  be  superstitious,   but  I am superstitious). No, I refuse to
consult  a  doctor  from spite. That  you probably will not understand. Well, I

I thought I had preserved the ( with putting the forward slash infront of them, in the seds:

cat textfileout.txt | sed 's/[()]/\\&/g' | sed 's/^.*$/(&) 20 + moveto show/' |  

And it does replace in the file, round brackets with /(. By the time postscript is created (at the
end of the while loop).. round brackets are back to round brackets. For some reason, PDF
still doesnt take them as regular characters (the filter i use is ps2pdf). and truncates lines as above.

I tried using a .ps file w/  a line such as /(something here goes /) ) 20 moveto 360.
It still messed it up and truncated the line.

Any ideas?
0
 

Author Comment

by:vupadhya
ID: 13839255
It might have to do with my ps2pdf filter. I'm trying with ps2pdf13... only bec. I see the acrobat file with acrobat 6.0
0
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

 

Author Comment

by:vupadhya
ID: 13839264
ah, still messes up.
0
 
LVL 48

Expert Comment

by:Tintin
ID: 13839382
Change your sed command to

sed -e 's/^(//' -e 's/) \([0-9].*\)/\1/'
0
 

Author Comment

by:vupadhya
ID: 13839591
I implemented it like so:

cat Sample.txt | sed -e 's/^(//' -e 's/) \([0-9].*\)/\1/' | sed 's/^.*$/(&) 20 + moveto show/' |

while read N
.
.
.
there is a space between s/)[space]\([0-9] correct?


Hmm, when I run it, now I get:

tone  again.  I  suspected that he  had an aversion for me, but still I went on
going to see him, not being quite certain of it.

Converting formatted text to Postscript and PDF, please wait...
AFPL Ghostscript 8.13: Unrecoverable error, exit code 1
DONE.


AFPL Ghostscript 8.13: Unrecoverable error, exit code 1.. ?

I cant open the pdf file for some reason (Get the Page not found error on IE).. hmm, maybe a server problem. (Or the file is corrupted)

When I run ps2pdf externally from my script, I get more details:
$ ps2pdf project4.ps project4.pdf
Error: /undefined in ,
Operand stack:
   (and solely for that reason)
Execution stack:
   %interp_exit   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   false   1   %stopped_push   1   3   %oparray_pop   1   3   %oparray_pop   1   3   %oparray_pop   1   3   %oparray_pop   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--
Dictionary stack:
   --dict:1110/1686(ro)(G)--   --dict:0/20(G)--   --dict:72/200(L)--   --dict:1/30(L)--
Current allocation mode is local
Current file position is 8755
AFPL Ghostscript 8.13: Unrecoverable error, exit code 1

That part of the text is:
(irritated)  you  think  fit  to   ask  me who I am -- then my answer is, I am a ) 20  452  moveto show
(collegiate  assessor.  I was in the service  that I might have something to eat ) 20  441  moveto show
(and solely for that reason), and when last year a distant relation left me six ) 20  430  moveto show
(thousand roubles in his will I immediately retired from the service and settled ) 20  419  moveto show
(down in my corner. I used to live in this corner before, but now I have settled ) 20  408  moveto show
(down in it. My room is a wretched, horrid  one in the outskirts of the town. My ) 20  397  moveto show

The line needs to be

((and solely for that reason), and when last year a distant relation left me six ) 20  430  moveto show

I think that is what is messing it up.

The original, unformatted text has it like:

I feel that you are irritated) you think fit to ask me who I am -- then my
answer is, I am a collegiate assessor.  I was in the service that I might have
something to eat (and solely for that reason), and when last year a distant
relation left me six thousand roubles in his will I immediately retired





0
 
LVL 48

Expert Comment

by:Tintin
ID: 13839839
Does this help?

$ cat a
I feel that you are irritated) you think fit to ask me who I am -- then my
answer is, I am a collegiate assessor.  I was in the service that I might have
(something to eat and solely for that reason), and when last year a distant
relation left me six thousand roubles in his will I immediately retired

$ cat a | sed -e 's/^/(/' -e 's/$/) 20 + moveto show/'
(I feel that you are irritated) you think fit to ask me who I am -- then my) 20 + moveto show
(answer is, I am a collegiate assessor.  I was in the service that I might have) 20 + moveto show
((something to eat and solely for that reason), and when last year a distant) 20 + moveto show
(relation left me six thousand roubles in his will I immediately retired) 20 + moveto show
0
 

Author Comment

by:vupadhya
ID: 13840141
Still gives

Converting formatted text to Postscript and PDF, please wait...
AFPL Ghostscript 8.13: Unrecoverable error, exit code 1
DONE.

using:
cat Project4OUT.txt | sed -e 's/^/(/' -e 's/$/) 20 + movetoshow/' |
while read N

When converting directly, its blowing up on the first line:
$ ps2pdf project4.ps project41.pdf
Error: /undefined in movetoshow
Operand stack:
   (Notes from the Underground )   20   760
Execution stack:
   %interp_exit   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   false   1   %stopped_push   1   3   %oparray_pop   1   3   %oparray_pop   1   3   %oparray_pop   1   3   %oparray_pop   .runexec2   --nostringval--   --nostringval--   --nostringval--   2   %stopped_push   --nostringval--
Dictionary stack:
   --dict:1110/1686(ro)(G)--   --dict:0/20(G)--   --dict:72/200(L)--   --dict:1/30(L)--
Current allocation mode is local
Current file position is 228
AFPL Ghostscript 8.13: Unrecoverable error, exit code 1

I'll have to look at it tomm, in detail. Any ideas in the meantime?
0
 
LVL 48

Expert Comment

by:Tintin
ID: 13840194
What's the output you get from the sed command?
0
 
LVL 22

Accepted Solution

by:
NovaDenizen earned 1500 total points
ID: 13845498
Postscript requires parenthesis within strings to be escaped with backslashes, not forward slashes, like so:
(\(something to eat and solely for that reason\), and when last year a distant) 20 + moveto show

It looks like you properly escaped your parentheses using sed 's/[()]/\\&/g'.  The problem resides within the fact that echo performs escape translation on whatever string it is given, thus causing the backslashes to disappear.  

I know that the builtin echo in bash has a -e feature which turns off escape translation.  I don't know specifically about other implementations of sh or echo.  If you're using bash you can use echo -e.

Another way to get around echo escape translation is to use /bin/printf.
echo "$fp" `expr 760 - 11 \* \( \( $nnr - 1 \) % 66 \)` "$sp"
could be replaced by:
/bin/printf "%s %d %s\n" "$fp" `expr 760 - 11 \* \( \( $nnr - 1 \) % 66 \)` "$sp"
0
 

Author Comment

by:vupadhya
ID: 13847252
If I use the printf, I get

printf:  moveto show expected numeric value
expr: syntax error

I guess it cant detect command substitution?
when it hits that line.. I'm using:

sp=`echo "$N" | cut -d"+" -f2`

#echo "$fp" `expr 760 - 11 \* \( \( $nnr - 1 \) % 66 \)` "$sp"
/bin/printf "%s %d %s\n" "$fp" `expr 760 - 11 \* \( \( $nnr - 1 \) %66\)` "$sp"

if [ "$rem" -eq 0 ]

The -e does not work on sh.
0
 

Author Comment

by:vupadhya
ID: 13847313
I also tried it like so:

fp=`echo "$N" | cut -d"+" -f1`
sp=`echo "$N" | cut -d"+" -f2`
#echo "$fp" `expr 760 - 11 \* \( \( $nnr - 1 \) % 66 \)` "$sp"
mp=`expr 760 - 11 \* \( \( $nnr - 1 \) % 66 \)`
/bin/printf "%s %d %s\n" "$fp" "$mp"  "$sp"

And the pdf file had the same problem, text with parentheses (sadjhdsa jadskd) corrupted the acrobat file.

The two seds are:
cat Project4OUT.txt | sed 's/[()]/\\&/g' |
sed 's/^.*$/(&) 20 + moveto show/' |     <-- So until here, the backslashes are retained in the doc. (?)

You are correct about the the backslashes. When I directly edited the .ps file for them, and converted to pdf, it worked.

Now, if we could only prevent escape translation, or use something other than echo (or apparently printf) to retain the backslashes.
0
 

Author Comment

by:vupadhya
ID: 13847342
I tried using usr/ucb/echo instead of echo, which according to man on 'echo',  
 The shells csh(1), ksh(1), and  sh(1),  each  have  an  echo
     built-in  command,  which, by default, will have precedence,
     and will be invoked if the user calls echo  without  a  full
     pathname.  /usr/ucb/echo and csh's echo() have an -n option,
     but do not understand back-slashed escape  characters.  sh's
     echo(),  ksh's echo(), and /usr/bin/echo, on the other

nnr=0
cat Project4OUT.txt | sed 's/[()]/\\&/g' |
sed 's/^.*$/(&) 20 + moveto show/' |

while read N
do
nnr=`expr $nnr + 1`
rem=`expr $nnr % 66`
if [ $rem -eq 1 ]
then echo PS
fi
fp=`echo "$N" | cut -d"+" -f1`
sp=`echo "$N" | cut -d"+" -f2`
/usr/ucb/echo "$fp" `expr 760 - 11 \* \( \( $nnr - 1 \) % 66 \)` "$sp"
if [ "$rem" -eq 0 ]
then echo PE

But it still got rid of the backslashes, and resulted in the corrupted pdf.

Where is this happening?
0
 

Author Comment

by:vupadhya
ID: 13847627
$ echo "Bob, come here (and test this) now." | sed 's/[()]/\\&/g'
Bob, come here \(and test this\) now.

$ echo "Bob, come here (and test this) now." | sed 's/[()]/"\(""\)"&/g'
Bob, come here "("")"(and test this"("")") now.

$ echo "\("
\(

If I put in "\(" in the document, instead of \(, when echo goes through it, it will decode it to \(, and retain it in the postscript.

Except, now I cant figure out how to use sed to put "\(" and not \(.
0
 

Author Comment

by:vupadhya
ID: 13847664
Wooohoo.. i got it.

sed 's/[()]/\\&/g'

needed to change to:

sed 's/[()]/\\\\&/g'

That would change "test ( paranthesis ) test" to:

" test \\( parantheses \)) test"

which, when read by the echo, turns to:

\( parantheses \) .. in the postscript

and turns to

( parantheses ) in the pdf.


Thanks guys for leading me to this!
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction: Database storage, where is the exe actually on the disc? Playing a game selected randomly (how to generate random numbers).  Error trapping with try..catch to help the code run even if something goes wrong. Continuing from the seve…
Introduction: Dialogs (1) modal - maintaining the database. Continuing from the ninth article about sudoku.   You might have heard of modal and modeless dialogs.  Here with this Sudoku application will we use one of each type: a modal dialog …
This video will show you how to get GIT to work in Eclipse.   It will walk you through how to install the EGit plugin in eclipse and how to checkout an existing repository.
Loops Section Overview

569 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question