Parsing printer ASCII data from serial on Python: need to separate prints

ltpitt
ltpitt used Ask the Experts™
on
Hello everybody...

I am building some kind of small virtual printer with python.

I read all data coming from a serial port in ASCII format and I can save into a file the prints coming.

I'd need to find a method to correctly separate one print from the other...

Any ideas?

Here's the code so far:

#!/usr/bin/python
# get lines of text from serial port, save them to a file

from __future__ import print_function
import serial, io

addr  = '/dev/ttyUSB0'  # serial port to read data from
baud  = 115200            # baud rate for serial port
fname = 'serial.dat'   # log file to save data in
fmode = 'a'             # log file mode = append

with serial.Serial(addr,baud) as pt, open(fname,fmode) as outf:
    spb = io.TextIOWrapper(io.BufferedRWPair(pt,pt,1),
        encoding='ascii', errors='ignore', newline='\r',line_buffering=True)
    spb.readline()  # throw away first line; likely to start mid-sentence (incomplete)
    while (1):
        x = spb.readline()  # read one line of text from serial port
        print (x,end='')    # echo line of text on-screen
        outf.write(x)       # write line of text to file
        outf.flush()        # make sure it actually gets written out

Open in new window

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
You should probably detect control characters related to the Start of TeXt (STX) and End of TeXt (ETX) -- see https://en.wikipedia.org/wiki/Control_character , for example.

Author

Commented:
Very interesting read and shed of light, pepr!

But this is row-related or whole transmission-related?

I've read more from your suggestion and here I find a list of the characters I should look for:

https://en.wikipedia.org/wiki/ASCII#ASCII_control_characters

The problem is that I don't get an Escape Code for (example) Start of Text: how can I detect it in my data?
The protocol (with control characters) was designed probably in the teletype era -- very mechanical devices -- where the transmission would be defined as character-by-character, or better to say in binary mode. The line feed, carriage return, and the like character were directly interpreted by the machine. Only later it was enhanced to be used for unix-like (software) terminals. I believe it should always be interpreted in the spirit of Unix -- stream of bytes. I have no direct experience, but I believe the STX and ETX are used to separate the block of text. It can be a whole message or a part of the message. Possibly, EOT (End Of Transmission) can be a good candidate for you to separate the messages.

If you do not know what exactly is coming from the serial port, I suggest to use the binary mode, capture everything to a file, and send some testing messages to see how the protocol looks like. Or you may find some description related to the source of data that sends them through the serial port.

Author

Commented:
Thanks for all this precious information!
You are welcome ;)

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial