Python Multithreading Details

Hi guys,

I'd like to know more about how Python does multithreading, if anyone knows. Does it rely on its own threading library with its own scheduler? I am running a very simple program that starts two threads that constantly and forever print a message ending in a newline (by default). Sometimes, I get two messages on the same line, but the actual messages themselves are never interleaved (except for that newline). Does anyone know why that would happen?

Thanks in advance!

Teddy
LVL 36
ZylochAsked:
Who is Participating?
 
ilalopoulosConnect With a Mentor Commented:
Python's handling mechanism is quite simple: it switches between threads every n (a small number of bytecodes i think the default is 10) or before intensive I/O operations and uses the GIL (GLobal Interpreter Lock) to synchronize access of the threads (which are real system threads) to the python interpreter.

Now as to why you see the breakage in the newlines only:

Python switches threads between bytecode instructions and not inside a bytecode so a bytecode is a guaranteed atomic operation that will be executed at whole in a thread before another get control.

So using the dis module (Disassembler for Python bytecode) we do the following little test (I have used python 2.7, different major versions can have different results):

>>> import dis
>>> def p():
      print "Hello"
      print "World"

      
>>> dis.dis(p)
  2           0 LOAD_CONST               1 ('Hello')
              3 PRINT_ITEM          
              4 PRINT_NEWLINE      

  3           5 LOAD_CONST               2 ('World')
              8 PRINT_ITEM          
              9 PRINT_NEWLINE      
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE  

As you can see there are two bytecodes in action the PRINT_ITEM and the PRINT_NEWLINE so a switch CAN happen between them and that is the reason for the behavior you have seen. However the PRINT_ITEM is a bytecode in its own so it executes at whole as an atomic operation.

Take in mind that bytecodes and thus atomic operations are not necessarily constant among python releases since it is stated that:

"CPython implementation detail:: Bytecode is an implementation detail of the CPython interpreter! Noguarantees are made that bytecode will not be added, removed, or changedbetween versions of Python. " (http://docs.python.org/library/dis.htm)

And so you cannot rely on the bytecode findings of one version to be the same like another one.

Finally although you do not ask such a question, if you want to synchronize the printing the simplest way is to use a lock. Before the print a thread should acquire the lock which will release it only after it finishes printing, all the other threads block at that point and wait to acquire the lock. A simple example would be:

import threading

lock = threading.Lock()

class MyThread(threading.Thread):
    def __init__(self, p):
        threading.Thread.__init__(self)
        self.p = p
    def run(self):
        while (1):
            lock.acquire()
            print self.p
            lock.release()

for i in range(0,10):
    myThread = MyThread("abc" + str(i))
    myThread.start()



Cheers
import threading

lock = threading.Lock()

class MyThread(threading.Thread):
    def __init__(self, p):
        threading.Thread.__init__(self)
        self.p = p
    def run(self):
        while (1):
            lock.acquire()
            print self.p
            lock.release()

for i in range(0,10):
    myThread = MyThread("abc" + str(i))
    myThread.start()

Open in new window

py-dis-hello-word.png
0
 
peprCommented:
If they use the same output stream, try to flush the buffer after the print or use an unbuffered output.  However, if you do not have the exclusive access to the stream... I am not sure how the threads are interupted and whether it cannot happen during the output.
0
 
peprCommented:
You may find more details in the documentation for the threading library (http://docs.python.org/library/threading.html#module-threading) or of the lower level thread module (http://docs.python.org/library/thread.html#module-thread).  I do not have first-hand experience with that.
0
Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

 
HonorGodSoftware EngineerCommented:
@ilalopoulos - outstanding update / explanation.
0
 
-Richard-Commented:
Yes, ilalopoulos, very authoritative and I learned a few things myself.  The only thing I would add is that you can find the number of bytecodes executed per Python "timeslice" with the sys.getcheckinterval() method, which will tell you (at least for Python 2.6) it is 100.  Additionally, you can even set the check interval yourself with sys.setcheckinterval(int).  Setting it to a larger number is documented as giving you somewhat better performance although it would reduce the rate at which your threads interact with each other.  
0
 
James MurrellProduct SpecialistCommented:
This question has been classified as abandoned and is being closed as part of the Cleanup Program.  See my comment at the end of the question for more details.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.