[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Python Multithreading Details

Posted on 2010-09-23
7
Medium Priority
?
852 Views
Last Modified: 2012-05-10
Hi guys,

I'd like to know more about how Python does multithreading, if anyone knows. Does it rely on its own threading library with its own scheduler? I am running a very simple program that starts two threads that constantly and forever print a message ending in a newline (by default). Sometimes, I get two messages on the same line, but the actual messages themselves are never interleaved (except for that newline). Does anyone know why that would happen?

Thanks in advance!

Teddy
0
Comment
Question by:Zyloch
6 Comments
 
LVL 29

Expert Comment

by:pepr
ID: 33741628
If they use the same output stream, try to flush the buffer after the print or use an unbuffered output.  However, if you do not have the exclusive access to the stream... I am not sure how the threads are interupted and whether it cannot happen during the output.
0
 
LVL 29

Expert Comment

by:pepr
ID: 33741635
You may find more details in the documentation for the threading library (http://docs.python.org/library/threading.html#module-threading) or of the lower level thread module (http://docs.python.org/library/thread.html#module-thread).  I do not have first-hand experience with that.
0
 
LVL 3

Accepted Solution

by:
ilalopoulos earned 2000 total points
ID: 33743032
Python's handling mechanism is quite simple: it switches between threads every n (a small number of bytecodes i think the default is 10) or before intensive I/O operations and uses the GIL (GLobal Interpreter Lock) to synchronize access of the threads (which are real system threads) to the python interpreter.

Now as to why you see the breakage in the newlines only:

Python switches threads between bytecode instructions and not inside a bytecode so a bytecode is a guaranteed atomic operation that will be executed at whole in a thread before another get control.

So using the dis module (Disassembler for Python bytecode) we do the following little test (I have used python 2.7, different major versions can have different results):

>>> import dis
>>> def p():
      print "Hello"
      print "World"

      
>>> dis.dis(p)
  2           0 LOAD_CONST               1 ('Hello')
              3 PRINT_ITEM          
              4 PRINT_NEWLINE      

  3           5 LOAD_CONST               2 ('World')
              8 PRINT_ITEM          
              9 PRINT_NEWLINE      
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE  

As you can see there are two bytecodes in action the PRINT_ITEM and the PRINT_NEWLINE so a switch CAN happen between them and that is the reason for the behavior you have seen. However the PRINT_ITEM is a bytecode in its own so it executes at whole as an atomic operation.

Take in mind that bytecodes and thus atomic operations are not necessarily constant among python releases since it is stated that:

"CPython implementation detail:: Bytecode is an implementation detail of the CPython interpreter! Noguarantees are made that bytecode will not be added, removed, or changedbetween versions of Python. " (http://docs.python.org/library/dis.htm)

And so you cannot rely on the bytecode findings of one version to be the same like another one.

Finally although you do not ask such a question, if you want to synchronize the printing the simplest way is to use a lock. Before the print a thread should acquire the lock which will release it only after it finishes printing, all the other threads block at that point and wait to acquire the lock. A simple example would be:

import threading

lock = threading.Lock()

class MyThread(threading.Thread):
    def __init__(self, p):
        threading.Thread.__init__(self)
        self.p = p
    def run(self):
        while (1):
            lock.acquire()
            print self.p
            lock.release()

for i in range(0,10):
    myThread = MyThread("abc" + str(i))
    myThread.start()



Cheers
import threading

lock = threading.Lock()

class MyThread(threading.Thread):
    def __init__(self, p):
        threading.Thread.__init__(self)
        self.p = p
    def run(self):
        while (1):
            lock.acquire()
            print self.p
            lock.release()

for i in range(0,10):
    myThread = MyThread("abc" + str(i))
    myThread.start()

Open in new window

py-dis-hello-word.png
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

 
LVL 41

Expert Comment

by:HonorGod
ID: 33743267
@ilalopoulos - outstanding update / explanation.
0
 
LVL 5

Expert Comment

by:-Richard-
ID: 33747213
Yes, ilalopoulos, very authoritative and I learned a few things myself.  The only thing I would add is that you can find the number of bytecodes executed per Python "timeslice" with the sys.getcheckinterval() method, which will tell you (at least for Python 2.6) it is 100.  Additionally, you can even set the check interval yourself with sys.setcheckinterval(int).  Setting it to a larger number is documented as giving you somewhat better performance although it would reduce the rate at which your threads interact with each other.  
0
 
LVL 31

Expert Comment

by:James Murrell
ID: 34424419
This question has been classified as abandoned and is being closed as part of the Cleanup Program.  See my comment at the end of the question for more details.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A set of related code is known to be a Module, it helps us to organize our code logically which is much easier for us to understand and use it. Module is an object with arbitrarily named attributes which can be used in binding and referencing. …
Article by: Swadhin
Introduction of Lists in Python: There are six built-in types of sequences. Lists and tuples are the most common one. In this article we will see how to use Lists in python and how we can utilize it while doing our own program. In general we can al…
Learn the basics of strings in Python: declaration, operations, indices, and slicing. Strings are declared with quotations; for example: s = "string": Strings are immutable.: Strings may be concatenated or multiplied using the addition and multiplic…
Learn the basics of lists in Python. Lists, as their name suggests, are a means for ordering and storing values. : Lists are declared using brackets; for example: t = [1, 2, 3]: Lists may contain a mix of data types; for example: t = ['string', 1, T…
Suggested Courses
Course of the Month17 days, 15 hours left to enroll

829 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question