Solved

Python Multithreading Details

Posted on 2010-09-23
7
813 Views
Last Modified: 2012-05-10
Hi guys,

I'd like to know more about how Python does multithreading, if anyone knows. Does it rely on its own threading library with its own scheduler? I am running a very simple program that starts two threads that constantly and forever print a message ending in a newline (by default). Sometimes, I get two messages on the same line, but the actual messages themselves are never interleaved (except for that newline). Does anyone know why that would happen?

Thanks in advance!

Teddy
0
Comment
Question by:Zyloch
7 Comments
 
LVL 28

Expert Comment

by:pepr
ID: 33741628
If they use the same output stream, try to flush the buffer after the print or use an unbuffered output.  However, if you do not have the exclusive access to the stream... I am not sure how the threads are interupted and whether it cannot happen during the output.
0
 
LVL 28

Expert Comment

by:pepr
ID: 33741635
You may find more details in the documentation for the threading library (http://docs.python.org/library/threading.html#module-threading) or of the lower level thread module (http://docs.python.org/library/thread.html#module-thread).  I do not have first-hand experience with that.
0
 
LVL 3

Accepted Solution

by:
ilalopoulos earned 500 total points
ID: 33743032
Python's handling mechanism is quite simple: it switches between threads every n (a small number of bytecodes i think the default is 10) or before intensive I/O operations and uses the GIL (GLobal Interpreter Lock) to synchronize access of the threads (which are real system threads) to the python interpreter.

Now as to why you see the breakage in the newlines only:

Python switches threads between bytecode instructions and not inside a bytecode so a bytecode is a guaranteed atomic operation that will be executed at whole in a thread before another get control.

So using the dis module (Disassembler for Python bytecode) we do the following little test (I have used python 2.7, different major versions can have different results):

>>> import dis
>>> def p():
      print "Hello"
      print "World"

      
>>> dis.dis(p)
  2           0 LOAD_CONST               1 ('Hello')
              3 PRINT_ITEM          
              4 PRINT_NEWLINE      

  3           5 LOAD_CONST               2 ('World')
              8 PRINT_ITEM          
              9 PRINT_NEWLINE      
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE  

As you can see there are two bytecodes in action the PRINT_ITEM and the PRINT_NEWLINE so a switch CAN happen between them and that is the reason for the behavior you have seen. However the PRINT_ITEM is a bytecode in its own so it executes at whole as an atomic operation.

Take in mind that bytecodes and thus atomic operations are not necessarily constant among python releases since it is stated that:

"CPython implementation detail:: Bytecode is an implementation detail of the CPython interpreter! Noguarantees are made that bytecode will not be added, removed, or changedbetween versions of Python. " (http://docs.python.org/library/dis.htm)

And so you cannot rely on the bytecode findings of one version to be the same like another one.

Finally although you do not ask such a question, if you want to synchronize the printing the simplest way is to use a lock. Before the print a thread should acquire the lock which will release it only after it finishes printing, all the other threads block at that point and wait to acquire the lock. A simple example would be:

import threading

lock = threading.Lock()

class MyThread(threading.Thread):
    def __init__(self, p):
        threading.Thread.__init__(self)
        self.p = p
    def run(self):
        while (1):
            lock.acquire()
            print self.p
            lock.release()

for i in range(0,10):
    myThread = MyThread("abc" + str(i))
    myThread.start()



Cheers
import threading

lock = threading.Lock()

class MyThread(threading.Thread):
    def __init__(self, p):
        threading.Thread.__init__(self)
        self.p = p
    def run(self):
        while (1):
            lock.acquire()
            print self.p
            lock.release()

for i in range(0,10):
    myThread = MyThread("abc" + str(i))
    myThread.start()

Open in new window

py-dis-hello-word.png
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 41

Expert Comment

by:HonorGod
ID: 33743267
@ilalopoulos - outstanding update / explanation.
0
 
LVL 5

Expert Comment

by:-Richard-
ID: 33747213
Yes, ilalopoulos, very authoritative and I learned a few things myself.  The only thing I would add is that you can find the number of bytecodes executed per Python "timeslice" with the sys.getcheckinterval() method, which will tell you (at least for Python 2.6) it is 100.  Additionally, you can even set the check interval yourself with sys.setcheckinterval(int).  Setting it to a larger number is documented as giving you somewhat better performance although it would reduce the rate at which your threads interact with each other.  
0
 
LVL 31

Expert Comment

by:James Murrell
ID: 34424419
This question has been classified as abandoned and is being closed as part of the Cleanup Program.  See my comment at the end of the question for more details.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
None is returned 6 40
Python: how to print and format rows from API results 4 63
Visualizing tuple values in Python (or generally) 3 66
BASH script to modify crontab? 3 50
Introduction On September 29, 2012, the Python 3.3.0 was released; nothing extremely unexpected,  yet another, better version of Python. But, if you work in Microsoft Windows, you should notice that the Python Launcher for Windows was introduced wi…
Sequence is something that used to store data in it in very simple words. Let us just create a list first. To create a list first of all we need to give a name to our list which I have taken as “COURSE” followed by equals sign and finally enclosed …
Learn the basics of strings in Python: declaration, operations, indices, and slicing. Strings are declared with quotations; for example: s = "string": Strings are immutable.: Strings may be concatenated or multiplied using the addition and multiplic…
Learn the basics of modules and packages in Python. Every Python file is a module, ending in the suffix: .py: Modules are a collection of functions and variables.: Packages are a collection of modules.: Module functions and variables are accessed us…

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now