Python之美[从菜鸟到高手]--threading daemon线程原理解读


事情的起因是我在看下面一段代码遇到的疑惑,明明是while True,为什么代码没有死循环??

class D(threading.Thread):
    def __init__(self, queue):
        threading.Thread.__init__(self)
        self.queue = queue
    def run(self):
        while  True:
            url = self.queue.get()
            self.download_file(url)
            self.queue.task_done()

    def download_file(self, url):
        h = urllib2.urlopen(url)
        f = os.path.basename(url)+'.html'
        with open(f,'wb') as f:
            while  True:
                c = h.read(1024)
                if not c:
                    break
                f.write(c)

if __name__ == "__main__":
    urls= ['http://www.baidu.com','http://www.sina.com']
    queue = Queue.Queue()
    for i in range(5):
        t = D(queue)
        t.setDaemon(True)
        t.start()

    for u in urls:
        queue.put(u)

    queue.join()

之前一直简单认为setDaemon就是设置为后台线程而已,没有进一步去挖掘里面的含义。

可问题的关键就是setDaemon,在底层的thread模块中,只要主线程结束了,所有的其它线程都会结束,这很明显,主线程结束python将销毁运行时环境,主线程肯定会被结束。

threading模块的线程setDaemon就是为了解决这个问题的,如果setDaemon(True),那么和之前一样,主线程结束,所有子线程都将结束。如果setDaemon(False),主线程将等待该线程结束,等同于你调用线程的join方法。

所以如果将上面的setDaemon注释和修改为False,那么程序将死循环。

其实我们并不推荐上面的做法,上面做法有点线程池的味道,但如果你看过一些python的线程池实现,while True

循环中肯定有检测退出语句,因为在python的世界里言明比隐晦更加pythonic。但很不幸的是,上面的代码就来

自与<<编写高质量代码:改善Python程序的91个建议>>,我并没有喷这本书,但我觉得代码举例的确有待商榷。

你可能好奇,setDaemon(False)是如何等同于线程join的呢?,不急,且听我慢慢道来。

未解决这个问题,threading模块引入了_MainThread对象

# Special thread class to represent the main thread
# This is garbage collected through an exit handler

class _MainThread(Thread):

    def __init__(self):
        Thread.__init__(self, name="MainThread")
        self._Thread__started.set()
        self._set_ident()
        with _active_limbo_lock:
            _active[_get_ident()] = self

    def _set_daemon(self):
        return False

    def _exitfunc(self):
        self._Thread__stop()
        t = _pickSomeNonDaemonThread()
        if t:
            if __debug__:
                self._note("%s: waiting for other threads", self)
        while t:
            t.join()
            t = _pickSomeNonDaemonThread()
        if __debug__:
            self._note("%s: exiting", self)
        self._Thread__delete()

def _pickSomeNonDaemonThread():
    for t in enumerate():
        if not t.daemon and t.is_alive():
            return t
    return None

# Create the main thread object,
# and make it available for the interpreter
# (Py_Main) as threading._shutdown.

_shutdown = _MainThread()._exitfunc
其实_MainThread并没有干什么事,唯一的贡献就是在threading模块导入时创建了一个实例,并将_exitfunc
赋值给_shutdown函数。_exitfunc将收集所有非daemon且alive的线程,并调用线程的join方法。哦,原来是

_MainThread悄悄的在幕后奋斗着,剩下的问题就是谁调用_shutdown函数的呢?

当python要销毁运行时之前肯定会调用,所以打开pythonrun.c,你会发现如下函数

/* Wait until threading._shutdown completes, provided
   the threading module was imported in the first place.
   The shutdown routine will wait until all non-daemon
   "threading" threads have completed. */
static void
wait_for_thread_shutdown(void)
{
#ifdef WITH_THREAD
    PyObject *result;
    PyThreadState *tstate = PyThreadState_GET();
    PyObject *threading = PyMapping_GetItemString(tstate->interp->modules,
                                                  "threading");
    if (threading == NULL) {
        /* threading not imported */
        PyErr_Clear();
        return;
    }
    result = PyObject_CallMethod(threading, "_shutdown", "");
    if (result == NULL)
        PyErr_WriteUnraisable(threading);
    else
        Py_DECREF(result);
    Py_DECREF(threading);
#endif
}
原来是这家伙在搞鬼,涨见识了,原来在C中还有调用py代码的需求啊。没办法啊,谁让threading模块是纯py
代码呢!!!



评论关闭