Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 4BC8F429E28 for ; Thu, 1 Dec 2011 08:04:09 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LENUq7--HjuQ for ; Thu, 1 Dec 2011 08:04:08 -0800 (PST) X-Greylist: delayed 2378 seconds by postgrey-1.32 at olra; Thu, 01 Dec 2011 08:04:08 PST Received: from jameswestby.net (jameswestby.net [89.145.97.141]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 7F6DF431FD0 for ; Thu, 1 Dec 2011 08:04:08 -0800 (PST) Received: from [74.220.184.116] (helo=dim.jameswestby.net) by jameswestby.net with esmtpa (Exim 4.69) (envelope-from ) id 1RW8Ut-00062f-I8 for notmuch@notmuchmail.org; Thu, 01 Dec 2011 15:24:27 +0000 Received: by dim.jameswestby.net (Postfix, from userid 1000) id 2B2CE5A4A92; Thu, 1 Dec 2011 10:24:26 -0500 (EST) From: James Westby To: notmuch Subject: python-notmuch crash with threads User-Agent: Notmuch/0.6.1-1 (http://notmuchmail.org) Emacs/23.3.1 (x86_64-pc-linux-gnu) Date: Thu, 01 Dec 2011 10:24:26 -0500 Message-ID: <87pqg8b9hx.fsf@jameswestby.net> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 01 Dec 2011 16:04:09 -0000 --=-=-= Hi, I've been seeing a race with python-notmuch, where it will crash due to pointers being invalidated when threads are used. I've attached a script which shows the problem some of the time. It's about the smallest script I can make, but it's hampered by the fact that making it simpler seems to make the race less likely, so it's hard to know when it is gone. The typical backtrace is: Program terminated with signal 11, Segmentation fault. #0 0x00007f7b19c34b59 in talloc_named_const () from /usr/lib/x86_64-linux-gnu/libtalloc.so.2 (gdb) up #1 0x00007f7b1a5f78dc in notmuch_query_search_threads (query=0x14001c70) at lib/query.cc:322 322 lib/query.cc: No such file or directory. in lib/query.cc (gdb) p *query Cannot access memory at address 0x14001c70 Where something is invalidating the pointer between creation in db.create_query() and calling it in query.search_threads() I've seen other similar things when using other code in the thread. http://talloc.samba.org/talloc/doc/html/index.html talks about the thread-safety of talloc, and I don't think it's any of those issues here. Any suggestions for how to debug this further would be most welcome. Thanks, James --=-=-= Content-Type: text/x-python Content-Disposition: inline; filename=test.py import threading class NotmuchThread(threading.Thread): def __init__(self): super(NotmuchThread, self).__init__() self.job_waiting = threading.Condition() self.job_queue = [] def run(self): from notmuch.database import Database self.db = Database() job = None print "Aquiring lock" with self.job_waiting: if len(self.job_queue) < 1: print "Job queue empty so waiting" while True: ready = self.job_waiting.wait(1) if ready: break if len(self.job_queue) > 0: break print "Got job, releasing lock" self.search_threads("tag:inbox") def search_threads(self, query_string): query = self.db.create_query(query_string) print("%X" % query._query) threads = query.search_threads() return threads test_thread = NotmuchThread() test_thread.start() with test_thread.job_waiting: test_thread.job_queue.append() test_thread.job_waiting.notify() import time; time.sleep(1) test_thread.join() --=-=-=--