Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 4B71E431FBD for ; Mon, 9 Jul 2012 09:30:08 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PzMxaJqX0H-e for ; Mon, 9 Jul 2012 09:30:07 -0700 (PDT) Received: from dmz-mailsec-scanner-4.mit.edu (DMZ-MAILSEC-SCANNER-4.MIT.EDU [18.9.25.15]) by olra.theworths.org (Postfix) with ESMTP id 8DB73431FAE for ; Mon, 9 Jul 2012 09:30:07 -0700 (PDT) X-AuditID: 1209190f-b7f306d0000008b4-82-4ffb070ee00b Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) by dmz-mailsec-scanner-4.mit.edu (Symantec Messaging Gateway) with SMTP id F4.9D.02228.E070BFF4; Mon, 9 Jul 2012 12:30:06 -0400 (EDT) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id q69GU5d9023574; Mon, 9 Jul 2012 12:30:05 -0400 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q69GU0KR019822 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Mon, 9 Jul 2012 12:30:01 -0400 (EDT) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77) (envelope-from ) id 1SoGqW-0001Zy-II; Mon, 09 Jul 2012 12:30:00 -0400 Date: Mon, 9 Jul 2012 12:30:00 -0400 From: Austin Clements To: Sebastien Binet Subject: Re: query on a subset of messages ? Message-ID: <20120709163000.GG18195@mit.edu> References: <871ukl5oj7.fsf@cern.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <871ukl5oj7.fsf@cern.ch> User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpkleLIzCtJLcpLzFFi42IRYrdT1+Vj/+1v0HuPx2Li81CL6zdnMjsw eTw6N5XR49mqW8wBTFFcNimpOZllqUX6dglcGXM3H2Ir+MNX0fnnDXsD41HuLkZODgkBE4n9 S7+zQ9hiEhfurWfrYuTiEBLYxyix/cc3RghnPaPEw40LWSCcE0wSjd/mAjkcQM4SRonHNSDd LAIqEm//L2MEsdkENCS27V8OZosIKEscP7WGDcRmBtq29P0VFhBbWEBH4u/Ni2BxXiB736/P YFcIAc3pbznLDBEXlDg58wkLRK+WxI1/L5lA1jILSEss/8cBEuYUUJW49usbWKsoUOuUk9vY JjAKzULSPQtJ9yyE7gWMzKsYZVNyq3RzEzNzilOTdYuTE/PyUot0TfRyM0v0UlNKNzGCQppT kn8H47eDSocYBTgYlXh4N7H89hdiTSwrrsw9xCjJwaQkynuRESjEl5SfUpmRWJwRX1Sak1p8 iFGCg1lJhHfR11/+QrwpiZVVqUX5MClpDhYlcd6rKTf9hQTSE0tSs1NTC1KLYLIyHBxKErx9 bEBDBYtS01Mr0jJzShDSTBycIMN5gIZHgNTwFhck5hZnpkPkTzEqSonzOoAkBEASGaV5cL2w lPOKURzoFWHeHJAqHmC6gut+BTSYCWhwew/I1cUliQgpqQZGls0XBQ/dM7plm37P85xXn+Fv sZ0v3jK/Wnal9FDB9med3VeWxVod3L/DPK9//vUGj9xdO+VuGf3rWMDV5vOn3D37bvrsv58F H/zgyem+rMQdv6/P8+nXS49nBFY/ffFm96yHTPfVdDVPpAaIfL6++JiM1PsfV63Ynud8YOG/ 0PBpzdfOPV253kosxRmJhlrMRcWJAKzczs0UAwAA Cc: Notmuch developer list X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 Jul 2012 16:30:08 -0000 Quoth Sebastien Binet on Jul 09 at 10:25 am: > > hi there, > > I was trying to reduce the I/O stress during my usual email > fetching+tagging by writing a little program using the go bindings to > notmuch. > > ie: > db, status := notmuch.OpenDatabase(db_path, > notmuch.DATABASE_MODE_READ_WRITE) > query := db.CreateQuery("(tag:new AND tag:inbox)") > msgs := query.SearchMessages() > for _,msg := range msgs { > tag_msg(msg, tagqueries) > } > > > where tagqueries is a subquery of the form: > [ > { > "Cmd": "+to-me", > "Query": "(to:sebastien.binet@cern.ch and not tag:to-me)" > }, > { > "Cmd": "+sci-notmuch", > "Query": "from:notmuch@notmuchmail.org or to:notmuch@notmuchmail.org or subject:notmuch" > } > ] > > > the idea being that I only need to crawl through the db only once and > then iteratively apply tags on those messages (instead of repeatedly > running "notmuch tag ..." for each and every of those many > 'tag-queries') > > I couldn't find any C-API to do such a thing using the notmuch library. > did I overlook something ? > > Is it something useful to add ? > > -s Have you tried a more direct translation of the multiple notmuch tag commands into Go, where you don't worry about subsetting the queries? Unless you're tagging a huge number of messages, the cost of notmuch tag is almost certainly the fsync that it does when it closes the database (which every call to notmuch tag must do). However, in Go, you can keep the database open across all of the tagging operations and then close and fsync it just once. Note that there is an important optimization in notmuch tag that you might have to replicate. It manipulates the original query to exclude messages that already have the desired tags, so that they get skipped very efficiently at the earliest stage possible.