Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 4B633431FB6 for ; Wed, 4 Apr 2012 03:55:39 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BYHRANrjliv6 for ; Wed, 4 Apr 2012 03:55:37 -0700 (PDT) Received: from tesseract.cs.unb.ca (tesseract.cs.unb.ca [131.202.240.238]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id C4BD1431FAE for ; Wed, 4 Apr 2012 03:55:37 -0700 (PDT) Received: from fctnnbsc30w-156034089108.dhcp-dynamic.fibreop.nb.bellaliant.net ([156.34.89.108] helo=zancas.localnet) by tesseract.cs.unb.ca with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1SFNsA-0008HH-5J; Wed, 04 Apr 2012 07:55:30 -0300 Received: from bremner by zancas.localnet with local (Exim 4.77) (envelope-from ) id 1SFNs4-0004qY-Ms; Wed, 04 Apr 2012 07:55:24 -0300 From: David Bremner To: Jameson Graef Rollins , Jani Nikula Subject: Re: [PATCH 6/8] cli: add support for batch tagging operations to "notmuch tag" In-Reply-To: <87aa2s2aow.fsf@servo.finestructure.net> References: <87ty123tpc.fsf@servo.finestructure.net> <87aa2tc22z.fsf@zancas.localnet> <87iphh50hz.fsf@servo.finestructure.net> <87fwcl4yr8.fsf@servo.finestructure.net> <87d37p4xor.fsf@servo.finestructure.net> <87pqbpxm2c.fsf@nikula.org> <87wr5w2zv5.fsf@servo.finestructure.net> <87bon82qok.fsf@zancas.localnet> <87aa2s2aow.fsf@servo.finestructure.net>User-Agent: Notmuch/0.12+70~g46e73fe (http://notmuchmail.org) Emacs/23.3.1 (x86_64-pc-linux-gnu) Date: Wed, 04 Apr 2012 07:55:24 -0300 Message-ID: <878vib3gwz.fsf@zancas.localnet> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam_bar: - Cc: Notmuch Mail X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Apr 2012 10:55:39 -0000 Jameson Graef Rollins writes: > With that in mind, I think I stand by my suggestion that the form should > match exactly the notmuch subcommand format. Even considering the > technical issues that Jani brought up, I still think it makes the most > sense to imagine generic batch processing handled by the top level > binary. And in that case the most logical format for the input is > probably just that of the CLI arguments. One thing that worries me about this (and to be honest it worries me a bit about the single character command tag) is the potential increase in size of a dump file, if we use exactly a list of commands as a dump format. The SQL/XML-like argument that it will all compress well is true; nontheless for applications involving version control, it does seem useful to have an uncompressed version around. A very rough estimate suggests for my about 250k messages, appending "tag " to the front of each line bloats a dump file by about 5%. Maybe that is not worth worrying about. I'd be curious to see how 4 * #lines / (total dump size) works out for other people. I thought that the bloat from having + in front of every tag would be larger, but it seems that my messages average something like one tag per message (many messages with no tags). I'm not sure how universal that is. We could also give up on marking the command on each line, and insert some kind of simple header at the top. This idea came up in the context of restore formats before. > Just out of curiosity and for the sake of argument, if we were going to > design a server/batch processor from the ground up would it make sense > to use a format like this, or would we better off opting for some other > more established protocol? I guess it depends how much work it is to support the established protocol, and how good the fit is with notmuch. Are there candidates other than IMAP? As far as implementation effort, as a totally unscientific experiment, I grabbed Net::IMAP::Server from CPAN, it is almost 7000 lines of perl. I'm not suggesting we use Perl ;), but I doubt C is shorter. Hopefully we wouldn't write such a library from scratch. A quick search did not lead me to "the canonical imap server library", unless that is the UW one, which I have bad, if non-specific memories about. I think we'd need to use a fair number of extensions to basic IMAP. What might work well is the GMail extensions to IMAP. I have no idea about the difficulty of implementing those; I suspect there are not solid C libraries supporting them. d