From: Daniel Kahn Gillmor Date: Thu, 10 Dec 2015 03:39:37 +0000 (+1900) Subject: allow indexing cleartext of encrypted messages X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=98d5f19be945b5b038b716dfcd04ccd4cf89c7d0;p=notmuch-archives.git allow indexing cleartext of encrypted messages --- diff --git a/d3/123cd91725bb2059bd79117f73c16b201ae426 b/d3/123cd91725bb2059bd79117f73c16b201ae426 new file mode 100644 index 000000000..f4a1b4148 --- /dev/null +++ b/d3/123cd91725bb2059bd79117f73c16b201ae426 @@ -0,0 +1,140 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by arlo.cworth.org (Postfix) with ESMTP id B6F216DE1601 + for ; Wed, 9 Dec 2015 19:40:13 -0800 (PST) +X-Virus-Scanned: Debian amavisd-new at cworth.org +X-Spam-Flag: NO +X-Spam-Score: -0.033 +X-Spam-Level: +X-Spam-Status: No, score=-0.033 tagged_above=-999 required=5 + tests=[AWL=-0.033] autolearn=disabled +Received: from arlo.cworth.org ([127.0.0.1]) + by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id Gxu-qmnTqJ1x for ; + Wed, 9 Dec 2015 19:40:12 -0800 (PST) +Received: from che.mayfirst.org (che.mayfirst.org [209.234.253.108]) + by arlo.cworth.org (Postfix) with ESMTP id 094F26DE1830 + for ; Wed, 9 Dec 2015 19:40:06 -0800 (PST) +Received: from fifthhorseman.net (unknown [38.109.115.130]) + by che.mayfirst.org (Postfix) with ESMTPSA id 21197F984 + for ; Wed, 9 Dec 2015 22:40:03 -0500 (EST) +Received: by fifthhorseman.net (Postfix, from userid 1000) + id 9D1AB20548; Wed, 9 Dec 2015 22:40:03 -0500 (EST) +From: Daniel Kahn Gillmor +To: Notmuch Mail +Subject: allow indexing cleartext of encrypted messages +Date: Wed, 9 Dec 2015 22:39:37 -0500 +Message-Id: <1449718786-28000-1-git-send-email-dkg@fifthhorseman.net> +X-Mailer: git-send-email 2.6.2 +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.20 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Thu, 10 Dec 2015 03:40:14 -0000 + +Notmuch currently doesn't index the cleartext of encrypted mail. This +is the right choice by default, because the index is basically +cleartext-equivalent, and we wouldn't want every indexed mailstore to +leak the contents of its encrypted mails. + +However, if a notmuch user has their index in a protected location, +they may prefer the convenience of being able to search the contents +of (at least some of) their encrypted mail. + +This series of patches enables notmuch to index the cleartext of +specific encrypted messages when they're being added via "notmuch new" +or "notmuch insert", via a new --try-decrypt flag. + +If --try-decrypt is used, and decryption is successful for part of a +message, the message gets an additional "index-decrypted" tag. If +decryption of part of a message fails, the message gets an additional +"index-decryption-failed" tag. + +This tagging approach should allow people to figure out which messages +have been indexed in the clear (or not), and can be used to +selectively reindex them in batch with something like: + +---------------- +#!/usr/bin/env python3 + +'''notmuch-reindex.py -- a quick and dirty pythonic mechanism to +re-index specific messages in a notmuch database. This should +probably be properly implemented as a subcommand for /usr/bin/notmuch +itself''' + +import notmuch +import sys + +d = notmuch.Database(mode=notmuch.Database.MODE.READ_WRITE) + +query = sys.argv[1] + +q = d.create_query(query) + +for m in q.search_messages(): + mainfilename = m.get_filename() + origtags = m.get_tags() + tags = [] + for t in origtags: + if t not in ['index-decrypted', 'index-decryption-failed']: + tags += [t] + d.begin_atomic() + for f in m.get_filenames(): + d.remove_message(f) + (newm,stat) = d.add_message(mainfilename, try_decrypt=True) + for tag in tags: + newm.add_tag(tag) + d.end_atomic() +---------------- + +A couple key points: + + * There is some code duplication between crypto.c (for the + notmuch-client) and lib/database.cc and lib/index.cc (for the + library) because both parts of the codebase use gmime to handle the + decryption. I don't want to contaminate the libnotmuch API with + gmime implementation details, so i don't quite see how to reuse the + code cleanly. I'd love suggestions on how to reduce the + duplications. + + * the libnotmuch API is extended with + notmuch_database_add_message_try_decrypt(). This should probably + ultimately be more general, because there are a few additional + knobs that i can imagine fiddling at indexing time. For example: + + * verifying cryptographic signatures and storing something about + those verifications in the notmuch db + + * extracting OpenPGP session key information for a given message + and storing it in a lookaside table in the notmuch db, so that + it's possible to securely destroy old encryption-capable keys + and still have local access to the cleartext of the remaining + messages. + + Some of these additional features might be orthogonal to one + another as well. I welcome suggestions for how to improve the API + so that we don't end up with a combinatorial explosion of + n_d_add_message_foo() functions. + + * To properly complete this patch series, i think i want to make + notmuch-reindex.c and add a reindex subcommand, also with a + --try-decrypt option. It's not clear to me if the right approach + for that is to have a C implementation of the python script above + without modifying libnotmuch, or if i should start by creating a + notmuch_message_reindex function in libnotmuch, with a try_decrypt + flag. Again, suggestions welcome. + + * Is the tagging approach the right thing to do to record success or + failure of decryption at index time? Is there a better approach? + +