From 70697720f6b5b5ec3e6019644c0b10a27bf7c2e4 Mon Sep 17 00:00:00 2001 From: David Bremner Date: Tue, 30 Apr 2013 22:12:16 +2100 Subject: [PATCH] Re: [PATCH] lib: Add a new prefix "list" to the search-terms syntax --- 98/751f4487ae63899aefa12f1229554871b0dcc2 | 118 ++++++++++++++++++++++ 1 file changed, 118 insertions(+) create mode 100644 98/751f4487ae63899aefa12f1229554871b0dcc2 diff --git a/98/751f4487ae63899aefa12f1229554871b0dcc2 b/98/751f4487ae63899aefa12f1229554871b0dcc2 new file mode 100644 index 000000000..a823d8e4d --- /dev/null +++ b/98/751f4487ae63899aefa12f1229554871b0dcc2 @@ -0,0 +1,118 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id EB318431FB6 + for ; Mon, 29 Apr 2013 18:12:32 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: 0 +X-Spam-Level: +X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] + autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id 5c9T9YLBPAOG for ; + Mon, 29 Apr 2013 18:12:28 -0700 (PDT) +Received: from tesseract.cs.unb.ca (tesseract.cs.unb.ca [131.202.240.238]) + (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 5C71D431FAF + for ; Mon, 29 Apr 2013 18:12:28 -0700 (PDT) +Received: from fctnnbsc30w-156034082078.dhcp-dynamic.fibreop.nb.bellaliant.net + ([156.34.82.78] helo=zancas.localnet) + by tesseract.cs.unb.ca with esmtpsa + (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) + (envelope-from ) + id 1UWz7G-0004a6-VW; Mon, 29 Apr 2013 22:12:25 -0300 +Received: from bremner by zancas.localnet with local (Exim 4.80) + (envelope-from ) + id 1UWz7A-0003yr-SL; Mon, 29 Apr 2013 22:12:16 -0300 +From: David Bremner +To: "Alexey I. Froloff" , notmuch@notmuchmail.org +Subject: Re: [PATCH] lib: Add a new prefix "list" to the search-terms syntax +In-Reply-To: <1365549369-12776-1-git-send-email-raorn@raorn.name> +References: <20130409083010.GA27675@raorn.name> + <1365549369-12776-1-git-send-email-raorn@raorn.name> +User-Agent: Notmuch/0.15.2+84~g12d5e4e (http://notmuchmail.org) Emacs/24.2.1 + (x86_64-pc-linux-gnu) +Date: Mon, 29 Apr 2013 22:12:16 -0300 +Message-ID: <87vc74vn2n.fsf@zancas.localnet> +MIME-Version: 1.0 +Content-Type: text/plain +X-Spam_bar: - +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Tue, 30 Apr 2013 01:12:33 -0000 + + +Hi Alexey, + +Thanks for working on this. I think the boolean prefix version makes +more sense, and it seems to work OK. I have a few comments below + +"Alexey I. Froloff" writes: + +> + begin_list_id = strrchr (list_id_header, '<'); +> + if (!begin_list_id) { +> + fprintf (stderr, "Warning: Not indexing mailformed List-Id tag.\n"); +> + return; +> + } + +- I guess this should say "malformed". + +- I got about 1800 lines of such messages when indexing 280k + messages. That might strike some people as excessive. On the otherhand + I guess we need to re-think error reporting overall. + + What do you think about printing filename or message-id here its + easier to double check that it is not a bug? + +> + end_list_id = strrchr(begin_list_id, '>'); +> + if (!end_list_id || (end_list_id - begin_list_id < 2)) { +> + fprintf (stderr, "Warning: Not indexing mailformed List-Id tag.\n"); +> + return; +> + } +> + + +Same comments here. + +> + void *local = talloc_new (message); +> + +> + /* We extract the list id between the angle brackets */ +> + const char *list_id = talloc_strndup (local, begin_list_id + 1, +> + end_list_id - begin_list_id - 1); +> + + we should handle ENOMEM here, I think. + +> + /* _notmuch_message_add_term() may return +> + * NOTMUCH_PRIVATE_STATUS_TERM_TOO_LONG here. We can't fix it, but +> + * this is not a reason to exit with error... */ +> + if (_notmuch_message_add_term (message, "list", list_id)) +> + fprintf (stderr, "Warning: Not indexing List-Id: <%s>\n", list_id); + +This should say why the indexing failed. + +Other than that: + +- We need a couple tests for this code; tests/search should give some + hints how to proceed. + +- We need a patch for NEWS, explaining what people need to do take + advantage of the new functionality. I think that adding new prefixes + to an existing database is OK, but I'd welcome confirmation. + +BTW, my not too scientific tests show no detectable bloat in the +database, at least after running xapian-compact. I'd be curious what +other people report. + -- 2.26.2