Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 4C803431FC4 for ; Wed, 16 Oct 2013 12:00:27 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gctA98Lw-GyQ for ; Wed, 16 Oct 2013 12:00:19 -0700 (PDT) Received: from mail-ea0-f178.google.com (mail-ea0-f178.google.com [209.85.215.178]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id F29F7431FAE for ; Wed, 16 Oct 2013 12:00:18 -0700 (PDT) Received: by mail-ea0-f178.google.com with SMTP id a15so584795eae.9 for ; Wed, 16 Oct 2013 12:00:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=G9U5RWyJjmaGecmuA6hU+oc4YCV4x1k5fLR30zG6cmI=; b=mvkXKhFUt3exYy8FDX654uqJrzuDNsV57cMDHkfxgFzE7UjLsPKLa0w/M5XNyK+P0I d+x4AInFp6iMZNoLnyp3gonr5gChs1qxlQWhKGjkSkwOrEfDncrvn80BR5eUBcgu6OXI u8UKJ0dLgf0PJ5udXZ+To5YK+elAAQsytUkjxa+nOncD/KIM1skESXW/SOGujYSRK5mF waj+ki2Yw3T0+bCYsxb3h9LyYdFefodzKd9gfzARzuoynw7yZybzFfKJqvv0nRn60aEX 66zZPFGrT1OZSab6W1unYHSE1o6W1D5hFjdwQ6LEPmp8x9z77e5FfMWBSig1egBhDyyO 3dRQ== X-Gm-Message-State: ALoCoQkAtWjBN+yytMiLI7EJm996pBghwwWR24OayEuFj0e12Gvapp/9sLOGpSYea7VcM3G4ZIQi X-Received: by 10.15.44.8 with SMTP id y8mr6771790eev.38.1381950015296; Wed, 16 Oct 2013 12:00:15 -0700 (PDT) Received: from localhost (dsl-hkibrasgw2-58c36f-91.dhcp.inet.fi. [88.195.111.91]) by mx.google.com with ESMTPSA id r48sm182994044eev.14.1969.12.31.16.00.00 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 16 Oct 2013 12:00:14 -0700 (PDT) From: Jani Nikula To: notmuch@notmuchmail.org Subject: [PATCH 0/6] lib: replace the message header parser with gmime Date: Wed, 16 Oct 2013 22:00:07 +0300 Message-Id: X-Mailer: git-send-email 1.8.4.rc3 X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Oct 2013 19:00:27 -0000 Hi all, here's something to debate. ;) We have a homebrew message header parser in the lib, and we also parse messages, including headers, using gime during indexing. This means for messages that get indexed we parse the headers twice. (Duplicates and non-emails only get parsed using our own parser.) The two parsers handle some things differently, which may cause confusion (tab handling in header folding for example). In the interest of reducing somewhat complicated code to maintain, just nuke the homebrew parser in favor of gmime. I did not look into the history of why we have our own parser to begin with; it was more fun to just do some coding. ;) Patches 1-3 do prep work to fix some of the differences in the parsers in advance. Arguably they are not that bad regardless of the parser change. Patches 4-5 actually make the change. Having two patches is a somewhat artificial division, but perhaps makes it easier to review. Patch 6 is just a hack to make perf tests not ignore so many mails... we have quite a bit of non-emails in the corpus by gmime parser standards. And this illlustrates one of the differences in the parsers. BR, Jani. Austin Clements (1): emacs: Sanitize authors and subjects in search and show Jani Nikula (5): cli: sanitize tabs to spaces in notmuch search cli: make the hacky from guessing more liberal lib: replace the header parser with gmime lib: parse messages only once HACK: fix broken messages in the perf test corpus emacs/notmuch-lib.el | 6 + emacs/notmuch-show.el | 7 +- emacs/notmuch.el | 6 +- lib/database.cc | 6 +- lib/index.cc | 70 +------- lib/message-file.c | 351 +++++++++++++------------------------- lib/message.cc | 6 + lib/notmuch-private.h | 19 ++- notmuch-reply.c | 4 +- notmuch-search.c | 4 +- performance-test/perf-test-lib.sh | 4 + 11 files changed, 172 insertions(+), 311 deletions(-) -- 1.8.4.rc3