--- /dev/null
+Return-Path: <cworth@cworth.org>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by olra.theworths.org (Postfix) with ESMTP id 5BB9D431FBF;\r
+ Sat, 21 Nov 2009 09:07:23 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+ by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id BRy2DldsBgM4; Sat, 21 Nov 2009 09:07:22 -0800 (PST)\r
+Received: from cworth.org (localhost [127.0.0.1])\r
+ by olra.theworths.org (Postfix) with ESMTP id D5801431FAE;\r
+ Sat, 21 Nov 2009 09:07:21 -0800 (PST)\r
+From: Carl Worth <cworth@cworth.org>\r
+To: Stefan Schmidt <stefan@datenfreihafen.org>, notmuch@notmuchmail.org\r
+In-Reply-To: <20091121145111.GB19397@excalibur.local>\r
+References: <20091121145111.GB19397@excalibur.local>\r
+Date: Sat, 21 Nov 2009 18:07:10 +0100\r
+Message-ID: <87fx874xj5.fsf@yoom.home.cworth.org>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain; charset=us-ascii\r
+Subject: Re: [notmuch] 25 minutes load time with emacs -f notmuch\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.12\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Sat, 21 Nov 2009 17:07:23 -0000\r
+\r
+On Sat, 21 Nov 2009 15:51:11 +0100, Stefan Schmidt <stefan@datenfreihafen.org> wrote:\r
+> Disclaimer: I'm using vim, in combination with mutt for email, for years, but\r
+> never dealt with emacs. Please have this in mind and spot any emacs user errors\r
+> in this report. :)\r
+\r
+Hi Stefan, welcome to Notmuch! And don't worry, we don't discriminate\r
+(too much) against non-emacs users around here.\r
+\r
+> I have first seen notmuch several weeks ago as it seems a silent project. Being\r
+> more then happy now that it envolves quickly and a real developer community\r
+> builds around it.\r
+\r
+Yes. Notmuch was a silent project since it was just something that I was\r
+doing for myself. I was always writing it as free software, and even had\r
+a public git repository available, but hadn't advertised it at all yet.\r
+\r
+And Keith did rather catch me off guard by announcing it. But I can't\r
+complain as we have gotten a nice community started already, and it's\r
+great to have other people writing the code that I intended to\r
+write. :-)\r
+\r
+But it's also true that some obvious problems just aren't taken care of\r
+yet.\r
+\r
+> But now to my problem. Getting m mail indexed was easy enough:\r
+> \r
+> stefan@excalibur:~$ du -chs not-much-mail/\r
+> 1.5G not-much-mail/\r
+> 1.5G total\r
+> stefan@excalibur:~$ time notmuch new\r
+> Found 103677 total files.\r
+> Processed 103677 total files in 42m 30s (40 files/sec.).\r
+> Added 100899 new messages to the database (not much, really).\r
+\r
+Good. I'm glad that went fairly smoothly for you.\r
+\r
+Though, frankly, I think we need to fix "notmuch new" to do much better\r
+than 40 files/sec. One plan I have for this is to not use the database\r
+to search for message IDs when adding many messages---but to instead\r
+just use a hash-table (seeded from any messages already in the\r
+database). This would allow us to do all thread resolution before\r
+indexing messages, without having to do the N different searches, and\r
+also means we'd avoid continually rewriting documents when merging\r
+thread IDs.\r
+\r
+> I put (require 'notmuch) in my ~/.emacs ans start emacs with the -f notmuch\r
+> option to enter the notmuch mode.\r
+\r
+I'm glad you've figured that much out. I feel bad that that's not even\r
+in the documentation anywhere yet.\r
+\r
+> What happends then is that a notmuch process gets started and emacs\r
+> waits for the return.\r
+\r
+OK. This is a known shortcoming. As Bdale supposes, this problem is from\r
+notmuch trying to load and construct every thread in your\r
+database. There are actually several different bugs/missing features\r
+here that should be addressed:\r
+\r
+ * "notmuch new" should look at the R flag in maildir files to\r
+ determine that they are read and do not need to be marked as "inbox"\r
+ and "unread"\r
+\r
+ * "notmuch setup" should prompt for some date range, ("last 2 months"\r
+ by default?) before which no messages will be considered unread.\r
+\r
+Either of those two fixes would have prevented your particular\r
+problem. But it's still easy to generate searches that return large\r
+numbers of results. So there's some more to do:\r
+\r
+ * The emacs code needs to call "notmuch search" with the --first and\r
+ --max-threads options to get a limited set of results, (one or two\r
+ screenfuls). You should be able to test this at the command line and\r
+ see that it returns results quickly. Then, of course, we'd like the\r
+ emacs code to fill in subsequent screenfuls as you page.\r
+\r
+But none of that helps you right now. What you need is to retroactively\r
+remove all of the "inbox" and "unread" tags from messages older than\r
+some time period. So then there's another missing feature:\r
+\r
+ * We need to support date-range-based searches. If we had that you\r
+ could just do:\r
+\r
+ notmuch tag -inbox -unread until:"2 months ago"\r
+\r
+ But we don't quite have this yet. Xapian does have support for a\r
+ slightly less convenient date range specification:\r
+\r
+ 1970-01-01..2009-09-21\r
+\r
+ but it turns out that we can't even use that just yet, since to make\r
+ that work we would have to have dates saved as YYYYMMDD strings for\r
+ each message, (where instead we have time_t values stored serialized\r
+ into a string that will sort correctly.). So we need a new\r
+ ValueRangeProcessor class to map to timestamps, and then we'll need\r
+ some fancy parsing to do things like "2 months ago".\r
+\r
+So, what's the best thing to do today if you want to start playing with\r
+notmuch? I think you could pick one of the above to work on, (a quick\r
+hack to "notmuch new" and a re-import might do the trick). Or you might\r
+just remove the inbox and unread tags from all messages and then just\r
+let messages that are actually *new* in the future get tagged into the\r
+inbox by "notmuch new". Oh, but then there's another missing feature:\r
+\r
+ * We need a syntax to specify a search string that should match all\r
+ messages. Then you could do:\r
+\r
+ notmuch tag -inbox -unread <whatever-magic-we-came-up-with>\r
+\r
+Yikes! So many bugs and missing features. How is anyone actually using\r
+this system? Well, Keith and I were able to get past all this by simply\r
+doing a "notmuch restore" based on tags we got from sup-dump. So here,\r
+is another attempt:\r
+\r
+ 1. Run "notmuch dump <some-file>" to get the list of message IDs, (all\r
+ with their "inbox" and "unread" tags).\r
+\r
+ 2. Edit that file to remove the tags you want.\r
+\r
+ 3. Run "notmuch restore <some-file>" to cause the tags to be removed.\r
+\r
+But, (*sigh*), that's not good either, because "notmuch dump" is\r
+currently hard-coded to dump messages in message-ID order rather than\r
+date order, (so you can't easily do something like "just remove the tags\r
+from messages older than two months).\r
+\r
+So, there's sadly no easy way to get what you want with the tools in\r
+their current form. I guess that's the pain that you get for being an\r
+early adopter. :-}\r
+\r
+But if hacking a little C code doesn't scare you away, a lot of the\r
+things listed above are actually really easy to fix. (Like, fixing\r
+"notmuch dump" to just run in date order is a one-line change. Adding a\r
+--sort command-line option to it wouldn't be much harder, etc.)\r
+\r
+So hopefully the above serves as a nice TODO list.\r
+\r
+Thanks everyone for your interest in this software even in its current,\r
+can-be-painful-to-use state.\r
+\r
+-Carl\r
+\r
+PS. Expect the mass-re-tag operations to be about as slow as the\r
+original "notmuch new" import of the messages. That's a known bug in\r
+Xapian that's one of the highest priority things that I'd like to fix,\r
+(along with all of the above and all the other things I want to do...)\r
+\r
+At least we're not running out of things to work on here.\r