Carl Worth [Wed, 28 Oct 2009 17:42:07 +0000 (10:42 -0700)]
Add full-text indexing using the GMime library for parsing.
This is based on the old notmuch-index-message.cc from early in
the history of notmuch, but considerably cleaned up now that
we have some experience with Xapian and know just what we want
to index, (rather than just blindly trying to index exactly
what sup does).
This does slow down notmuch_database_add_message a *lot*, but I've
got some ideas for getting some time back.
Carl Worth [Wed, 28 Oct 2009 17:35:26 +0000 (10:35 -0700)]
notmuch search: Clarify documentation of implicit Boolean operators
The original documentation of implicit AND is what we want, but
Xapian doesn't actually let us get that today. So be honest about
what the user can actually expect. And let's hope the Xapian
wizards give us the feature we want soon:
http://trac.xapian.org/ticket/402
Carl Worth [Wed, 28 Oct 2009 14:28:01 +0000 (07:28 -0700)]
TODO: A couple new items.
It's time to put full-text indexing back, and we might want to
experiment with optimization the original thread-stitching phase.
Carl Worth [Wed, 28 Oct 2009 08:46:24 +0000 (01:46 -0700)]
TODO: Remove a couple of since-completed items.
"notmuch tag" is implemented now and seems to work great (and fast).
As for the race condition, as noted in the description we're removing
it's not exposed directly in the API, but only in a client that
allows for looping over search results and removing the inbox tag
from all of them. But then, that's exactly what the "notmuch tag"
command does. So, as discussed, we've now documented that command
to highlight the issue. Problem resolved, (as well as we can).
Carl Worth [Wed, 28 Oct 2009 08:37:57 +0000 (01:37 -0700)]
notmuch help: Review and augment all of the "notmuch help" documentation.
The big addition here is the first description of the syntax for
the query strings for "notmuch search", (and, by reference, for
"notmuch tag").
Carl Worth [Wed, 28 Oct 2009 07:58:26 +0000 (00:58 -0700)]
notmuch help: Be less verbose by default and support detailed help
Putting all of our documentation into a single help message was getting
a bit unwieldy. Now, the simple output of "notmuch help" is a reasonable
reminder and a quick reference. Then we now support a new syntax of:
"notmuch help <command>" for the more detailed help messages.
This gives us freedom to put more detailed caveats, etc. into some
sub-commands without worrying about the usage statement getting too
long.
Carl Worth [Wed, 28 Oct 2009 06:59:06 +0000 (23:59 -0700)]
notmuch tag: Fix crash when removing a tag that didn't exist
Xapian is trying to be useful by reporting that the specified term
didn't exist, but this is one case where we just don't care. :-)
Carl Worth [Wed, 28 Oct 2009 06:57:37 +0000 (23:57 -0700)]
Fix segfault in case of the database lock not being available.
We were nicely reporting the lock-aquisition failure, but then marching
along trying to use the database object and just crashing badly.
So don't do that.
Carl Worth [Wed, 28 Oct 2009 06:55:08 +0000 (23:55 -0700)]
Update prefix so that "thread:" can be used in search strings.
It's convenient to be able to do things like:
notmuch tag -inbox thread:<thread-id>
(even though this can run into a race condition as noted in TODO--the fix
for the race is simply to not run "notmuch new" between reading a thread
with the (not yet existent) "notmuch show" and removing its inbox tag
with a command like the above). So we now allow such a thing.
Carl Worth [Wed, 28 Oct 2009 00:07:14 +0000 (17:07 -0700)]
Add new "notmuch tag" command for adding/removing tags.
This uses the same search functionality as "notmuch search" so
it should be quite powerful. And this global search might be
quick enough to be used for "automatic" adding of tags to new
messages.
Of course, this will all be a lot more useful when we can search
for actual text of messages and not just tags.
Carl Worth [Tue, 27 Oct 2009 23:19:20 +0000 (16:19 -0700)]
notmuch_database_add_message: Do not return a message on failure.
The recent, disastrous failure of "notmuch new" would have been
avoided with this change. The new_command function was basically
assuming that it would only get a message object on success so
wasn't destroying the message in the other cases.
Carl Worth [Tue, 27 Oct 2009 23:17:22 +0000 (16:17 -0700)]
notmuch_database_close: Explicitly flush the Xapian database.
This would have helped with the recent bug causing "notmuch new"
to not record any results in the database. I'm not sure why
the explicit flush would be required, (shouldn't the destructor
always ensure that things flush?), but perhaps some outstanding
references from the leak prevented that.
In any case, an explicit flush on close() seems to make sense.
Carl Worth [Tue, 27 Oct 2009 23:12:04 +0000 (16:12 -0700)]
Merge branch to fix broken "notmuch setup" and "notmuch new"
I'm trying to stick to a habit of fixing previously-introduced bugs
on side branches off of the commit that introduced the bug. The
idea here is to make it easy to find the commits to cherry pick
if bisecting in the future lands on one of the broken commits.
Carl Worth [Tue, 27 Oct 2009 23:07:27 +0000 (16:07 -0700)]
Fix "notmuch new" (bad performance, and no committing of results).
We were incorrectly only destroying messages in the case of
successful addition to the database, and not in other cases,
(such as failure due to FILE_NOT_EMAIL).
I'm still not entirely sure why this was performing abysmally, (as in
making an operation that should take a small fraction of a second take
10 seconds), nor why it was causing the database to entirely fail to
get new results.
But fortunately, this all seems to work now.
Carl Worth [Tue, 27 Oct 2009 19:00:58 +0000 (12:00 -0700)]
Unbreak the "notmuch setup" command.
The recent addition of support for automatically adding tags to
new messages for "notmuch new" caused "notmuch setup" to segfault.
The fix is simple, (just need to move a destroy function to inside
a nearby if block).
Did I mention recently we need to add a test suite?
Carl Worth [Tue, 27 Oct 2009 18:35:30 +0000 (11:35 -0700)]
TODO: Several more ideas that have come to mind, that I don't want to forget.
Some of these are simple little code cleanups, but it's nice to write them
down rather than trying to remember them.
Carl Worth [Tue, 27 Oct 2009 17:19:46 +0000 (10:19 -0700)]
TODO: More notes on archive-thread and race conditions.
Interstingly, it's our simple "notmuch" client that's going to be the
most difficult to fix. There's just not as much information preserved
in the textual representation from "notmuch search" as there is in the
objects returned from notmuch_query_search_threads.
Carl Worth [Tue, 27 Oct 2009 17:04:48 +0000 (10:04 -0700)]
TODO: Add "notmuch tag" and thoughts on avoiding races in archiving threads.
The archive-thread race condition doesn't even exist now because there's
no command for modifying tags at the level of a thread (just individual
messages).
Carl Worth [Tue, 27 Oct 2009 05:25:45 +0000 (22:25 -0700)]
notmuch restore: Fix to remove all tags before adding tags.
This means that the restore operation will now properly pick up the
removal of tags indicated by the tag just not being present in the
dump file.
We added a few new public functions in order to support this:
notmuch_message_freeze
notmuch_message_remove_all_tags
notmuch_message_thaw
Carl Worth [Tue, 27 Oct 2009 05:19:08 +0000 (22:19 -0700)]
notmuch restore: Don't bother printing tag values.
The code was just a little messy here with three parallel conditions
testing for message == NULL.
Carl Worth [Tue, 27 Oct 2009 04:44:05 +0000 (21:44 -0700)]
add_message: Add an optional parameter for getting the just-added message.
We use this to implement the addition of "inbox" and "unread" tags
for all messages added by "notmuch new".
Carl Worth [Tue, 27 Oct 2009 03:11:58 +0000 (20:11 -0700)]
Fix incorrect name of _notmuch_thread_get_subject.
Somehow this naming with an underscore crept in, (but only in the
private header, so notmuch.c was compiling with no prototype). Fix
to be the notmuch_thread_get_subject originally intended.
Carl Worth [Tue, 27 Oct 2009 00:35:31 +0000 (17:35 -0700)]
Add public notmuch_thread_get_subject
And use this in "notmuch search" to display subject line as well as
thread ID.
Carl Worth [Mon, 26 Oct 2009 22:17:10 +0000 (15:17 -0700)]
Remove all calls to g_strdup_printf
Replacing them with calls to talloc_asprintf if possible, otherwise
to asprintf (with it's painful error-handling leaving the pointer
undefined).
Carl Worth [Mon, 26 Oct 2009 21:46:14 +0000 (14:46 -0700)]
Add notmuch_thread_get_tags
And augment "notmuch search" to print tag values as well as thread ID
values. This tool is almost usable now.
Carl Worth [Mon, 26 Oct 2009 21:12:56 +0000 (14:12 -0700)]
tags: Replace sort() and reset() with prepare_iterator().
The previous functions were always called together, so we might as
well just have one function for this. Also, the reset() name was
poor, and prepare_iterator() is much more descriptive.
Carl Worth [Mon, 26 Oct 2009 21:02:58 +0000 (14:02 -0700)]
Fix memory leak in notmuch_thread_results_t
If we were using a talloc-based resizing array then this wouldn't
have happened. Of course, thanks to valgrind for catching this.
Carl Worth [Mon, 26 Oct 2009 16:13:19 +0000 (09:13 -0700)]
tags: Re-implement tags iterator to avoid having C++ in the interface
We want to be able to iterate over tags stored in various ways, so
the previous TermIterator-based tags object just wasn't general
enough. The new interface is nice and simple, and involves only
C datatypes.
Carl Worth [Mon, 26 Oct 2009 16:20:32 +0000 (09:20 -0700)]
notmuch restore: Fix leak of FILE* object.
Apparently, I didn't copy enough of the "notmuch dump" implementation
since it didn't have a similar leak.
Carl Worth [Mon, 26 Oct 2009 13:00:07 +0000 (06:00 -0700)]
Hide away the details of the implementation of notmuch_tags_t.
We will soon be wanting multiple different implementations of
notmuch_tags_t iterators, so we need to keep the actual structure
as an implementation detail inside of tags.cc.
Carl Worth [Mon, 26 Oct 2009 12:53:40 +0000 (05:53 -0700)]
Move terms and tags code to a new tags.cc file.
We want to start using this from both message.cc and thread.cc so we
need it in a place we can share the code. This also requires a new
notmuch-private-cxx.h header file for interfaces that include
C++-specific datatypes (such as Xapian::Document).
Carl Worth [Mon, 26 Oct 2009 12:14:51 +0000 (05:14 -0700)]
results_get: Fix to return NULL if past the end of the results
We had documented both notmuch_thread_results_get and
notmuch_message_results_get to return NULL if (! has_more)
but we hadn't actually implemented that. Fix.
Carl Worth [Mon, 26 Oct 2009 06:18:05 +0000 (23:18 -0700)]
Add TODO file.
I've been maintaining this for a while now, so I might as well
start tracking it with revision control as well.
Carl Worth [Mon, 26 Oct 2009 06:12:20 +0000 (23:12 -0700)]
Add an initial implementation of a notmuch_thread_t object.
We've now got a new notmuch_query_search_threads and a
notmuch_threads_result_t iterator. The thread object itself
doesn't do much yet, (just allows one to get the thread_id),
but that's at least enough to see that "notmuch search" is
actually doing something now, (since it has been converted
to print thread IDs instead of message IDs).
And maybe that's all we need. Getting the messages belonging
to a thread is as simple as a notmuch_query_search_messages
with a string of "thread:<thread-id>".
Though it would be convenient to add notmuch_thread_get_messages
which could use the existing notmuch_message_results_t iterator.
Now we just need an implementation of "notmuch show" and we'll
have something somewhat usable.
Carl Worth [Mon, 26 Oct 2009 05:11:09 +0000 (22:11 -0700)]
Rename notmuch_query_search to notmuch_query_search_messages
Along with renaming notmuch_results_t to notmuch_message_results_t.
The new type is quite a mouthful, but I don't expect it to be
used much other than the for-loop idiom in the documentation,
(which does at least fit nicely within 80 columns).
This is all in preparation for the addition of a new
notmuch_query_search_threads of course.
Carl Worth [Sun, 25 Oct 2009 23:12:24 +0000 (16:12 -0700)]
Drop dead function add_term.
Even with the recent warnings work, gcc didn't tell me about a static
function that I'm not calling? Apparently I get "defined but not
used" in C files, but not C++ files. That's bogus, and yet one more
reason for me to push the C++ to a minimal lower layer.
Carl Worth [Sun, 25 Oct 2009 23:09:31 +0000 (16:09 -0700)]
Fix missing xapian-flags when generating dependencies.
I didn't notice this because `xapian-config -cxxflags` gives empty
output on my system. But for someone with the xapian library
installed in some non-standard location this would be important.
Carl Worth [Sun, 25 Oct 2009 23:07:46 +0000 (16:07 -0700)]
Drop unused variable.
I didn't end up adding any of the warnings options that aren't allowed
for C++, (such as -Wold-style-definition, -Wnested-externs,
-Werror-implicit-function-declaration, -Wstrict-prototypes,
-Wmissing-prototypes, or -Wbad-function-cast). So for now we can
drop the separate C and C++ variables for warnings.
Carl Worth [Sun, 25 Oct 2009 23:03:45 +0000 (16:03 -0700)]
Add -Wswitch-enum and fix warnings.
Having to enumerate all the enum values at every switch is annoying,
but this warning actually found a bug, (missing support for
NOTMUCH_STATUS_OUT_OF_MEMORY in notmuch_status_to_string).
Carl Worth [Sun, 25 Oct 2009 22:58:05 +0000 (15:58 -0700)]
Add -Wmising-declarations and fix warnings.
Wow, lots of missing 'static' on internal functions.
Carl Worth [Sun, 25 Oct 2009 22:55:23 +0000 (15:55 -0700)]
Add -Wwrite-strings and fix warnings.
Need to be const-clean when handling string literals.
Carl Worth [Sun, 25 Oct 2009 22:53:27 +0000 (15:53 -0700)]
Re-enable the warning for unused parameters.
It's easy enough to squelch the warning with an __attribute__ ((unused))
and it might just catch something for us in the future.
Carl Worth [Sun, 25 Oct 2009 22:39:53 +0000 (15:39 -0700)]
Add -Wextra and fix warnings.
When adding -Wextra we also add -Wno-ununsed-parameters since that
function means well enough, but is really annoying in practice.
So the warnings we fix here are basically all comparsions between
signed and unsigned values.
Carl Worth [Sun, 25 Oct 2009 22:19:36 +0000 (15:19 -0700)]
Rework Makefile just a bit to enable adding flags for more compiler warnings
We have to carefully separate the C and C++ flags here since a
bunch of the warnings options for gcc are valid for compiling C,
but not for C++.
Carl Worth [Sun, 25 Oct 2009 22:01:20 +0000 (15:01 -0700)]
_notmuch_database_linke_message: Fix error-status propagation.
The _notmuch_database_link_message_to_parents function was void
in an earlier draft. Now, ensure that we don't miss any error
return value from it.
Carl Worth [Sun, 25 Oct 2009 21:54:13 +0000 (14:54 -0700)]
Change database to store only a single thread ID per message.
Instead of supporting multiple thread IDs, we now merge together
thread IDs if one message is ever found to belong to more than one
thread. This allows for constructing complete threads when, for
example, a child message doesn't include a complete list of References
headers back to the beginning of the thread.
It also simplifies dealing with mapping a message ID to a thread ID
which is now a simple get_thread_id just like get_message_id, (and no
longer an iterator-based thing like get_tags).
Carl Worth [Sun, 25 Oct 2009 18:05:16 +0000 (11:05 -0700)]
link_message: Remove dead code.
We dropped the THREAD_ID value from the database a while back, but here
is code that's carefully computing that value and then never doing
anything with it. Delete, delete, delete.
Carl Worth [Sun, 25 Oct 2009 18:03:55 +0000 (11:03 -0700)]
add_message: Pull the thread-stitching portion out into new _notmuch_database_link_message
The function was getting too long-winded before. Add since I'm about
to change how we handle the thread linking, it's convenient to have
it in an isolated function.
Carl Worth [Sun, 25 Oct 2009 17:22:41 +0000 (10:22 -0700)]
Add an INTERNAL_ERROR macro and use it for all internal errors.
We were previously just doing fprintf;exit at each point, but I
wanted to add file and line-number details to all messages, so it
makes sense to use a single macro for that.
Carl Worth [Sun, 25 Oct 2009 16:47:21 +0000 (09:47 -0700)]
add_message: Propagate error status from notmuch_message_create_for_message_id
What a great feeling to remove an XXX comment.
Carl Worth [Sun, 25 Oct 2009 16:20:13 +0000 (09:20 -0700)]
notmuch dump: Eliminate extra space in error message.
Little details can make big impressions.
Carl Worth [Sun, 25 Oct 2009 16:14:16 +0000 (09:14 -0700)]
Move read-only-archive hint from "notmuch setup" to "notmuch new"
The "notmuch setup" output was getting overwhelmingly verbose.
Also, some people might not have a lot of mail, so might never need
this optimization. It's much better to move the hint to the time
when the user could actually benefit from it, (it's easy to detect
that "notmuch new" took more than 1 second, and we know if there
are any read-only directories there or not).
Carl Worth [Sun, 25 Oct 2009 15:57:09 +0000 (08:57 -0700)]
Add comment documenting our current database schema.
I've got schemes to change this schema somewhat dramatically, so I
want a place to be able to record and review those changes.
Carl Worth [Sun, 25 Oct 2009 07:25:59 +0000 (00:25 -0700)]
Drop the storage of thread ID(s) in a value.
Now that we are iterating over the thread terms instead, we can
drop this redundant storage (which should shrink our database a
tiny bit).
Carl Worth [Sun, 25 Oct 2009 07:04:33 +0000 (00:04 -0700)]
Convert notmuch_thread_ids_t to notmuch_terms_t
Aside from increased code sharing, the benefit here is that now
thread_ids iterates over the terms of a message rather than the
thread_id value. So we'll now be able to drop that value.
Carl Worth [Sun, 25 Oct 2009 06:58:06 +0000 (23:58 -0700)]
Implement notmuch_tags_t on top of new notmuch_terms_t
The generic notmuch_terms_t iterator should provide support for
notmuch_thread_ids_t when we switch as well, (And it would be
interesting to see if we could reasonably make this support a
PostingIterator too. Time will tell.)
Carl Worth [Sun, 25 Oct 2009 06:05:08 +0000 (23:05 -0700)]
Shuffle the value numbers around in the database.
First, it's nice that for now we don't have any users yet, so we
can make incompatible changes to the database layout like this
without causing trouble. ;-)
There are a few reasons for this change. First, we now use value 0
uniformly as a timestamp for both mail and timestamp documents, (which
lets us cleanup an ugly and fragile bare 0 in the add_value and
get_value calls in the timestamp code).
Second, I want to drop the thread value entirely, so putting it at the
end of the list means we can drop it as compatible change in the
future. (I almost want to drop the message-ID value too, but it's nice
to be able to sort on it to get diff-able output from "notmuch dump".)
But the thread value we never use as a value, (we would never sort on
it, for example). And it's totally redundant with the thread terms we
store already. So expect it to disappear soon.
Carl Worth [Sun, 25 Oct 2009 05:49:35 +0000 (22:49 -0700)]
Invent our own prefix values.
We're now dropping all pretense of keeping the database directly
compatible with sup's current xapian backend. (But perhaps someone
might write a new nothmuch backend for sup in the future.)
In coming up with the prefix values here, I tried to follow the
conventions of http://xapian.org/docs/omega/termprefixes.html as
closely as makes sense, (with some domain translation from "web"
to "email archive").
Carl Worth [Sun, 25 Oct 2009 05:38:43 +0000 (22:38 -0700)]
Split BOOLEAN_PREFIX into INTERNAL and EXTERNAL subsets.
The idea here is that only some of the prefix names (such as "id" and
"tag") actually make sense in external user-supplied query
strings. Other things like "type" are internal implementation details
of how we store things in the database. So internal machinery will add
those terms to the database and we don't need to support them in the
string itself.
With this, we can now simply loop over the external prefix values to
let the quiery parser know about them. So as we add prefixes in the
future, we'll only need to add them to this list.
Carl Worth [Sun, 25 Oct 2009 05:29:49 +0000 (22:29 -0700)]
Change all occurrences of "msgid" to "id".
What's good for the user is good for the internals.
Carl Worth [Sun, 25 Oct 2009 05:28:22 +0000 (22:28 -0700)]
Add bash-completion script for notmuch.
It's not much of a script, (we don't have that many commands after
all), but it's the kind of thing that's nice to have and gives the
tool a slightly more polished feel.
Carl Worth [Sun, 25 Oct 2009 05:23:58 +0000 (22:23 -0700)]
Add the magic to allow searches such as "tag:inbox".
The key for this is call add_boolean_prefix on the QueryParser
object. That tells the query parser to take something like "tag:inbox"
and transform it into the "Linbox" term and do what it needs to do to
make this term a requirement of the search. We're starting to have a
real system here.
Also, I didn't want to expose the ugly name of "msgid" to the user, so
we add a prefix name of simply "id" instead.
Carl Worth [Sun, 25 Oct 2009 05:21:57 +0000 (22:21 -0700)]
Use _find_prefix instead of hard-coded term in notmuch_query_search
I'm planning to change prefix values soon, which would break code
like this. So eliminate the fragility by going through our existing
_find_prefix function.
Carl Worth [Sun, 25 Oct 2009 05:20:13 +0000 (22:20 -0700)]
Fix bit-twiddling brain damage in notmuch_query_search
Here's the big bug that was preventing any searches from working at
all like desired. I did the work to carefully pick out exactly the
flags that I wanted, and then I threw it away by trying to combine
them with & instead of | (so just passing 0 for flags instead).
Much better now.
Carl Worth [Sun, 25 Oct 2009 05:18:20 +0000 (22:18 -0700)]
Add debugging code for examining query strings.
It's nice that Xapian provides a little function to print a textual
representation of the entire query tree. So now, if you compile
like so:
make CFLAGS=-DDEBUG_QUERY
then you get a nice output of the query string received by the query
module, and the final query actually being sent to Xapian.
Carl Worth [Sun, 25 Oct 2009 05:16:10 +0000 (22:16 -0700)]
Add a preliminary "notmuch search" command.
This isn't behaving at all like it's documented yet, (for example,
it's returning message IDs not thread IDs[*]). In fact, the output
code is just a copy of the body of "notmuch dump", so all you
get for now is message ID and tags.
But this should at least be enough to start exercising the query
functionality, (which is currently very buggy).
[*] I'll want to convert the databse to store thread documents
before fixing that.
Carl Worth [Sun, 25 Oct 2009 05:14:31 +0000 (22:14 -0700)]
notmuch_database_create: Document idea to (optionally) return a status
The current problem is that when this function fails the caller
doesn't get any information about what the particular failure
was, (something in the filesystem? or in Xapian?). We should fix
that.
Carl Worth [Sun, 25 Oct 2009 05:11:38 +0000 (22:11 -0700)]
notmuch setup/new: Propagate failure from notmuch_database_set_timestamp
With some recent testing, the timestamp was failing, (overflowing
the term limit), and reporting an error, but the top-level notmuch
command was still returning a success return value.
I think it's high time to add a test suite, (and the code base is
small enough that if we add it now it shouldn't be *too* hard to
shoot for a very high coverage percentage).
Carl Worth [Sun, 25 Oct 2009 05:10:03 +0000 (22:10 -0700)]
Fix timestamp generation to avoid overflowing the term limit
The previous code was only correct as long as the timestamp prefix
was only a single character. But with the recent change to a
multi-character prefix, this broke. So fix it now.
Carl Worth [Sun, 25 Oct 2009 05:04:59 +0000 (22:04 -0700)]
Trim down prefix list to things we are actually using.
I've decided not to try for sup compatibility at the leve of the
xapian datbase. There's just too much about sup's usage of the
database that I don't like, (beyond the embedded ruby data structures
there is redundant storage of message IDs, thread IDs, and dates (in
both terms and values)).
I'm going to fix that up in the database of notmuch, with some other
changes as well. (I plan to drop "reference" terms once linkage to a
thread ID through the reference is established. I also plan to add
actual documents to represent threads.)
So with all that incompatibility, I might as well make my own prefix
values. And while doing that, I should try to be as compatible as
possible with the conventions described here:
http://xapian.org/docs/omega/termprefixes.html
Carl Worth [Sun, 25 Oct 2009 04:52:48 +0000 (21:52 -0700)]
Move the prefix-string arrays back into database.cc from message.cc
Yes, I'm being wishy-washy here, moving code back and forth. But
this is where these really do belong.
Carl Worth [Sat, 24 Oct 2009 15:06:23 +0000 (08:06 -0700)]
Revert "Remove some unneeded initializers."
This reverts commit
fb1bae07002d45138832eacb280419dbd7a19774.
These initializers were totally necessary. I clearly wasn't
thinking straight when I removed them.
Carl Worth [Sat, 24 Oct 2009 00:25:23 +0000 (17:25 -0700)]
Cut the enthusiasm a bit.
It gets annoying pretty quick.
Carl Worth [Sat, 24 Oct 2009 00:20:43 +0000 (17:20 -0700)]
Make "notmuch new" ignore directories that are read-only.
With this, "notmuch new" is now plenty fast even with large archives
spanning many sub-directories. Document this both in "notmuch help"
and also in the output of notmuch setup.
Carl Worth [Fri, 23 Oct 2009 23:19:35 +0000 (16:19 -0700)]
add_files: Pull one stat out of the recrusive function.
There's no need to stat each directory both before and after each
recursive call.
Carl Worth [Fri, 23 Oct 2009 23:00:24 +0000 (16:00 -0700)]
More fixing of plurals.
It definitely doesn't help that we have the same messages in both
"setup" and "new". Should combine those really.
Carl Worth [Fri, 23 Oct 2009 22:57:39 +0000 (15:57 -0700)]
More care in final status reporting.
Printing "Added 1 new messages" just looks like lack of attention
to detail, (but yes plurals can be annoying this way).
Carl Worth [Fri, 23 Oct 2009 22:50:48 +0000 (15:50 -0700)]
Print a better message than "0s" for zero seconds.
It's nice to have a tool that at least construct actual sentences.
Carl Worth [Fri, 23 Oct 2009 22:48:05 +0000 (15:48 -0700)]
Add new "notmuch new" command.
Finally, I can get new messages into my notmuch database without
having to run a complete "notmuch setup" again. This takes
advantage of the recent timestamp capabilities in the database
to avoid looking into directories that haven't changed since the
last time "notmuch new" was run.
Carl Worth [Fri, 23 Oct 2009 22:39:11 +0000 (15:39 -0700)]
add_files: Change to return a status value instead of void
Also change to use goto rather than early returns. And once again,
there were lots of bugs in the error cases previously.
Carl Worth [Fri, 23 Oct 2009 22:22:14 +0000 (15:22 -0700)]
notmuch setup: Clean up the progress printing a bit.
Get rid of a useless leading 0 on the seconds value, and make a
distinction between "files" and "messages", (we process many
files, but not all of them are recongized as messages). Finally,
add a summary line at the end saying how many unique messages
were added to the database. Since this comes right after the
total number of files, it gives the user at least a hint as
to how many messages were encountered with duplicate message IDs.
Carl Worth [Fri, 23 Oct 2009 22:17:16 +0000 (15:17 -0700)]
Re-order documentation a bit.
The notmuch_database_get_default_path function is unique in not
accepting a notmuch_database_t* (nor creating one). So list it
outside the other notmuch_database functions.
Carl Worth [Fri, 23 Oct 2009 22:12:03 +0000 (15:12 -0700)]
notmuch_message_get_filename: Improve documentation.
Fix a typo, and add clarifications about the lifetime and readonly
nature of the return value.
Carl Worth [Fri, 23 Oct 2009 21:55:50 +0000 (14:55 -0700)]
Remove some unneeded initializers.
Some people might argue for more initializers to be "safer",
but I actually prefer to leave things this way. It saves
typing, but the real benefit is that the things that do
require initialization stand out so we know to watch them
carefully. And with valgrind, we actually get to catch
errors earlier if we *don't* initialize them. So that can
be "safer" ironically enough.
Carl Worth [Fri, 23 Oct 2009 21:55:02 +0000 (14:55 -0700)]
notmuch setup: Fix a couple of error paths.
We had early returns instead of goto statments, and sure enough,
they were leaking. Much cleaner this way.
Carl Worth [Fri, 23 Oct 2009 21:45:33 +0000 (14:45 -0700)]
_find_prefix: Exit when given an invalid prefix name.
This will be a nice safety check for internal sanity.
Carl Worth [Fri, 23 Oct 2009 21:40:33 +0000 (14:40 -0700)]
Add NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID
And document that notmuch_database_add_message can return this
value. This pushes the hard decision of what to do with duplicate
messages out to the user, but that's OK. (We weren't really doing
anything with these ourselves, and this way the user is at least
informed of the issue, rather than it just getting papered over
internally.)
Carl Worth [Fri, 23 Oct 2009 21:37:09 +0000 (14:37 -0700)]
Clean up comments to not include spaces before tabs.
This were just unclean, (an invisble sort of uncleanliness, but still
there are liable to make for ugly diffs). Oh, wait, like this one!
But at least it's not sprinkled among code changes.
Carl Worth [Fri, 23 Oct 2009 21:34:21 +0000 (14:34 -0700)]
Clarify documentation and error string for NOTMUCH_STATUS_TAG_TOO_LONG
It's helpful to point out NOTMUCH_STATUS_TAG_MAX for users.
Carl Worth [Fri, 23 Oct 2009 21:31:01 +0000 (14:31 -0700)]
Add notmuch_database_set_timestamp and notmuch_database_get_timestamp
These will be very helpful to implement an efficient "notmuch new"
command which imports new mail messages that have appeared.
Carl Worth [Fri, 23 Oct 2009 21:24:07 +0000 (14:24 -0700)]
database: Add private find_unique_doc_id and find_unique_document functions
These are a generalization of the unique-ness testing of
notmuch_database_find_message. More preparation for
firectory timestamps.
Carl Worth [Fri, 23 Oct 2009 21:12:06 +0000 (14:12 -0700)]
database: Similarly rename find_message_by_docid to find_document_for_doc_id
Again preferring notmuch_database_t* over Xapian::Database*.
Also, we're standardizing on "doc_id" rather than "docid" locally, (as
an analoge to "message_id"), in spite of the "Xapian::docid" name,
(which, fortunately, we can ignore and just us "unsigned int" instead).
Carl Worth [Fri, 23 Oct 2009 21:06:24 +0000 (14:06 -0700)]
database: Rename internal find_messages_by_term to find_doc_ids
This name is a more accurate description of what it does, and
the more general naming will make sense as we start storing
non-message documents in the database (such as directory
timestamps).
Also, don't pass around a Xapian::Database where it's more our
style to pass a notmuch_database_t*.
Carl Worth [Fri, 23 Oct 2009 20:54:53 +0000 (13:54 -0700)]
sha1: Add new notmuch_sha1_of_string function
We'll be using this for storing really long terms in the database
and when we just need to look them up, (and never read back the
original data directly from the database). For example, storing
arbitrarily long directory paths in the database along with
mtime timestamps.
Note that if we did want to store arbitrarily long terms and also
be able to read them back, the Xapian folks recommending splitting
the term off with multiple prefixes. See the note near the end
of this page:
http://trac.xapian.org/wiki/FAQ/UniqueIds
Carl Worth [Fri, 23 Oct 2009 13:08:22 +0000 (06:08 -0700)]
notmuch restore: Print names of tags that cannot be applied
This helps the user gauge the severity of the error.
For example, when restoring my sup tags I see a bunch of tags missing
for message IDs of the form "sup-faked-...". That's not surprising
since I know that sup generates these with the md5sum of the message
header while notmuch uses the sha-1 of the entire message. But how
much will this hurt?
Well, now that I can see that most of the missing tags are just
"attachment", then I'm not concerned, (I'll be automatically creating
that tag in the future based on the message contents). But if a
missing tag is "inbox" then that's more concerning because that's data
that I can't easily regenerate outside of sup.
Carl Worth [Fri, 23 Oct 2009 13:06:20 +0000 (06:06 -0700)]
notmuch_tags_has_more: Fix to use string.empty rather than string.size
I'm really interested in the length of the data here, not the size
of the storage.
Carl Worth [Fri, 23 Oct 2009 13:04:57 +0000 (06:04 -0700)]
Fix notmuch_message_get_message_id to never return NULL.
With the recent improvements to the handling of message IDs we
"know" that a NULL message ID is impossible, (so we simply
abort if the impossible happens).
Carl Worth [Fri, 23 Oct 2009 13:00:10 +0000 (06:00 -0700)]
add_message: Fix to not add multiple documents with the same message ID
Here's the second big fix to message-ID handling, (the first was to
generate message IDs when an email contained none). Now, with no
document missing a message ID, and no two documents having the same
message ID, we have a nice consistent database where the message ID
can be used as a unique key.
Carl Worth [Fri, 23 Oct 2009 12:53:52 +0000 (05:53 -0700)]
Add _notmuch_message_create_for_message_id
This is the last piece needed for add_message to be able to properly
support a message with a duplicate message ID. This function creates
a new notmuch_message_t object but one that may reference an existing
document in the database.
Carl Worth [Fri, 23 Oct 2009 12:45:29 +0000 (05:45 -0700)]
Fix _notmuch_message_create to catch Xapian DocNotFoundError.
This function is only supposed to be called with a doc_id that
was queried from the database already. So there's an internal
error if no document with that doc_id can be found in the database.
In that case, return NULL.