Here's another change which I'm making for sup compatibility against
my better judgment. It seems that sup never indexes content from
mime parts with content-disposition of attachment. But these
attachments are often very indexable, (for example, the first one
I encountered was a small shell script).
So I'll have to think a bit about whether or not I want to revert
this commit. To do this properly we would really want to distinguish
between attachments that are indexable, (such as text), and those
that aren't, (such as binaries). I know the mime-type alone isn't
alwas sufficient here as even this little plaintext shell script
was attached as octet-stream.
And if we wanted to get really fancy we could run things like antiword
to generate text from non-text attachments and index their output.
strcmp (disposition->disposition, GMIME_DISPOSITION_ATTACHMENT) == 0)
{
add_term (term_gen.get_document (), "label", "attachment");
+ return;
}
byte_array = g_byte_array_new ();