Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 438A2431FBF; Mon, 23 Nov 2009 18:57:49 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iA4VAJlEEGUF; Mon, 23 Nov 2009 18:57:48 -0800 (PST) Received: from cworth.org (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 4B572431FAE; Mon, 23 Nov 2009 18:57:46 -0800 (PST) From: Carl Worth To: djcb@djcbsoftware.nl In-Reply-To: <87pr79yaz1.wl%djcb@djcbsoftware.nl> References: <87aayggsjp.wl%djcb@djcbsoftware.nl> <87iqd43wot.fsf@yoom.home.cworth.org> <87skc6n3yp.wl%djcb@djcbsoftware.nl> <877htifa0e.fsf@yoom.home.cworth.org> <87pr79yaz1.wl%djcb@djcbsoftware.nl> Date: Mon, 23 Nov 2009 18:57:32 -0800 Message-ID: <87k4xg634z.fsf@yoom.home.cworth.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "notmuch@notmuchmail.org" Subject: Re: [notmuch] interesting project! X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Nov 2009 02:57:49 -0000 On Mon, 23 Nov 2009 09:08:34 +0200, Dirk-Jan C. Binnema wrote: > Well, the counter point to the OOM-problems is that is that in many programs, > the 'malloc returns NULL'-case is often not very well tested (because it's > rather hard to test), and that at least on Linux, it's unlikely that malloc > ever does return NULL. Lennart Poettering wrote this up in some more > detail[1]. Of course, the requirements for notmuch may be a bit different and > I definitely don't want to suggest any radical change here after only finding > out about notmuch a few days ago :) No problem. I'm glad to discuss things. That's how I learn and find out whether my decisions are sound or not. :-) I agree that trying to support OOM doesn't make sense without testing. But that's why I want to test notmuch with memory-fault injection. We've been doing this with the cairo library with good success for a while. As for "unlikely that malloc ever returns NULL", that's simply a system-configuration away (just turn off overcommit). And I can imagine notmuch being used in lots of places, (netbooks, web servers, etc.), so I do want to make it as robust as possible. > (BTW, there is a hashtable implementation in libc, (hcreate(3) etc.). Is that > one not sufficiently 'talloc-friendly'? It's not very user-friendly, but > that's another matter) Thanks for mentioning the hash table. The hash table is one of the few things that I *am* using from glib right now in notmuch. It's got a couple of bizarre things about it: 1. The simpler-appearing g_hash_table_new function is useless for common cases like hashing strings. It will just leak memory. So g_hash_table_new_full is the only one worth using. 2. There are two lookup functions, g_hash_table_lookup, and g_hash_table_lookup_extended. And a program like notmuch really does use the hash table in two ways. In the simpler case, we're using the hash to simply implement a set, (such as avoiding duplicates in a set of tags). In the more complex case, we're associating actual objects with the keys, (such as when linking messages together into a tree for the thread). So, it might make sense if a hash-table interface supported these two modes well. What's bizarre about GHashTable though, is that in the "just a set" case, we only use NULL as the value when inserting. And distinguish "previously inserted with NULL" from "never inserted" is the one thing that g_hash_table_lookup can't do. So I've only found that I could ever use g_hash_table_lookup_extended, (and pass a pair of NULLs for the return arguments I don't need). Fortunately, Eric Anholt spent *his* flight home coding up an nice implementation of an open-addressed hash designed specifically to be a tiny little implementation suitable for copying directly into project. He's testing it with Mesa now, and I might pull it into notmuch later. > I could imagine the string functions could replace the ones in talloc. There > are many more string functions, e.g., for handling file names / paths, which > are quite useful. Then there are wrappers for gcc'isms (G_UNLIKELY etc.) that > would make the ones in notmuch unneeded, and a lot of compatibility things > like G_DIR_SEPARATOR. And the datastructures (GSlice/GList/GHashtable) are > nice. The UTF8 functionality might come in handy. Yes. The portability stuff I think is actually interesting. I've thought it really might make sense to have something that gave you *just* that, (without a main loop, an object system, several memory allocators or pieces for making your own memory allocators, etc). I haven't had a chance to look into gnulib yet, but I'd like to. As for a list, I almost always find it cleaner to be able to just have my own list data structures, (to avoid casts, etc.). And for a hash table, I'm interested in what Eric's doing. I'm really not prejudiced against using code that's already been written, (in spite of what might appear I don't feel the need to re-solve every problem that's already been solved). But I have long thought that we could have better support for a "C programmers toolkit" of commonly needed things than we have before. I definitely like the idea of having tiny, focused libraries that do one thing and do it well, (and maybe even some things so tiny that they are actually designed to be copied into the application---like with gnulib and with Eric's new hash table). > Anyway, I was just curious, people have survived without GLib before, and if > you dislike the OOM-strategy, it's a bit of a no-no of course. Thanks for understanding. :-) And I enjoy the conversation, -Carl