Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 90959431FBC for ; Thu, 25 Feb 2010 13:25:30 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.146 X-Spam-Level: X-Spam-Status: No, score=-0.146 tagged_above=-999 required=5 tests=[AWL=-0.247, BAYES_50=0.001, RDNS_DYNAMIC=0.1] autolearn=no Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id g2a6JlLDNAh5 for ; Thu, 25 Feb 2010 13:25:30 -0800 (PST) Received: from hackervisions.org (67-207-143-141.slicehost.net [67.207.143.141]) by olra.theworths.org (Postfix) with ESMTP id E09F2431FAE for ; Thu, 25 Feb 2010 13:25:29 -0800 (PST) Received: from ool-18bd392a.dyn.optonline.net ([24.189.57.42] helo=localhost) by hv with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.69) (envelope-from ) id 1NklD3-0001in-GA; Thu, 25 Feb 2010 16:25:25 -0500 From: James Vasile To: notmuch@notmuchmail.org Date: Thu, 25 Feb 2010 16:25:16 -0500 Message-ID: <87hbp5j9dv.fsf@hackervisions.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: [notmuch] Initial tagging X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Feb 2010 21:25:30 -0000 I'm slowly groping my way to using the notmuch emacs client as my routine MUA. As I coerce it into tagging and displaying the way I want, the next big question was automatically tagging things and getting them in to notmuch. I'm curious as to what people are doing in this regard. My solution involves cron running a sync_email script. Sync_email does the correct dance to make sure it only ever runs one instance at a time. It also logs to syslog. The script runs offlineimap, a mail_filter script that sorts mail in maildirs (for wanderlust, the MUA I'm hoping to leave behind), and then finally a shell script to do notmuch new and initial tagging. The tagging script uses the inbox tag to identify new mail, tags it according to criteria, then removes the inbox tag from anything it found a match for. Uncategorized mail keeps the inbox tag so I can inspect it later and make rules for it (or tag it manually). Also, prepending "tag:inbox and" to search criteria restricts the tagging to a small subset of the db, which makes the tagging script run fairly quickly. My unexpurgated tagging script has almost 100 rules for tagging, and I expect it to grow over time. ################## notmuch-tag.sh ################ #!/bin/bash bin=/usr/local/bin/notmuch function notmuch { echo $1 while [ 1 -gt 0 ]; do result=`$bin $1 2>&1` regex="already locked" if [[ $result =~ $regex ]]; then echo "Xapian DB busy. Retrying in 2 seconds" else if [ -n "$result" ]; then echo $result fi return fi sleep 2 done } function tag_new { notmuch "tag $1 tag:inbox and ($2)"; } function blacklist { tag_new "-inbox -unread +delete" $1; } notmuch new blacklist "from:xxx@example.com or from yyy@example.com" # voicemail tag_new "-inbox +voicemail" "from:ast@example.com" # friends tag_new "+friend +mathieu" "mathieu or ejm2106 or emily@example.com" tag_new "+friend +balktick" "balktick" # open community services tag_new "+ocs" "open community services or opencommunityservices" # okos tag_new "+okos" "jim and glaser and not LinkedIn" tag_new "+okos" "joshlevy.ny@example.com" tag_new "+okos" "enright@example.com" # book liberator tag_new "+bklib" "wnf@example.com or bkrpr" # joomla tag_new "+osm" "from:waring or to:waring" tag_new "+osm" "from:dave.huelsmann@example.com or to:dave.huelsmann@example.com" tag_new "+osm" "james.vasile@example.com" tag_new "+osm" "joomla" #lists tag_new "+list +notmuch" "to:notmuchmail.org or notmuch" tag_new "+list +stumpwm" "to:stumpwm-devel@nongnu.org or stumpwm" tag_new "+list +bklib" "to:bklib@googlegroups.com" ## Catchalls for sflc, hv, etc. tag_new "+sflc" "not tag:list and not tag:friend and softwarefreedom.org and not tag:osm" tag_new "+sflc" "to:firm@example.com" tag_new "+hv" "hackervisions.org and not tag:list and not tag:friend" tag_new "+gmail" "(to:jvasile@example.com or from:jvasile@example.com) and not tag:list and not tag:friend" ## Mark mine unread tag_new "-unread" "from:james@example.com" tag_new "-unread" "from:vasile@example.com" tag_new "-unread" "from:james.vasile@example.com" ## Remove inbox tag tag_new "-inbox" "tag:sflc or tag:hv or tag:list or tag:osm or tag:okos or tag:friend or tag:bklib" ############# sync_email ######################### #!/bin/sh ## Sync email unless we're already in the process of syncing. SCRIPTNAME=`basename $0` PIDDIR=/home/vasile/var/run/${SCRIPTNAME} PIDFILE=${PIDDIR}/${SCRIPTNAME}.pid ## Do the double-lock with a dir and a pid file if ! mkdir ${PIDDIR} 2>/dev/null; then sleep 3 # give the other process time to write its pid if [ -f ${PIDFILE} ]; then #verify if the process is actually still running under this pid OLDPID=`cat ${PIDFILE}` RESULT=`ps -ef | grep ${OLDPID} | grep ${SCRIPTNAME}` if [ -n "${RESULT}" ]; then logger -s ${SCRIPTNAME} already running! Exiting exit 255 fi fi fi ## Update pid file PID=`ps -ef | grep ${SCRIPTNAME} | head -n1 | awk ' {print $2;} '` echo ${PID} > ${PIDFILE} logger -s filter done, starting offlineimap offlineimap -l /home/vasile/.offlineimap/log logger -s offlineimap done, starting mail filter mairix --unlock /home/vasile/bin/mail_filter.py logger -s mail filter done, starting notmuch tagger /home/vasile/bin/notmuch-tag.sh > /home/vasile/var/log/notmuch logger -s notmuch tagger done sync_email finished ## clean up pid file and dir if [ -f ${PIDFILE} ]; then rm -rf ${PIDDIR} fi