18. Learning the notmuch email system
18.1. Motivation, prerequisites and plan
Motivation
A personal note:
I have used emacs with gnus since the 1990s. Before that I used emacs with vm, before that emacs with pcmail (which seems to be a completely forgotten system, but I remember it being powerful), and before that I used the command line Berkeley email client.
Prerequisites
You have to have incoming mail which saves your mail in Maildir format. I use postfix and configure it to save in ~/Maildir
Then install some packages. On ubuntu:
sudo apt install elpa-notmuch elpa-elfeed alot python3-notmuch
sudo apt install neomutt notmuch afew muchsync gmailieer
Plan
I will start from simple tutorials and get notmuch for emacs set up and record the steps here.
18.2. Tutorial from official notmuch website
18.2.1. Basic notmuch
https://notmuchmail.org/getting-started/
I stopped fetchmail and did a full backup of ~/Maildir/ and then ran:
notmuch new
This came up with several cases of dangling symlinks, which I removed. Then it complained about all the files in ~/Maildir/.zz_mairix-*. This is a virtual newsgroup which GNUS can create sometimes for a certain type of search. I haven’t used it in years and I removed those directories.
Tip
notmuch dump –output=/tmp/notmuch-dump.txt grep -v ‘+inbox – id:’ /tmp/notmuch-dump.txt | grep -v ‘+inbox +unread – id:’ | less
18.3. Trying the tutorial that uses gmailieer
http://www.johnborwick.com/2019/02/09/notmuch-gmailieer.html
I used it and it is good, but it does not really do much more than the basic tutorial. In particular
18.4. Other tutorials
https://bostonenginerd.com/posts/notmuch-of-a-mail-setup-part-2-notmuch-and-emacs/
18.5. notmuch-emacs
18.6. some command line things I used
To remove all new tags from the misc_ folders:
notmuch tag -new folder:/.misc_.*/
To take my old Madildir/.people_ed-fenimore/ folder and my book club folder re-tag it:
notmuch tag -new +people_ed-fenimore folder:/.people_ed-fenimore/
notmuch tag -new +misc_books_book-club folder:/.misc_books_book-club/
To apply that trick to all my people_ folders:
for dirname in .people_*
do
echo "## $dirname"
FNAME=`echo ${dirname} | sed 's/\.//'`
echo "## FNAME $FNAME"
cmd="notmuch tag +${FNAME} folder:/${dirname}/"
echo "echo $cmd"
echo $cmd
echo "sleep 10"
done
In a single line:
(for dirname in .people_*; do FNAME=`echo ${dirname} | sed 's/\.//'`; cmd="notmuch tag +${FNAME} folder:/${dirname}/"; echo "echo $cmd"; echo $cmd; echo "sleep 10"; done) | /bin/sh
And while that is running I need to continuously be running gmi sync so that there isn’t a single huge body of stuff to be sync-ed.
To list all tags:
notmuch search --output=tags '*'
Clever technique for dump/restore:
https://notmuchmail.org/performance/
in this case he uses it to prepare a database version migration, but something analogous could be used for backups.
Another blog post:
18.7. A bit on filtering
Some discussion and an important tip for afew is at:
18.8. The problem of sending messages with long lines
The whole “flowing” thing did not work: if you forget to hit a hard newline, you get chaos.
Here’s a thought:
18.9. A rabbit hole that came from using afew to tag lists
afew will automatically create a mailing-list-specific tag for messages it identifies as come from mailing list software. Unfortunately many mailing list managers do something bad and create one-time mailing list ids, so things go badly wrong. I got this:
1 lists/100021545
1 lists/100026759
1 lists/10064619_21185094
1 lists/10064619_21185175
[...]
1 lists/10420780_83247882
1 lists/10420780_83248202
5 lists/1050979
1 lists/1058860_1042149
1 lists/1058860_1061179
1 lists/1058860_1076993
1 lists/1058860_788804
[...]
1 lists/1058860_994013
31 lists/1064447
1 lists/1064447_3017961
1 lists/1064447_3046870
[...]
13 lists/15a7d2c76830bddc0e3a71c19
2 lists/177351
1 lists/177351_21151920
1 lists/177351_21203161
2 lists/195066e322642c622c0ecdde3
1 lists/1b46f808f42895a1ee80b2c21
7 lists/1e05a646494f869efd43620ea
9 lists/1e87dc
6 lists/1eff2b60d311e246430e6616e
1 lists/1f73a9bbafeb51d30af1157c3dc1499f225e216f
1 lists/247771a698970a278d6bd2dd0
21 lists/2628b46ec044dd10df8388b38
1 lists/28c333051d0ba08b9b938ec0a
4 lists/2b47d251a90a0e64db38daf091d0d4efa596014c
[...]
1 lists/48972_76983102
1 lists/48972_76983428
1 lists/48972_76991775
[...]
lists/97f063777adc6ab53ff77270b7d6ad32e1e9821e
lists/99391e2d9ccd853f1235e5ec5
[...]
lists/a191a0fd5209e6ccd7f546396
lists/a47a307e45501b97d495a84573449668820327ca
lists/a56b31dc46a8b614799ba2120
lists/aavso-hen
lists/activities-announce
[...]
lists/d07f6cc121599f73e7b16646b
lists/d12e91d94403394c3b5d532a7
lists/d333d45ea4faf01c346b4fde5
lists/d63a14e3e1bf973d446a59f79
lists/data_scraping_top500
lists/datefinder
lists/ddd549eebdf9ef7c37a86e515d7ac3a3a66208b6
and so much more. That was ghastly and frustrating, so I came up with combinations of scripting queries and regular expressions to remove those tags from all messages.
First of all the culprit. This line:
[ListMailsFilter]
in ~/.config/afew/config
is what causes the problem, so I took that
out. The authors of afew might want to drop that one.
But I still had all these hundreds of frustrating tags polluting my stuff.
So here’s a series of steps I followed to remove all those tags.
To list all tags that have list/
:
notmuch search --output=tags '*' | grep lists/
notmuch search --output=tags '*' | grep lists/ > taglist
Now you have the file taglist which has all the tags that start with
lists/
but notice that many of those are legitimate. Initially
you think that the bad ones all start with digits (for example
lists/48972_76983428
, and if you can confirm that none of your
legitimate lists start with lists/0 or lists/1 or … lists/9 then you
can start by removing from all messages all tags that start with
lists/[0-9]:
egrep 'lists/[0-9].*' taglist
In my case this gave 386 matches. I surveyed them visually to make sure that there were no legitimate tags, and then I started scripting the deletion.
Note that I did not want to do all the deletions in a single pass. This is because it makes for a huge sync up to my mail server, so I broke it up.
egrep 'lists/[0-9].*' taglist > taglist-digit
for tag in `cat taglist-digit`
do
cmd="notmuch tag -${tag} -- tag:${tag}"
echo "# cmd: $cmd"
echo $cmd
done
That will print commands that remove tags. Note that they will only
be the tags that start with list/DIGIT. To spot the others you see
some other patterns: many look like
lists/cf8406ac27f85deded2833350
which has a bunch of hex digits
(25 of them or more). The regular expression ‘lists/[0-9a-f]{25}’
matches that, so we do:
egrep 'lists/[0-9a-f]{25}' taglist > taglist-h25
for tag in `cat taglist-h25`
do
cmd="notmuch tag -${tag} -- tag:${tag}"
echo "# cmd: $cmd"
echo $cmd
done
That seems to take care of almost everything.