rss2email.git
11 years agoBump to version 3.2 v3.2
W. Trevor King [Wed, 13 Mar 2013 13:50:37 +0000 (09:50 -0400)]
Bump to version 3.2

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoCHANGELOG: Update with summaries of recent changes
W. Trevor King [Wed, 13 Mar 2013 13:46:38 +0000 (09:46 -0400)]
CHANGELOG: Update with summaries of recent changes

This documents the following changes:
f01eac2 (email: Make path to sendmail configurable, 2013-02-15)
1c97270 (email: Attempt .as_string() if BytesGenerator.flatten()
  fails, 2013-02-16)
e08e198 (email: Decode headers when checking .as_string() flatten
  fallback, 2013-02-17)
3adef87 (config: Use extended interpolation in Config, 2013-02-21)

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoREADME: Link to the new openSUSE package
W. Trevor King [Sat, 2 Mar 2013 19:40:25 +0000 (14:40 -0500)]
README: Link to the new openSUSE package

On Sat, Mar 02, 2013 at 08:54:11AM -0800, Arun Persaud wrote:
> I managed to get rss2email 3.1 packaged for opensuse. It's available at
>
> http://download.opensuse.org/repositories/server:/mail/
>
> for 12.2, 12.3 and Factory in case you want to mention it on the project
> page. It doesn't build on older distros, mostly due to missing
> dependencies, but on 12.1 the build fails with
>
> ...
>
> The package is developed at:
>
> https://build.opensuse.org/package/show?package=rss2email&project=server%3Amail

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoconfig: Use extended interpolation in Config
W. Trevor King [Thu, 21 Feb 2013 11:11:19 +0000 (06:11 -0500)]
config: Use extended interpolation in Config

This avoids triggering accidental interpolation errors when your URL
contains percent signs (e.g. %2F).  Curly braces, on the other hand,
will never appear in an encoded URL.  From RFC 1738:

  Unsafe:
  ... Other characters are unsafe because gateways and other transport
  agents are known to sometimes modify such characters. These
  characters are "{", "}", "|", "\", "^", "~", "[", "]", and "`".

  All unsafe characters must always be encoded within a URL.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Decode headers when checking .as_string() flatten fallback
W. Trevor King [Sun, 17 Feb 2013 15:49:41 +0000 (10:49 -0500)]
email: Decode headers when checking .as_string() flatten fallback

A naive dict-comparison fails if any header fields are encoded
following RFC 2047.  For example,

  '=?iso-8859-1?q?this=20is=20some=20text?='

will not compare equal to an email.header.Header instance.  By
decoding everything to Unicode strings before comparing the header
fields, we can see if the underlying data matches.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Attempt .as_string() if BytesGenerator.flatten() fails
W. Trevor King [Sat, 16 Feb 2013 14:06:38 +0000 (09:06 -0500)]
email: Attempt .as_string() if BytesGenerator.flatten() fails

Before converting BytesGenerator in 8a907f9 (email: Fix _flatten()
implementation for non-ASCII bodies, 2013-01-23), we used to flatten
emails with message.as_string().  BytesGenerator should be the more
robust approach, but it is, unfortunately, broken with respect to
Unicode payloads [1,2,3,4].  This makes the use-8bit setting pretty
useless.

Until we find a clean fix for BytesGenerator, fall back on the earlier
.as_string() approach where possible.  We check the feasibility of the
fallback by performing a quasi-round-trip and comparing a message
recovered from the byte-encoded form with the original message.  If
the recovered version does not match the original message, we reraise
the BytesGenerator.flatten() error.  This fallback should work for
any charset who's mapping for ASCII characters is a no-op.

One benefit of this altered approach is that we no longer need to
encode the payload when we set it up in get_message().  This "Unicode
inside--encode on output" approach doesn't smell as much as the old
approach ;).

The new fallback will probably die screaming if you try and flatten a
multipart message, but we don't do that in rss2email.  Hopefully, the
upstream issues with the email library will be sorted out in the near
future...

[1]: http://thread.gmane.org/gmane.comp.python.general/725425
[2]: http://bugs.python.org/issue16324
[3]: http://bugs.python.org/issue12553
[4]: http://bugs.python.org/issue12552#msg140294

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Fix typo '\\n' -> '\n' in _flatten docstring.
W. Trevor King [Sat, 16 Feb 2013 13:22:10 +0000 (08:22 -0500)]
email: Fix typo '\\n' -> '\n' in _flatten docstring.

I seem to have forgotten that the docstring is raw (`r"""`) when I
wrote the original tests.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Make path to sendmail configurable
W. Trevor King [Fri, 15 Feb 2013 14:16:39 +0000 (09:16 -0500)]
email: Make path to sendmail configurable

For example, Azer Koçulu has an rss2email fork that uses msmtp [1].

[1]: https://github.com/azer/rss2email

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoCHANGELOG: Update after the release of 3.0 and 3.1
W. Trevor King [Fri, 15 Feb 2013 13:13:34 +0000 (08:13 -0500)]
CHANGELOG: Update after the release of 3.0 and 3.1

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoBump to version 3.1 v3.1
W. Trevor King [Thu, 14 Feb 2013 13:34:03 +0000 (08:34 -0500)]
Bump to version 3.1

Changes since 3.0:
* Import __url__, __author__, and __email__ in rss2email.error, which
  fixes bugs formatting a number of errors.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoerror: Import __*__ metadata (URL, author, email)
W. Trevor King [Thu, 14 Feb 2013 13:28:35 +0000 (08:28 -0500)]
error: Import __*__ metadata (URL, author, email)

These are used to format some of the log messages.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agosetup.py: Use __url__ and __author__ from rss2email/__init__.py
W. Trevor King [Thu, 14 Feb 2013 13:24:08 +0000 (08:24 -0500)]
setup.py: Use __url__ and __author__ from rss2email/__init__.py

Instead of duplicating those values locally.  This gives us one less
place to forget to update the next time we change the website or
maintainer.

Also, update __url__ to point to my GitHub repository and shift the
email address to rss2email.__email__.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoBump to version 3.0 v3.0
W. Trevor King [Wed, 13 Feb 2013 14:32:40 +0000 (09:32 -0500)]
Bump to version 3.0

Changes since 2.71:
* State storage split into a static configuration file (usually
  `~/.config/rss2email.cfg`) and a dynamic JSON data file (usually
  `~/.local/share/rss2email.json`).
* The static configuration file is parsed with Python's ConfigParser
  class, which allows for default settings that can be overridden on a
  global or per-feed basis.  You'll have to translate your old config
  to the new format by hand when you upgrade.
* Emailed messages now have Message-IDs.
* Feeds can be indexed by name as well as index (e.g.
  `r2e run my-feed`).
* Restructured as a package with submodules instead of a single
  module.  This makes dependencies between various portions of
  rss2email more explicit.
* Converted to Python >=3.2, for more consistent Unicode handling,
  exception chaining, and argparse (although argparse is also in 2.7).
* Packaged with setup.py and distutils, in case you want to install
  rss2email instead of running it from a Git checkout or unpacked
  tarball.
* Added a test suite (run with `./test/test.py`).
* Added a man page, based on the version in the Debian package.
* Require Signed-off-by lines in new commit messages, following the
  Linux and Git projects.
* Assorted cleanups and bug fixes.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agocommand: In run(), save feeds even after errors
W. Trevor King [Wed, 13 Feb 2013 14:12:05 +0000 (09:12 -0500)]
command: In run(), save feeds even after errors

It's annoying to have a few feeds processed successfully and then have
one feed with a configuration error take down the process without
saving.  With this commit, we always safe the feeds, regardless of any
error.  We also catch and log any RSS2EmailError, not just the
NoToEmailAddress and ProcessingErrors we caught earlier.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoconfig: Fix 'significan' -> 'significant' typo in a comment
W. Trevor King [Thu, 24 Jan 2013 22:51:13 +0000 (17:51 -0500)]
config: Fix 'significan' -> 'significant' typo in a comment

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Encode the body when we might use 8bit encoding
W. Trevor King [Thu, 24 Jan 2013 05:30:04 +0000 (00:30 -0500)]
email: Encode the body when we might use 8bit encoding

R. David Murray writes [1]:
> In 2.x that will work, and will give you the 8bit CTE at need, as
> long as you pass encoded text to MIMEText (as opposed to
> unicode...which it doesn't really handle correctly if I recall
> right).

See also, [2].

[1]: http://bugs.python.org/issue12552
     email.MIMEText overide BASE64 for utf8 charset
[2]: http://bugs.python.org/issue12553
     Add support for using a default CTE of '8bit' to MIMEText

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Fix _flatten() implementation for non-ASCII bodies
W. Trevor King [Thu, 24 Jan 2013 04:41:50 +0000 (23:41 -0500)]
email: Fix _flatten() implementation for non-ASCII bodies

The email header should be flattened to ASCII with funky encoded
headers [1], but the body may be encoded in a non-ASCII-compatible
charset (e.g. UTF-16-LE).  The old _flatten() implementation used the
body charset to encode the entire message, which could garble the
header.  This patch uses BytesGenerator, which takes advantage of
email.charset.Charset's separate fields for the header encoding and
body encoding.

[1]: http://docs.python.org/3/library/email.header.html

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Add a failing UTF-16 _flatten example
W. Trevor King [Thu, 24 Jan 2013 04:12:01 +0000 (23:12 -0500)]
email: Add a failing UTF-16 _flatten example

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Factor message-to-bytes formatting out into _flatten()
W. Trevor King [Thu, 24 Jan 2013 03:59:55 +0000 (22:59 -0500)]
email: Factor message-to-bytes formatting out into _flatten()

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Use Charsets to set the Content-Transfer-Encoding
W. Trevor King [Thu, 24 Jan 2013 04:39:44 +0000 (23:39 -0500)]
email: Use Charsets to set the Content-Transfer-Encoding

This ensures that payload encoding/decoding happens appropriately, and
allows 7-bit-clean data to be sent with a 7bit CTE, even when the
use-8bit setting is on.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Alphabetize imports (swap email.mime and email.header)
W. Trevor King [Thu, 24 Jan 2013 04:02:41 +0000 (23:02 -0500)]
email: Alphabetize imports (swap email.mime and email.header)

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agomain: Show traceback when we're extra verbose
W. Trevor King [Thu, 24 Jan 2013 01:17:00 +0000 (20:17 -0500)]
main: Show traceback when we're extra verbose

Adding --verbose (or -V) flags moves the logger from ERROR to WARNING
(-V), INFO (-VV), and DEBUG (-VVV).  Additional increments were
ignored, but I don't like always masking tracebacks.  This patch sets
an additional verbosity level (-VVVV) which logs at DEBUG and
additionally prints exception tracebacks instead of hiding them.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: Convert missing/extra key errors to InvalidFeedConfig
W. Trevor King [Thu, 24 Jan 2013 01:03:51 +0000 (20:03 -0500)]
feed: Convert missing/extra key errors to InvalidFeedConfig

This way we get the message and not a full traceback, to avoid scaring
users who aren't familiar with Python tracebacks.  Theres not much
information to go on in the new message, but if you crank up the
verbosity, you get:

  $ PYTHONPATH=. ./r2e -c conf -d data -VVV list
  load feed configuration from ['conf']
  loaded configuration from ['conf']
  load feed data from data
  extra configuration key: use_8bit

which seems good enough for me.

Reported-by: Dmitry Bogatov <KAction@gnu.org>
Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: When setting an 8bit CTE, remove the old header
W. Trevor King [Wed, 23 Jan 2013 23:00:57 +0000 (18:00 -0500)]
email: When setting an 8bit CTE, remove the old header

From the docs [1]:

  Note that this does not overwrite or delete any existing header with
  the same name. If you want to ensure that the new header is the only
  one present in the message with field name name, delete the field
  first, e.g.:

    del msg['subject']
    msg['subject'] = 'Python roolz!'

[1]: http://docs.python.org/3/library/email.message.html#email.message.Message.__setitem__

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agotest/bbc-chinese: Add tests for use-8bit
W. Trevor King [Wed, 23 Jan 2013 23:00:02 +0000 (18:00 -0500)]
test/bbc-chinese: Add tests for use-8bit

The BBC's Chinese feed should have a few non-ASCII characters in it
;).

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoconfig: Rename use_8bit to use-8bit for uniformity
W. Trevor King [Wed, 23 Jan 2013 22:54:34 +0000 (17:54 -0500)]
config: Rename use_8bit to use-8bit for uniformity

We use hyphens in all the other config settings.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: Pass config and section arguments to get_message()
W. Trevor King [Wed, 23 Jan 2013 22:51:50 +0000 (17:51 -0500)]
feed: Pass config and section arguments to get_message()

Otherwise it will always use the default config.

Also add section fallback code to get_message in case the
feed-specific section is not in the config file.  This is useful for
testing, although in production every feed should have a section to
hold it's URL.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoAdd use of `sender' parameter in `sendmail_send'.
Dmitry Bogatov [Wed, 23 Jan 2013 20:36:02 +0000 (00:36 +0400)]
Add use of `sender' parameter in `sendmail_send'.

Signed-off-by: Dmitry Bogatov <KAction@gnu.org>
11 years agoAdd 8bit Content-Transfer-Encoding support.
Dmitry Bogatov [Wed, 23 Jan 2013 20:35:04 +0000 (00:35 +0400)]
Add 8bit Content-Transfer-Encoding support.

Signed-off-by: Dmitry Bogatov <KAction@gnu.org>
11 years agofeeds: Raise an RSS2EmailError on invalid Feeds.index() arguments
W. Trevor King [Wed, 23 Jan 2013 14:13:43 +0000 (09:13 -0500)]
feeds: Raise an RSS2EmailError on invalid Feeds.index() arguments

Don't confuse non-Python folks by giving a traceback for this usage
error.

Reported-by: Dmitry Bogatov <KAction@gnu.org>
Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoemail: Don't assume `extra_headers` has content in get_message()
W. Trevor King [Sat, 19 Jan 2013 18:22:49 +0000 (13:22 -0500)]
email: Don't assume `extra_headers` has content in get_message()

The default is None, so we should at least handle that case
gracefully.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: Remove the for loop variable `e` for a clean namespace
W. Trevor King [Sat, 19 Jan 2013 00:13:15 +0000 (19:13 -0500)]
feed: Remove the for loop variable `e` for a clean namespace

Otherwise:

  >>> import rss2email.feed
  >>> print([x for x in dir(rss2email) if not x.startswith('_')])
  ['Feed', 'e']

which might confuse people ;).

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agomain: Catch command-less case for Python 3.3
W. Trevor King [Fri, 18 Jan 2013 21:46:23 +0000 (16:46 -0500)]
main: Catch command-less case for Python 3.3

In Python 3.2, the argument parser raises an error if no subcommand is
listed on the command line.  This does not seem to be the case with
Python 3.3.0, and the changed behavior seems to have been a side
effect of this:

  http://hg.python.org/cpython/rev/cab204a79e09
  changeset:   70741:cab204a79e09
  user:        R David Murray <rdmurray@bitdance.com>
  date:        Thu Jun 09 12:34:07 2011 -0400
  summary:     #10424: argument names are now included in the missing argument mes

Anyhow, it's easy enough to catch the new behaviour in rss2email and
print the appropriate error.

Reported-by: Dmitry Bogatov <KAction@gnu.org>
Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agorss2email: Sort __contributors__ by first name and add missing folks
W. Trevor King [Fri, 18 Jan 2013 03:07:45 +0000 (22:07 -0500)]
rss2email: Sort __contributors__ by first name and add missing folks

This brings __contributors__ in line with the auto-generated AUTHORS.
I'm not convinced that listing __contributors__ in a Python-parsable
manner is worth the trouble, but I'll leave it in for now.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed|feeds: Update datafile format to version 2
W. Trevor King [Thu, 10 Jan 2013 16:05:48 +0000 (11:05 -0500)]
feed|feeds: Update datafile format to version 2

We may want to store additional data about previously seen entries
besides our possibly auto-generated ID.  Convert the `seen` mapping
from:

  entry_id -> our_id

to:

  entry_id -> {'id': our_id}

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agor2e.1: Update ConfigParser URL after PEP 430
W. Trevor King [Thu, 10 Jan 2013 13:57:17 +0000 (08:57 -0500)]
r2e.1: Update ConfigParser URL after PEP 430

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agor2e.1: Update __contributors__ location and direct bugs to the mailing list
W. Trevor King [Thu, 10 Jan 2013 13:39:52 +0000 (08:39 -0500)]
r2e.1: Update __contributors__ location and direct bugs to the mailing list

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agor2e.1: Update maintainer from Lindsey to me.
W. Trevor King [Thu, 10 Jan 2013 13:35:16 +0000 (08:35 -0500)]
r2e.1: Update maintainer from Lindsey to me.

This should have happened in:

  commit 6460e8738b5e7c66df6c2143e8a29c048cf308bd
  Author: W. Trevor King <wking@tremily.us>
  Date:   Fri Nov 9 07:46:17 2012 -0500

    Change maintainer from Lindsey to Trevor.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeeds: Follow the the XDG Base Directory Specification
W. Trevor King [Thu, 10 Jan 2013 15:09:26 +0000 (10:09 -0500)]
feeds: Follow the the XDG Base Directory Specification

This splits config files and data files into different directories (by
default), so we no longer need an rss2email subdirectory (we only have
one config file and one data file).  The default config file is now
~/.config/rss2email.cfg and the default data file is now
~/.local/share/rss2email.json.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeeds: Use JSON instead of Pickle for storing dynamic feed state
W. Trevor King [Thu, 10 Jan 2013 15:03:41 +0000 (10:03 -0500)]
feeds: Use JSON instead of Pickle for storing dynamic feed state

It's safer, more portable, and possibly faster.  I've also added
version information to the data file for easier future upgrades.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: Split entry link extraction out into Feed._get_entry_link
W. Trevor King [Thu, 10 Jan 2013 11:43:05 +0000 (06:43 -0500)]
feed: Split entry link extraction out into Feed._get_entry_link

Now other methods can all access the same link, without having to
extract it in _process_entry and pass the extracted link around
explicitly.  Currently, nothing special happens during link
extraction, but the new method helps future proof us.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoRan update-copyright.py
W. Trevor King [Wed, 9 Jan 2013 16:25:35 +0000 (11:25 -0500)]
Ran update-copyright.py

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agomain: Add missing `# Copyright` tag for update-copyright.py
W. Trevor King [Wed, 9 Jan 2013 16:22:48 +0000 (11:22 -0500)]
main: Add missing `# Copyright` tag for update-copyright.py

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoversion: Add get_versions and teach rss2email --full-version
W. Trevor King [Wed, 9 Jan 2013 16:20:30 +0000 (11:20 -0500)]
version: Add get_versions and teach rss2email --full-version

This makes it easier for users to submit useful bug reports.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agomain: Add an explicit --version argument.
W. Trevor King [Wed, 9 Jan 2013 16:01:17 +0000 (11:01 -0500)]
main: Add an explicit --version argument.

Use an explicit `version` action instead of the undocumented (and
deprecated) `version` keyword to ArgumentParser [1,2].

[1]: http://docs.python.org/3.3/library/argparse.html#action
[2]: http://docs.python.org/3.3/library/argparse.html#upgrading-optparse-code

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoerror: Strip trailing whitespace in module docstring
W. Trevor King [Wed, 9 Jan 2013 15:43:24 +0000 (10:43 -0500)]
error: Strip trailing whitespace in module docstring

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: Raise the new InvalidFeedConfig on missing feed.url
W. Trevor King [Wed, 9 Jan 2013 15:27:22 +0000 (10:27 -0500)]
feed: Raise the new InvalidFeedConfig on missing feed.url

You can't fetch a feed without a URL.  This new error message makes
the cause explicit, compared to the somewhat ambigious former error
messages:

  fetch $NAME (None -> $TO)
  process $NAME (None -> $TO)
  HTTP status 200
  could not get HTTP headers: $NAME (None -> $TO)
  unrecognized version: $NAME (None -> $TO)
  sax parsing error: <unknown>:2:0: no element found: $NAME (None -> $TO)

I was getting URL-less feeds when I clobbered my
~/.config/rss2email/config [1], removing some newer entries.  However,
because I never deleted the feeds explicitly, they were repopulated
(without their URL) from ~/.config/rss2email/feeds.dat, and subsequent
runs generated the above error.

[1]: The clobbering was related to my dotfile management, and not due
to an rss2email issue.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agotest:disqus: add a Disqus feed for testing
W. Trevor King [Tue, 11 Dec 2012 01:11:38 +0000 (20:11 -0500)]
test:disqus: add a Disqus feed for testing

This raised a few issues with the handling of missing IDs, which I've
just fixed.  The new test will make sure we keep exercising these code
paths.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: fix id fallback in Feed._process_entry
W. Trevor King [Tue, 11 Dec 2012 01:07:35 +0000 (20:07 -0500)]
feed: fix id fallback in Feed._process_entry

The old `entry['id'] or _id` raised a KeyError when the entry lacked
an ID.  The new `entry.get('id', _id)` falls back appropriately.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: use hashlib.sha1 for fallback IDs in Feed._get_entry_id
W. Trevor King [Tue, 11 Dec 2012 01:00:35 +0000 (20:00 -0500)]
feed: use hashlib.sha1 for fallback IDs in Feed._get_entry_id

This fixes what I broke in

  commit 0fc2ff4465d741823b3dceebcfcf3a98a0081522
  Author: W. Trevor King <wking@tremily.us>
  Date:   Thu Oct 4 08:33:32 2012 -0400

    Cleanup global module configuration.

  diff --git a/rss2email.py b/rss2email.py
  index 216c13d..3919bd7 100755
  --- a/rss2email.py
  +++ b/rss2email.py
  ...
  -hash = hashlib.md5
  ...

I've come back with sha1 instead of md5, mostly because that's what
Git uses ;).  We don't migrate recorded IDs from the old configuration
method, so the change from md5 to sha1 shouldn't affect anyone.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: fix Feed._get_entry_content unpacking in Feed._get_entry_title
W. Trevor King [Mon, 10 Dec 2012 23:23:51 +0000 (18:23 -0500)]
feed: fix Feed._get_entry_content unpacking in Feed._get_entry_title

_get_entry_content returns a single dict, but _get_entry_id had been
unpacking it as if it was an object with a `content` attribute.

Also convert HTML to text before extracting the title, to avoid things
like `<p>In the beginning...` in the title.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: fix Feed._get_entry_content unpacking in Feed._get_entry_id
W. Trevor King [Mon, 10 Dec 2012 23:21:34 +0000 (18:21 -0500)]
feed: fix Feed._get_entry_content unpacking in Feed._get_entry_id

_get_entry_content returns a single dict, but _get_entry_id had been
unpacking it as if it was a length-two tuple.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agofeed: fix the `type` key returned by Feed._get_entry_content
W. Trevor King [Mon, 10 Dec 2012 23:13:40 +0000 (18:13 -0500)]
feed: fix the `type` key returned by Feed._get_entry_content

The previous version used the Python object `type` where it should
have used the string 'type'.  I hadn't caught the bug before because
none of my example feeds fell through that far.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agotest:gmane:README: fix "feed.atom" -> "feed.rss" typo
W. Trevor King [Mon, 10 Dec 2012 23:06:45 +0000 (18:06 -0500)]
test:gmane:README: fix "feed.atom" -> "feed.rss" typo

Probably a copy-paste error from seeding with
test/allthingsrss/README.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoLink to Debian & Ubuntu package pages
Etienne Millon [Mon, 10 Dec 2012 14:36:36 +0000 (15:36 +0100)]
Link to Debian & Ubuntu package pages

(instead of search)

Signed-off-by: Etienne Millon <me@emillon.org>
11 years agoDon't put UTF8 last in the list of encodings
Etienne Millon [Mon, 10 Dec 2012 14:02:03 +0000 (15:02 +0100)]
Don't put UTF8 last in the list of encodings

Sometimes, BIG5 will be selected for english text if quotes make it
not representable in ASCII. See [1] for the original bug report.

This default list is arguably European-centric but at least it documents a good
amount of the alternative encodings.

[1]: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=659920

Reported-by: James Cloos <cloos@jhcloos.com>
Signed-off-by: Etienne Millon <me@emillon.org>
11 years agoChange default email address
Etienne Millon [Mon, 10 Dec 2012 13:50:21 +0000 (14:50 +0100)]
Change default email address

As discussed in [1], user@rss2email.invalid is less verbose and simpler to
understand.

[1]: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=464077

Reported-by: Noah Slater <nslater@bytesexual.org>
Signed-off-by: Etienne Millon <me@emillon.org>
11 years agoCOPYING: whitespace changes to match the new gnu.org download
W. Trevor King [Sat, 8 Dec 2012 16:13:24 +0000 (11:13 -0500)]
COPYING: whitespace changes to match the new gnu.org download

No sense in preserving differences between this branch and the version
in the `license` branch.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoCONTRIBUTING.md: describe rss2email's GPLv{2,3}
W. Trevor King [Sat, 8 Dec 2012 16:10:50 +0000 (11:10 -0500)]
CONTRIBUTING.md: describe rss2email's GPLv{2,3}

We want people using s-o-b to be very clear about which licenses they
should use for rss2email submissions.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoMerge remote-tracking branch 's-o-b/contributing-github'
W. Trevor King [Sat, 8 Dec 2012 15:52:37 +0000 (10:52 -0500)]
Merge remote-tracking branch 's-o-b/contributing-github'

11 years agoCONTRIBUTING.md: point to external docs on GitHub
W. Trevor King [Sat, 8 Dec 2012 15:32:48 +0000 (10:32 -0500)]
CONTRIBUTING.md: point to external docs on GitHub

This translates CONTRIBUTING (from the `contributing` branch) into
Markdown using a GitHub URL for the link.  Merging this into your
project will set it up to use GitHub's CONTRIBUTING infrastructure,
and will be used to notify users creating issues and pull requests
[1].

GitHub's blob URL syntax is [2]:

  https://github.com/<user>/<project>/blob/<commit-SHA-1>/Path/To/File

Like CONTRIBUTING, CONTRIBUTING.md is also released under the CC0
Universal license (see the `license` branch for full text).

[1]: https://github.com/blog/1184-contributing-guidelines
[2]: https://help.github.com/articles/how-do-i-get-a-permanent-link-from-file-view-to-permanent-blob-url

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agoCONTRIBUTING: point to external docs (for non GPLv2= projects)
W. Trevor King [Mon, 26 Nov 2012 14:31:02 +0000 (09:31 -0500)]
CONTRIBUTING: point to external docs (for non GPLv2= projects)

If a project wants to use the DCO/s-o-b workflow, but can't because of
license incompatibility, they can include this blurb pointing towards
the external documentation.  To avoid licensing issues with this
CONTRIBUTING file itself, I'm releasing it under the Creative Commons
CC0 1.0 Universal license (see the `license` branch for full text).

If you use this in your project, you'll probably want to adjust the
URL to point to somewhere more dependable.  It's annoyingly long, but
you should include the blob hash (or a higher level hash like the
commit hash) in your URL to make it absolutely clear which version of
the documentation you were using in your project at a particular time.

Signed-off-by: W. Trevor King <wking@tremily.us>
11 years agorss2email: raise error on import with Python < 3.2.
W. Trevor King [Sun, 18 Nov 2012 17:19:15 +0000 (12:19 -0500)]
rss2email: raise error on import with Python < 3.2.

rss2email won't work with older Pythons.  Avoid user confusion due to
API-breakage error messages [1] by bailing explicitly up front.

[1]; http://forums.macrumors.com/showthread.php?t=1216694

11 years agoCHANGELOG: summarize changes since v2.71
W. Trevor King [Sun, 18 Nov 2012 15:17:44 +0000 (10:17 -0500)]
CHANGELOG: summarize changes since v2.71

11 years agotest: record sender as SENT BY (not SENT TO) in *.expected
W. Trevor King [Sun, 18 Nov 2012 14:12:34 +0000 (09:12 -0500)]
test: record sender as SENT BY (not SENT TO) in *.expected

11 years agotest:gmane: add Gmane feed for RSS testing
W. Trevor King [Sun, 18 Nov 2012 14:09:32 +0000 (09:09 -0500)]
test:gmane: add Gmane feed for RSS testing

The weird indentation is because Gmane wraps the descriptions in <pre>
tags (to preserve formatting in the initial email).  html2text is
converting the description to Markdown, so it inserts a leading 4
spaces for preformatted blocks.  The unindented initial line and
following blank are due to a bug in html2text, for which I've
submitted

  https://github.com/aaronsw/html2text/pull/63

11 years agoCHANGELOG: strip trailing whitespace
W. Trevor King [Sun, 18 Nov 2012 13:42:53 +0000 (08:42 -0500)]
CHANGELOG: strip trailing whitespace

11 years agosetup.py: list dependencies in setup(requires=...)
W. Trevor King [Sun, 18 Nov 2012 12:22:38 +0000 (07:22 -0500)]
setup.py: list dependencies in setup(requires=...)

11 years agosetup.py: use mailing list address for maintenance
W. Trevor King [Sun, 18 Nov 2012 12:20:04 +0000 (07:20 -0500)]
setup.py: use mailing list address for maintenance

11 years agoREAMDE: point to rss2email/config.py, not a line number in rss2email.py
W. Trevor King [Sun, 18 Nov 2012 12:10:48 +0000 (07:10 -0500)]
REAMDE: point to rss2email/config.py, not a line number in rss2email.py

11 years agoREADME: use GitHub URL for git clone
W. Trevor King [Sun, 18 Nov 2012 12:04:33 +0000 (07:04 -0500)]
README: use GitHub URL for git clone

11 years agoREADME: use Gmane for example feed
W. Trevor King [Sun, 18 Nov 2012 12:03:45 +0000 (07:03 -0500)]
README: use Gmane for example feed

11 years agoREADME: add mailing list information (and link to Gmane)
W. Trevor King [Sun, 18 Nov 2012 11:38:16 +0000 (06:38 -0500)]
README: add mailing list information (and link to Gmane)

11 years agomain: fix logging imports
W. Trevor King [Tue, 13 Nov 2012 17:34:55 +0000 (12:34 -0500)]
main: fix logging imports

This should have happened in:

  commit 066602efa088b4a89d67e23011613b4459db3c92
  Author: W. Trevor King <wking@tremily.us>
  Date:   Tue Nov 13 09:09:00 2012 -0500

    rss2email: split massive package into modules

11 years agoRun update-copyright.py
W. Trevor King [Tue, 13 Nov 2012 15:15:47 +0000 (10:15 -0500)]
Run update-copyright.py

11 years ago.update-copyright.conf: add copyright configuration.
W. Trevor King [Tue, 13 Nov 2012 15:14:44 +0000 (10:14 -0500)]
.update-copyright.conf: add copyright configuration.

Use my external update-copyright package to maintain copyright blurbs.

http://pypi.python.org/pypi/update-copyright/

11 years ago.mailmap: map Lindsey's usernames to a canonical form
W. Trevor King [Tue, 13 Nov 2012 15:14:04 +0000 (10:14 -0500)]
.mailmap: map Lindsey's usernames to a canonical form

11 years agoCOPYING: add GPLv2 text
W. Trevor King [Tue, 13 Nov 2012 14:44:43 +0000 (09:44 -0500)]
COPYING: add GPLv2 text

11 years agosetup.py: add a comma to the end of the url kwarg
W. Trevor King [Tue, 13 Nov 2012 14:12:27 +0000 (09:12 -0500)]
setup.py: add a comma to the end of the url kwarg

This should have happened in:

  commit cd5b8f30942c72a0fd1b82f4763f21cfaad3864b
  Author: W. Trevor King <wking@tremily.us>
  Date:   Mon Nov 12 15:59:09 2012 -0500

    Convert homepage/downloads from allthingsrss.com to GitHub

11 years agorss2email: split massive package into modules
W. Trevor King [Tue, 13 Nov 2012 14:09:00 +0000 (09:09 -0500)]
rss2email: split massive package into modules

11 years agotest: fix tests to handle Message-IDs
W. Trevor King [Tue, 13 Nov 2012 14:01:40 +0000 (09:01 -0500)]
test: fix tests to handle Message-IDs

This should have happened in:

  commit 29f8b8813e1f464b7171ea1830b6ed40ec24dac2
  Author: W. Trevor King <wking@tremily.us>
  Date:   Sat Oct 27 11:09:45 2012 -0400

    rss2email: add Message-IDs so I can link messages in Mutt

11 years agorss2email: convert infogami URL to GitHub URL
W. Trevor King [Tue, 13 Nov 2012 12:52:50 +0000 (07:52 -0500)]
rss2email: convert infogami URL to GitHub URL

11 years agoREADME: GitHub's markup renderer chokes on Unicode ellipses in code blocks
W. Trevor King [Tue, 13 Nov 2012 12:48:58 +0000 (07:48 -0500)]
README: GitHub's markup renderer chokes on Unicode ellipses in code blocks

11 years agoREADME.rst: add symlink for GitHub rendering
W. Trevor King [Tue, 13 Nov 2012 12:23:47 +0000 (07:23 -0500)]
README.rst: add symlink for GitHub rendering

11 years agoConvert homepage/downloads from allthingsrss.com to GitHub
W. Trevor King [Mon, 12 Nov 2012 20:59:09 +0000 (15:59 -0500)]
Convert homepage/downloads from allthingsrss.com to GitHub

11 years agorss2email: reorder NoValidEncodingError superclasses
W. Trevor King [Mon, 12 Nov 2012 20:54:19 +0000 (15:54 -0500)]
rss2email: reorder NoValidEncodingError superclasses

This avoids:

  TypeError: NoValidEncodingError does not take keyword arguments

11 years agoMerge Lindsey's recent advances with my master
W. Trevor King [Mon, 12 Nov 2012 20:39:54 +0000 (15:39 -0500)]
Merge Lindsey's recent advances with my master

Conflicts:
.gitignore    (use my master, dumping Lindsey's changes)
readme.html   (remove in favor of my README)
rss2email.py  (use my master, bringing in the following changes from
               Lindsey's branch)

  commit 42fa878f929c6ee39c8068d186de1ae7f4630638
  Author: Lindsey Smith <lindsey.smith@gmail.com>
  Date:   Mon Mar 14 09:18:42 2011 -0700

    Improved basic email validation

Incorperated into Feed._validate_email.

  commit 38b33a71e049f29743ce1805b82d4f048e466ce6
  Author: Lindsey Smith <lindsey.smith@gmail.com>
  Date:   Mon Mar 14 09:19:12 2011 -0700

    Initial revision

Incorperated into the Feed._validate_email doctest.

  commit a55809611f9706e7004791c060bd7e5a90a2dcb9
  Author: U-SEVEN\lindsey <lindsey.smith@gmail.com>
  Date:   Fri Jun 24 11:07:22 2011 -0700

    Better attribute handling. Factored out tag handling into getTags()

This wasn't an atomic commit.  I dropped the BeautifulSoup HTML
cleanup to avoid an additional dependency.  I skipped the r → fullfeed
change in getName because I'd already rewritten that function as
Feed._get_entry_name.  I tweaked Feed._get_entry_tags to also handle
termless-tags.

  commit db2ac2f7a76cf93022363ac103f17d7ec71927f9
  Author: U-SEVEN\lindsey <lindsey.smith@gmail.com>
  Date:   Fri Jun 24 11:08:19 2011 -0700

    Added tests for getTags()

Merged getName tests into the Feed._get_entry_name doctest and merged
tag tests into the Feed._get_entry_tags doctest.

The other commits in Lindsey's branch didn't have anything that is
still useful after my refactoring.

11 years agoMerge Lindsey's tip (development since v2.71)
W. Trevor King [Mon, 12 Nov 2012 19:53:58 +0000 (14:53 -0500)]
Merge Lindsey's tip (development since v2.71)

11 years agoMerge Lindsey's v2.71 into my v2.71
W. Trevor King [Mon, 12 Nov 2012 19:47:06 +0000 (14:47 -0500)]
Merge Lindsey's v2.71 into my v2.71

I would just use Lindsey's repository directly, except my repository
contains more old releases (back to v2.65).

The conflicts were due to a few new lines in Lindsey's CHANGELOG, and
a number of whitespace differences due to line endings.

Conflicts:
CHANGELOG
config.py.example
r2e
r2e.bat
readme.html
rss2email.py

11 years agoChange maintainer from Lindsey to Trevor.
W. Trevor King [Fri, 9 Nov 2012 12:46:17 +0000 (07:46 -0500)]
Change maintainer from Lindsey to Trevor.

I emailed Lindset on October 4th with some changes, and then again on
October 18th with a PyPI push suggestion.  I haven't heard back on
either count, so I'm assuming maintainership and pushing this to PyPI
myself.

Lindsey, I'm not attempting a hostile takeover ;).  If you get back
from a month-long vacation and want to resume maintainership, let me
know.

11 years agorss2email: allow config feed ordering to override datafile ordering
W. Trevor King [Mon, 29 Oct 2012 19:02:44 +0000 (15:02 -0400)]
rss2email: allow config feed ordering to override datafile ordering

11 years agorss2email: add Message-IDs so I can link messages in Mutt
W. Trevor King [Sat, 27 Oct 2012 15:09:45 +0000 (11:09 -0400)]
rss2email: add Message-IDs so I can link messages in Mutt

11 years agorss2email: fix Feed._validate_email signature to include self.
W. Trevor King [Sat, 20 Oct 2012 14:24:55 +0000 (10:24 -0400)]
rss2email: fix Feed._validate_email signature to include self.

11 years agorss2email: fix e.__cause__ -> self.__cause__ typos in exception logging.
W. Trevor King [Thu, 18 Oct 2012 17:38:19 +0000 (13:38 -0400)]
rss2email: fix e.__cause__ -> self.__cause__ typos in exception logging.

11 years agorss2email.py: don't create config-only feeds if they were in the datafile.
W. Trevor King [Thu, 18 Oct 2012 17:31:07 +0000 (13:31 -0400)]
rss2email.py: don't create config-only feeds if they were in the datafile.

11 years agorss2email: add ability to load feeds from the config file only.
W. Trevor King [Thu, 18 Oct 2012 16:56:58 +0000 (12:56 -0400)]
rss2email: add ability to load feeds from the config file only.

If a feed exists in the config file, but not in the data file, the
previous implementation would not load it.  This patch initializes
such feeds using only the information from the config file
(i.e. without dynamic data from the data file).  The patch also
creates missing data files on demand.  As an example use case, if you
keep a backup of your config file, but lose the data file, you can
restore the config file and an `r2e run` will create and repopulate
your data file.

11 years agorss2email: fix index->send argument typo in feed.run() call.
W. Trevor King [Thu, 18 Oct 2012 16:22:28 +0000 (12:22 -0400)]
rss2email: fix index->send argument typo in feed.run() call.

11 years agorss2email: add debug logging for smtp_send() and sendmail_send().
W. Trevor King [Thu, 18 Oct 2012 16:22:05 +0000 (12:22 -0400)]
rss2email: add debug logging for smtp_send() and sendmail_send().

11 years ago.gitignore: add `build`, a setup.py byproduct.
W. Trevor King [Thu, 18 Oct 2012 16:09:48 +0000 (12:09 -0400)]
.gitignore: add `build`, a setup.py byproduct.

11 years agorss2email: work around pickle.load() messing with LOG.
W. Trevor King [Thu, 18 Oct 2012 16:06:30 +0000 (12:06 -0400)]
rss2email: work around pickle.load() messing with LOG.

I'm not sure why this is happening yet, but _pickle.load() is
duplicating the StreamHandlers in LOG and resetting the log level to
ERROR.  Work around that by saving the original level/handlers and
restoring them after the load() call.
I haven't figured out why this happens yet.