summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
W. Trevor King [Sat, 22 Feb 2014 03:00:47 +0000 (19:00 -0800)]
package_cache: Bump to version 0.2
Changes since v0.1:
* Added a fallback MIME type to fix server errrors on unknown
extensions.
* Documented a transparent proxy iptables setup.
W. Trevor King [Fri, 21 Feb 2014 20:16:06 +0000 (12:16 -0800)]
server: Add a fallback MIME type (application/octet-stream)
Avoid:
Traceback (most recent call last):
File "/.../wsgiref/handlers.py", line 137, in run
self.result = application(self.environ, self.start_response)
File "/.../site-packages/package_cache/server.py", line 50, in __call__
environ=environ, start_response=start_response)
File "/.../site-packages/package_cache/server.py", line 69, in _serve_request
path=cache_path, environ=environ, start_response=start_response)
File "/.../site-packages/package_cache/server.py", line 124, in _serve_file
start_response('200 OK', list(headers.items()))
File "/.../wsgiref/handlers.py", line 226, in start_response
self.headers = self.headers_class(headers)
File "/.../wsgiref/headers.py", line 39, in __init__
self._convert_string_type(v)
File "/.../wsgiref/headers.py", line 46, in _convert_string_type
" of type str (got {0})".format(repr(value)))
AssertionError: Header names/values must be of type str (got None)
for portage-
20140220.tar.xz.md5sum.
W. Trevor King [Fri, 21 Feb 2014 19:08:59 +0000 (11:08 -0800)]
README: Document tranparent proxy setup
W. Trevor King [Fri, 21 Feb 2014 06:07:55 +0000 (22:07 -0800)]
package_cache: Bump to version 0.1
W. Trevor King [Fri, 21 Feb 2014 05:41:42 +0000 (21:41 -0800)]
README: Document Gentoo / OpenRC usage for distfiles caching
This is what I'm using this project for ;).
W. Trevor King [Fri, 21 Feb 2014 05:36:38 +0000 (21:36 -0800)]
contrib/openrc/init.d/package-cache: Don't include 'distfiles' in source
Portage will request ${MIRROR}/distfiles/${FILENAME}, so we don't want
the 'distfiles' part in the source directory. With the shift, we can
also use a single cache for both source (distfiles) and binary
packages (PKGDIR, which defaults to /usr/portage/packages) if there
are any binary packages on the upstream mirror.
W. Trevor King [Fri, 21 Feb 2014 02:35:08 +0000 (18:35 -0800)]
main: Add the logger name and process ID to the syslog formatter
When logging to stderr, there's no need to differentiate the logging
process. When everything's landing in the same system log, there is.
W. Trevor King [Fri, 21 Feb 2014 02:29:58 +0000 (18:29 -0800)]
contrib/openrc/init.d/package-cache: Add PC_OPTS
You can use this to tweak the logging:
$ cat /etc/conf.d/package-cache
PC_OPTS="-vvv"
W. Trevor King [Fri, 21 Feb 2014 01:04:47 +0000 (17:04 -0800)]
contrib/openrc/init.d/package-cache: Nest more deeply
Gentoo's doinitd can't handle renaming the files it installs, so
create a deeper tree where the init script can be called
'package-cache'. This layout leaves room for a future
contrib/openrc/conf.d/package-cache if we want to supply one.
W. Trevor King [Thu, 20 Feb 2014 23:44:46 +0000 (15:44 -0800)]
package-cache: Remove .py extension
Users shouldn't care what the implementation language is.
W. Trevor King [Thu, 20 Feb 2014 23:20:50 +0000 (15:20 -0800)]
README.rst: Add symlink for GitHub rendering
W. Trevor King [Thu, 20 Feb 2014 23:20:11 +0000 (15:20 -0800)]
README: Convert from Markdown to reStructuredText for PyPI
W. Trevor King [Thu, 20 Feb 2014 23:06:13 +0000 (15:06 -0800)]
contrib/openrc-init: Add an OpenRC init script
References:
http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=2&chap=4#doc_chap4
https://wiki.gentoo.org/wiki/OpenRC
W. Trevor King [Thu, 20 Feb 2014 22:31:53 +0000 (14:31 -0800)]
main: Teach package-cache the --syslog option
W. Trevor King [Thu, 20 Feb 2014 22:20:42 +0000 (14:20 -0800)]
main: Teach package-cache the --verbose option
For adjusting the verbosity of the package-level logger.
Also add a simple LoggingRequestHandler class so WSGI-side logging is
routed through our loggers instead of being written directly to
stderr.
W. Trevor King [Thu, 20 Feb 2014 22:19:50 +0000 (14:19 -0800)]
package_cache: Add a package-level logger
This gives us a single location for configuring verbosity, handlers,
etc., for submodule loggers.
W. Trevor King [Thu, 20 Feb 2014 22:19:22 +0000 (14:19 -0800)]
server: Log source-requests and errors
W. Trevor King [Thu, 20 Feb 2014 21:59:45 +0000 (13:59 -0800)]
README.md: Explain what this is all about
W. Trevor King [Thu, 20 Feb 2014 21:57:54 +0000 (13:57 -0800)]
Run update-copyright.py
W. Trevor King [Thu, 20 Feb 2014 21:56:54 +0000 (13:56 -0800)]
package-cache.py: Add a '# Copyright' stub for update-copyright.py
W. Trevor King [Thu, 20 Feb 2014 21:55:28 +0000 (13:55 -0800)]
.update-copyright.conf: Add copyright configuration
Use my external update-copyright package to maintain copyright blurbs.
http://pypi.python.org/pypi/update-copyright/
W. Trevor King [Thu, 20 Feb 2014 21:54:56 +0000 (13:54 -0800)]
.gitignore: Ignore Python-3 side effects
W. Trevor King [Thu, 20 Feb 2014 21:54:15 +0000 (13:54 -0800)]
setup.py: Package package-cache with distutils
The AUTHORS file doesn't exist yet, but we'll have it soon.
W. Trevor King [Thu, 20 Feb 2014 21:42:19 +0000 (13:42 -0800)]
server: Use the Last-Modified header to set last-modified time (mtime)
This also sets the access time to the same value, but we're only
calling _get_file if we're about to serve the file to a client, which
will clobber any value of atime set here.
W. Trevor King [Thu, 20 Feb 2014 21:10:50 +0000 (13:10 -0800)]
server: Check for relative paths to invalid directories
Avoid leaking information to requests like:
http://localhost:4000/../../etc/passwd
PEP 333 isn't clear on what values are allowed for PATH_INFO, but it
does mention them as "CGI-style" [1]. RFC 3875, defining CGI 1.1,
says about PATH_INFO [2]:
The server MAY impose restrictions and limitations on what values it
permits for PATH_INFO, and MAY reject the request with an error if
it encounters any values considered objectionable.
I can't actually exploit this with Python's reference WSGI
implementation. When I tried to fetch /../../etc/passwd with Wget, I
got '/etc/passwd' as PATH_INFO, but this seems like an
important-enough risk that a little extra checking would not be wrong
;).
Also drop the urlparse call, because PATH_INFO is already the parsed
path portion of the URL.
[1]: http://legacy.python.org/dev/peps/pep-0333/#specification-details
[2]: http://tools.ietf.org/search/rfc3875#section-4.1.5
W. Trevor King [Thu, 20 Feb 2014 20:48:22 +0000 (12:48 -0800)]
server: Create file paths as needed
Add support for non-flat source file layouts (e.g. relative paths that
contain directory parts).
Instead of creating the cache directory and possible per-file
subdirectories separately, just create per-file directories on the
fly. This simplifies the code, but means that you won't die until the
first request if your server doesn't have permission to create these
directories.
W. Trevor King [Thu, 20 Feb 2014 20:16:47 +0000 (12:16 -0800)]
server: Implement Server._get_file
It would be nice to use sendfile to copy between the HTTPResponse
object [1] and the cache file. Linux supports arbitrary files (not
just sockets) for out_fd since 2.6.33, so the "to the cache file" side
works. However, from sendfile(2) [2]:
The in_fd argument must correspond to a file which supports
mmap(2)-like operations (i.e., it cannot be a socket).
So reading from the HTTPResponse is not going to happen (yet). Once
Linux gains support for socket in_fd, we could use something like:
_os.sendfile(
f.fileno(), response.fileno(), offset=None, count=content_length)
[1]: http://docs.python.org/3/library/http.client.html#httpresponse-objects
[2]: http://man7.org/linux/man-pages/man2/sendfile.2.html
W. Trevor King [Thu, 20 Feb 2014 19:19:22 +0000 (11:19 -0800)]
server: Don't use a keyword for the response_headers argument to start_response
Despite being documented as response_headers [1], using a keyword
argument raises a TypeError:
TypeError: start_response() got an unexpected keyword argument 'response_headers'
[1]: http://legacy.python.org/dev/peps/pep-0333/#the-start-response-callable
W. Trevor King [Thu, 20 Feb 2014 19:16:21 +0000 (11:16 -0800)]
server: Don't use a keyword for the path argument to getmtime
Despite being documented as path [1], using a keyword argument
raises a TypeError:
TypeError: getmtime() got an unexpected keyword argument 'path'
[1]: http://docs.python.org/3/library/os.path.html#os.path.getmtime
W. Trevor King [Thu, 20 Feb 2014 19:14:54 +0000 (11:14 -0800)]
server: Don't use a keyword for the path argument to getsize
Despite being documented as path [1], using a keyword argument raises
a TypeError:
TypeError: getsize() got an unexpected keyword argument 'path'
[1]: http://docs.python.org/3/library/os.path.html#os.path.getsize
W. Trevor King [Thu, 20 Feb 2014 19:11:24 +0000 (11:11 -0800)]
server: Don't use a keyword for the urlstring argument to urlparse
Despite being documented as urlstring [1], using a keyword argument
raises a TypeError:
TypeError: urlparse() got an unexpected keyword argument 'urlstring'
[1]: http://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlparse
W. Trevor King [Thu, 20 Feb 2014 19:05:34 +0000 (11:05 -0800)]
server: Don't use a keyword for the path argument to makedirs
Despite being documented as path [1], using a keyword argument raises
a TypeError:
TypeError: makedirs() got an unexpected keyword argument 'path'
[1]: http://docs.python.org/3/library/os.html#os.makedirs
W. Trevor King [Thu, 20 Feb 2014 19:02:29 +0000 (11:02 -0800)]
server: Create the cache directory if it doesn't already exist
W. Trevor King [Thu, 20 Feb 2014 19:00:16 +0000 (11:00 -0800)]
main: Add an argparse-based command line interface
And a package-cache.py wrapper script to call it.
W. Trevor King [Thu, 20 Feb 2014 18:50:41 +0000 (10:50 -0800)]
server: Stub out a WSGI server
This still needs source-fetching and Content-Range support, but it
should handle serving from the cache well enough.
W. Trevor King [Thu, 20 Feb 2014 18:50:17 +0000 (10:50 -0800)]
package_cache: Create a Python package with a version
W. Trevor King [Thu, 20 Feb 2014 17:21:07 +0000 (09:21 -0800)]
COPYING: Use the GPLv3
Fresh download from http://www.gnu.org/licenses/gpl-3.0.txt.