security.txt

   1 = Security analysis of irker =
   2
   3 This is an analysis of security and DoS vulnerabilities associated
   4 with irker, exploring and explaining certain design choices.  Much of
   5 it derives from a code audit and report by Daniel Franke.
   6
   7 == Assumptions and Goals ==
   8
   9 We begin by stating some assumptions about how irker will be deployed,
  10 and articulating a set of security goals.
  11
  12 Communication flow in an irker deployment will looks like this:
  13
  14              Committers
  15                  |
  16                  |
  17         Version-control repositories
  18                  |
  19                  |
  20             irkerhook.py
  21                  |
  22                  |
  23                irkerd
  24                  |
  25                  |
  26              IRC servers
  27
  28 Here are our assumptions:
  29
  30 1. The repositories are hosted on a public forge sites such as
  31 SourceForge, GitHub, Gitorious, Savannah, or Gna and must be
  32 accessible to untrusted users.
  33
  34 2. Repository project owners can set properties on their repositories
  35 (including but not limited to irker.*), and may be able to set custom
  36 post-commit hooks which can execute arbitrary code on the repostory
  37 server. In particular, these people my be able to modify the local
  38 copy of irkerhook.py.
  39
  40 3. The machine which hosts irkerd has the same owner as the machine which
  41 hosts the the repo; these machines are possibly but not necessarily
  42 one and the same.
  43
  44 4. The network is protected by a perimeter firewall, and only a
  45 trusted group is able to emit arbitrary packets from inside the
  46 perimeter; committers are not necessarily part of this group.
  47
  48 5. irkerd communicates with IRC servers over the open internet,
  49 and an IRC server's administrator is assumed to hold no position of
  50 trust with any other party.
  51
  52 We can, accordingly, identify the following groups of security
  53 principals:
  54
  55 A. irker administrators.
  56 B. Project committers.
  57 C. Project owners
  58 D. IRC server administrators.
  59 E. Other people on irker's internal network.
  60 F. irkerd-IRC men-in-the-middle (i.e. people who control the network path
  61    between irkerd and the IRC server).
  62 G. Random people on the internet.
  63
  64 Our security goals for irker can be enumerated as follows:
  65
  66 * Control: We don't want anyone outside group A gaining control of
  67   the machines which host irkerd or the git repos.
  68
  69 * Availability: Only group A should be able to to deny or degrade
  70   irkerd's ability to receive commit messages and relay them to the
  71   IRC server. We recognize and accept as inevitable that MITMs (groups
  72   4 and 5) can do this too (by ARP spoofing, cable-cutting, etc.).
  73   But, in particular, we would like irker-mediated services to be
  74   resilient against DoS (denial of service) attacks.
  75
  76 * Authentication/integrity: Notifications should be truthful, i.e.,
  77   commit messages sent to IRC channels should actually reflect that a
  78   corresponding commit has taken place. We accept that groups A, C,
  79   D, and E can violate this property.
  80
  81 * Secrecy: irker shouldn't aid spammers (group G) in harvesting
  82   committers' email addresses.
  83
  84 * Auditability: If people abuse irkerd, we want to be able to identify
  85   the abusive account or IP address.
  86
  87 == Control Issues ===
  88
  89 We have audited the irker and irkerhook.py code for exploitable
  90 vulnerabilities.  We have not found any in the code itself, but the
  91 fact that irkerhook.py relies on external binaries to mine data ought
  92 of its repository opens up a well-known set of vulnerabilities if a
  93 malicious user is able to insert binaries in a carelessly-set
  94 execution path.  Normal precautions against this should be taken.
  95
  96 == Availability ==
  97
  98 === Solved problems ===
  99
 100 When the original implementation of irkerd saw a nick collision it
 101 generated new nicks in a predictable sequence. A malicious IRC user
 102 could have continuously changed his own nick to the next one that
 103 irkerd is going to try. Some randomness has been added to nick
 104 generation to prevent this.
 105
 106 === Unsolved problems ===
 107
 108 DoS attacks on any networked application can never completely
 109 prevented, only mitigated by forcing attackers to invest more
 110 resources.  Here we consider the easiest attack paths against irker,
 111 and possible countermeasures.
 112
 113 irker handles each connection to a particular IRC server in a separate
 114 thread - actually, due to server limits on open channels per
 115 connection, there may be multiple sessions per server. This may not
 116 scale well, especially on 32-bit architectures.
 117
 118 Thread instance overhead, combined with the lack of any restriction on
 119 how many URLs can appear in the 'to' list, is a DoS vulnerability. If
 120 a repository's properties specify that notifications should go to more
 121 than about 500 unique hostnames, then on 32-bit architectures we'll
 122 hit the 4GB cap on virtual memory (even while the resident set size
 123 remains small).
 124
 125 Another ceiling to watch out for is the ulimit on file descriptors,
 126 which defaults to 1024 on many Linux systems but can safely be set
 127 much larger. Each connection instance costs a file descriptor.
 128
 129 We consider some possible ways of addressing the problem:
 130
 131 1. Limit the number of URLs in a request.  Pretty painless - it will
 132 be very rare that anyone wants to specify a larger set than a project
 133 channel plus freenode #commits - but also ineffective.  A malicious
 134 hook could achieve DoS simply by spamming lots of requests.
 135
 136 2. Limit the total number of requests than can be queued. Completely
 137 ineffective - just sets a target for the DoS attack.
 138
 139 3. Limit the number of requests that can be queued by source IP address.
 140 This might be worth doing; it would stymie a single-source DoS attack through
 141 a publicly-exposed irkerd, though not a DDoS by a bitnet.  But there isn't
 142 a lot of win here for a properly installed irker (e.g. behind a firewall),
 143 which is typically going to get all its requests from a single repo host
 144 anyway.
 145
 146 4. Rate-limit requests by source IP address - that is, after any request
 147 discard additional ones during some timeout period.  Again, good for
 148 stopping a single-source DoS against an exposed irker, won't stop a
 149 DDoS.  The real problem though, is that any such rate limit might interfere
 150 with legitimate high-volume use by a very active repo site.
 151
 152 After this we appear to have run out of easy options, as source IP address
 153 is the only thing irkerd can see that an attacker can't spoof.
 154
 155 === Future directions ===
 156
 157 One way we could mitigate some availability risks is by reaping old
 158 sessions when we're near resource limits.  An ordinary DoS attack
 159 would then be prevented from completely blocking all message traffic;
 160 the cost would be a whole lot of join/leave spam due to connection
 161 churn.
 162
 163 = Authentication/Integrity =
 164
 165 One way to help prevent DoS attacks would be in-band authentication -
 166 requiring irkerd submitters to present a credential along with each
 167 message submission.  In principle this, if it existed, could also be used
 168 to verify that a submitter is authorized to issue notifications with
 169 respect to a given project.
 170
 171 We rejected this approach. The design goal for irker was to make
 172 submissions fast, cheap, and stateless; baking an authentication
 173 system directly into the irkerd codebase would have conflicted with
 174 these objectives, not to mention probably becoming the camel's nose
 175 for a godawful amount of code bloat.
 176
 177 The deployment advice in the installation instructions assumes that
 178 irkerd submitters are "authenticated" by being inside a firewall - that is,
 179 mesages are issued from an intranet and it can be trusted that anyone
 180 issuing messages from within a given intrenet is authorized to do so.
 181 This fits the assumption that irker instances will run on forge sites
 182 receiving requests from instances of irkerhook.py.
 183
 184 If this is *not* the case (e.g. the network between a hook and irkerd
 185 has to be considered hostile) we could hide irkerd behind an instance
 186 of spiped <http://www.tarsnap.com/spiped.html> or an instance of
 187 stunnel <http://www.stunnel.orgproxy>. These would be far superior to
 188 in-band authentication in that they would leave the job to specialist
 189 code not in any way coupled to irkerd's internals, minimizing
 190 global complexity and failure modes.
 191
 192 === Future directions ===
 193
 194 There is presently no direct support for spipe or stunnel in
 195 irkerhook.py.  We'd take patches for this.
 196
 197 == Secrecy ==
 198
 199 irkerd has no inherent secrecy risks.
 200
 201 The distributed version of irkerhook.py removes the host part of
 202 author addresses specifically in order to prevent address harvesting
 203 from the notifications.
 204
 205 == Auditability ==
 206
 207 We previously noted that source IP address is the only thing irker can
 208 see that an attacker can't spoof.  This makes auditability difficult
 209 unless we impose conventions on the notifications passing though it.
 210
 211 The irkerhook.py that we ship inherits an auditability property from
 212 the CIA service it was designed to replace: the first field of every
 213 notification (terminated by a colon) is the name of the issuing
 214 project.  The only other competitor to replace CIA known to us
 215 (kgb_bot) shares this property.
 216
 217 In the general case we cannot guarantee this property against
 218 groups A and F.
 219
 220 == Risks relative to centralized services ==
 221
 222 irker and irkerhook.py were written as a replacement for the
 223 now-defunct CIA notification service.  The author has written
 224 a critique of that service: "CIA and the perils of overengineering"
 225 at <http://esr.ibiblio.org/?p=4540>.  It is thus worth considering how
 226 a risk assessment of CIA compares to this one.
 227
 228 The principal advantages of CIA from a security point of view were (a)
 229 it provided a single point at which spam filtering and source blocking
 230 could be done with benefit to all projects using the service, and (b)
 231 since it had to have a database anyway for routing messages to project
 232 channels, the incremental overhead for an authentication feature will
 233 be relatively low.
 234
 235 As a matter of fact rather than theory CIA never fully exploited
 236 either possibility.  Anyone could create a CIA project entry with
 237 fanout to any desired set of IRC channels.  Notifications were not
 238 authenticated, so anyone could masquerade as a member of any project.
 239 The only check on abuse was human intervention to source-block
 240 spammers, and this was by no means completely effective - spam shipped
 241 via CIA was occasionally seen on on the freenode #commits channel.
 242
 243 The principal security disadvantage of CIA was that it meant the
 244 entire notification system was subject to single-point failure due
 245 to software or hosting failures on cia.vc, or to DoS attacks
 246 against the server.  While there is no evidence that the site
 247 was ever deliberately DoSed, failures were sufficiently common
 248 that a half-hearted DoS attack might not have been even noticed.
 249
 250 Despite the absence of authentication, irker instances on
 251 properly firewalled intranets do not obviously pose additional
 252 spamming risks beyond those incurred by the CIA service.  The
 253 overall robustness of the notification system as a whole should
 254 be greatly improved.
 255
 256 == Conclusions ==
 257
 258 The security and DoS issues irker has are not readily addressable by
 259 changing the irker codebase itself, short of a complete (much more
 260 complex and heavyweight) redesign.  They are largely implicit risks of
 261 its operating environment and must be managed by properly controlling
 262 access to irker instances.
 263