Since the beginning, writing IAM policies with the minimum necessary permissions has been hard. Some services don’t have resource-level permissions (you have to grant to *), but then later they do. When a service has resource-level permissions, it may only be for some of its permissions (the rest still need *). Some services have their own Condition Operators (separate from the global ones) that may or may not help you tighten control. Et cetera. The details are documented differently for each service and it’s a lot of hunting and testing to try to put together a tight policy.
Amazon made it easier! There’s new magic in the IAM UI to help you create policies. It has some limitations, but it’s a big improvement. Here are some of the things it can do that I used to have to do myself:
Knows which S3 permissions require the resource list to include a bucket name and which require the bucket name and an object path.
Tries to group permissions and resources into statements when it results in equivalent access (but sometimes ends up granting extra access, see below).
Knows when a service doesn’t support resource-level permissions.
Knows about the Condition Operators specific to each service (not just the global ones).
There are some limitations:
Doesn’t deduplicate. If you add permissions it doesn’t go back and put them into existing statements, it just adds new statements that may duplicate parts of old ones.
Only generates JSON, so if you’re writing a YAML CloudFormation template you should translate.
Seems to have limited form validation on Condition Operators. You can put in strings that will never match because the API calls for that service can’t contain what you entered (making the statement a no-op).
Can end up grouping permissions in a way that makes some resource restrictions meaningless and grants more access than might be expected.
Sometimes it messes up the syntax. Seems to happen if you don’t put exactly what it expects into the forms.
So there are a few problems, but this is still way better than it was before! My plan is to use the visual editor to write policies, then go through and touch it up afterward. Based on what I’ve seen so far, this cuts the time it takes me to develop policies by about 30%.
This statefulness is why you can let host A SSH to host B just by allowing outgoing SSH on A’s SG and incoming SSH on B’s SG. B doesn’t need to allow outgoing SSH because it knows the return traffic is part of a connection that was already allowed. Similarly for A and incoming SSH.
Here’s the detail of today’s post: if the Security Group sees traffic as part of an established connection, it’ll allow it even if its rules say not to. Ok now let’s break a Security Group.
Two hosts, testa and testb. One SG for each, both allowing all outgoing traffic. Testb’s SG allows incoming TCP on port 4321 (a random ephemeral port I’m using for this test):
To test traffic flow, I’m going to use nc. It’s a common Linux utility that sends and receives TCP traffic:
Listen: nc -l [port]
Send: nc [host] [port]
(screenshots of shell output below)
Listen on port 4321 on testb.
Start a connection from testa to port 4321 on testb.
Send a message. It’s delivered, as expected.
Remove testb’s SG rule allowing port 4321:
Send another message through the connection. It will get through! There’s no rule to allow it, but it still gets through.
To show nothing else was going on, let’s redo the test with the security group as it is now (no rule allowing 4321).
Quit nc on testa to close the connection. You’ll see it also close on testb.
Listen on port 4321 on testb.
Start a connection from tests a to port 4321 on testb.
Send a message. Not delivered. This time there was no established connection so the traffic was compared to the SGs rules. There was no rule to allow it, so it was denied.
(where we listened)
Only two messages got through.
(where we sent)
We sent three messages. The last two were sent while the SG had the same rules, but the first message was allowed and the second was denied.
The rules in EC2 Security Groups don’t apply to open (established) TCP connections. If you need to ensure traffic isn’t flowing between two instances you can’t just remove rules from your SGs. You have to close all open connections.