Securely issuing HTTP requests from the cloud

blog-image

Being able to issue HTTP requests from our cloud providers have opened a whole new realm of possibilities with automating our workflows and services. Now with systems like headless chrome, the possibilities seem nearly endless. Yet, time and time again I see companies falling victim to cloud meta data attacks or other forms of Server Side Request Forgery (SSRF). The underlying vulnerability is this. You are taking untrusted user input, in the form of a URL, and issuing a request from your cloud environment. Since it’s running in your environment, it can potentially access any of your internal systems. Cloud metadata services contain a wealth of information about the environment the system is running in, and in some cases accessing this can lead to a complete compromise. If you are looking for an excellent description of these types of attacks, check out MWR Labs’ write up. Simply blocking URLs is a losing game, which will be discussed below. While this post will primarily focus on AWS, most of these techniques apply across all cloud providers. Since linkai.io’s hakken system uses web browsers extensively, a lot of thought and care was put into this issue.

Architecture First

Let’s discuss some possible architectural solutions to these problems before we get into more technical solutions.

Don’t run it from your internal environment

Sounds easy right? Remove the risk of internal systems being exposed by… not running the requests from internal systems. One solution is to use Lambda or Cloud Functions. We simply call a lambda function via the API from our internal service, which can be protected behind a private VPC. Pass in any context we need to issue the requests, then get the response directly from Lambda, or store the results in an S3 bucket. Even if a malicious user were able to issue arbitrary requests, they would not have access to our internal VPC. As an additional bonus, there is no ‘ec2 metadata’ API accessible from Lambda. Everything comes from either environment variables or the context object passed in when calling the function.

This design choice isn’t limited to Lambda, it could be applied to any instance type, create a separate VPC that is only used for accessing the Internet. The metadata API will still be accessible, so ensure your IAM policies are strictly defined. If you have a wildcard (*) in any of your IAM policies, you are leaving your system too open. Message passing to issue requests and return responses can be done in multiple ways, SQS, gRPC or whatever messaging system you use. However, if you are using service discovery that exposes REST endpoints for microservices, you’ll quickly notice a problem.

Run it from an internal environment, but safely

Let’s say you have some services which issue web requests, these are all done from systems on a public or private VPC. However, you still need to ensure arbitrary internal services can’t be accessed. The first step is to set security group policies that limit ports and services to only those that are necessary. With that out of the way, you come to the realization that your service discovery system exposes an HTTP REST endpoint. This endpoint must absolutely be accessible to your service otherwise it won’t be able to register itself, receive health check updates or access any other functions required for it to run. Configuring TLS and authentication is the next logical step, but what if there are other services that can’t be easily protected?

The naive approach is to simply try to block certain hosts or URLs via an ignore/allow list. Everything starts off fine, you block http://169.254.169.254 so URLs to the EC2 metadata API are disallowed. You follow suit with your other internal IP addresses or services. But there’s a problem. If you accept arbitrary URLs, you are still vulnerable to how those hostnames are resolved. Anyone can very easily register a domain of ec2.attackerdomain.com and have it resolve to 169.254.169.254. So, you try to resolve the hostname first, then check the IP. This sounds good, but if you are familiar with DNS rebinding attacks, you should spot that this is a potential Time Of Check, Time Of Use (TOCTOU) issue. If you are issuing a single request (not from a browser, but some language specific HTTP library) this may be sufficient. But if we are using a real browser such as headless chrome, this method is not enough. Regardless, we can do better here. What the hakken service does is create a separate Linux user for web requests. By having a separate user, we can use an excellent feature of IPTables, which is create firewall rules that can be applied to a UID/GID.

Separate user for browsers

drawing

Hakken uses the headless chrome extensively for monitoring customers new websites, or changes to known assets. We can’t use a simple ignore/allow list here. Customers could upload any domain name they want, they have full control over how a URL is to be loaded, and what third party resources that URL in turn loads. The solution to this was to create a separate Linux user that manages headless browsers, and apply IPTable rules specific to that user. This browser service is called the browser leaser, and it only has two jobs. Listen on a unix domain socket for requests to acquire a browser, provide the browser, and listen for returning the browser (to be destroyed along with the randomly created profile directory). The service that acquires browsers uses the debug protocol to instrument and capture the necessary information from the loaded websites.

If we attempted to run browsers inside of the same process as our service, we would be unable to block internal services as well as localhost. Once the EC2 system starts, it needs the metadata API to get various information about the environment. Also, the service that manages analyzing web requests needs to access service discovery REST endpoints. If we blocked the web analysis service from accessing localhost, the service would not be able to register itself. By using two separate users and two separate processes we have separation of concerns and can allow the service to do its internal communications but block the browser process from accessing risky IP addresses. So, what do these rules look like?

iptables -A OUTPUT -p tcp -d 169.254.0.0/16 -m owner --uid-owner browserleaser -j REJECT
iptables -A OUTPUT -p tcp -d 10.0.0.0/8 -m owner --uid-owner browserleaser -j REJECT
...
iptables -A OUTPUT -p tcp -d 127.0.0.0/8 --destination-port 8500 -m owner --uid-owner browserleaser -j REJECT

Using CloudFormation UserData we create a new web user, install the leaser process as a system service and apply the above rules. We can now be confident our browser processes can’t access any of our internal resources. A quick note, if you have an ipv6 interface, you will need to apply rules to them as well.

Making requests without a browser

If a service isn’t making use of browsers, it’s far easier to block requests. There is less worry about TOCTOU issues provided the service only issues a single request. For Go, it is possible to hook the http client’s DialContext and DialTLS methods.

This allows a service to initiate a connection, inspect the remote address and see if it’s in a deny list before issuing requests. Other libraries may offer similar hooks that can be used to block but remember the host names must be resolved first and only the resolved IP address should be used in a single request. This includes redirects, if a web site attempts to redirect your client, the hook will need to be able to intercept that new connection and check the resulting IP Address after resolution.

Here is a working example in Go of how one would achieve such a result:

package main

import (
	"context"
	"crypto/tls"
	"errors"
	"log"
	"net"
	"net/http"
	"net/http/httputil"
	"time"
)
// Alternatively do net.ParseIP
func IsBannedIP(ip string) bool {
	if ip == "93.184.216.34" {
		return true
	}
	return false
}

func main() {
	timeout := 10 * time.Second
	tr := &http.Transport{
		DialContext: func(ctx context.Context, network, addr string) (net.Conn, error) {
			c, err := net.Dial(network, addr)
			if err != nil {
				return nil, err
			}
			ip, _, _ := net.SplitHostPort(c.RemoteAddr().String())
			if IsBannedIP(ip) {
				log.Printf("BANNED IP")
				return nil, errors.New("ip address is banned")
			}
			return c, err
		},
		DialTLS: func(network, addr string) (net.Conn, error) {
			c, err := tls.Dial(network, addr, &tls.Config{})
			if err != nil {
				return nil, err
			}

			ip, _, _ := net.SplitHostPort(c.RemoteAddr().String())
			if IsBannedIP(ip) {
				log.Printf("TLS BANNED IP")
				return nil, errors.New("ip address is banned")
			}

			err = c.Handshake()
			if err != nil {
				return c, err
			}

			return c, c.Handshake()
		},
		TLSHandshakeTimeout: 5 * time.Second,
	}

	c := &http.Client{
		Transport: tr,
		Timeout:   timeout,
	}

	resp, err := c.Get("https://example.com:443/")
	if err != nil {
		log.Printf("nope: %v", err)
		return
	}
	defer resp.Body.Close()
	// we shouldn't get here
	data, err := httputil.DumpResponse(resp, true)
	if err != nil {
		log.Printf("error dumping response: %v\n", err)
	}
	log.Printf("response: %s\n", string(data))
}

Hardening

If you are using headless chrome with user input you must be extra careful. You must keep Chrome up to date, and I’d strongly urge everyone to not use forks or other less vetted browsers. You may also want to use additional hardening such as firejail with the chromium profile. Firejail is excellent because we can give the browser a read only file system, which protects against possible drive by downloads. User profiles should be generated for each browser process that loads a URL. It is strongly recommended to destroy the browser and the profile directory as soon as it has completed its task.

Conclusion

While most cases of SSRF are accidental vulnerabilities in backend systems, there are also cases where an organization would want to issue requests but protect internal resources. Hopefully the above guide will help some engineers make better informed design decisions where and how to allow these requests.

If you are interested in the Hakken service to continuously discover new web assets on your network, please contact us as we are actively looking for beta testers.