May 25, 2014

Alice – Painless Middleware Chaining for Go

According to a recent thread on Reddit, many people like it simple when doing web development in Go and use net/http with useful addons (like the Gorilla toolkit) instead of a full-fledged framework.

Such applications often make use of middleware, but wrapping lots of layers of handlers can become messy in the long run:

final := gzipHandler(rateLimitHandler(securityHandler(authHandler(myApp))))

Sure, that works. But then you want to remove a handler, add one or reorder them. Suddenly, you're drowning in parentheses.

Alice (available on GitHub) was created as a solution to simplify chaining while remaining flexible and playing nice with the existing net/http middleware.

It's not a framework, a mux or a toolkit. Its sole functionality is to let you create the same middleware chain like this:

final := alice.New(gzipHandler, ratelimitHandler, 
    securityHandler, authHandler).Then(myApp)

Flaws of other approaches

Many might point out that there are existing solutions for chaining middleware. That's true, but any of the solutions I had looked into had at least one thing that I thought should be done in a differently.

The recent Negroni package allows handlers to be added as middleware like this:

n := negroni.Classic()
// func (n *Negroni) UseHandler(handler http.Handler)
n.UseHandler(myMiddleware)

See the fault here? Traditional net/http middleware wrap the next handler so the wrapper can have a total control. Here, there's simply no handler to pass in. You can't pass in the Negroni instance as it would result in infinite recursion (Negroni calls middleware calls Negroni).

Negroni has its own mechanism for control flow (next() to call the following handlers), but you have to modify your middleware to fully utilize it, which is not ideal.

go-httppipe has the same problem: it's suited for successive handlers, but not wrapper-type middleware.

go-stackbuilder forcibly uses http.ServeMux as the main handler, instead of http.Handler.

func New(mux *http.ServeMux) Builder {
    return Builder{mux}
}

func Build(hs ...interface{}) http.Handler {
    return New(http.DefaultServeMux).Build(hs...)
}

That is not ideal as one might like to bypass the default mux completely, especially when using an alternative one.

Muxchain, like Negroni, doesn't provide a reference to the next handler.

Matt Silverlock's use.go snippet came closest to what I wanted. My only complaint is that the ordering of handlers here is counter-intuitive. Reading the chaining code makes it obvious that

use(myApp, csrf, logging, recovery)

is equivalent to this code:

recovery(logging(csrf(myApp)))

and this request cycle:

recovery -> logging -> csrf -> myApp

So, a reversed order from what you've written in your code.

How Alice is better

Alice adopts Matt's model and fixes the small imperfections. It still requires middleware constructors of form func (http.Handler) http.Handler, but requests now flow exactly the way you order your handlers.

This code:

alice.Chain(recovery, logging, csrf).Then(myApp)

will result in this request cycle:

recovery -> logging -> csrf -> myApp

Here, recovery receives the request and has a reference to logging. It may or may not call logging: it's completely up to recovery, just as without Alice.

One more thing

Deciding on a unified constructor for middleware is the main reason why creating such a convenient API for chaining is even possible.

However, limiting ourselves to one function signature has a drawback. Many middleware have settings one might want (or have) to set. Not only does that not fit into our chosen signature, there is no standard way adopted by developers.

Middleware found in the standard library take the options as additional arguments to the same function:

handler = http.StripPrefix("/old/", handler)

Throttled constructs a rate limiting strategy which has a method to wrap our handler:

th := throttled.RateLimit(throttled.PerMin(30), 
    &throttled.VaryBy{RemoteAddr: true}, 
    store.NewMemStore(1000))
handler := th.Throttle(handler)

And nosurf has additional methods on the handler:

handler := nosurf.New(handler)
handler.ExemptPath("/")
handler.SetFailureHandler(http.NotFoundHandler())

It's nearly impossible for Alice to solve this without resorting to ugly reflection tricks, so it doesn't try to. Instead, when in need of customization, one should create their own constructor.

func myStripPrefix(h http.Handler) http.Handler {
    return http.StripPrefix("/old/", h)
}

This now complies to the constructor interface, so we can plug it into Alice.

alice.New(myStripPrefix).Then(myApp)

Prove me wrong, though

Alice was born in a matter of minutes and based on my own perception of what's right and what's not. Just like I've found other solutions non-ideal, some might find inherent flaws in how Alice works. Although it's unlikely that the very essence of how Alice functions will change, any feedback is welcome.

Check out Alice on GitHub.

Apr 26, 2014

Best Practices for Errors in Go

Error handling seems to be one of the more controversial areas of Go. Some are pleased with it, while others hate it with passion. Nevertheless, there is a handful of best practices that will make dealing with errors less painful if you're a sceptic and even better if you like it as it is.

Know when to panic

The idiomatic way of reporting errors in Go is having the error as the last return value of a funtion. However, Go also offers an alternative error mechanism called panic that is similar to what is known as exceptions in other programming languages.

Despite not being suitable everywhere, panics are useful in several specific scenarios.

When the user explicitly asks: Okay

In certain cases, the user might want to panic and thus abort the whole application if an important part of it can not be initialized. Go's standard library is ripe with examples of variants of functions that panic instead of returning an error, e.g. regexp.MustCompile. As long as there remains a function that does this in an idiomatic way (regexp.Compile), providing the user with a nifty shortcut is okay.

When setting up: Maybe

A scenario that, in a way, intersects with the previous one, is the setting up phase of an application. In most cases, if preparations fail, there is no point in continuing. One case of this might be a web application failing to bind the required port or being unable to connect to the database server. In that case, there's not much left to do, so panicking is acceptable, even if not explicitly requested by the user.

This behavior can also be observed in the standard library: the net/http muxer will panic if a pattern is invalid in any way.

func (mux *ServeMux) Handle(pattern string, handler Handler) {
    mux.mu.Lock()
    defer mux.mu.Unlock()

    if pattern == "" {
        panic("http: invalid pattern " + pattern)
    }
    if handler == nil {
        panic("http: nil handler")
    }
    if mux.m[pattern].explicit {
        panic("http: multiple registrations for " + pattern)
    }

    // ...

Otherwise: Not really

Although there might be other cases where panic is useful, returned errors remain preferable in most scenarios.

Predefine errors

It's not unusual to see code returning errors like this

func doStuff() error {
    if someCondition {
        return errors.New("no space left in the device")
    } else {
        return errors.New("permission denied")
    }
}

The wrongdoing here is that it's not convenient to check which error has been returned. Comparing strings is error prone: misspelling a string in comparison will result in an error that cannot be caught at compile time, while a cosmetic change of the returned error will break the checking just as much.

Given a small set of errors, the best way to handle this is to predefine each error publicly at the package level.

var ErrNoSpaceLeft = errors.New("no space left in the device")
var ErrPermissionDenied = errors.New("permission denied")

func doStuff() error {
    if someCondition {
        return ErrNoSpaceLeft
    } else {
        return ErrPermissionDenied 
    }
}

Now the previous problems aren't anymore.

if err == ErrNoSpaceLeft {
    // handle this particular error
}

Provide information

Sometimes, an error can happen because of a whole lot of different reasons. Wikipedia lists 41 different HTTP client errors. Let's say we want to treat them as errors in Go (net/http does not). What's more, we'd like to be able to look into the specifics of the error we received and find out whether the error was 404, 410 or 418.

In the last paragraph we developed a pattern for discerning errors from one another, but here it gets a bit messy. Predefining 41 separate errors like this:

var ErrBadRequest = errors.New("HTTP 400: Bad Request")
var ErrUnauthorized = errors.New("HTTP 401: Unauthorized")
// ...

will make our code and documentation messy and our fingers sore.

A custom error type is the best solution to this problem. Go's implicit interfaces make creating one easy: to conform to the error interface, we only need to have an Error() method that returns a string.

type HttpError struct {
    Code        int
    Description string
}

func (h HttpError) Error() string {
    return fmt.Sprintf("HTTP %d: %s", h.Code, h.Description)
}

Not only does this act like a regular error, it also contains the status code as an integer. We can then use it as follows:

func request() error {
    return HttpError{404, "Not Found"}
}

func main() {
    err := request()

    if err != nil {
        // an error occured
        if err.(HttpError).Code == 404 {
            // handle a "not found" error
        } else {
            // handle a different error
        }
    }

}

The only minor annoyance left here is the need to do a type assertion to a concrete error type. But it's a small price to pay compared to the fragility of parsing the error code from a string.

Provide stack traces

Errors as they exist in Go remain inferior to panics in one important way: they do not provide important information about where in the call stack the error happened.

A solution to this problem has recently been created by the Juju team at Canonical. Their package errgo provides the functionality of wrapping an error into another one that records where the error happened.

Building up on the HTTP error handling example, we'll now put this to use.

package main

import (
    "fmt"

    "github.com/juju/errgo"
)

type HttpError struct {
    Code        int
    Description string
}

func (h HttpError) Error() string {
    return fmt.Sprintf("HTTP %d: %s", h.Code, h.Description)
}

func request() error {
    return errgo.Mask(HttpError{404, "Not Found"})
}

func main() {
    err := request()

    fmt.Println(err.(errgo.Locationer).Location())

    realErr := err.(errgo.Wrapper).Underlying()

    if realErr.(HttpError).Code == 404 {
        // handle a "not found" error
    } else {
        // handle a different error
    }
}

Our code has changed in several ways. Firstly, we wrap an error into an errgo-provided type. This code now prints the information on where the error happened. On my machine it outputs

/private/tmp/example.go:19

referencing a line in request().

However, our code has become somewhat messier. To get to the real HttpError we need to do more unwrapping. Sadly, I'm not aware of a real way to make this nicer, so it's all about tradeoffs. If your codebase is small and you can always tell where the error came from, you might not need to use errgo at all.

What about concrete types?

Some might point out that a portion of the type assertions could have been avoided if we returned a concrete type from a function instead of an error interface.

However, doing that in conjunction with the := operator will bring trouble. As a result, the error variable will not be reusable.

func f1() HttpError { ... }
func f2() OsError { ... }

func main() {
    // err automatically declared as HttpError
    err := f1()

    // OsError is a completely different type
    // The compiler does not allow this
    err = f2()
}

To avoid this, errors are best returned using the error type. Standard library avoids returning concrete types as well, e.g. os package states:

Often, more information is available within the error. For example, if a call that takes a file name fails, such as Open or Stat, the error will include the failing file name when printed and will be of type *PathError, which may be unpacked for more information.

This concludes my list of best practices for errors in Go. As in other areas, a different mindset has to be adopted when coming into Go from elsewhere. "Different" does not imply "worse" though and deciding on a set of conventions is vital to making Go development even better.

Oct 19, 2013

Writing HTTP Middleware in Go

In the context of web development, "middleware" usually stands for "a part of an application that wraps the original application, adding additional functionality". It's a concept that usually seems to be somewhat underappreciated, but I think middleware is great.

For one, a good middleware has a single responsibility, is pluggable and self-contained. That means you can plug it in your app at the interface level and have it just work. It doesn't affect your coding style, it isn't a framework, but merely another layer in your request handling cycle. There's no need to rewrite your code: if you decide that you want the middleware, you add it into the equation, if you change your mind, you remove it. That's it.

Looking at Go, HTTP middleware is quite prevalent, even in the standard library. Although it might not be obvious at first, functions in the net/http package, like StripPrefix or TimeoutHandler are exactly what we defined middleware to be: they wrap your handler and take additional steps when dealing with requests or responses.

My recent Go package nosurf is middleware too. I intentionally designed it as one from the very beginning. In most cases you don't need to be aware of things happening at the application layer to do a CSRF check: nosurf, like any proper middleware, stands completely on its own and works with any tools that use the standard net/http interface.

You can also use middleware for:

  • Mitigating BREACH attack by length hiding
  • Rate-limiting
  • Blocking evil bots
  • Providing debugging information
  • Adding HSTS, X-Frame-Options headers
  • Recovering gracefully from panics
  • ...and probably many others

Writing a simple middleware

For the first example, we'll write middleware that only allows users visit our website through a single domain (specified by HTTP in the Host header). A middleware like that could serve to protect the web application from host spoofing attacks.

Constructing the type

For starters, let's define a type for the middleware. We'll call it SingleHost.

type SingleHost struct {
    handler     http.Handler
    allowedHost string
}

It consists of only two fields:

  • the wrapped Handler we'll call if the request comes with a valid Host.
  • the allowed host value itself.

As we made the field names lowercase, making them private to our package, we should also make a constructor for our type.

func NewSingleHost(handler http.Handler, allowedHost string) *SingleHost {
    return &SingleHost{handler: handler, allowedHost: allowedHost}
}

Request handling

Now, for the actual logic. To implement http.Handler, our type only needs to have one method:

type Handler interface {
        ServeHTTP(ResponseWriter, *Request)
}

And here it is:

func (s *SingleHost) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    host := r.Host
    if host == s.allowedHost {
        s.handler.ServeHTTP(w, r)
    } else {
        w.WriteHeader(403)
    }
}

ServeHTTP method simply checks the Host header on the request:

  • if it matches the allowedHost set by the constructor, it calls the wrapped handler's ServeHTTP method, thus passing the responsibility for handling the request.
  • if it doesn't match, it returns the 403 (Forbidden) status code and the request is dealt with.

The original handler's ServeHTTP is never called in the latter case, so not only does it not get a say in this, it won't even know such a request arrived at all.

Now that we're done coding our middleware, we just need to plug it in. Instead of passing our original Handler directly into the net/http server, we wrap it in the middleware.

singleHosted = NewSingleHost(myHandler, "example.com")
http.ListenAndServe(":8080", singleHosted)

An alternative approach

The middleware we just wrote is really simple: it literally consists of 15 lines of code. For writing such middleware, there exists a method with less boilerplate. Thanks to Go's support of first class functions and closures, and having the neat http.HandlerFunc wrapper, we'll be able to implement this as a simple function, rather than a separate struct type. Here is the function-based version of our middleware in its entirety.

func SingleHost(handler http.Handler, allowedHost string) http.Handler {
    ourFunc := func(w http.ResponseWriter, r *http.Request) {
        host := r.Host
        if host == allowedHost {
            handler.ServeHTTP(w, r)
        } else {
            w.WriteHeader(403)
        }
    }
    return http.HandlerFunc(ourFunc)
}

Here we declare a simple function called SingleHost that takes in a Handler to wrap and the allowed hostname. Inside it, we construct a function analogous to ServeHTTP from the previous version of our middleware. Our inner function is actually a closure, so it can access the variables from the outer function. Finally, HandlerFunc lets us use this function as a http.Handler.

Deciding whether to use a HandlerFunc or to roll out your own http.Handler type is ultimately up to you. While for basic cases a function might be enough, if you find your middleware growing, you might want to consider making your own struct type and separate the logic into several methods.

Meanwhile, the standard library actually uses both ways of building middleware. StripPrefix is a function that returns a HandlerFunc, while TimeoutHandler, although a function too, returns a custom struct type that handles the requests.

A more complex case

Our SingleHost middleware was trivial: we checked one attribute of the request and either passed the request to the original handler, not caring about it anymore, or returned a response ourselves and didn't let the original handler touch it at all. Nevertheless, there are cases where, rather than acting based on what the request is, our middleware has to post-process the response after the original handler has written it, modifying it in some way.

Appending data is easy

If we just want to append some data after the body written by the wrapped handler, all we have to do is call Write() after it finishes:

type AppendMiddleware struct {
    handler http.Handler
}

func (a *AppendMiddleware) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    a.handler.ServeHTTP(w, r)
    w.Write([]byte("Middleware says hello."))
}

The response body will now consist of whatever the original handler outputted, followed by Middleware says hello..

The problem

Doing other types of response manipulations is a bit harder though. Say, we'd like to prepend data to the response instead of appending it. If we call Write() before the original handler does, it will lose control over the status code and headers, since the first Write() writes them out immediately.

Modifying the original output in any other way (say, replacing strings in it), changing certain response headers or setting a different status code won't work because of a similar reason: when the wrapped handler returns, those will have already be sent to the client.

To counter this we need a particular kind of ResponseWriter that would work as a buffer, gathering the response and storing it for later use (and modifications). We would then pass this buffering ResponseWriter to the original handler instead of giving it the real RW, thus preventing it from actually sending the response to the user just yet.

Luckily, there's a tool just like that in the Go standard library. ResponseRecorder in the net/http/httptest package does all we need: it saves the response status code, a map of response headers and accumulates the body into a buffer of bytes. Although (like the package name implies) it's intended to be used in tests, it fits our use case as well.

Let's look at an example of middleware that uses ResponseRecorder and modifies everything in a response, just for the sake of completeness.

type ModifierMiddleware struct {
    handler http.Handler
}

func (m *ModifierMiddleware) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    rec := httptest.NewRecorder()
    // passing a ResponseRecorder instead of the original RW
    m.handler.ServeHTTP(rec, r)
    // after this finishes, we have the response recorded
    // and can modify it before copying it to the original RW

    // we copy the original headers first
    for k, v := range rec.Header() {
        w.Header()[k] = v
    }
    // and set an additional one
    w.Header().Set("X-We-Modified-This", "Yup")
    // only then the status code, as this call writes out the headers 
    w.WriteHeader(418)
    // the body hasn't been written (to the real RW) yet,
    // so we can prepend some data.
    w.Write([]byte("Middleware says hello again. "))
    // then write out the original body
    w.Write(rec.Body.Bytes())
}

And here's the response we get by wrapping a handler that would otherwise simply return "Success!" with our middleware.

HTTP/1.1 418 I'm a teapot
X-We-Modified-This: Yup
Content-Type: text/plain; charset=utf-8
Content-Length: 37
Date: Tue, 03 Sep 2013 18:41:39 GMT

Middleware says hello again. Success!

This opens up a whole lot of new possibilities. The wrapped handler is now completely in or control: even after it handling the request, we can manipulate the response in any way we want.

Sharing data with other handlers

In various cases, your middleware might need to expose certain information to other middleware or your app itself. For example, nosurf needs to give the user a way to access the CSRF token and the reason of failure (if any).

A nice pattern for this is to use a map, usually an unexported one, that maps http.Request pointers to pieces of the data needed, and then expose package (or handler) level functions to access the data.

I used this pattern for nosurf too. Here, I created a global context map. Note that a mutex is also needed, since Go's maps aren't safe for concurrent access by default.

type csrfContext struct {
    token string
    reason error
}

var (
    contextMap = make(map[*http.Request]*csrfContext)
    cmMutex    = new(sync.RWMutex)
)

The data is set by the handler and exposed via exported functions like Token().

func Token(req *http.Request) string {
    cmMutex.RLock()
    defer cmMutex.RUnlock()

    ctx, ok := contextMap[req]
    if !ok {
            return ""
    }

    return ctx.token
}

You can find the whole implementation in the context.go file in nosurf's repository.

While I chose to implement this on my own for nosurf, there exists a handy gorilla/context package that implements a generic map for saving request information. In most cases, it should suffice and protect you from pitfalls of implementing a shared storage on your own. It even has a middleware of its own that clears the request data after it's been served.

All in all

The intention of this article was both to draw fellow gophers' attention to middleware as a concept and to demonstrate some of the basic building blocks for writing middleware in Go. Despite being a relatively young language, Go has an amazing standard HTTP interface. It's one of the factors that make coding middleware for Go a painless and even fun process.

Nevertheless, there is still a lack of quality HTTP tools for Go. Most, if not all, of the middleware ideas for Go I mentioned earlier are yet to come to life. Now that you know how to build middleware for Go, why not do it yourself? ;)

P.S. You can find samples for all the middleware written for this post in a GitHub gist.

Oct 14, 2013

Embrace Go's HTTP Tools

Some time ago I released nosurf, a Go middleware for mitigating Cross-Site Request Forgery attacks. Writing a seemingly simple and small package was enough to fall in love with how Go handles HTTP. Yet, it's up to us to either embrace the standard HTTP facilities or fragmentate, sacrificing composability and modularity.

http.Handler is THE interface

Unified HTTP interfaces for web apps written in certain programming languages, like WSGI for Python and Rack for Ruby are a great idea, but they weren't always there. For instance, Rack only emerged in 2007, when Rails had already been going strong for a while.

Meanwhile in Go, the only interface needed has been in development since 2009, and although it's been through some serious changes since that, by the end of 2011, months before Go 1.0 was released, it had already stabilized.

Of course, I'm talking about the mighty http.Handler.

type Handler interface {
    ServeHTTP(ResponseWriter, *Request)
}

To be able to handle HTTP requests, your type only needs to implement this one method. The method reads the request info from the given *Request and writes a response into the given ResponseWriter. Seems simple enough, right?

Complement, don't replace

Yet, when building abstractions on top of that, some get it wrong. Take for example Mango, described by its author as "a modular web-application framework for Go, inspired by Rack and PEP333".

This is what a Mango application looks like:

func Hello(env mango.Env) (mango.Status, mango.Headers, mango.Body) {
  return 200, mango.Headers{}, mango.Body("Hello World!")
}

Looks simple, concise and very similar to WSGI or Rack, right? Except for one thing. While with dynamic/duck typing, you could have any iterable for a body, here mango.Body is simply a string. Essentially, that takes away the ability to do any sort of streaming responses with Mango. Even if it were to expose a ResponseWriter, anything written to it would clash with the returned values, since they're only returned at the end of the function, after the calls to ResponseWriter have already been made.

That's bad. Whether you need another interface on top of existing net/http is a matter of taste, but even if you do, it should not take functionality away. An interface that is nicer to code with, but takes away important functions is clearly inferior.

The right way

A popular "micro" web framework web.go deals with this in a simple, yet much better way. Its handlers take a pointer to web.Context as an optional first argument.

type Context struct {
    Request *http.Request
    Params  map[string]string
    Server  *Server
    http.ResponseWriter
}
// ...
func hello(ctx *web.Context, val string) string {
    return "hello " + val 
}

web.Context does not take the standard HTTP handler structures away. Instead, the *Request argument is available as a struct member and Context implements the required ResponseWriter methods itself embeds the original ResponseWriter. The string you return from the function (if any) is simply appended to the response.

That is a good design choice and I think it goes well with Go's philosophy. Even though you get a nice higher-level API, you don't have to sacrifice the low-level control over the request handling.

Start now

Go's HTTP library infrastructure, despite growing rapidly, still has some gaps left to fill. But the last thing we need is fragmentation and annoying incompatibilities due to poor design and abstractions that actually take important functionality away. Embracing and supporting the standard Go HTTP facilities is, in my humble opinion, the straightest way to having functional and modular 3rd-party HTTP tools.