Writing Versioned Service APIs With Curacao: Part 1


Wed 22 Oct 2014 18:11:19 -0800

Curacao is a beautifully simple toolkit for the JVM that lets you write highly concurrent services on the common and trusted J2EE stack. While you can use it to build a full web-application, Curacao is fundamentally designed to support highly asynchronous REST/HTTP-based integration layers on top of asynchronous Servlets that are easy to understand, maintain, and debug. At its core, Curacao completely avoids the mental overhead of passing of messages between actors or within awful event loops — yet, given its simplicity, performs very well in even the most demanding applications.

Quite often, one of the most difficult problems to solve when designing an API is resource versioning.

As I see it, there’s several aspects to the API versioning problem:

  1. how do clients specify the version of the resource they’re asking for?
  2. how does the API route version specific requests?
  3. how does the API manage and respond with versioned responses?
  4. how does the API gracefully sunset deprecated resource versions?

Solutions to these questions have been debated, ad nauseam, into infinity to which everyone has a conflicting opinion.

Opinions aside, Part 1 of this series highlights a few examples that illustrate how you might implement solutions to these problems with Curacao.

Part 1: Versioned Requests

How do clients specify the version of the resource they’re asking for?

Query Parameter

One approach, is specifying the desired version of a resource using an optional query parameter. For example, consider the following requests:


While technically asking for the same resource /user/89171245.json, the client is using the version query parameter to specify the version of the API it intends to use. The server side can interpret the value of the version query parameter, and respond with an entirely unique response object depending on the requested version. In this case, version=1 may result in an entirely different response JSON object compared to that of version=2. In the event that the version query parameter is omitted, the API will default to the most recent version.

Implementing this versioning mechanism with Curacao is trivial.

The trick is to implement a custom ControllerMethodArgumentMapper that looks for the version query parameter, sanitizes it, and passes the requested version to your controller methods as a typed argument.

First, lets define an enumeration that cleanly represents all possible supported API versions.

public static enum MyApiVersion {

  /* API version 1 */
  /* API version 2 */
  private String version_;
  private MyApiVersion(final String version) {
    version_ = version;
   * Given a string, from a query parameter, convert it into one of the
   * supported API versions.  If the param is null, or doesn't match any
   * known version, this method returns the latest version.
  public static final MyApiVersion versionFromParam(final String param) {
    MyApiVersion result = MyApiVersion.VERSION_2; // Default
    if (param != null) {
      // Iterate over each possible version in the enumeration,
      // looking for a match.
      for (final MyApiVersion version : MyApiVersion.values()) {
        if (version.version_.equals(param)) {
          result = version;
    return result;

Now, let’s implement a custom ControllerMethodArgumentMapper that converts the version query parameter on the request, if any, into a MyApiVersion.

import com.kolich.curacao.handlers.requests.mappers.ControllerMethodArgumentMapper;

public final class MyApiVersionArgumentMapper extends ControllerMethodArgumentMapper<MyApiVersion> {

  private static final String VERSION_QUERY_PARAM = "version";

  @Nullable @Override
  public final MyApiVersion resolve(@Nullable final Annotation annotation,
                                    final CuracaoRequestContext context) throws Exception {
    final HttpServletRequest request = context.request_;
    final String versionParam = request.getParameter(VERSION_QUERY_PARAM);
    return MyApiVersion.versionFromParam(versionParam);


And finally, we can now write our controller methods to take an argument of type MyApiVersion. At runtime, Curacao will see this MyApiVersion argument on your controller methods, and invoke our custom MyApiVersionArgumentMapper to extract the desired version from the request.

import static com.kolich.curacao.annotations.methods.RequestMapping.RequestMethod.*;

public final class VersionedController {

  private final DataSource ds_;

  public VersionedController(final DataSource ds) {
    ds_ = ds;

  @RequestMapping(value="^\\/user\\/(?<userId>\\d+)\\.json$", methods={GET})
  public final String getUser(@Path("userId") final String userId,
                              final MyApiVersion version) {
    final User user = ds_.getUserById(userId);
    if (MyApiVersion.VERSION_1.equals(version)) {
      // Construct and return a "version 1" User object.
    } else {
      // Construct and return a "version 2" User object.

Note that the MyApiVersion argument in the controller method above is automatically discovered and injected when invoked by Curacao.


A more common approach to request versioning is through the usage of a version identifier in the path itself. For example, consider the following requests:


Note the v1 and v2 version identifier in the path.

Again, unsurprisingly, implementing this versioning mechanism with Curacao is trivial. Current best practices dictate the usage of multiple controllers — one that handles v1 requests and another that handles v2.

And so, one controller for v1:

package com.foo.api.controllers.v1;

public final class ControllerV1 {

  @RequestMapping(value="^\\/v1\\/user\\/(?<userId>\\d+)\\.json$", methods={GET})
  public final String fooV1(@Path("userId") final String userId) {
    return "v1: " + userId;


And another for v2:

package com.foo.api.controllers.v2;

public final class ControllerV2 {

  @RequestMapping(value="^\\/v2\\/user\\/(?<userId>\\d+)\\.json$", methods={GET})
  public final String fooV2(@Path("userId") final String userId) {
    return "v2: " + userId;


Note clean separation using a unique package declaration.

Accept Header

Another, slightly more RESTful approach, is using the Accept HTTP request header to identify the desired version of a resource. This is somewhat analogous to client/server “content negotiation”.

In the interest of brevity, I won’t write a complete implementation here. However, a key takeaway is that you can use Curacao’s @Header annotation to extract the value of any request header. From there, your business logic in the controller can examine the header value to make a decision about what API version is invoked.

public final class HeaderController {

  @RequestMapping(value="^\\/foo", methods={GET})
  public final String headerDemo(@Header("Accept") final String accept) {
    if (accept.contains("v2")) {
      // V2
    } else {
      // V1


In addition to @Header, there are a number of “convenience” request header annotations you can use to decorate your controller method arguments:

  • @Accept — convenience for the Accept request header
  • @UserAgent — convenience for the User-Agent request header
  • @ContentType — convenience for the Content-Type request header
  • @Authorization — convenience for the Authorization request header
  • … and of course, many more in the com.kolich.curacao.annotations.parameters.convenience package.

Part 2

In the upcoming Part 2 of this series, I’ll cover the routing of version specific requests with Curacao. This includes building a service that routes based on a custom header, and of course, the path.

Till next time, cheers!

Introducing Curacao


Sun 31 Aug 2014 14:57:31 -0800

Tired of Spring, Jersey, raw Servlets, and other REST toolkits I cautiously approached the thought of building my own JVM web-layer from scratch. In retrospect, I probably didn’t need to spend time on yet another toolkit to help shield engineers from the boilerplate and complexity of web-applications on the JVM. However, I found most existing libraries (and frameworks) to be overly bloated, complex and just generally awful.

I wanted something “better” — of course, better purely by my own personal definition.

I’ve written enough Java and Scala to recognize what’s most relevant when choosing a highly asynchronous and flexible web-layer upon which to build a scalable web-service or application. With a foot in multiple camps, and having previously used most widely available frameworks and tools, I’d like to think I have a unique perspective on this problem. From what I can tell, there’s generally two sides:

  1. The asynchronous overkill approach — Akka, Spray and Play
  2. The boil-the-ocean, thread-based approach — Spring and Apache Struts

Each of these tools have their own merits, but I conjecture that a large majority of the time, they are either misused or chosen for the wrong reasons. Quite often, especially in software engineering, developers get lost in a haze of early over optimization or analysis paralysis. I wish I had a dollar for every time I heard a Product or Engineering Manager say something like, “We need to support 1,000,000 concurrent users! Web-scale!” Hold on — let’s make a pragmatic upfront technical decision, and build a beautiful product first. If and when the opportunity to go “web-scale” presents itself, we can address those tough scalability questions later.

However, in the meantime, there must exist some web-layer that:

  • doesn’t attempt to boil-the-ocean
  • can be used with a Java or Scala stack
  • is easy to understand and debug
  • avoids confusing and generally awful DSLs
  • is fully asynchronous
  • doesn’t require tens-of-megabytes of dependency hell
  • doesn’t “baby” engineers with fancy shells and command line tools
  • has a reasonable set of complete documentation and examples
  • is “fast”

And so, I sat down one evening many moons ago, and began to write my own web-layer from scratch — one that attempts to address many, if not all, of the shortcomings I perceive in existing toolkits.

I named the project Curacao, because I like fancy blue drinks with tiny umbrellas.

10,000 foot view

At a high level, here are some things you should know about Curacao:

  • it’s written in Java 7, but plays nicely with Scala
  • it’s thread based, built on top of asynchronous Servlets as part of the J2EE Servlet 3.0 spec
  • takes a “return or throw anything, from anywhere” approach to response handling
  • implements a clean, and very fast, dependency injection model
  • controllers, components, and routes are defined using simple annotations
  • BYO (bring-your-own) ORM library
  • no XML, anywhere — is configurable using HOCON and the Typesafe Config configuration library
  • for JSON, supports GSON and Jackson out-of-the-box
  • compiled, Curacao ships in a single JAR that’s only 150KB in size
  • deployable with any Servlet 3.0 compatible web-application
  • it’s free and open source on GitHub


Still here?

Let’s bootstrap a Curacao application in 3-steps.

  1. First, configure your project to pull in the necessary dependencies. As of this writing, the latest stable version is 2.6.3, however you should check the Curacao Releases page for the latest version.

    If using Maven:

      <name>Kolich repo</name>

    If using SBT:

    resolvers += "Kolich repo" at "http://markkolich.github.io/repo"
    val curacao = "com.kolich.curacao" % "curacao" % "2.6.3" % "compile"

  2. Second, inject the required listener and dispatcher into your application's web.xml.


    The CuracaoContextListener listens for ServletContext lifecycle events, and initializes and destroys application components accordingly. And, like you might expect, the CuracaoDispatcherServlet is responsible for receiving and dispatching incoming requests from the Servlet container.

  3. Lastly, create a HOCON configuration file named application.conf and put it in a place that's accessible on your classpath — typically somewhere like src/main/resources. This file defines your Curacao application configuration, and is loaded from the classpath at runtime.

    curacao {
      ## Your boot package is the package in which all of your components and
      ## controllers reside.  At boot time, Curacao uses reflection and scans
      ## this package, and all of its children, looking for annotated classes
      ## to dynamically instantiate.
      boot-package = "com.foobar"
      ## The asynchronous timeout for any response.  If your application fails to
      ## respond to any request within this timeout, Curacao will kick in and
      ## throw an exception, which allows you to abort+handle the response
      ## gracefully.  Set to 0 (zero) for an infinite timeout.
      async-context-timeout = 30s
      ## The maximum number of threads that will be used to handle incoming
      ## requests.  The number of concurrent request worker threads will never
      ## exceed this size.  Set to 0 (zero) for an unbounded thread pool.
      pools.request {
        size = 4
      ## The maximum number of threads that will be used to process outgoing
      ## responses.  The number of concurrent response worker threads will never
      ## exceed this size.  Set to 0 (zero) for an unbounded thread pool. 
      pools.response {
        size = 4

    Take a look at Curacao's global reference.conf for the complete list of application configuration options. This reference.conf file defines the Curacao default set of configuration options, which are completely overridable in your own application.conf.

That’s it! You’ve bootstrapped your first Curacao enabled application.


At their core, Curacao controllers are immutable singletons that are automatically instantiated at application startup — they’re classes that contain methods which Curacao will invoke via reflection when dispatching a request. On launch, Curacao recursively scans your defined boot-package looking for any classes annotated with the @Controller annotation. As requests are received and dispatched from the Servlet container, Curacao very efficiently interrogates each known controller instance looking for a method worthy of handling the request.

For maximum efficiency at runtime, regular expressions and request routing tables are compiled and cached once at startup.

Here’s a sample controller implementation that demonstrates several key features:

public final class UserController {

  public String getUser(@Path("userId") final String userId) {
    return "Load user: " + userId;
  @RequestMapping(value="^\\/users$", methods=POST)
  public String createUser(@RequestBody final String body) {
    // Lazily convert 'body' to a user object
    // Insert user into data store
    return "Successfully created user.";
  @RequestMapping(value="^\\/users\\/(?<userId>[a-zA-Z_0-9\-]+)$", methods=PUT)
  public void updateUser(@Path("userId") final String userId,
                         final HttpServletResponse response,
                         final AsyncContext context) {
    try {
      // Do work, update user with id 'userId'
      response.setStatus(201); // 201 Created
    } finally {
      // Complete context manually due to 'void' return type
  @RequestMapping(value="^\\/users\\/(?<userId>[a-zA-Z_0-9\-]+)$", methods=DELETE)
  public void deleteUser(@Path("userId") final String userId,
                         final HttpServletResponse response,
                         final AsyncContext context) {
    try {
      // Delete user specified by 'userId'
      response.setStatus(204); // 204 No Content
    } finally {
      // Complete context manually due to 'void' return type
  public Future<List<String>> queryUsers(@Query("name") final String name) {
    // Query data store for a list of users matching the provided 'name'.
    // Return a Future<List<String>> which represents an async operation that
    // fetches a list of user ID's.
    return someFuture;


Like other popular toolkits, request routing is handled using a familiar @RequestMapping method annotation. The @RequestMapping annotation allows you to specify, among other things, the HTTP request method and URI path to match. The default behavior of @RequestMapping uses Java regular expressions and Java 7’s named capture groups to extract path components from the incoming request URI.

When you need the entire request body as a UTF-8 encoded String, simply add a String method argument and annotate it with the @RequestBody annotation. Further, query parameters can be easily extracted using the @Query method argument annotation. For more complex scenarios, when you need direct access to the underlying HttpServletResponse or Servlet 3.0 AsyncContext object, just add them as arguments and Curacao will pass them to your method when invoked. Last but not least, your controller methods may return a Future<?> anytime you need to render the result of an asynchronous operation, that may or may not complete successfully at some point in the future.

In the unlikely event you need to route requests by something other than the URI/path, you can implement your own CuracaoPathMatcher and pass it to Curacao using the matcher attribute of the @RequestMapping annotation.


Like Curacao controllers, components are immutable singletons instantiated at application startup — they’re classes that represent pieces of shared logic or configuration, much like Java “beans”. Unsurprisingly, component classes are annotated with the @Component annotation.

Component singletons can be passed to other components, controllers, request filters, request mappers and response handlers — we’ll cover the latter three later in this post. In the spirit of immutability, Curacao components can only be passed to other Curacao instantiated classes via their constructors — there are no “getters” and “setters”. Current best practices dictate the usage of final like instance variables in your Curacao instantiated classes, ensuring immutability.

Consider the two components below, Foo and Bar — based on their constructor declarations, Bar depends on Foo. In other words, Bar cannot be instantiated unless it is passed an instance of Foo via its constructor.

public final class Foo {

  public Foo() {
    // Stuph.

public final class Bar {

  private final Foo foo_;

  public Bar(@Nonnull final Foo foo) {
    foo_ = foo;


You may have noticed the @Injectable constructor annotation. The @Injectable annotation is used to declare dependencies on other components. In the example above, because class Bar has an @Injectable annotated constructor with an argument of type Foo, Curacao interprets this relationship as “class Bar depends on Foo”. Therefore, Foo will be instantiated first, and then passed to Bar’s constructor.

Curacao automagically identifies such dependencies, and instantiates component singletons in dependency order. Like other dependency-injection (DI) models, Curacao scans your declared boot-package and intelligently builds an object graph by analyzing dependencies derived from your implementation. However, note there are no “component factories” in Curacao.

Your object graphs can be as simple, or as complex as you’d like.

Injecting components into your controllers is easy too. In your controller, simply add an @Injectable annotated constructor. Component singletons, once instantiated, will be passed to your controller as constructor arguments.

public final class SampleController {

  private final Bar bar_;

  public SampleController(final Bar bar) {
    bar_ = bar;
  public String bar() {
    return bar_.toString();


Lastly, components that need to be aware of application container lifecycle events such as startup and shutdown, can implement the ComponentInitializable and/or ComponentDestroyable interfaces.

public final class WebServiceClient implements ComponentDestroyable {

  private final AsyncHttpClient httpClient_;

  public WebServiceClient() {
    httpClient_ = new AsyncHttpClient();
   * Called once during application shutdown to stop this component.
   * Is useful to cleanup or close open sockets and other resources.
  public void destroy() throws Exception {

public final class DataStore implements ComponentInitializable {

  private final MongoDbClient mongo_;

  public DataStore() {
    mongo_ = new MongoDbClient();
   * Called once during application startup to initialize this component.
   * Is useful to further initialize a component beyond its constructor.
  public void initialize() throws Exception {
    mongo_.setCredentials("foo", "bar");



Request filters are singletons that implement the CuracaoRequestFilter interface and are invoked as “pre-processing” tasks before an underlying controller method is invoked. Filters can accept the request, and attach context attributes for consumption by a controller. Or, they can reject the request by throwing an Exception.

This makes request filters a suitable place for handling request authentication or authorization.

Unlike vanilla Servlet filters, Curacao request filters are handled asynchronously outside of a blocking Servlet container thread. In other words, Curacao calls request.startAsync() on the incoming ServletRequest before it invokes your request filter. This means that Curacao request filters are asynchronously handled in the context of a normal request.

Just like other components or controllers, filters are injectable — decorate a filter’s constructor with @Injectable to inject component singletons into the filter.

public final class SessionFilter implements CuracaoRequestFilter {

  private final DataStore ds_;

  public SessionFilter(final DataStore ds) {
    ds_ = ds;
  public void filter(final CuracaoRequestContext context) throws Exception {
    final HttpServletRequest request = context.request_;
    final String auth = request.getHeader("Authentication");
    // Authenticate the request against the data store, throw Exception if needed 
    final String userId = ds_.authorizeUser(auth);
    // If we got here, we must have successfully authenticated the user.
    // Attach the user's ID to the context to be picked up by a controller later.
    context.setProperty("user-id", userId);


The CuracaoRequestContext is an object that represents a mutable “request context” which spans across the life of the request. A filter can use the internal mutable property map in this class to pass data objects from itself to another filter, controller, or argument mapper (covered later).

Attach one or more filters to your controller methods using the filters attribute of the @RequestMapping annotation.

public final class SecureController {

  @RequestMapping(value="^\\/secure$", filters={SessionFilter.class})
  public String secureArea() {
    // Secure.


Request Mappers

Request mappers are immutable singletons that translate the request body, or some other piece of the request, into something directly usable by a controller method. For example, reading and translating an incoming form POST body into a Multimap<String,String>. Or, reading and translating an incoming PUT request body into a custom object — e.g., unmarshalling a JSON string into an application entity.

For convenience, Curacao ships with several default request mappers. For instance, in your controller, if you’d like to convert the incoming request body to a Multimap<String,String>, simply add the right argument and annotate it with the @RequestBody annotation. Curacao uses Google’s Guava Multimap implementation exclusively.

public final class RequestBodyDemoController {

   * Buffer the request body, and decode the URL encoded key-value parameters
   * therein into a Multimap<String,String>.
  @RequestMapping(value="^\\/post", methods=POST)
  public String post(@RequestBody final Multimap<String,String> body) {
    // Assume POST body was 'foo=bar&dog=cat', body.get("foo") returns ["bar"]
    List<String> foo = body.get("foo");
    return foo.toString();
   * Get a single parameter from the POST body, 'foo'.
  @RequestMapping(value="^\\/post\\/foo", methods=POST)
  public String postFoo(@RequestBody("foo") final String foo) {
    return foo;
   * Buffer the entire request body into an NIO ByteBuffer.
  @RequestMapping(value="^\\/put\\/buffer", methods=PUT)
  public String postBuffer(@RequestBody final ByteBuffer body) {
    return "Byte buffer capacity: " + body.capacity();

Implementing your own request mapper is easy too. For instance, if you need to unmarshall a JSON POST body into an object, simply write a class to extend InputStreamReaderRequestMapper<T> and annotate it with the @ControllerArgumentTypeMapper annotation.

public final class MyObjectMapper extends InputStreamReaderRequestMapper<MyObject> {

  private final DataStore ds_;

   * Yes, argument mappers are component injectable too!
  public MyObjectMapper(final DataStore ds) {
    ds_ = ds;

  public MyObject resolveWithReader(final InputStreamReader reader) throws Exception {
    // Use provided 'InputStreamReader' and unmarshall string to a MyObject instance
    return myObject;


Now that you’ve registered a request mapper for type MyObject, you can simply add a MyObject argument to any controller method. Curacao will automagically invoke your request mapper to convert the body to a MyObject, before calling your controller method.

public final class MyObjectController {

  @RequestMapping(value="^\\/myobject", methods=POST)
  public String myObject(final MyObject mine) {
    // Do something with MyObject
    return "Worked!";


You can find the default set of Curacao request mappers here.

Response Handlers

Curacao takes a “return or throw anything, from anywhere” approach to response handling.

Like you might expect, response handlers are designed to convert controller returned objects into a response, or convert thrown exceptions into a response. Fortunately, Curacao handles AsyncContext completion for you, so in most cases there’s no need to write verbose code that forcibly calls context.complete() in your controllers.

For convenience, Curacao ships with several default response handlers. For instance, when your controller method returns a String, Curacao automatically interprets this return type as a text/plain; charset=UTF-8 encoded response body and sets the right response headers accordingly. Similarly, if your controller method returns a java.io.File object, Curacao interprets this as a static resource response — images, CSS, JavaScript, etc. As such, Curacao will set the right Content-Type response header based on the file’s extension, and will automatically stream the File contents back to the client.

Thrown exceptions are handled in the same way. For example, Curacao’s default response handling behavior for any thrown java.lang.Exception is to return a vanilla 500 Internal Server Error with an empty response body.

These default behaviors make writing controllers surprisingly pleasant and simple. However, you can of course, override any of these default behaviors by implementing your own RenderingResponseTypeMapper.

public final class MyObjectResponseHandler extends RenderingResponseTypeMapper<MyObject> {

  private final DataStore ds_;
   * Yes, response handlers are component injectable too!
  public MyObjectResponseHandler(final DataStore ds) {
    ds_ = ds;
  public void render(final AsyncContext context,
                     final HttpServletResponse response,
                     @Nonnull final MyObject obj) throws Exception {
    response.setContentType("application/json; charset=UTF-8");
    try(final Writer w = response.getWriter()) {
      // Convert 'MyObject' to JSON using the library of your choice.

Now that a response handler has been defined for type MyObject, anytime a controller method returns and object of type MyObject, the MyObjectResponseHandler above will be called by Curacao to convert it into JSON automatically.

Thrown exceptions are handled in the same way.

public final class AuthenticationExceptionResponseHandler
  extends RenderingResponseTypeMapper<AuthenticationException> {
  public void render(final AsyncContext context,
                     final HttpServletResponse response,
                     @Nonnull final AuthenticationException ex) throws Exception {
    // Redirect the user to the login page.

Here’s an example controller that makes use of these response handlers.

public final class ResponseHandlerDemoController {

   * This method returns a 'MyObject' instance, which will trigger Curacao
   * to invoke the MyObjectResponseHandler above to render it as JSON.
  public MyObject getMyObject() {
    return new MyObject();

   * When a controller throws an 'AuthenticationException', Curacao catches this
   * and invokes the 'AuthenticationExceptionResponseHandler' which redirects
   * the user to the login page.
  public String home() {
    boolean isLoggedIn = false;
    // Validate that user is authenticated and request contains a valid session.
    if (!isLoggedIn) {
      throw new AuthenticationException();
    return "Hello, world!";


You can find the default set of Curacao response handlers here.


Curacao has been proudly submitted to TechEmpower’s Framework Benchmark test suite.

I’m anxiously waiting on results from Round 10 of their tests, which should include Curacao. When the test results are available, I intend to publish them here.

Further Examples

In the spirit of “eating my own dog food”, this very blog is built on Curacao and is fully open source on GitHub. If you’re looking for more complex component definitions, and realistic request mapping and response handling examples, the application source of this blog will be a great start.

Additionally, further examples that demonstrate the flexibility of Curacao can be found in the curacao-examples project on GitHub.

Open Source

Curacao is free on GitHub and licensed under the popular MIT License.

Issues and pull requests welcome.


SBT: Recursive sbt.IO.listFiles


Thu 26 Jun 2014 11:31:04 -0800

Annoyingly, SBT’s very own sbt.IO util Object doesn’t provide a mechanism to recursively list files in a directory.

As of SBT 0.13.5, the three listFiles functions it does implement are only somewhat useful for complex builds.

  • def listFiles(dir: File): Array[File]
  • def listFiles(dir: File, filter: java.io.FileFilter): Array[File]
  • def listFiles(filter: java.io.FileFilter)(dir: File): Array[File]


Perhaps more frustrating is that sbt.IO is an Object (a singleton) which by its very nature in Scala means it cannot be extended. So, even if I wanted to extend sbt.IO and override to make it recursive, I can’t.

So, here’s how one can recursively list files in a directory leveraging SBT’s sbt.IO.listFiles:

trait IOHelpers {
  def listFilesRecursively(dir: File): Seq[File] = {
    val list = IO.listFiles(dir)
    list.filter(_.isFile) ++ list.filter(_.isDirectory).flatMap(listFilesRecursively)

Functional programming for-the-win!


Manually Throttle the Bandwidth of a Linux Network Interface


Sun 08 Jun 2014 11:52:32 -0800

In complex service oriented application stacks, some bugs only manifest themselves on congested or slow networking interfaces. Consider a web-service running on a generic Linux box with a single networking interface, eth0. If eth0 is busy enough to completely saturate its networking link, a web-service running on the host behind that link may experience odd behavior when things “slowdown”.

For instance, established client connections timeout but the service fails to gracefully cleanup after itself leaving these connections open — this is a classic source of connection leaks, which on the JVM usually results in the dreaded IOException: Too many open files problem.

So, in development, if one wants to see how a service behaves behind a slow networking interface with extra latency:

  • Download large files in a loop to artificially saturate your networking link
  • Or, more appropriately, figure out how to shape networking traffic on an interface of your choice

A quick search for “how to artificially slow down a Linux networking interface” produced a number of interesting results. Folks mostly discussed 3rd party tools like Wondershaper and Dummynet. Other suggestions involved proxying all HTTP/HTTPS traffic through Apache’s mod_bw — yuck!

Fortunately, most Linux distros ship with the tc command which is used to configure Traffic Control in the Linux kernel.

On my Ubuntu 12.04 box I’ve got a single gigabit networking interface, eth0.

Let’s slow ’er down!

Add latency, slowing ping times

Without throttling, ping times to another local node on my home network are less than 0.2ms on average.

[mark@ubuntu]~$ ping regatta
PING regatta.kolich.local ( 56(84) bytes of data.
64 bytes from regatta.kolich.local ( icmp_req=1 ttl=64 time=0.118 ms
64 bytes from regatta.kolich.local ( icmp_req=2 ttl=64 time=0.193 ms
64 bytes from regatta.kolich.local ( icmp_req=3 ttl=64 time=0.181 ms

So, lets use tc to add 500ms of latency to all network traffic.

[mark@ubuntu]~$ sudo tc qdisc add dev eth0 root netem delay 500ms

Now, trying ping again note time=500 ms as desired.

[mark@ubuntu]~$ ping regatta
PING regatta.kolich.local ( 56(84) bytes of data.
64 bytes from regatta.kolich.local ( icmp_req=1 ttl=64 time=500 ms
64 bytes from regatta.kolich.local ( icmp_req=2 ttl=64 time=500 ms
64 bytes from regatta.kolich.local ( icmp_req=3 ttl=64 time=500 ms

Using tc we’ve added a delay of 500ms to all traffic. This will slow short connections, but once a connection gets past the TCP Slow-start window we’re back to full speed. That is, the connection may start slow — as shaped by our tc delay tweak — but once things are started TCP will ramp up and eventually hit full speed again.

Throttling a sustained maximum rate

So, let’s configure a sustained maximum rate using tc. In other words, lets configure Linux to never allow eth0 to use more than 1kbps regardless of port or application.

[mark@ubuntu]~$ sudo tc qdisc add dev eth0 handle 1: root htb default 11
[mark@ubuntu]~$ sudo tc class add dev eth0 parent 1: classid 1:1 htb rate 1kbps
[mark@ubuntu]~$ sudo tc class add dev eth0 parent 1:1 classid 1:11 htb rate 1kbps

Looks good, now lets download a large .iso file using wget to prove to ourselves that our sustained maximum rate throttling is actually working.

[mark@ubuntu]~$ wget http://mirrors.kernel.org/.../CentOS-6.5-x86_64-bin-DVD1.iso -O /dev/null
HTTP request sent, awaiting response... 200 OK
Length: 4467982336 (4.2G) [application/octet-stream]
Saving to: `/dev/null'
 13% [==>                                   ] 580,837,703     10.5K/s

Note the download isn’t going to hover exactly at 1.0K/sec — the actual download speed as reported by wget is an average over time. In short, you’ll see numbers closer to an even 1.0K/sec the longer the transfer. In this example, I didn’t wait to download an entire 4.2GB file, so the 10.5K/s you see above is just wget averaging the transfer speed over the short time I left wget running.

Clearing all tc rules

Now that we’re done, simply delete all traffic control throttling rules to return to normal.

[mark@ubuntu]~$ sudo tc qdisc del dev eth0 root