Introducing Curacao

314de9da9114a65a6fa4367302cbb63ce28568fc

Sun 31 Aug 2014 14:57:31 -0800

Tired of Spring, Jersey, raw Servlets, and other REST toolkits I cautiously approached the thought of building my own JVM web-layer from scratch. In retrospect, I probably didn’t need to spend time on yet another toolkit to help shield engineers from the boilerplate and complexity of web-applications on the JVM. However, I found most existing libraries (and frameworks) to be overly bloated, complex and just generally awful.

I wanted something “better” — of course, better purely by my own personal definition.

I’ve written enough Java and Scala to recognize what’s most relevant when choosing a highly asynchronous and flexible web-layer upon which to build a scalable web-service or application. With a foot in multiple camps, and having previously used most widely available frameworks and tools, I’d like to think I have a unique perspective on this problem. From what I can tell, there’s generally two sides:

  1. The asynchronous overkill approach — Akka, Spray and Play
  2. The boil-the-ocean, thread-based approach — Spring and Apache Struts

Each of these tools have their own merits, but I conjecture that a large majority of the time, they are either misused or chosen for the wrong reasons. Quite often, especially in software engineering, developers get lost in a haze of early over optimization or analysis paralysis. I wish I had a dollar for every time I heard a Product or Engineering Manager say something like, “We need to support 1,000,000 concurrent users! Web-scale!” Hold on — let’s make a pragmatic upfront technical decision, and build a beautiful product first. If and when the opportunity to go “web-scale” presents itself, we can address those tough scalability questions later.

However, in the meantime, there must exist some web-layer that:

  • doesn’t attempt to boil-the-ocean
  • can be used with a Java or Scala stack
  • is easy to understand and debug
  • avoids confusing and generally awful DSLs
  • is fully asynchronous
  • doesn’t require tens-of-megabytes of dependency hell
  • doesn’t “baby” engineers with fancy shells and command line tools
  • has a reasonable set of complete documentation and examples
  • is “fast”

And so, I sat down one evening many moons ago, and began to write my own web-layer from scratch — one that attempts to address many, if not all, of the shortcomings I perceive in existing toolkits.

I named the project Curacao, because I like fancy blue drinks with tiny umbrellas.

10,000 foot view

At a high level, here are some things you should know about Curacao:

  • it’s written in Java 7, but plays nicely with Scala
  • it’s thread based, built on top of asynchronous Servlets as part of the J2EE Servlet 3.0 spec
  • takes a “return or throw anything, from anywhere” approach to response handling
  • implements a clean, and very fast, dependency injection model
  • controllers, components, and routes are defined using simple annotations
  • BYO (bring-your-own) ORM library
  • no XML, anywhere — is configurable using HOCON and the Typesafe Config configuration library
  • for JSON, supports GSON and Jackson out-of-the-box
  • compiled, Curacao ships in a single JAR that’s only 150KB in size
  • deployable with any Servlet 3.0 compatible web-application
  • it’s free and open source on GitHub

Bootstrap

Still here?

Let’s bootstrap a Curacao application in 3-steps.

  1. First, configure your project to pull in the necessary dependencies. As of this writing, the latest stable version is 2.6.3, however you should check the Curacao Releases page for the latest version.

    If using Maven:

    <repository>
      <id>Kolichrepo</id>
      <name>Kolich repo</name>
      <url>http://markkolich.github.io/repo/</url>
      <layout>default</layout>
    </repository>
    
    <dependency>
      <groupId>com.kolich.curacao</groupId>
      <artifactId>curacao</artifactId>
      <version>2.6.3</version>
      <scope>compile</scope>
    </dependency>
    

    If using SBT:

    resolvers += "Kolich repo" at "http://markkolich.github.io/repo"
    val curacao = "com.kolich.curacao" % "curacao" % "2.6.3" % "compile"
    

  2. Second, inject the required listener and dispatcher into your application's web.xml.

    <web-app>
                     
      <listener>
        <listener-class>com.kolich.curacao.CuracaoContextListener</listener-class>
      </listener>
    
      <servlet>
        <servlet-name>CuracaoDispatcherServlet</servlet-name>
        <servlet-class>com.kolich.curacao.CuracaoDispatcherServlet</servlet-class>
        <load-on-startup>1</load-on-startup>
        <async-supported>true</async-supported>
      </servlet>
      <servlet-mapping>
        <servlet-name>CuracaoDispatcherServlet</servlet-name>
        <url-pattern>/*</url-pattern>
      </servlet-mapping>
    
    </web-app>
    

    The CuracaoContextListener listens for ServletContext lifecycle events, and initializes and destroys application components accordingly. And, like you might expect, the CuracaoDispatcherServlet is responsible for receiving and dispatching incoming requests from the Servlet container.

  3. Lastly, create a HOCON configuration file named application.conf and put it in a place that's accessible on your classpath — typically somewhere like src/main/resources. This file defines your Curacao application configuration, and is loaded from the classpath at runtime.

    curacao {
                        
      ## Your boot package is the package in which all of your components and
      ## controllers reside.  At boot time, Curacao uses reflection and scans
      ## this package, and all of its children, looking for annotated classes
      ## to dynamically instantiate.
      boot-package = "com.foobar"
      
      ## The asynchronous timeout for any response.  If your application fails to
      ## respond to any request within this timeout, Curacao will kick in and
      ## throw an exception, which allows you to abort+handle the response
      ## gracefully.  Set to 0 (zero) for an infinite timeout.
      async-context-timeout = 30s
      
      ## The maximum number of threads that will be used to handle incoming
      ## requests.  The number of concurrent request worker threads will never
      ## exceed this size.  Set to 0 (zero) for an unbounded thread pool.
      pools.request {
        size = 4
      }
      
      ## The maximum number of threads that will be used to process outgoing
      ## responses.  The number of concurrent response worker threads will never
      ## exceed this size.  Set to 0 (zero) for an unbounded thread pool. 
      pools.response {
        size = 4
      }
      
    }
    

    Take a look at Curacao's global reference.conf for the complete list of application configuration options. This reference.conf file defines the Curacao default set of configuration options, which are completely overridable in your own application.conf.

That’s it! You’ve bootstrapped your first Curacao enabled application.

Controllers

At their core, Curacao controllers are immutable singletons that are automatically instantiated at application startup — they’re classes that contain methods which Curacao will invoke via reflection when dispatching a request. On launch, Curacao recursively scans your defined boot-package looking for any classes annotated with the @Controller annotation. As requests are received and dispatched from the Servlet container, Curacao very efficiently interrogates each known controller instance looking for a method worthy of handling the request.

For maximum efficiency at runtime, regular expressions and request routing tables are compiled and cached once at startup.

Here’s a sample controller implementation that demonstrates several key features:

@Controller
public final class UserController {

  @RequestMapping("^\\/users\\/(?<userId>[a-zA-Z_0-9\-]+)$")
  public String getUser(@Path("userId") final String userId) {
    return "Load user: " + userId;
  }
  
  @RequestMapping(value="^\\/users$", methods=POST)
  public String createUser(@RequestBody final String body) {
    // Lazily convert 'body' to a user object
    // Insert user into data store
    return "Successfully created user.";
  }
  
  @RequestMapping(value="^\\/users\\/(?<userId>[a-zA-Z_0-9\-]+)$", methods=PUT)
  public void updateUser(@Path("userId") final String userId,
                         final HttpServletResponse response,
                         final AsyncContext context) {
    try {
      // Do work, update user with id 'userId'
      response.setStatus(201); // 201 Created
    } finally {
      // Complete context manually due to 'void' return type
      context.complete();
    }    
  }
  
  @RequestMapping(value="^\\/users\\/(?<userId>[a-zA-Z_0-9\-]+)$", methods=DELETE)
  public void deleteUser(@Path("userId") final String userId,
                         final HttpServletResponse response,
                         final AsyncContext context) {
    try {
      // Delete user specified by 'userId'
      response.setStatus(204); // 204 No Content
    } finally {
      // Complete context manually due to 'void' return type
      context.complete();
    }    
  }
  
  @RequestMapping("^\\/users$")
  public Future<List<String>> queryUsers(@Query("name") final String name) {
    // Query data store for a list of users matching the provided 'name'.
    // Return a Future<List<String>> which represents an async operation that
    // fetches a list of user ID's.
    return someFuture;
  }

}

Like other popular toolkits, request routing is handled using a familiar @RequestMapping method annotation. The @RequestMapping annotation allows you to specify, among other things, the HTTP request method and URI path to match. The default behavior of @RequestMapping uses Java regular expressions and Java 7’s named capture groups to extract path components from the incoming request URI.

When you need the entire request body as a UTF-8 encoded String, simply add a String method argument and annotate it with the @RequestBody annotation. Further, query parameters can be easily extracted using the @Query method argument annotation. For more complex scenarios, when you need direct access to the underlying HttpServletResponse or Servlet 3.0 AsyncContext object, just add them as arguments and Curacao will pass them to your method when invoked. Last but not least, your controller methods may return a Future<?> anytime you need to render the result of an asynchronous operation, that may or may not complete successfully at some point in the future.

In the unlikely event you need to route requests by something other than the URI/path, you can implement your own CuracaoPathMatcher and pass it to Curacao using the matcher attribute of the @RequestMapping annotation.

Components

Like Curacao controllers, components are immutable singletons instantiated at application startup — they’re classes that represent pieces of shared logic or configuration, much like Java “beans”. Unsurprisingly, component classes are annotated with the @Component annotation.

Component singletons can be passed to other components, controllers, request filters, request mappers and response handlers — we’ll cover the latter three later in this post. In the spirit of immutability, Curacao components can only be passed to other Curacao instantiated classes via their constructors — there are no “getters” and “setters”. Current best practices dictate the usage of final like instance variables in your Curacao instantiated classes, ensuring immutability.

Consider the two components below, Foo and Bar — based on their constructor declarations, Bar depends on Foo. In other words, Bar cannot be instantiated unless it is passed an instance of Foo via its constructor.

@Component
public final class Foo {

  public Foo() {
    // Stuph.
  }

}
@Component
public final class Bar {

  private final Foo foo_;

  @Injectable
  public Bar(@Nonnull final Foo foo) {
    foo_ = foo;
  }

}

You may have noticed the @Injectable constructor annotation. The @Injectable annotation is used to declare dependencies on other components. In the example above, because class Bar has an @Injectable annotated constructor with an argument of type Foo, Curacao interprets this relationship as “class Bar depends on Foo”. Therefore, Foo will be instantiated first, and then passed to Bar’s constructor.

Curacao automagically identifies such dependencies, and instantiates component singletons in dependency order. Like other dependency-injection (DI) models, Curacao scans your declared boot-package and intelligently builds an object graph by analyzing dependencies derived from your implementation. However, note there are no “component factories” in Curacao.

Your object graphs can be as simple, or as complex as you’d like.

Injecting components into your controllers is easy too. In your controller, simply add an @Injectable annotated constructor. Component singletons, once instantiated, will be passed to your controller as constructor arguments.

@Controller
public final class SampleController {

  private final Bar bar_;

  @Injectable
  public SampleController(final Bar bar) {
    bar_ = bar;
  }
  
  @RequestMapping("^\\/bar$")
  public String bar() {
    return bar_.toString();
  }

}

Lastly, components that need to be aware of application container lifecycle events such as startup and shutdown, can implement the ComponentInitializable and/or ComponentDestroyable interfaces.

@Component
public final class WebServiceClient implements ComponentDestroyable {

  private final AsyncHttpClient httpClient_;

  public WebServiceClient() {
    httpClient_ = new AsyncHttpClient();
  }
  
  /**
   * Called once during application shutdown to stop this component.
   * Is useful to cleanup or close open sockets and other resources.
   */
  @Override
  public void destroy() throws Exception {
    httpClient_.close();
  }

}
@Component
public final class DataStore implements ComponentInitializable {

  private final MongoDbClient mongo_;

  public DataStore() {
    mongo_ = new MongoDbClient();
  }
  
  /**
   * Called once during application startup to initialize this component.
   * Is useful to further initialize a component beyond its constructor.
   */
  @Override
  public void initialize() throws Exception {
    mongo_.setCredentials("foo", "bar");
    mongo_.setMaxConnections(100);
  }

}

Filters

Request filters are singletons that implement the CuracaoRequestFilter interface and are invoked as “pre-processing” tasks before an underlying controller method is invoked. Filters can accept the request, and attach context attributes for consumption by a controller. Or, they can reject the request by throwing an Exception.

This makes request filters a suitable place for handling request authentication or authorization.

Unlike vanilla Servlet filters, Curacao request filters are handled asynchronously outside of a blocking Servlet container thread. In other words, Curacao calls request.startAsync() on the incoming ServletRequest before it invokes your request filter. This means that Curacao request filters are asynchronously handled in the context of a normal request.

Just like other components or controllers, filters are injectable — decorate a filter’s constructor with @Injectable to inject component singletons into the filter.

public final class SessionFilter implements CuracaoRequestFilter {

  private final DataStore ds_;

  @Injectable
  public SessionFilter(final DataStore ds) {
    ds_ = ds;
  }
  
  @Override
  public void filter(final CuracaoRequestContext context) throws Exception {
    final HttpServletRequest request = context.request_;
    final String auth = request.getHeader("Authentication");
    // Authenticate the request against the data store, throw Exception if needed 
    final String userId = ds_.authorizeUser(auth);
    // If we got here, we must have successfully authenticated the user.
    // Attach the user's ID to the context to be picked up by a controller later.
    context.setProperty("user-id", userId);
  }

}

The CuracaoRequestContext is an object that represents a mutable “request context” which spans across the life of the request. A filter can use the internal mutable property map in this class to pass data objects from itself to another filter, controller, or argument mapper (covered later).

Attach one or more filters to your controller methods using the filters attribute of the @RequestMapping annotation.

@Controller
public final class SecureController {

  @RequestMapping(value="^\\/secure$", filters={SessionFilter.class})
  public String secureArea() {
    // Secure.
  }

}

Request Mappers

Request mappers are immutable singletons that translate the request body, or some other piece of the request, into something directly usable by a controller method. For example, reading and translating an incoming form POST body into a Multimap<String,String>. Or, reading and translating an incoming PUT request body into a custom object — e.g., unmarshalling a JSON string into an application entity.

For convenience, Curacao ships with several default request mappers. For instance, in your controller, if you’d like to convert the incoming request body to a Multimap<String,String>, simply add the right argument and annotate it with the @RequestBody annotation. Curacao uses Google’s Guava Multimap implementation exclusively.

@Controller
public final class RequestBodyDemoController {

  /**
   * Buffer the request body, and decode the URL encoded key-value parameters
   * therein into a Multimap<String,String>.
   */
  @RequestMapping(value="^\\/post", methods=POST)
  public String post(@RequestBody final Multimap<String,String> body) {
    // Assume POST body was 'foo=bar&dog=cat', body.get("foo") returns ["bar"]
    List<String> foo = body.get("foo");
    return foo.toString();
  }
  
  /**
   * Get a single parameter from the POST body, 'foo'.
   */
  @RequestMapping(value="^\\/post\\/foo", methods=POST)
  public String postFoo(@RequestBody("foo") final String foo) {
    return foo;
  }
  
  /**
   * Buffer the entire request body into an NIO ByteBuffer.
   */
  @RequestMapping(value="^\\/put\\/buffer", methods=PUT)
  public String postBuffer(@RequestBody final ByteBuffer body) {
    return "Byte buffer capacity: " + body.capacity();
  }
  
}

Implementing your own request mapper is easy too. For instance, if you need to unmarshall a JSON POST body into an object, simply write a class to extend InputStreamReaderRequestMapper<T> and annotate it with the @ControllerArgumentTypeMapper annotation.

@ControllerArgumentTypeMapper(MyObject.class)
public final class MyObjectMapper extends InputStreamReaderRequestMapper<MyObject> {

  private final DataStore ds_;

  /**
   * Yes, argument mappers are component injectable too!
   */
  @Injectable
  public MyObjectMapper(final DataStore ds) {
    ds_ = ds;
  }

  @Override
  public MyObject resolveWithReader(final InputStreamReader reader) throws Exception {
    // Use provided 'InputStreamReader' and unmarshall string to a MyObject instance
    return myObject;
  }

}

Now that you’ve registered a request mapper for type MyObject, you can simply add a MyObject argument to any controller method. Curacao will automagically invoke your request mapper to convert the body to a MyObject, before calling your controller method.

@Controller
public final class MyObjectController {

  @RequestMapping(value="^\\/myobject", methods=POST)
  public String myObject(final MyObject mine) {
    // Do something with MyObject
    return "Worked!";
  }

}

You can find the default set of Curacao request mappers here.

Response Handlers

Curacao takes a “return or throw anything, from anywhere” approach to response handling.

Like you might expect, response handlers are designed to convert controller returned objects into a response, or convert thrown exceptions into a response. Fortunately, Curacao handles AsyncContext completion for you, so in most cases there’s no need to write verbose code that forcibly calls context.complete() in your controllers.

For convenience, Curacao ships with several default response handlers. For instance, when your controller method returns a String, Curacao automatically interprets this return type as a text/plain; charset=UTF-8 encoded response body and sets the right response headers accordingly. Similarly, if your controller method returns a java.io.File object, Curacao interprets this as a static resource response — images, CSS, JavaScript, etc. As such, Curacao will set the right Content-Type response header based on the file’s extension, and will automatically stream the File contents back to the client.

Thrown exceptions are handled in the same way. For example, Curacao’s default response handling behavior for any thrown java.lang.Exception is to return a vanilla 500 Internal Server Error with an empty response body.

These default behaviors make writing controllers surprisingly pleasant and simple. However, you can of course, override any of these default behaviors by implementing your own RenderingResponseTypeMapper.

@ControllerReturnTypeMapper(MyObject.class)
public final class MyObjectResponseHandler extends RenderingResponseTypeMapper<MyObject> {

  private final DataStore ds_;
  
  /**
   * Yes, response handlers are component injectable too!
   */
  @Injectable
  public MyObjectResponseHandler(final DataStore ds) {
    ds_ = ds;
  }
		
  @Override
  public void render(final AsyncContext context,
                     final HttpServletResponse response,
                     @Nonnull final MyObject obj) throws Exception {
    response.setStatus(200);
    response.setContentType("application/json; charset=UTF-8");
    try(final Writer w = response.getWriter()) {
      // Convert 'MyObject' to JSON using the library of your choice.
      w.write(obj.toJson());
    }
  }
	
}

Now that a response handler has been defined for type MyObject, anytime a controller method returns and object of type MyObject, the MyObjectResponseHandler above will be called by Curacao to convert it into JSON automatically.

Thrown exceptions are handled in the same way.

@ControllerReturnTypeMapper(AuthenticationException.class)
public final class AuthenticationExceptionResponseHandler
  extends RenderingResponseTypeMapper<AuthenticationException> {
		
  @Override
  public void render(final AsyncContext context,
                     final HttpServletResponse response,
                     @Nonnull final AuthenticationException ex) throws Exception {
    // Redirect the user to the login page.
    response.sendRedirect("/login");
  }
	
}

Here’s an example controller that makes use of these response handlers.

@Controller
public final class ResponseHandlerDemoController {

  /**
   * This method returns a 'MyObject' instance, which will trigger Curacao
   * to invoke the MyObjectResponseHandler above to render it as JSON.
   */
  @RequestMapping("^\\/myobject")
  public MyObject getMyObject() {
    return new MyObject();
  }

  /**
   * When a controller throws an 'AuthenticationException', Curacao catches this
   * and invokes the 'AuthenticationExceptionResponseHandler' which redirects
   * the user to the login page.
   */
  @RequestMapping("^\\/home")
  public String home() {
    boolean isLoggedIn = false;
    // Validate that user is authenticated and request contains a valid session.
    if (!isLoggedIn) {
      throw new AuthenticationException();
    }
    return "Hello, world!";
  }

}

You can find the default set of Curacao response handlers here.

Performance

Curacao has been proudly submitted to TechEmpower’s Framework Benchmark test suite.

I’m anxiously waiting on results from Round 10 of their tests, which should include Curacao. When the test results are available, I intend to publish them here.

Further Examples

In the spirit of “eating my own dog food”, this very blog is built on Curacao and is fully open source on GitHub. If you’re looking for more complex component definitions, and realistic request mapping and response handling examples, the application source of this blog will be a great start.

Additionally, further examples that demonstrate the flexibility of Curacao can be found in the curacao-examples project on GitHub.

Open Source

Curacao is free on GitHub and licensed under the popular MIT License.

Issues and pull requests welcome.

Cheers!

SBT: Recursive sbt.IO.listFiles

96b0fbe1fdf322ad86c2349d52e94d20329f15d6

Thu 26 Jun 2014 11:31:04 -0800

Annoyingly, SBT’s very own sbt.IO util Object doesn’t provide a mechanism to recursively list files in a directory.

As of SBT 0.13.5, the three listFiles functions it does implement are only somewhat useful for complex builds.

  • def listFiles(dir: File): Array[File]
  • def listFiles(dir: File, filter: java.io.FileFilter): Array[File]
  • def listFiles(filter: java.io.FileFilter)(dir: File): Array[File]

Meh.

Perhaps more frustrating is that sbt.IO is an Object (a singleton) which by its very nature in Scala means it cannot be extended. So, even if I wanted to extend sbt.IO and override to make it recursive, I can’t.

So, here’s how one can recursively list files in a directory leveraging SBT’s sbt.IO.listFiles:

trait IOHelpers {
  def listFilesRecursively(dir: File): Seq[File] = {
    val list = IO.listFiles(dir)
    list.filter(_.isFile) ++ list.filter(_.isDirectory).flatMap(listFilesRecursively)
  }
}

Functional programming for-the-win!

Cheers.

Manually Throttle the Bandwidth of a Linux Network Interface

ccf6dea1113d4479376e643cc0ddffac20715c35

Sun 08 Jun 2014 11:52:32 -0800

In complex service oriented application stacks, some bugs only manifest themselves on congested or slow networking interfaces. Consider a web-service running on a generic Linux box with a single networking interface, eth0. If eth0 is busy enough to completely saturate its networking link, a web-service running on the host behind that link may experience odd behavior when things “slowdown”.

For instance, established client connections timeout but the service fails to gracefully cleanup after itself leaving these connections open — this is a classic source of connection leaks, which on the JVM usually results in the dreaded IOException: Too many open files problem.

So, in development, if one wants to see how a service behaves behind a slow networking interface with extra latency:

  • Download large files in a loop to artificially saturate your networking link
  • Or, more appropriately, figure out how to shape networking traffic on an interface of your choice

A quick search for “how to artificially slow down a Linux networking interface” produced a number of interesting results. Folks mostly discussed 3rd party tools like Wondershaper and Dummynet. Other suggestions involved proxying all HTTP/HTTPS traffic through Apache’s mod_bw — yuck!

Fortunately, most Linux distros ship with the tc command which is used to configure Traffic Control in the Linux kernel.

On my Ubuntu 12.04 box I’ve got a single gigabit networking interface, eth0.

Let’s slow ’er down!

Add latency, slowing ping times

Without throttling, ping times to another local node on my home network are less than 0.2ms on average.

[mark@ubuntu]~$ ping regatta
PING regatta.kolich.local (1.0.0.2) 56(84) bytes of data.
64 bytes from regatta.kolich.local (1.0.0.2): icmp_req=1 ttl=64 time=0.118 ms
64 bytes from regatta.kolich.local (1.0.0.2): icmp_req=2 ttl=64 time=0.193 ms
64 bytes from regatta.kolich.local (1.0.0.2): icmp_req=3 ttl=64 time=0.181 ms

So, lets use tc to add 500ms of latency to all network traffic.

[mark@ubuntu]~$ sudo tc qdisc add dev eth0 root netem delay 500ms

Now, trying ping again note time=500 ms as desired.

[mark@ubuntu]~$ ping regatta
PING regatta.kolich.local (1.0.0.2) 56(84) bytes of data.
64 bytes from regatta.kolich.local (1.0.0.2): icmp_req=1 ttl=64 time=500 ms
64 bytes from regatta.kolich.local (1.0.0.2): icmp_req=2 ttl=64 time=500 ms
64 bytes from regatta.kolich.local (1.0.0.2): icmp_req=3 ttl=64 time=500 ms

Using tc we’ve added a delay of 500ms to all traffic. This will slow short connections, but once a connection gets past the TCP Slow-start window we’re back to full speed. That is, the connection may start slow — as shaped by our tc delay tweak — but once things are started TCP will ramp up and eventually hit full speed again.

Throttling a sustained maximum rate

So, let’s configure a sustained maximum rate using tc. In other words, lets configure Linux to never allow eth0 to use more than 1kbps regardless of port or application.

[mark@ubuntu]~$ sudo tc qdisc add dev eth0 handle 1: root htb default 11
[mark@ubuntu]~$ sudo tc class add dev eth0 parent 1: classid 1:1 htb rate 1kbps
[mark@ubuntu]~$ sudo tc class add dev eth0 parent 1:1 classid 1:11 htb rate 1kbps

Looks good, now lets download a large .iso file using wget to prove to ourselves that our sustained maximum rate throttling is actually working.

[mark@ubuntu]~$ wget http://mirrors.kernel.org/.../CentOS-6.5-x86_64-bin-DVD1.iso -O /dev/null
HTTP request sent, awaiting response... 200 OK
Length: 4467982336 (4.2G) [application/octet-stream]
Saving to: `/dev/null'
 13% [==>                                   ] 580,837,703     10.5K/s

Note the download isn’t going to hover exactly at 1.0K/sec — the actual download speed as reported by wget is an average over time. In short, you’ll see numbers closer to an even 1.0K/sec the longer the transfer. In this example, I didn’t wait to download an entire 4.2GB file, so the 10.5K/s you see above is just wget averaging the transfer speed over the short time I left wget running.

Clearing all tc rules

Now that we’re done, simply delete all traffic control throttling rules to return to normal.

[mark@ubuntu]~$ sudo tc qdisc del dev eth0 root

Cheers!

Bolt: A Wrapper around Java's ReentrantReadWriteLock

09df17fb2d779355c5a8121409741543bd982ef4

Thu 27 Feb 2014 20:49:40 -0800

Concurrency is difficult, and generally tough to get right. Fortunately, there are tools that can somewhat ease this pain. For instance, take Java’s ReentrantReadWriteLock — a useful and foundational class that helps any highly concurrent Java application manage a set of readers and writers that need to access a critical block of code. When using a ReentrantReadWriteLock you can have any number of simultaneous readers, but the write lock is exclusive. In other words:

  • If any thread holds the write lock, all readers are forced to wait (or fail hard) until the thread that holds the write lock releases the lock.
  • If the write lock is not held, any number of readers are allowed to access the protected critical block concurrently — and any incoming writers are forced to wait (or fail hard) until all readers are done.

In short, this is the classic ReadWriteLock paradigm.

This is great, except that a vanilla ReentrantReadWriteLock is missing a few key features:

  1. Conditionally wait, or fail immediately, if the desired lock is not available. In other words, let me define upfront what I want to do if the lock I want to “grab” is not available — fail now, or wait indefinitely?
  2. And, execute a callback function only upon successful execution of a transaction. Here, we define a transaction to mean successfully acquiring the lock, doing work (without failure), and releasing the lock.

I wanted these features, so I implemented Bolt — a very tiny wrapper around Java’s ReentrantReadWriteLock with better wait, cleaner fail, and transactional callback support.

LockableEntity

Using Bolt, any entity or object you want to protect should implement the LockableEntity interface.

import com.kolich.bolt.LockableEntity;
import java.util.concurrent.locks.ReadWriteLock;

public final class Foobar implements LockableEntity {

  private final ReadWriteLock lock_;

  public Foobar() {
    lock_ = new ReadWriteLock();
  }

  @Override
  public ReadWriteLock getLock() {
    return lock_;
  }

}

Now, let’s create an instance of this example entity which we will use to protect a critical section of code within a transaction.

public static final Foobar foo = new Foobar();

This instance, foo, is used below throughout my examples.

Read Lock, Fail Immediately

First, let’s grab a shared read lock on foo, but fail immediately with a LockConflictException if the write lock is already acquired by another thread.

new ReentrantReadWriteEntityLock<T>(foo) {
  @Override
  public T transaction() throws Exception {
    // ... do read only work.
    return baz;
  }
}.read(false); // Fail immediately if read lock is not available

Note that read asks for a shared reader lock — the lock will be granted if and only if there are no threads holding a write lock on foo. There very well may be other reader threads.

Read Lock, Block/Wait Forever

Next, let’s grab a shared read lock on foo, but block/wait forever for the read lock to become available. Execute the success callback if and only if the transaction method finished cleanly without exception.

Note the implementation of the success method is completely optional.

new ReentrantReadWriteEntityLock<T>(foo) {
  @Override
  public T transaction() throws Exception {
    // ... do read only work.
    return baz;
  }
  @Override
  public T success(final T t) throws Exception {
    // Only called if transaction() finished cleanly without exception
    return t;
  }
}.read(); // Wait forever

It is very important to note that the underlying lock is held, while the success method is called. That is, the acquired lock isn’t released until the transaction and success method are finished.

Write Lock, Fail Immediately

Grab an exclusive write lock on foo, or fail immediately with a LockConflictException if a write or read lock is already acquired by another thread. Further, execute the success callback method if and only if the transaction method finished cleanly without exception.

new ReentrantReadWriteEntityLock<T>(foo) {
  @Override
  public T transaction() throws Exception {
    // ... do read or write work, safely.
    return baz;
  }
  @Override
  public T success(final T t) throws Exception {
    // Only called if transaction() finished cleanly without exception
    return t;
  }
}.write(); // Fail immediately if write lock not available

Write Lock, Block/Wait Forever

Grab an exclusive write lock on foo, or block/wait forever for all readers to finish.

new ReentrantReadWriteEntityLock<T>(foo) {
  @Override
  public T transaction() throws Exception {
    // ... do read or write work, safely.
    return baz;
  }
}.write(true); // Wait forever

An Example

The Havalo project makes extensive real-world use of this locking mechanism, as a way to manage shared entities that may be concurrently accessed by any number of threads. Havalo is a lightweight key-value store written in Java. Internally, it maintains a collection of repositories and objects, and uses Bolt to conditionally gate access to these objects in local memory.

GitHub

Bolt is free, and open source, on GitHub:

https://github.com/markkolich/kolich-bolt

Pull requests welcome.