EventCast Whiteboard Pattern (Listeners Considered Harmful)

The first time I encountered the whiteboard (or blackboard) pattern was when reading the pragmatic programmer. Since then I have seen it a few times. The major mainstream use of the whiteboard I have seen is in Dependency Injection frameworks like Google Guice and Spring.

In my opinion these DI frameworks might better be described as an implementation of the whiteboard pattern for service (dependency) location and contribution, which also do dependency injection. They can provide you objects from the whiteboard – one way they do that is by DI. But they also can do it in other ways, the Guice Injector can be used as a factory directly (using the getInstance method). Or by providing factory implementations (Providers or my favourite AssistedInject) that draw objects from the whiteboard.

In particular, this is very helpful when an application is structured for testability in a ports and adapters architecture. I am aware that Nat Pryce has an issue with DI frameworks not distinguishing between internals and peers, but I don’t think I have really run into that problem practically. I think probably he is also making the point that DI libraries can hide important information about how you objects are put together. I think both issues probably depend on how you use the DI library and which objects you choose to put into its whiteboard – in general avoid hiding the rules about relationships between your domain objects in your DI usage. BTW; I don’t think EventCast hides relationships in general (actually the opposite), but I can see how it could be used to do that.

So, onto the main point of this post.

The OSGi framework uses the whiteboard pattern to manage dependencies between bundles. In particular when one bundle is driven by events from another (supporting IoC). There is a provocatively named paper called Listeners Considered Harmful: The “Whiteboard” pattern on how they used the whiteboard pattern to replace the listener pattern.

I have recently been working on a library called EventCast based on my experiences using Guava EventBus. What it does is use Guice to implement the whiteboard pattern for OSGi style inter-object communication using interfaces to define the listeners (kind of a very light, stripped down, version of some of what you get with OSGi).

EventCast tries to be as unintrusive to the application code as it can. A single instance of the listener interface is injected into (hopefully the constructor(!) of) each object that wants to produce those events. Registering and unregistered of listeners (listener lifecycle), the threading model of the listeners and any listener error handling is managed separately from the producer. So your consumer just implements the interface it wants messages for, and your producer just calls methods on the injected instance of the listener interface. EventCast handles distributing the message to all of the currently registered listeners.

It is tightly integrated with Guice. It uses the Guice Whiteboard to store the EventCast listener implementation and make it available to the rest of the application. It detects additions of new listeners to the Guice Whiteboard and registers them for any produced events.

Guava EventBus Experiences

Edit 2012/12/13: I have created a library that corrects some of the issues I observed below in eventbus.

I have been using EventBus from the Google Guava library to create a customized installation system for bespoke software product consisting of a large number of components. Here are some of my experiences using the EventBus.

Using EventBus with Guice

I use Guice heavily in almost all of the new code I write. Guice is the new `new`. Using EventBus with Guava is straightforward. Just bind a type listener that will register every object with the EventBus.

final EventBus eventBus = new EventBus("my event bus");
bind(EventBus.class).toInstance(eventBus);
bindListener(Matchers.any(), new TypeListener() {
   @Override
   public <I> void hear(@SuppressWarnings("unused") final TypeLiteral<I> typeLiteral, final TypeEncounter<I> typeEncounter) {
       typeEncounter.register(new InjectionListener<I>() {
           @Override public void afterInjection(final I instance) {
               eventBus.register(instance);
           }
       });
   }
});

You get an instance of the event bus simply by using any of Guice’s normal dependency injection mechanisms:

class MyClass
{
   private final EventBus eventBus;
 
   @Inject
   public MyClass(final EventBus eventBus)
   {
      this.eventBus = eventBus;
   }
[...]

Listening for Events

To listen for an event, simply add the `@Subscribe` annotation to you class, and get an instance of that class from Guice.

@Subscribe
public void myEventHappened(final MyEvent event)
{
   // do work
}

Easy to miss off annotations in Guava EventBus

public void myEventHappened(final MyEvent event)
{
   // do work
}

If I miss an annotation off a subscriber method then nothing tells me something has gone wrong. The compiler won’t tell me, Guice won’t tell me, EventBus will happily register an object that has no subscribe methods at all. I have to write module level tests that check that all of my subscriber methods actually get called by the EventBus. This is fine, and I probably do that any way. But its easy to make a mistake, and I’d like either something more robust or “Guice-style” failfast.

Loose coupling between sender and listener

EventBus creates very loose coupling between the Event sending class and the Event listening class. The only thing that is shared between the two classes is knowledge of the Type of the Event that is being communicated. EventBus isolates the sending class from failures in the listening class. EventBus can also isolate the sending class from the performance of the listening class by using an AsyncEventBus.

Eventbus doesn’t work well with my normal IDE code navigation tools

Because of the very loose coupling between Sender and Listener, it can be hard to find all the methods that are listening for a given event (say) E.class. You have to do do a search like “find all occurrences of the class E” and then sort out which ones are your Subscribe methods. The same is true for finding all the senders of a particular event – you need to say “find all the calls to `post` that are given an argument with the runtime type E”. Both of these navigations are hard to do in an IDE I believe.

Inversion of Control

EventBus allows the wiring of the Sender->Listener relationship to come up to the top level of the application. Although this isn’t inherent in EventBus, the way it integrates with Guice provides a natural decoupled way to wire up your application. Exactly which EventBus instance each of your instance objects gets registered with and sends too can be configured using all the normal Guice features.

Eventbus gets coupled into your codebase

If a class needs to send a message it has to be injected with an instance of the Eventbus. If a class needs to subscribe to a message it has to have some of its methods annotated with the Subscribe annotation. I can wrap up the EventBus in my own custom wrapper, but that doesn’t fix the Subscribe annotation issue. I’d like something that has less coupling throughout my code.

Eventbus is hard to mock

I tend do use a lot of London-School (JMock) style unit tests. Eventbus is a little hard to use in this style. Two issues in particular are awkward: 1) Eventbus is a class with no interface, so mocking it requires a certain amount of technology and is generally not as clean as I would like. 2) The post method requires a single event object. I end up having to write custom matchers for each part of my event object I might be interested in, or implementing `equals`, `hashCode` and `toString` for all of my Event classes. With a normal method call, I can choose to be only interested that a particular named method gets called, or interrogate each of the arguments using any of the collection of Matchers that I have built up.

Eventbus causes the definition of lots of `event` classes

EventBus has the effect of forcing the developer to define all the events used by their application as Classes in the system (you can send any object as a message). This can be helpful in that it forces a clear definition of the message for each event. But it feels a little un-java-like in the sense that a message in Java is usually thought of as a combination of the method name and the method parameters. In EventBus the method name in the listener is irrelevant to the meaning of the message, and only one parameter is allowed in the `Subscribe` methods which is the Event instance. This overloads the parameter to contain both the data required by the `Subscribe` method and also to contain the meaning of the message. It also produces packing and unpacking boilerplate code, each send has to package the message up into the Event object and each `Subscribe` method has to unpackage the Event object.

Dynamic dispatch

EventBus events are delivered according to the runtime type of the event (similar to the way exceptions are caught). This allows `Subscribe` methods to be very specific about only responding to the exact even they are interested in, but it can make it a little difficult to reason about whether a particular call to post will cause a particular `Subscribe` method to be invoked.

Contest – concurrent testing using junit

Inspired by imunit I have put up a very early version of a concurrent testing library for junit which allows you to test your thread-safe objects with multiple execution schedules.

You can specify constraints on each schedule which contest will enforce – if the test progresses the execution schedule will meet the constraint (at the moment contest doesn’t detect impossible schedules or runtime deadlocks).

Tests look like this:

@Test @Schedules({
        @Schedule(
                when = AddAddRemove.class,
                then = SizeIsOne.class),
        @Schedule(
                when = AddRemoveAdd.class,
                then = SizeIsOne.class),
        @Schedule(
                when = RemoveAddAdd.class,
                then = SizeIsTwo.class)
}) public void twoAddsOneRemove()
{
    context.checking(new TestRun() {
        {
            inThread(Producer).action(FirstAdd).is(multiset).add(42);
            inThread(Producer).action(SecondAdd).is(multiset).add(42);
            inThread(Consumer).action(Remove).is(multiset).remove(42);
        }
    });
}

The following is a schedule constraint read as “Action FirstAdd happens before action Remove, and action SecondAdd also happens before action Remove“.

class AddAddRemove extends BaseSchedule
{
    {
        action(FirstAdd).isBefore(Remove);
        action(SecondAdd).isBefore(Remove);
    }
}

Theories are defined like this (using hamcrest matchers):

class SizeIsOne extends BaseTheory
{
     {
         asserting(that(multiset).size(), equalTo(1));
     }
}

Get it from maven central:

<groupId>com.lexicalscope.contest</groupId>
<artifactId>contest</artifactId>
<version>0.0.1</version>

Alpha: Lightweight Dynamic Proxy API

I have added an alpha version of a dynamic proxy API to fluent reflection.

It allows you to define Java dynamic proxies using a light-weight fluent syntax. It uses the same reusable hamcrest matchers that are used throughout the rest of the fluent reflection API.

Overview

For example, to proxy a simple interface like this

interface TwoQueryMethod {
   int methodA();
 
   int methodB();
}

You define a new class which extends com.lexicalscope.fluentreflection.dynamicproxy.Implementing. You can define as many methods as you like inside this class. Each method should start by asserting what it is proxying by declaring a hamcrest matcher to match against the method currently being proxied. The first matching method found will be executed.

In this simple example each method is matched by exact name, but the matchers can be as complex as you need them to be:

final TwoQueryMethod dynamicProxy = dynamicProxy(new Implementing<TwoQueryMethod>() {
   public void bodyA() {
      whenProxying(callableHasName("methodA"));
      returnValue(42);
   }
 
   public void bodyB() {
      whenProxying(callableHasName("methodB"));
      returnValue(24);
   }
});

Method Arguments

The arguments of the original method are available from the context provided by the Implementing base class

interface MethodWithArgument {
   int method(int argument);
   String method(String argument);
}
 
final MethodWithArgument dynamicProxy = dynamicProxy(new Implementing<MethodWithArgument>() {
   public void body() {
      returnValue(args()[0]);
   }
});

An alternative approach can be used if you know in advance what the arguments of the method will be. In this example a matcher which matches against the arguments of each body is implied. The body will only be called if the proxied method call has arguments that can be used to satisfy the requirements of the body:

final MethodWithArgument dynamicProxy = dynamicProxy(new Implementing<MethodWithArgument>() {
   public int body(final int agument)
   {
      return 42;
   }
 
   public String body(final String agument)
   {
      return "42";
   }
});

100% branch and statement coverage does not mean the code is fully covered

Standard Disclaimer: This discussion is about high reliability software for situations where software failures can have significant impacts. The ideas discussed are not necessarily appropriate or cost effective for most software products.

When refactoring high reliability software it is often important not to introduce any observable behavioural changes – no bugs. But how do you go about it?

One answer is to have very high test coverage. Ensuring a high line and branch coverage is not enough to ensure that your refactoring is safe. The reason for this is that even 100% branch coverage might only test a very small subset of the possible executions of a program. A simple example:

int myFunction(int x)
{
   int result = 0;
   if(x % 2 == 0)
   {
       // A (even)
       result++;
   }
   else
   {
       // B (odd)
       result--;
   }
 
   if(x % 3 == 0)
   {
       // C (divisible by 3)
       result++;
   }
   else
   {
       // D (indivisible by 3)
       result--;
   }
 
   return result;
}

So, we can test all the branches with the tests values 2,3,7.

  • 2 – branches A,D
  • 3 – branches B,C
  • 7 – branches B,D

So, even with this simple code where there is 100% branch and statement coverage, it does not cover the path “A,C”. An additional test case is required. For example the value (x == 6) would cover “A,C”.

In practice, there can be a very large number of paths through a program, so exhaustively testing them can be either very expensive or completely impractical.

Recompilation Time and Layering

When following an iterative development methodology compilation time is a concern. In order to safely make a rapid series of small transformations to a program you need to regularly recompile and test the changes you are making. Using an IDE which can do incremental compilation is very helpful, but more complex programs with code and resource generation phases, when using languages without that kind of compilation support, and when verifying that the CI build will work, still require a fast full automated build and test cycle.

In a previous article I discussed a design principal that can be used to try to minimise the amount of time required to do a full compile of all the code in an application.

It is also possible to select design principals which will reduce the amount of recompilation required by any given code change. Conveniently it turns out that same abstract layering approach can be used here too.

We return to the example simple application;

application depending on library 1 depending on library 2 depending on library 3 depending on library 4

How long is required to compile if a change is made in each component, if each component takes “m” time to compile?

Component Compilation Order Compilation Time
library4 library4,library3,library2,library1,application 5m
library3 library3,library2,library1,application 4m
library2 library2,library1,application 3m
library1 library1,application 2m
application application 1m

If we calculate the instability of each module taking into account transitive dependencies. Abusing the definitions slightly we get:

instability = (Ce/(Ca+Ce))

Component Ce Ca I
library4 0 4 0
library3 1 3 0.25
library2 2 2 0.5
library1 3 1 0.75
application 4 0 1

Comparing the instability with the compilation time is interesting. Less stable packages should have lower compilation times, which we see here:

Component Time I
library4 5m 0
library3 4m 0.25
library2 3m 0.5
library1 2m 0.75
application 1m 1

Reducing Recompilation Time

We can restructure our application using the dependency inversion principal. We split each layer into an abstract (interface) layer and a concrete implementation layer.

The application module becomes responsible for selecting the concrete implementation of each layer and configuring the layers of the application. This design pattern is known as Inversion of Control.

application depends on libraries, but libraries depend on abstract layer interfaces

Taking into account the ability of a multi-core machine to compile modules in parallel, we get the following minimum compilation times:

Component Changed Compilation Order Compilation Time
library4 library4,application 2m
<<library4>> <<library4>>,{library4,library3},application 3m
library3 library3,application 2m
<<library3>> <<library3>>,{library3,library2},application 3m
library2 library2,application 2m
<<library2>> <<library2>>,{library2,library1},application 3m
library1 library1,application 2m
application application 1m

The compilation time for the abstract modules is uniformly 3m, and the compilation time for the concrete modules is uniformly 2m. The application itself is always the last to be compiled so is 1m as before.

How has the design change affected the stability of the modules?

Component Ce Ca I
library4 1 1 0.5
<<library4>> 0 3 0
library3 2 1 0.66
<<library3>> 0 3 0
library2 2 1 0.66
<<library2>> 0 3 0
library1 1 1 0.5
application 4 0 1

Again we see that the less stable a module is the lower its compilation time:

Component Time I
library4 2m 0.5
<<library4>> 4m 0
library3 2m 0.66
<<library3>> 4m 0
library2 2m 0.66
<<library2>> 4m 0
library1 2m 0.5
application 1m 1

This example is quite simple, so we see the instability metric of the modules being dominated by the application. It does illustrate however how we can use dependency inversion to help establish an upper bound on the recompilation time for any given change in a concrete module. I will examine a more realistic example in a future article.

Compilation Time and Layering

On a modern computer with multiple cores the total compile time of an application is related to the longest path though the dependency graph of the application.

In a statically type checked language that supports separate compilation of application components (into, for example jars or assemblies), it is generally necessary to compile the dependencies of an application or library before compiling the application or library itself.

A typical layered application might look like this:

application depending on library 1 depending on library 2 depending on library 3 depending on library 4

The compile order for such an application would be library4 then library3 then library2 then library1 then application. The longest path through the dependency graph is 5.

Assuming, for simplicity, that the compile time for each module is approximately the same (m) the total compile time for the application (t) is 5m.

t = 5m

Taking Advantage of Parallel Compilation

We can restructure our application using the dependency inversion principal. We split each layer into an abstract (interface) layer and a concrete implementation layer.

The application module becomes responsible for selecting the concrete implementation of each layer and configuring the layers of the application. This design pattern is known as Inversion of Control.

application depends on libraries, but libraries depend on abstract layer interfaces

The compile order for such an application is much more flexible than that of the earlier design allowing some modules to be compiled in parallel.

The longest path through the dependency graph is 3.

Assuming again, that the compile time for each module is approximately the same (m) the minimum compile time for the application (t) is 3m.

t = 3m

Cloudbees Open Source Software infrastructure

Cloudbees offers free Jenkins hosting in the cloud to FOSS projects.

I have been using cloudbees for the continuous integration build of fluent-reflection for about a month. It has been pretty predictable and reliable so far, and offers enough of the Jenkins features to be useful. Jenkins itself is very easy to configure with a maven project.

It was quite easy to use it in conjunction with GitHub.

Screenshot of the fluent reflection jenkins build on cloudbees

Dependency Inversion

What is a Dependency

For the purposes of this discussion a class A has a dependency on another class B, iff you cannot compile class A without class B.

Example

class A {B b;}
class B { }

Box A with an arrow pointing to box B

We have to compile class B before (or at the same time as) class A.

Other kinds of Dependency

Anywhere the name “B” appears in class A creates a dependency. Some other examples of dependencies are:

class A extends B { }
class A implements B { }
class A { void method(B b) { } }

Transitive Dependencies

If a class A depends on another class B which itself has dependencies then the dependencies of class B are effectively dependencies of class A

Example:

class A { B b; }
class B { C c; } 
class C { }

Box A with an arrow to Box B with an Arrow to Box C, dotted arrow from Box A to Box C

We have to compile class C before (or at the same time as) class B and class A.

The Problem

When class C changes, we have to recompile and retest both class A and class B. In a large system this can take a very long time. It also means that you have to know about class C in advance; you cannot decide on class C after deciding on class A.

If class C is a more concrete class then it might change more frequently than class A. This will cause class A to be recompiled/tested much more frequently than it otherwise would need to be.

The Solution

Inverting Dependencies

Have class A and class B both depend on an abstraction I. This inverts the direction of the dependency arrow on class B.

interface I { }
class A { I i; }
class B implements I { }

Box A with arrow to Box I, Box B with arrow to Box I

Breaks Transitive Dependency

The really helpful effect of this inversion is that it also breaks the transitive dependency from class A onto class B

interface I{ }
class A { I i; }
class B implements I{ C c; }
class C { }

Box A arrow to Box I, Box B arrow to Box I, Box B arrow to box C. Box A dotted arrow to Box C deleted

Dependency Inversion Principle

This is an application of the Dependency Inversion Principle:

  • High level modules should not depend on low level modules. Both should depend on abstractions.
  • Abstractions should not depend upon details. Details should depend upon abstractions.

Compositional Patterns for Test Driven Development

This article is going to look at how to implement a parameterizable algorithm so it can both conform generally to the open/closed principal and also most effectively be tested.

A common pattern for code reuse, implementation selection, or extension, is to use class inheritance and the template method pattern. This is an example of an implementation of the template method pattern with two different variations B and C:

abstract class A {
    public void doSomething() {
        doP();
        doQ();
        doR();
    }
 
    protected abstract void doP();
    protected abstract void doQ();
    protected abstract void doR();
}
 
class B extends A {
    @Override protected void doP() { /* do P the B way */}
    @Override protected void doQ() { /* do Q the B way */}
    @Override protected void doR() { /* do R the B way */}
}
 
class C extends A {
    @Override protected void doP() { /* do P the C way */}
    @Override protected void doQ() { /* do Q the C way */}
    @Override protected void doR() { /* do R the C way */}
}

Refactoring template method pattern to Strategy Pattern

We can always convert this template method pattern to a compositional pattern by performing a refactoring in the following steps (if you have member variable access there are a couple more steps, but I’ll cover them in a follow up article):

Step 1; encapsulate the construction of the B and C strategies:

abstract class A {
    public void doSomething() {
        doP();
        doQ();
        doR();
    }
 
    protected abstract void doP();
    protected abstract void doQ();
    protected abstract void doR();
}
 
class B extends A {
    public static A createB() {
        return new B();
    }
 
    @Override protected void doP() { /* do P the B way */}
    @Override protected void doQ() { /* do Q the B way */}
    @Override protected void doR() { /* do R the B way */}
}
 
class C extends A {
    public static A createC() {
        return new C();
    }
 
    @Override protected void doP() { /* do P the C way */}
    @Override protected void doQ() { /* do Q the C way */}
    @Override protected void doR() { /* do R the C way */}
}

Step 2; extract an interface for the strategy methods:

interface S {
    void doP();
    void doQ();
    void doR();
}
 
abstract class A implements S {
    private final S s = this;
 
    public void doSomething() {
        s.doP();
        s.doQ();
        s.doR();
    }
}
 
class B extends A {
    public static A createB() {
        return new B();
    }
 
    @Override public void doP() { /* do P the B way */}
    @Override public void doQ() { /* do Q the B way */}
    @Override public void doR() { /* do R the B way */}
}
 
class C extends A {
    public static A createC() {
        return new C();
    }
 
    @Override public void doP() { /* do P the C way */}
    @Override public void doQ() { /* do Q the C way */}
    @Override public void doR() { /* do R the C way */}
}

Step 3; pass the strategies into the superclass instead of using this:

interface S {
    void doP();
    void doQ();
    void doR();
}
 
final class A {
    private final S s;
 
    public A(final S s) {
        this.s = s;
    }
 
    public void doSomething() {
        s.doP();
        s.doQ();
        s.doR();
    }
}
 
class B implements S {
    public static A createB() {
        return new A(new B());
    }
 
    public void doP() { /* do P the B way */}
    public void doQ() { /* do Q the B way */}
    public void doR() { /* do R the B way */}
}
 
class C implements S {
    public static A createC() {
        return new A(new C());
    }
 
    public void doP() { /* do P the C way */}
    public void doQ() { /* do Q the C way */}
    public void doR() { /* do R the C way */}
}

Advantage of the compositional style

Less fragile

Changes to A such as new methods are much less likely to break the strategies in the compositional style.

Easier to Test

In the compositional style, class A can be tested by itself using Mocks. As can class B and class C. In the inheritance style class B and class C cannot be tested without also testing class A. This leads to duplication in the tests, as features of class A are re-tested for every subclass.

Easier to Reuse

In the compositional style class B and class C can be reused in other contexts where A is not relevant. In the inheritance style this type of reuse is not possible.

Emergent Model/Domain concepts

If class B or class C are reused in a different context, it may turn out that during subsequent refactoring they develop a new role within the system. I may even discover an important new domain concept.