Using A Core.Async Routine As A State Machine

Clojure has a library called Core.Async that it allows you to model your system as a set of independent processes that pass messages to each other through channels. Each process is responsible for running itself, managing its own state, etc. But how do you manage state in a language where everything is immutable? Well there are many ways to do that using Atoms, Refs, and Agents. But those constructs are mostly about sharing mutable state between processes, and in many cases end up being more complex than you would need to implement state machine that entirely encapsulated its own state transitions. Fortunately there’s another way.

The go block is the basis of each of the independent processes. These blocks define a unit of asynchronous work. The go block is generally implemented in a loop/recur fashion where a message is received from a channel, processed, and then the recur is called to loop back and wait for another message to be received. This looping gives the go block an opportunity to transform itself into a new state after each message is processed.

A Simple Example

A simple state machine might only have a couple of states and transform from one to the next.

simple state machine

Below you can see a go block that implements those three states: starting, playing, and done. It does so by starting in the “:starting” state and then transforming to “:playing” state when it recurs in the loop. The “:playing” state is run over and over until “some-condition” is met that makes it go to “:done”.

(go-loop
   [state :starting]
   (let [[new-state]
        (case state
              :starting (let [init-value (<! ch)]
                         (some-work init-value)
                         :playing)
              :playing (let [round (<! ch)]
                         (if some-condition
                           :done
                           (do
                             (some-other-work round)
                             state)))
              :done (do
                      (maybe-do-some-cleanup)
                      (close! ch)
                      :done))]
       (if-not (= :done new-state)
          (recur new-state))))

Using Data to Determine State Transitions

A slightly more complex state machine might have multiple states to step through to process a series of messages. Think of an ATM as an example (a really simple ATM that is).

atm state machine

The ATM has multiple user interactions and must communicate with other services to actually authorize the transaction. When the ATM is done with a given user, it goes back into a waiting state until someone swipes their card again.

Instead of passing an explicit state through the loop, our implementation passes data around. This accumulator pattern allows for the state machine to collect data at each step and then to pass that data on to later states. The data itself is used to determine what state the state machine is in. If there’s card information, but then the process expects a pin, if there’s a pin it expects an amount, etc. Once it’s accumulated the data it can process the transaction. It then transitions back to the waiting state by setting the transaction data back to empty so that the next iteration puts it back in the waiting state. Each of those states is a fairly simple process, but you can easily layer on more rules to implement more complex state transitions.

(def host (chan))
(def from-host (chan))
(def atm (chan))

(defn run
  []

  ;; The State machine for the ATM
  (go-loop
    [data {}]
    (let [new-txn-data
      (cond 
        (:error data)
          (do
            (println "Got an error. No monies for you.")
            {})
        (:dispense data) 
          (do
            (println "You've withdrawn: " (:amount data))
            {})
        (:host-response data)
          (let [resp (<! from-host)
                data (dissoc data :host-response)]
            (case resp
              :success (assoc data :dispense true)
              (assoc data :error true)))
        (:pin data)
          (let [amount (<! atm)
                txn (assoc data :amount amount)]
            (>! host txn)
            (assoc txn :host-response true))
        (:card data)
          (let [pin (<! atm)]
            (assoc data :pin pin))
        (empty? data)
          (let [card-info (<! atm)]
            (assoc data :card card-info)))]
      (recur new-txn-data)))

  ;; Simulator for the Host that the ATM talks to
  (go-loop
    []
    (let [txn (<! host)]
      (if (= 1111 (:pin txn))
        (>! from-host :success)
        (>! from-host :error))
      (recur)))

  ;; Simulated hardware events from the ATM machine
  (>!! atm 411111111111)
  (>!! atm 1111)
  (>!! atm 200)

  (>!! atm 411111111111)
  (>!! atm 112)
  (>!! atm 200)
)

Sets of independent processes managing their own state, doing their own work, communicating by sending messages to other independent processes. Welcome to the world of simple asynchronous programs!

High Performance and Parallelism With the Feel of a Scripting Language

Sometimes you need a quick-and-dirty tool to get something done. You’re not looking for a long-term solution, but rather just have a simple job to do. As a programmer when you have those thoughts you naturally migrate toward scripting languages. But sometimes that throw away tool also needs to do highly parallel operations with good performance characteristics. Until recently it seemed like you only got to choose one or the other though. And then came Go.

A Use Case

We’ve been working on an application that provides APIs for other apps. Those APIs are required to be fast and to scale up to many concurrent users. We needed a way to push a lot of traffic to this API while ensuring that the API would access a wide swath of the data in the database. We didn’t want to run into the case where the same request was being made over and over allowing the database to end up with an unrealistic scenario where it had all the data cached. There are a number of existing tools for this kind of performance testing, but seeing some of the tests run didn’t give us much confidence that they were really running these requests in parallel like we needed. We also wanted to be able to easily run these tests from many different clients computers at once so that we could ensure that the client computers and internet connections were not the bottleneck.

How Does Go Fit That Use Case?

Write-Once (Compile a Few Times) and Run Anywhere

One of the advantages of Go is that it is easy to cross-compile it to other architectures and operating systems. This property made it easy to write a little application that we could run at the same time on Mac OS and Linux. Just like a scripting language it was write-once and run anywhere. Of course we had to compile it for each of the different operating systems but that is incredibly easy with Go. Unlike most scripting languages, once a Go binary is compiled for an OS, nothing else needs to be installed to run it. There’s no management of different versions or libraries. A Go binary is entirely self-contained so no extra Go runtime is needed to be installed for the application to be run and all of the depencies are statically linked in. Simply copy the binary to the appropriate machine and execute it. You can’t get much simpler than that.

$ brew install go --cross-compile-common
$ GOOS=linux go build myapp.go

Libraries for All The Things

Go has a large number of good libraries that come standard. These libraries include support for making HTTP clients and servers. There’s support for accessing databases (although the drivers themselves are not included). It includes support for parsing command line arguments, encoding and decoding JSON, for doing cryptography, and for using regular expressions. Basically it includes a lot of libraries that you need for creating applications whether it’s something you want to maintain forever or whether it’s a throw away app.

flag.BoolVar(&help, "h", false, "help")
resp, err := http.Get("http://example.com/")

var exampleResp MyJsonResponse
decoder := json.NewDecoder(resp.Body)
err = decoder.Decode(&exampleResp)

Concurrent Design and Parallel Execution

Goroutines allow a program to execute a function concurrently with other running code. Channels allow for different goroutines to communicate by passing messages to each other. Those two things together allow for a simple means of structuring code with a concurrent design.

ch := make(chan int)
go func() {
  for {
    val := <-ch
    fmt.Printf("Got an int: %v", val)
  }
}()
ch <- 1
ch <- 2

In addition to having easy mechanisms to implement a concurrent design, your program also needs to be able to do actual work in parallel. Go can run many different goroutines in parallel and gives you control over how many run at the same time with a simple function call.

runtime.GOMAXPROCS(25)

Put The Pieces Together

Bringing together those libraries and a concurrent design allows us to easily create a program that meets our needs for testing these APIs.

This is a simple application that does GET requests to a specific URL. The program allows you to specify the URL, the number of requests to make, and the number to run concurrently. It uses many of the libraries I mentioned above for handling HTTP, for parsing command line arguments, for calcuating the duration of requests, etc. It also uses goroutines to allow for multiple simultaneous requests to be made while using a channel to communicate the results back to the main program.

package main

import (
  "flag"
  "fmt"
  "io/ioutil"
  "net/http"
  "runtime"
  "sync"
  "time"
)

var help bool
var count int
var concurrent int
var url string

var client *http.Client

func init() {
  client = &http.Client{}

  flag.BoolVar(&help, "h", false, "help")
  flag.IntVar(&count, "n", 1000, "number of requests")
  flag.IntVar(&concurrent, "c", runtime.NumCPU() + 1, "number of concurrent requests")
  flag.StringVar(&url, "u", "http://127.0.0.1:5000/", "url")
  flag.Parse()
}

func main() {
  if help {
    flag.Usage()
    return
  }

  fmt.Printf("Concurrent: %v\n", concurrent)
  runtime.GOMAXPROCS(concurrent + 2)

  runChan := make(chan int, concurrent)
  resultChan := make(chan Result)

  var wg sync.WaitGroup

  success_cnt := 0
  failure_cnt := 0
  var durations []time.Duration
  var min_dur time.Duration
  var max_dur time.Duration

  // Run the stuff
  dur := duration(func() {

    // setup to handle responses
    go func() {
      for {
        r := <-resultChan
        durations = append(durations, r.Duration)
        min_dur = min(min_dur, r.Duration)
        max_dur = max(max_dur, r.Duration)

        // 200s and 300s are success in HTTP
        if r.StatusCode < 400 {
          success_cnt += 1
        } else {
          fmt.Printf("Error: %v; %v\n", r.StatusCode, r.ErrOrBody())
          failure_cnt += 1
        }
        wg.Done()
      }
    }()

    // setup to handle running requests
    wg.Add(count)
    go func() {
      for i:=0; i < count; i++ {
        <-runChan
        fmt.Printf(".")
        go func() {
          resultChan <- Execute()
          runChan<- 1
        }()
      }
    }()

    // tell N number of requests to run, but this limits the concurrency
    for i := 0; i < concurrent; i ++ {
      runChan<- 1
    }

    wg.Wait()
  })

  fmt.Printf("\n")
  fmt.Printf("Success: %v\nFailure: %v\n", success_cnt, failure_cnt)
  fmt.Printf("Min: %v\nMax: %v\n", min_dur, max_dur)
  fmt.Printf("Mean: %v\n", avg(durations))
  fmt.Printf("Elapsed time: %v\n", dur.Seconds())

}

func avg(durs []time.Duration) time.Duration {
  total := float64(0)
  for _, d := range durs {
    total += d.Seconds()
  }
  return time.Duration((total / float64(len(durs))) * float64(time.Second))
}

func min(a time.Duration, b time.Duration) time.Duration {
  if a != 0 && a < b {
    return a
  }
  return b
}

func max(a time.Duration, b time.Duration) time.Duration {
  if a > b {
    return a
  }
  return b
}

func Execute() Result {
  var resp *http.Response
  var err error
  dur := duration(func() {
    resp, err = http.Get(url)
  })

  if err != nil {
    return Result{dur, -1, err, ""}
  }
  defer resp.Body.Close()
  var body string
  if b, err := ioutil.ReadAll(resp.Body); err == nil {
    body = string(b)
  } else {
    body = ""
  }

  return Result{dur, resp.StatusCode, nil, body}
}

type Result struct {
  Duration time.Duration
  StatusCode int
  Err error
  Body string
}
func (r *Result) ErrOrBody() string {
  if nil != r.Err {
    return r.Err.Error()
  } else {
    return r.Body
  }
}

func duration(f func()) time.Duration {
  start := time.Now()
  f()
  return time.Now().Sub(start)
}

The app we wrote started out a lot like this; easy and straightforward. As we needed to add more tests we stated refactoring out types to allow me to separate the core of the load testing and calculation of times from the actual requests run. Go provides function type aliases, higher order functions and a lot of other abstractions which make those refactorings quite elegant. But that’s for a different post…

3.5x Increase In Performance with a One-Line Change

Gather around my friends, I’d like to tell you about a cure for what ails you, be it sniffles, scurvy, stomach ailments, eye sight, nervousness, gout, pneumonia, cancer, heart ailments, tiredness or plum just sick of life… Yes, sir, a bottle of my elixir will fix whatever ails you!

You might see the title above and think I’m trying to sell you some snake oil. The truth is, I probably am. As with most performance claims, your mileage may vary and the devil will always be in the details.

Let’s Start with a bit of Background

I recently began working on a client’s Ruby on Rails application that needed to provision data into another system at runtime. The provisioning was done through synchronous HTTP REST calls performed during the most performance critical request flow in the application. The flow that made up 95% of the overall traffic that this application handled. The provisioning consisted of between 8 and 15 HTTP requests to an external application.

record scratching

Yes, you read that correctly. For one HTTP request to this application, in the flow that made up 95% of the traffic that this application was supposed to handle, the app made up to 15 HTTP requests to a second system. This is not an ideal design from a performance standpoint of course. The ultimate goal would be to eliminate or substantially reduce the number of calls through a coarse grain interface. But that requires changes in two applications, coordinated across multiple teams, which will take a while. We needed to find something to do in the short term to help with the performance issues to give us the breathing room to make more extensive changes.

The Good News

Luckily the HTTP Requests were already being made using the Faraday library. Faraday is an HTTP client library which provides a consistent interface over different HTTP implementations. By default it uses the standard Ruby Net:HTTP library. Faraday is configured like this:


conn = Faraday.new(:url => 'http://example.com') do |faraday|
faraday.request :url_encoded # form-encode POST params
faraday.response :logger # log requests to STDOUT
faraday.adapter Faraday.default_adapter # make requests with Net::HTTP
end

Net:HTTP in Faraday will create a new HTTP connection to the server for each request that is made. If you’re only making one request or you’re making requests to different hosts, this is perfectly fine. In our case, this was an HTTPS connection and all were being made to the same host. So for each of those 15 requests Net:HTTP was opening a new socket, negotiating some TCP, and negotiating an SSL connection. So how does Faraday help in this case?

One of the adapters that Faraday supports is net-http-persistent which is a ruby library that supports persistent connections and HTTP Keep-Alive across multiple requests. HTTP Keep-Alive allows for an HTTP connection to be reused for multiple requests and avoids the TCP negotiation and SSL connection overhead. To use the net-http-persistent implementation all you have to do is to change your Faraday configuration to look like:


conn = Faraday.new(:url => 'http://example.com') do |faraday|
faraday.request :url_encoded # form-encode POST params
faraday.response :logger # log requests to STDOUT
faraday.adapter :net_http_persistent
end

This simple change swaps out the HTTP implementation that is used to make the requests. In our case it reduced the average time to process a complete request (including the ~15 requests made using Faraday) under load from 8 seconds down to 2.3 seconds.

the crowd goes wild

OK, so technically you need to add a new Gem reference to your Gemfile to use net-http-persistent. So it’s not REALLY a One-Line Fix. I also hope you never have an interface so chatty that your application needs to make 15 calls to the same remote server to process one request. But if you do! Let me tell you my friend! Just a little drop of net-http-persistent is all you need to cure what ails you.

P.S.

Faraday has some other benefits including supporting a Middleware concept for processing requests and responses that allows for code to be shared easily across different HTTP requests. So you can have common support for handling JSON or for error handling or logging for example. This is a nice architecture that allows you to easily process request data. So even if you don’t need it for its ability to switch out HTTP implementations, it’s still a nice library to use.

Mapping SQL Joins using Anorm

Anorm is a Scala framework that is a fairly thin veneer over JDBC that allows you to write SQL queries and map results into Scala objects. The examples easily found on the web have a tendency to be fairly simple. One of the first problems I ran into was mapping a Parent-Child hierarchy where the parent has a collection of values from a different table.

For this post, I’m using a simple, contrived schema that looks like the following:

CREATE TABLE user (
id SERIAL,
user_name varchar(100),
CONSTRAINT pk_user PRIMARY KEY (id)
);

CREATE TABLE email (
user_id LONG,
email varchar(100),
CONSTRAINT pk_email PRIMARY_KEY (user_id, email)
);
ALTER TABLE email ADD CONSTRAINT fk_email_user FOREIGN_KEY (user_id) REFERENCES user (id);

CREATE TABLE phone (
user_id LONG,
phone varchar(11),
CONSTRAINT pk_phone PRIMARY_KEY (user_id, phone)
);
ALTER TABLE phone ADD CONSTRAINT fk_phone_user FOREIGN_KEY (user_id) REFERENCES user (id);

The Simple Case

In the simplest case, Anorm allows you to map the results of a query to a Scala case class like the following:

case class User(id:Long, name: String)
object User {
def rowMapper = {
long("id") ~
str("user_name") map {
case id ~ name => User(id, name)
}
}
}

def getUser(id: String): Option[User] = {
DB.withConnection {
implicit conn =>
SQL("SELECT id, user_name FROM user WHERE user.id = {id}")
.on("id" -> id)
.as(User.rowMapper singleOpt)
}
}

The query is executed and the results of the query are mapped to the User using a RowMapper which converts the result columns into Scala types and ultimately to a Scala User object that you’ve defined.

Joins

But what if you want a more complex object, such as adding Phone numbers and Email addresses to your user object? Lets say you want something more like the following:

case class User(id:Long, name: String, emails: List[String], phones: List[String])
object User {
def rowMapper = {
long("id") ~
str("user_name") ~
(str("email") ?) ~
(str("number") ?) map {
case id ~ name ~ email ~ number => ((id, name), email, number)
}
}
}

This row mapper doesn’t return a User object directly, but rather the columns grouped into a Triple (with id and name as the first part of the Triple).

Anorm doesn’t have a lot of support for that out of the box, but Scala’s built in functions for dealing with Lists and Maps have the tools that you need. Take a look at the following example. If you’re new to Scala, good luck wrapping your brain around it.

def getUser(id: String): Option[User] = {
DB.withConnection {
implicit conn =>
SQL(
"""SELECT user_name, email.email, phone.number
|FROM user
|LEFT JOIN email ON email.user_id = user.id
|LEFT JOIN phone ON phone.user_id = user.id
|WHERE user.id = {id}""".stripMargin)
.on("id" -> id)
.as(User.rowMapper *)
.groupBy(_._1)
.map {
case ((dbId, name), rest) => User(dbId, name, rest.unzip3._2.map(_.orNull), rest.unzip3._3.map(_.orNull))
}.headOption
}
}

But we can break down those steps a little bit, include the type declarations of what happens at each step to make it more clear as to what’s being done. Using those type declarations you end up with something like the following.

def getUser(id: String): Option[User] = {
DB.withConnection {
implicit conn =>
val result: List[((Long, String), Option[String], Option[String])] = SQL(
"""SELECT user_name, email.email, phone.number
|FROM user
|LEFT JOIN email ON email.user_id = user.id
|LEFT JOIN phone ON phone.user_id = user.id
|WHERE user.id = {id}""".stripMargin)
.on("id" -> id)
.as(User.rowMapper *)

val queryGroupedByUser: Map[(Long, String), List[((Long, String), Option[String], Option[String])]]
= result.groupBy(_._1)

val listOfUser: Iterable[User] = queryGroupedByUser.map {
case ((dbId, name), rest) => {
val emails: List[String] = rest.unzip3._2.map(_.orNull) // convert Option[String]s to List[String] where Some[String]
val phones: List[String] = rest.unzip3._3.map(_.orNull) // convert Option[String]s to List[String] where Some[String]
User(dbId, name, emails, phones)
}
}

listOfUser.headOption
}
}

Let’s break that down a little more:

val result: List[((Long, String), Option[String], Option[String])] = SQL(
"""SELECT user_name, email.email, phone.number
|FROM user
|LEFT JOIN email ON email.user_id = user.id
|LEFT JOIN phone ON phone.user_id = user.id
|WHERE user.id = {id}""".stripMargin)
.on("id" -> id)
.as(User.rowMapper *)

This code creates a List as you can see from the type declaration. The List contains an entry for each row returned in the result set. Because we used JOIN clauses, we might have gotten back many rows. For example, if a user had 2 emails the results might have looked like:

id, name, email, number
1, Geoff, geoff@example.com, 15135551212
1, Geoff, geoff2@example.com, 15135551212

The resulting Scala List that directly contains the data from that result set. But we take an extra step of grouping the basic User data (the parent) into its own Tuple which we’ll use later to identify the unique Users. The Scala list of the above result set would contain:

List(((1, "Geoff"), Some("geoff@example.com"), Some("15135551212")),
((1, "Geoff"), Some("geoff2@example.com"), Some("15135551212")))

Next we create a map of the results where the key to the map is the unique users:

val queryGroupedByUser: Map[(Long, String), List[((Long, String), Option[String], Option[String])]]
= result.groupBy(_._1)

From that list, we create a map, where the keys of the map are the unique parent objects. This turns the list shown above into a map like:

Map((1, "Geoff"),
List(((1, "Geoff"), Some("geoff@example.com"), Some("15135551212")),
((1, "Geoff"), Some("geoff2@example.com"), Some("15135551212")))
)

This mapping will work if there are many keys returned as well (assuming you were querying by something non-unique). In that case your map will contain one entry for each of the unique parents.

Finally, we need to take apart the Map and turn it into our domain object:

val listOfUser: Iterable[User] = queryGroupedByUser.map {
case ((dbId, name), rest) => {
val emails: List[String] = rest.unzip3._2.map(_.orNull) // convert Option[String]s to List[String] where not null
val phones: List[String] = rest.unzip3._3.map(_.orNull) // convert Option[String]s to List[String] where not null
User(dbId, name, emails, phones)
}
}

The case statement destructures the Map back into the key containing the basic user information and then the list of all the other data associated with that user. rest.unzip3 turns the List(A, B, C) into (List[A], List[B], List[C]). _.2 takes the second element out of the Triple, in this case the List[String] containing the emails. We then map over them to get the value or null from the Option[String] to create a list of the items that are not null. The same process is done for emails and phones. Those values along with the key from the map are used to create the instances of our Users. In this case, since we only expect one based on an id, we also use listOfUser.headOption to get the first element of the list (or None if the list is empty).

Hopefully breaking down the Scala into smaller chunks will help some people understand how this stuff works.

Java 7 Code Coverage with Gradle and Jacoco

Thanks

Steven Dicks’ post on Jacoco and Gradle is a great start to integrating Jacoco and Gradle, this is a small iteration on top of that work.

Java 7 Code Coverage

The state of Code Coverage took a serious turn for the worst when Java 7 came out. The byte-code changes in Java 7 effectively made Emma and Cobertura defunct. They will not work with Java 7 constructs. Fortunately there is a new player in town called JaCoCo (for Java Code Coverage). JaCoCo is the successor to Emma which is being built on the knowledge gained over the years by the Eclipse and Emma teams on how best to do code coverage. And works with Java 7 out-of-the-box.

The advantage of using established tools is that they generally are well supported across your toolchain. JaCoCo is fairly new and so support in Gradle isn’t so smooth. Fortunately Steven’s post got me started down the right path. The one thing that I wanted to improve right away was to use transitive dependency declarations as opposed to having local jar files in my source repository. JaCoCo is now available in the Maven repos so we can do that. One thing to note is that the default files build in the Maven repo are Eclipse plugins, so we need to reference the “runtime” classifier in our dependency

The Gradle Script


configurations {
codeCoverage
codeCoverageAnt
}
dependencies {
codeCoverage 'org.jacoco:org.jacoco.agent:0.5.10.201208310627:runtime@jar'
codeCoverageAnt 'org.jacoco:org.jacoco.ant:0.5.10.201208310627'
}
test {
systemProperties = System.properties
jvmArgs "-javaagent:${configurations.codeCoverage.asPath}=destfile=${project.buildDir.path}/coverage-results/jacoco.exec,sessionid=HSServ,append=false",
'-Djacoco=true',
'-Xms128m',
'-Xmx512m',
'-XX:MaxPermSize=128m'
}
task generateCoverageReport << {
ant {
taskdef(name:'jacocoreport', classname: 'org.jacoco.ant.ReportTask', classpath: configurations.codeCoverageAnt.asPath)

mkdir dir: "build/reports/coverage"

jacocoreport {
executiondata {
fileset(dir: "build/coverage-results") {
include name: 'jacoco.exec'
}
}
structure(name: project.name) {
classfiles {
fileset(dir: "build/classes/main") {
exclude name: 'org/ifxforum/**/*'
exclude name: 'org/gca/euronet/generated**/*'
}
}
sourcefiles(encoding: 'CP1252') {
fileset dir: "src/main/java"
}
}

xml destfile: "build/reports/coverage/jacoco.xml"
html destdir: "build/reports/coverage"
}
}
}

A Few Details

The magic is in the jvmArgs of the test block. JaCoCo is run as a Java Agent which uses the runtime instrumentation feature added in Java 6 to be able to inspect the running code. Extra arguments can be added to JaCoCo there including things like excludes to exclude specific classes from coverage. The available parameters are the same as the maven JaCoCo parameters.

The generateCoverageReport task converts the jacoco.exec binary into html files for human consumption. If you're just integrating with a CI tool, like Jenkins, then you probably don't need this, but it's handy for local use and to dig into the details of what's covered.

Loose Ends

One problem that I ran into was referencing project paths like the project.buildDir from within an Ant task. Hopefully someone will come along and let me know how that's done.

Groovy Mocking a Method That Exists on Object

In Groovy, the find method exists on Object. The find method is also an instance method on EntityManager which is commonly used in JPA to get an instance based on a database id. I was trying to create a Mock of EntityManager like:

def emMock = new MockFor(EntityManager)
emMock.demand.find { Class clazz, Object rquid -> return new Request(rqUID: rquid) }
def em = emMock.proxyDelegateInstance()

This gave me an error though:

groovy.lang.MissingMethodException: No signature of method: com.onelinefix.services.RequestRepositoryImplTest$_2_get_closure1.doCall() is applicable for argument types: (groovy.mock.interceptor.Demand) values: [groovy.mock.interceptor.Demand@a27ebd9]
Possible solutions: doCall(java.lang.Class, java.lang.Object), findAll(), findAll(), isCase(java.lang.Object), isCase(java.lang.Object)
at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:264) ~[groovy-all-1.8.6.jar:1.8.6]
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:877) ~[groovy-all-1.8.6.jar:1.8.6]
at groovy.lang.Closure.call(Closure.java:412) ~[groovy-all-1.8.6.jar:1.8.6]
at groovy.lang.Closure.call(Closure.java:425) ~[groovy-all-1.8.6.jar:1.8.6]

It ends up that find is a DefaultGroovyMethod that’s added to all Objects and this was causing the mock.demand to get confused because it thought I was trying to call the Object.find.

Luckily there is a workaround and that’s to use the ordinal argument for find.


emMock.demand.find(1) { Class clazz, Object rquid -> return new Request(rqUID: rquid) }

That’s enough of a change to invoke mock version instead of the DefaultGroovyMethod version.
Now if I only knew why….

Testing and Internal Implementation in .NET

Switching back and forth between Java and .NET lets you see some of the differences between the two platforms more easily. This happened to me the other day when I switched from Java to .NET and was writing Unit Tests. In Java, the access modifiers include public, private, protected and default. In C# they are public, private, protected and internal. In general, the public, private access modifiers are very similar. Protected is slightly different in that Java allows both derived classes as well as classes in the same package to access those elements where C# only allows derived classes to access them. Where things diverge more is in the default/internal differences. Default in java restricts access to the same package while internal in C# restricts access to the same Assembly (generally a single DLL).

What does this have to do with testing you might ask?

It’s a good OO design principle to expose only those things that are part of the contract to a class or package and to leave the implementation hidden as much as possible. This is called encapsulation. You can make methods private or default/internal. You can make entire classes default/internal and only publicly expose an interface that clients need to use.

A common practice in the Java world is to mimic the package layout of your main source code in your test code. When you mimic that layout then your test classes and implementation classes end up being in the same packages. Because of this your test classes can access all those default members to test them. In C# because it’s not based on a namespace, but rather an Assembly this doesn’t work.

Luckily there’s an easy workaround.

In the AssemblyInfo.cs of your main project add:

[assembly: InternalsVisibleTo("someOther.AssemblyName.Test")]

Where SomeOther.AssemblyName.Test is the name of the Assembly that contains your tests for the target assembly. Then the test code can access internal details of the assembly. And you can easily test the things that other calling code might not have access to.

Using Groovy AST to Add Common Properties to Grails Domain Classes

Groovy offers a lot of runtime meta-programming capabilities that allow you to add reusable functionality in a shared fashion. Grails plugins make use of this ability to enhance your project. One of the things that you can’t do with runtime meta-programming in Grails is to add persistent Hibernate properties to your domain classes. If you want to add a persistent property in a plugin (or otherwise using meta-programming) for your Grails project you have to make use of “compile-time” meta-programming. In Groovy this is done with AST Transformations.

(If you are unfamiliar with the concept of the Abstract Syntax Tree, see the Wikipedia article on Abstract Syntax Tree.)

AST Transformations are made up of two parts: (1) An annotation and (2) an ASTTransformation implementation. During compilation the Groovy compiler finds all of the Annotations and calls the ASTTransformation implementation for the annotation passing in information about.

To create your own Transformation you start by creating an Annotation. The key to the annotation working is that your annotation has to itself be annotated with @GroovyASTTransformationClass. The values passed to the GroovyASTTransformationClass define the Transformation that will be called on classes, methods or other code prior to it being compiled.

Example Annotation


package net.zorched.grails.effectivity;

import org.codehaus.groovy.transform.GroovyASTTransformationClass;
import java.lang.annotation.*;

@Target({ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
@GroovyASTTransformationClass({"net.zorched.grails.effectivity.EffectivizeASTTransformation"})
public @interface Effectivize {
}

Notice the reference to net.zorched.grails.effectivity.EffectivizeASTTransformation. That’s the important part because it defines the class that will be used to perform the transformation.

Example Transformation


package net.zorched.grails.effectivity;

import org.codehaus.groovy.ast.*;
import org.codehaus.groovy.ast.builder.AstBuilder;
import org.codehaus.groovy.ast.expr.*;
import org.codehaus.groovy.ast.stmt.*;
import org.codehaus.groovy.control.*;
import org.codehaus.groovy.transform.*;
import java.util.*;
import static org.springframework.asm.Opcodes.*;

@GroovyASTTransformation(phase = CompilePhase.CANONICALIZATION)
public class EffectivizeASTTransformation implements ASTTransformation {

// This is the main method to implement from ASTTransformation that is called by the compiler
public void visit(ASTNode[] nodes, SourceUnit sourceUnit) {
if (null == nodes) return;
if (null == nodes[0]) return;
if (null == nodes[1]) return;
if (!(nodes[0] instanceof AnnotationNode)) return;

ClassNode cNode = (ClassNode) nodes[1];
addProperty(cNode, "effectiveStart", Date.class, createGenerateStartMethodCall())
addProperty(cNode, "effectiveEnd", Date.class, createGenerateEndMethodCall())

}

// This method returns an expression that is used to initialize the newly created property
private Expression createGenerateStartMethodCall() {
return new ConstructorCallExpression(new ClassNode(Date.class), ArgumentListExpression.EMPTY_ARGUMENTS);
}

private Expression createGenerateEndMethodCall() {
return new MethodCallExpression(
new ConstructorCallExpression(new ClassNode(Date.class), ArgumentListExpression.EMPTY_ARGUMENTS),
"parse",
new ArgumentListExpression(new ConstantExpression("yyyy/MM/dd"), new ConstantExpression("2099/12/31")));
}

// This method adds a new property to the class. Groovy automatically handles adding the getters and setters so you
// don't have to create special methods for those
private void addProperty(ClassNode cNode, String propertyName, Class propertyType, Expression initialValue) {
FieldNode field = new FieldNode(
propertyName,
ACC_PRIVATE,
new ClassNode(propertyType),
new ClassNode(cNode.getClass()),
initialValue
);

cNode.addProperty(new PropertyNode(field, ACC_PUBLIC, null, null));
}
}

This example code gets called for each annotated class and adds two new Date properties called effectiveStart and effectiveEnd to it. Those properties are seen by Grails and Hibernate and will become persistent and behave the same as if you typed them directly in your Domain.

It’s a lot of work to add a simple property to a class, but if you’re looking to consistently add properties and constraints across many Grails Domain classes, this is the way to do it.

Check Multiple Mercurial Repositories for Incoming Changes

Currently I have a whole bunch of Mercurial repositories in a directory. All of these are cloned from a central repository that the team pushes their changes to. I like to generally keep my local repositories up-to-date so that I can review changes. Manually running hg incoming -R some_directory on 20 different projects is a lot of work. So I automated it with a simple shell script.

This script will run incoming (or outgoing) on all of the local repositories and print the results to the console. Then I can manually sync the ones that have changed if I want.

I called this file hgcheckall.sh and run it like: ./hgcheckall.sh incoming

#!/bin/bash

# Find all the directories that are mercurial repos
dirs=(`find . -name ".hg"`)
# Remove the /.hg from the path and that's the base repo dir
merc_dirs=( "${dirs[@]//\/.hg/}" )

case $1 in
incoming)
for indir in ${merc_dirs[@]}; do
echo "Checking: ${indir}"
hg -R "$indir" incoming
done
;;
outgoing)
for outdir in ${merc_dirs[@]}; do
echo "Checking: ${outdir}"
hg -R "$outdir" outgoing
done
;;
*)
echo "Usage: hgcheckall.sh [incoming|outgoing]"
;;
esac

I guess the next major improvement would be to capture the output and then automatically sync the ones that have changed, but I haven’t gotten around to that yet.

Java Return from Finally

try…catch…finally is the common idiom in Java for exception handling and cleanup. The thing that people may not know is that returning from within a finally block has the unintended consequence of stoping an exception from propagating up the call stack. It “overrides” the throwing of an exception so that the caller will never get to handle it.


public class Main {
public static void main(String[] args) throws Throwable {
System.out.println("Starting");
method();
System.out.println("No way to know that an exception was thrown");
}

public static void method() throws Throwable {
try {
System.out.println("In method about to throw an exception.");
throw new RuntimeException();
} catch (Throwable ex) {
System.out.println("Caught exception, maybe log it, and then rethrow it.");
throw ex;
} finally {
System.out.println("return in finally prevents an exception from being passed up the call stack.");
return; // remove the return to see the real behavior
}
}
}

I recently came across code like this. This is Real Bad. Returning from within finally prevents the propagation of regular exceptions, which is bad enough, but worse, prevents the propagation of runtime exceptions which are generally programmer errors. This one small mistake can hide programmer errors so that you’ll never see them and never know why things aren’t working as expected. One of the interesting things is that the Java compiler understands this as well. If you return from within a finally block where an exception would otherwise be thrown, the compiler does not force you to declare that exception in the method’s throws declaration.

Long story short. Don’t return from with finally. Ever.