Monday, May 30, 2011

Some DSL fun with Groovy 1.8

One of my colleagues at work was recently tasked with creating a query language for the new integrated infrastructure we're building out. I think he is ultimately going to go with NodeJS, but after reading Groovy's recent 1.8.0 release notes, I thought I'd try my hand at doing with Groovy.

I typically like to work towards a good goal, so I started with a few directional ideas for what I wanted the language to look like. First, the new language is going to be internally (and perhaps externally) referred to as CQL (pronounced cee-que-el), not to be mistaken for SQL (ceequel). So I wanted the language itself to differentiate itself. That rules out stuff like, "select foo from bar where = 'foo1'".

Second, I've long had a preference for "finder" methods over "getter" methods for data-access level classes. This also happens to go along with Grails' dynamic finders. So, I wanted to stay away from a language like "get user where firstname = 'foo' and lastname = 'bar'". Personally, I find the "find by" language much more user friendly anyway.

With these ideas in mind, and not having a ton of experience with the new command chains in Groovy, my rough goal was to create something like this:
find entity by someField: "value" or by someOtherField: "value"

Here is what I actually ended up with:
find user("email", "firstName") by email: "" or email: ""

Pretty darn close, perhaps even better actually. There's a nice way to specify which fields from User that I really want, and I think it really reads well.

This actually translates to this with parenthesis and dots:
find(user("email", "firstName")).by(email: "").or(email:")

Here's how I did it.
First let's take care of the find, and user methods:
def find(it) { it }
def user(String[] fields = [] ) {
   new UserFinder(fields: fields as Set)
Basically the 'find' method is just for syntax sake, it's really not needed and simply returns whatever object is passed into it. In this case, it's going to be a UserFinder which is created by the 'user' method.

This gives us a nice encapsulation of concerns, the UserFinder is responsible for finding users.
class UserFinder {
   Set fields
   Map byFields
   Map orFields
   UserFinder by(Map byFields) {
      this.byFields = byFields
      return this
   UserFinder or(Map orFields) {
      this.orFields = orFields
      return this
   def find() {
      ... do find work ...

That's really it. Note that since both the 'by' and the 'or' methods take maps you could provide multiple fields there, eg:
find user("firstName", "lastName") by email: "", id: 1 or email: "", id: 2

Some benefits I see with this approach is that it's easy to separate out the concerns. The "Finder" classes are responsible for doing the actual finding. A groovy script template can be created, and be set up so that the 'user' function and classes are imported, which makes it easy to figure out what functionality is available. Obviously some sanitisation of user queries needs to be done, but besides that execution of user supplied queries is quite safe as the only methods that can be called are those that are provided within the script template. Lastly, groovydoc could be used to produce some documentation without a lot of extra work.

Another interesting idea might be to actually provide a client library which would mean the end user could build these queries in code, and it would look exactly the same as it would be sent over.



  1. One of the aspects of SQL is that the engine can make optimizations. It seems to me that the query language you invented limits the optimization by forcing the execution order - so it will only be suitable for small databases.
    I really liked the theme of the blog (wallpaper, colors etc).

  2. :) Thanks Shlomy!

    The DSL isn't really going to be running directly against any data source. It's more of an abstraction on top of our new integrated service layer. Also, the execution order isn't forced at all, from my example the 'by', and 'or' fields are stored into maps. The 'find' method can use them as it sees fit. This would be a place for any optimisations. Eg. The "user" object might actually get data from multiple data sources, or back end services. The find method could make 2 parallel service calls one to find the user by id, and another to find it by email address. The idea of the UserFinder class is to encapsulate that logic behind the 'find' method. This way the query can be handled in the best way. :)