On Tuesday last week I attended my first ever meetup in New York as a member of the Open Statistical Programming Meetup. This particular meetup was focusing on Julia, a new programming language I’d been interested in hearing more about since I read a post by John Myles White on the R-bloggers metablog titled, “Julia, I love you”.
Since then I’d read bits and pieces about the language, but generally more conceptual stuff, rather than syntax or applications. Given that Stefan Karpinski - one of the language’s creators - was speaking, this seemed an ideal opportunity to get more information from the horse’s mouth, as well as see some demos of active projects (and see why, exactly, John loves Julia so much).
Julia - overview and goals
In a nutshell, Julia is a dynamic language which uses an LLVM-based aggressive JIT compiler for incredibly fast and flexible scientific computing. It allows high and low level programming in a single language, with the ability to easily call C, Python and Fortran libraries. While it doesn’t yet have a threading model, it allows parallel computing over a distributed system, meaning you can transparently send and receive information from distributed computers, making it ideal for big data, the buzzword of the moment. The initial, pre-release version was pushed out in February this year, and since then there has been a huge amount of work updating and building out the necessary support. Its development has been based on the following aims;
“We want a language that’s open source, with a liberal license. We want the speed of C with the dynamism of Ruby. We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab. We want something as usable for general programming as Python, as easy for statistics as R, as natural for string processing as Perl, as powerful for linear algebra as Matlab, as good at gluing programs together as the shell. Something that is dirt simple to learn, yet keeps the most serious hackers happy.”
Those are pretty lofty goals, but after seeing what Julia can do, I’d say Stefan and co are well on their way to meeting them. There were a few things during the talks which I felt really stood out.
This means you can define a variable as a specific type (in which case you impose the various [runtime] typing restrictions you expect) but you don’t have to. This option is awesome. It means you can write typesafe code where appropriate, and hack together something very quickly (as you might in Python or Perl) in a matter of minutes. The best thing, though, is that using or not using typing has no impact on performance, meaning its purely a design choice, not an optimization one. It also allows for a totally natural way to make calls to C functions, which is typically less syntactically clear in a totally untyped language. What typing does allow, however, is multiple dispatch, which allows for massive generalization of functions and code. With type checking occurring at runtime only, there’s no compile time type checking, which allows for a dependent type system which would otherwise be impossible (because a runtime type is only actually generated at runtime as it can use real values, not just types). This adds a lot of flexibility into how you develop your systems.
A lot of Julia’s speed comes from a type inference algorithm which allows the specialization of code. This means that rather than generating generic methods which then type check, type-specific code is compiled, making the language incredibly fast. I guess it’s kind of obvious, but both the benchmarks Stefan showed and the feedback from the developers who presented was pretty impressive. Wes McKinney blogged about benchmarking some pretty trivial array operations more slowly than you’d expect given the benchmarks shown. Stefan’s response to this was good, and I’d hope that any of the hiccups or inconsistencies seen this early on are simply due to a mixture of immature libraries and people not doing things in the most efficient way. Fingers crossed compiler optimization can try and correct for the latter, and given the language’s age this is pretty understandable.
An honorable mention should probably go the JS V8 engine, which is pretty speedy, although not very well suited for scientific computing.
Syntax and flexibility
Syntax is perhaps not the most crucial thing for seasoned programmers, but for new programmers having a syntax which basically looks like matlab + python is only a good thing – code looks like pseudocode at times (example below).
Another thing I liked was the codes flexibility, where even the most unassuming expressions can trigger massive operations. As Stefan put it,
“ (a+b) can do a single machine instruction or start up a cluster”
Despite all the awesome language features, one of the things I was most impressed with was the actual presentation. Trying to push a new language and make it appealing is not an easy task. Stefan’s presentation was awesome – in my opinion it was pitched at precisely the right level, the mix of overview and code was great, and he dealt with questions exceptionally well. The presentations from Shane Conway and John, only backed up what Stefan had said, and indeed further showcased the languages power and simplicity, but in a totally non-fanboyish way.
Julia’s future looks pretty exciting too. Stefan said that there will hopefully be a package manager and a module system out in the next month or so. I think when this happens, we’ll start to see some of the general scientific packages ported over (something I’d be interested in doing). Additionally, we can expected improved performance and stability ahead of the 1.0 release.
For me, this is very much one to watch, and perhaps get involved in…
Postscript, March 2015: Its interesting that although I haven't gotten involved with Julia (yet...), the language has florished. At the time of this meetup there was a discussion about what makes a language 'real'. The general benchmark was circa. 100 users or more using and working on the language. In the fall of 2013 I was talking with a friend who, it turns out, is using Julia in production in his thesis work. It seems crazy that I was there with the creator a few months after the languages' alpha release, and now it's being used at scale to solve real (hard!) problems.