Haxe is a most unusual language. So far, nobody I’ve enthused about it to has heard of it, which is a shame. I’m loving it. But before jumping into it, I want to give you some setup.

We had a problem. The company I’m with wants to flush data from hundreds of different kinds of IoT devices to the AWS Cloud. There are also Linux-powered gateways, a ton of code on the Cloud side plus Web browser applications. Among them, they use Python, C/C++, Java, JS, and PHP, and run on Linux, Mongoose, Microsoft, OSX, Android and even bare metal (the embedded controller-based devices, e.g. Arduino and ESP32, etc.)

Despite all these exotica, our problem is humble. The messages the components send, at some point, are almost all represented in JSON, so we need some way to define that JSON centrally to ensure that all participants conform to the same schema and to make it testable. The best way to do this is to provide developers with standard objects—beans, in the Java world—that emit and accept the JSON. But we don’t want to write and maintain the bean code in five languages as things evolve. How do we get around that?

IDL In The Middle?

There are many frameworks that use some kind of Interface Definition Language (IDL) to define data objects generically so that they to and from a generic format wire-format in a language-neutral way. These frameworks use the IDL document to generate equivalent beans in multiple languages. The beans know how to emit and reconstitute the data at either end and in between the data is usually serialized to some efficient binary wire-format to conserve bandwidth. CORBA, Protocol Buffers, Avro, and Thrift all do something like this.

IDL seems like the right general idea, but those frameworks don’t quite fit our needs because they aren’t JSON oriented and wire formats and communication aren’t really the problems. For us, it’s just a question of keeping the JSON consistent. Writing such beans isn’t a big deal—we just don’t want to write and maintain everything in five languages for the rest of eternity.

Haxe

Which brings me to Haxe—the coolest language you never heard of.

Forget IDL. Haxe is a feature-rich, high-level Turing-complete programming language. It’s very generic, somewhat Java-like but it also feels somewhat Pythonish at times. It’s got all the basic stuff with plenty of modern whistles and bells like closures and generics. Nothing too exciting there, but the point is, it’s not a specialized framework. It’s a real programming language suitable for complex projects.

The unique thing about Haxe is it is that there is no Haxe compiler that turns out an executable, and no virtual machine, either. Huh?! Instead, you run your code through the Haxe cross-compiler (called haxe) with a flag naming a target language, and it rewrites your Haxe program in your language of choice and even compiles it for you. I’m not sure if compiles every compiled language, but it compiles it for Java. The Python just comes out Python. If you name a “main” on the command line, the result is executable.

In theory you could modify the code haxe emits, but that would be perverse. Doing that would be a one-way street, because there’s no way to suck it back into Haxe. The normal practice would be to always modify the Haxe code, and only the Haxe code, and re-emit the code you’re going to actually run fresh. AFAIK there’s no reason to ever open the emitted files. You can, and people often do, write entire program in Haxe but what we’re doing is just creating libraries for manipulating the JSON.

This solves our problem perfectly. I’ve written the JSON-oriented code in Haxe. There’s beans with getters and setters for all the fields plus methods to write the JSON and constructors to go in the reverse direction. There’s also a convenience class to run some standard known data through each object and convert it to JSON so we can verify reference output with automated tests.

Now, the developers can be confident that they are writing to the same message model and they don’t have to code the JSON at all—that’s done just the one time in Haxe and updates will go to all the libraries automatically when it’s compiled.

It took a couple of hours to get Haxe set up, figure out how it worked, and establish the mechanics, but after that it was easy, and it looks like it will reduce the time spent on maintaining this aspect by 80%, forever. It’s so successful and easy that we’re looking fold other boilerplate functionality into the Haxe build. It’s pretty amazing.

Some More Details

Once more, I’m no expert (yet) but there are some other points worth keeping in mind so I’ll just dump some things I’ve come across and you can investigate yourself.

Limitations of space and my own inexpertise will only allow me to touch on the high points but HaXe’s excellent Website is very complete and there is a book, HaXe 2 Beginner’s Guide by Benjamin Dasnoise available online.

Type Checking and Binding

Haxe has an odd model for type checking.

Languages vary in two major ways w.r.t. type checking: how strict they are, and when a given variable is bound to a type. For instance, in a C++ program, you always have to state the type of a variable or function when you declare it and thereafter the type definition cannot change. This is called compile-time binding or early-binding. On the other hand, you do not declare types in Python and similar languages because the runtime figures out what type a variable is on the fly. This is known as late-binding. In fact, in Python, a given variable can hold different types at various points during execution, which strikes many C++ and Java people as flat-out depraved. (Not all late-bound languages allow this.)

Superficially, Haxe type rules sometimes feel like Python in that you can simply ignore types much of the time, but there are times when Haxe insists that you give a variable or function return value a type. A language expert may correct me on this, but despite the fact that you don’t always have to declare the type, the underlying model of Haxe seems to be strictly compile-time bound in that by the time the code hits the compiler, every type must be either explicitly stated or deterministically inferable. It would make sense because if the language you compile to doesn’t need the information, you can throw it away, but if it weren’t there, you could not compile to early-bound languages.

How Hard Is It?

If you are a user of any major programming language other than Lisp, much of Haxe will feel familiar. You can just start using it and figure out the fine points as you muddle through your first project. I’m a complete nube myself, having done exactly one non-trivial project in Haxe, but I got several hundred lines of working code without too much trouble.

How Complete is It?

Most of the features of modern languages are included in Haxe. It’s kind of weirdly generic that way. If you’re used to Java and Python, you’ll barely notice that it’s not whatever language you’re used to. The big exception is that Haxe does not expose memory directly with pointers the way C and C++ do, so that style of programming won’t be directly available to you. Just as cross-compiling from an early-bound model to a late-bound model is logically straightforward but going from a late-bound model to an early bound model would not be, translating from a language without pointers to a language with pointers is relatively easy, but going from a language with pointers to one without would be a very good trick.

Compiling

It feels like magic—you write your code one time in this generic programming language and in seconds you can have your code in Java, Python, C/C++/CPPIA, Lua, Neko, PHP, JavaScript, C#, or some specialized languages such as Flash.

But of course, it can’t really be that simple. Languages have libraries that aren’t strictly part of the language proper but are very much a part of the language culture. Also, a few obvious things are mysteriously left out. For instance, when the target is C/C++, C#, Java, Neko, or PHP, you have the Haxa Sys library, which deals with command-line arguments and several other important things, but it’s not available for Python. I had to work around a few things, but very few so far.

It goes the other way, too. A lot of the general purpose stuff is built into the Haxe libraries but it can’t be expected to include the union of all languages’ library functionality, if only because the models are often different. So for each language, there is an additional set of idiosyncratic libraries. The PHP library has things like cookies and HTML, and the Flash libraries obviously have Flash, etc.

To cover the inevitable complexities there are some compiling features such as conditional compilation that let you continue to maintain one code base. The ins and outs of compilation are not trivial but they don’t seem like a big deal compared to C++.

The Black Arts

If you are truly hardcore (and my core is pretty much that of a Hostess Twinkie) you can start messing with macros. This feature lets you jump into the middle of the translation/code generation path and insert your own custom functionality. This means that in theory you can soup it up in arbitrarily complex ways, but given the comprehensive set of syntactic constructs and libraries that are already there, you’d have to have some pretty abstruse needs to justify getting into that for anything more than the sheer deviltry of it..

What Is The Correct Pronunciation?

It doesn’t matter how you pronounce it because nobody has ever heard of it before. As the unquestioned authority, your pronunciation will be established locally as correct.

If you find yourself talking to someone who actually knows the correct pronunciation, rely on chutzpah. Eccentrics are still proudly mispronouncing vi to rhyme with bye after more than 40 years and others snobbishly stick to saying Line-ux instead of Lin-ux because supposedly Linus is pronounced Line-us. Be your own person, goddammit.

Me, I’m going with hax. Like axe.

Bottom Line

I could not be more tickled so far. It’s not perfect, because languages have differences that are more than cosmetic—non-removable as they say in calculus. But for the kind of situation we’re in, it couldn’t be better. Word on the street (I can’t verify it) is that it’s popular with people who write multi-platform games, Web-applications, and desktop-applications. I can see why.

Addenda

Haxe experts Jeff Ward and Mike Knol were kind enough to send me some comments and clarifications. Rather than either relegating this material to the comments or attempting to correct and adjust my own text, I’m going to inline comments below (in italics.) There is less chance of them being mangled that way, and comparing their comments to what I wrote gives useful contrast. Some of these hint at some tantalizing conceptual depths.

Jeff Ward Says

Cross-Compilation

As you said: strictly speaking, the Haxe compiler itself simply cross-compiles, but for some targets, a VM implementation is required (and provided). e.g. the hxcpp library (which contains a VM implementation) is required to target CPP. Same for the hl (aka hashlink) target. This VM includes a garbage collector and support for the high-level features of Haxe, e.g. function closures.

This is an interesting wrinkle. The Java and Python targets work pretty much as you’d guess, but I’ll have give CPP a try and get back to this issue. That sounds like another blog piece!

IDL

Since you mentioned IDL’s – my company is experimenting with describing some data types in graphql (which among other things includes a simply IDL). I’ve written a library for compiling graphql IDL to Haxe typedefs. That may be overkill for your purposes, but generally I agree – Haxe is a great language to define JSON-like structures.

GraphQL is a query language designed at Facebook with querying across the Internet in mind. Queries are expressed in JSON format and in a typical use-case might have multiple languages at either end of the query. It’s quite similar to the problem we have, but in a much more complex environment. I’m using it for something much humbler—mere beans.

Macros: A More Powerful Approach

And macros, as you say, generate code at compile-time. So, say you have a typedef which defines your valid JSON structure. With a macro, e.g. it could use this definition to auto-generate the Haxe code necessary to validate these structures at runtime. If that’s something you care to do.

You might also glance at the tink_json library. It’s a macro library for working with JSON in a type safe (and performant) way. But again, perhaps overkill for your application?

This is quite a bit more sophisticated than what I’ve cobbled together, but it speaks to what I meant about Haxe being a serious programming language, not just a tool for getting around some limited problem. It’s an intriguing idea. Definitely appealing for version 2.0.

Types

WRT inference… Yes, Haxe’s handling of type inference is fantastic. Basically, if you don’t type a variable, it’s a monomorph (of unknown type) until some later expression causes it to be inferred. Once inferred, it’s strictly typed. But this can trip you up occasionally. e.g. if you don’t type function arguments:

function do_something(a) {
return Math.round(a);
}
And then you tuck a trace statement in there:

function do_something(a) {
trace(‘Debug: a is: $a’);
return Math.round(a);
}
Suddenly there’s an error, because while a had been inferred as a Float, suddenly you inserted a line that inferred it as a String. For this reason, and for code readability, I (almost) always type function arguments. Just be aware of these things, and try to think like a compiler. 🙂

Very much what I was trying to get at. Even if the type of a variable is unambiguous when you first write something, later code can make it ambiguous. That’s one reason to default to expressing variable types. Another is that there’s useful information there for you, the programmer.

Mike Knol Says

Sys Library

Python target does have sys access.

I’ll have to check this out. I wasn’t able to get command line arguments and some other things in Sys working in Python, which wasn’t a problem because I’m only using command line arguments for testing purposes. Everything gets called as a library routine in real life.

Resources

You have links to haXe 2 books. haXe 2 is quite old. I mean so old that it was written like haXe; “we” just write “Haxe” nowadays.

We have more resources like the Haxe manual and the [Haxe code cookbook](https://code.haxe.org, which you could link if you want.

The Haxe manual and Cookbook are terrific resources. Haxe doesn’t have the vast user community of languages like Java and Python, so it’s not quite as easy to program-by-Google in Haxe, but it’s all out there and thoughtfully presented.

Pronunciation

Btw, I once made this http://haxe.stroep.nl/how-to-say-haxe/ That is how you could pronounce Haxe (I think it works only recent browsers)

Looks like there’s still room for independence. “Hax” seems to be dominating in the US, but numerous variations on “hax-uh” are used elsewhere. I might switch—now that I’ve heard all these other people say it, the American pronunciation sounds sort of like a duck.

2 thoughts on “Haxe”

nvcken says:

Hi , very cool, How about another aspect likes parsing data speed compare between example Protocol Buffers vs JSON ?

LikeLike

July 5, 2018 at 3:49 pm Reply
- Peter Coates says:
  
  I really don’t know the answer, but ProtocolBuffers, Avro, and similar frameworks deal with a broader set of issues than JSON, emphasizing the full end-to-end data path, while JSON is a narrow convention for representing data. But if you used some library to convert objects to JSON, sent the JSON via HTTP, and then unpacked, I think you’d find PB would be faster in most cases, with a larger and larger advantage as the payload size increases. I say this because PB and etc. pack the data into a dense binary wire format, whereas JSON is just text, and fluffy text at that. So PB’s advantage would increase with object size and also increase with longer transmissions or limited channels. But for me, the critical thing is that while PB handles only the end-to-end bit; if you use Haxe, much more of your code only has to be written once. Factory methods, convenience methods, verification of the data, diagnostics, etc. can all be centralized in the Haxe and not duplicated for each language. For example, in our case, you might have a general pattern that you want to send data updates only if certain thresholds are reached, otherwise you’d just send a minimal heartbeat every Nth skipped message. With Haxe, that kind of code can be common across the various languages. That’s huge for us because for the rest of time, all the usual maintenance has to be done for several languages.
  
  LikeLike
  
  July 5, 2018 at 8:10 pm Reply

	Water on A Pilgrim’s Progress #1:…
	Stewyn on Shifting to Hive Part II: Best…
	Glen on Go Go Go
	hadoop 3 Erasure cod… on Erasure Code in Hadoop
	Rajesh KSV on Shifting to Hive Part II: Best…

	Water on A Pilgrim’s Progress #1:…
	Stewyn on Shifting to Hive Part II: Best…
	Glen on Go Go Go
	hadoop 3 Erasure cod… on Erasure Code in Hadoop
	Rajesh KSV on Shifting to Hive Part II: Best…

hadoopoopadoop

Big Data with Hortonworks Hadoop

Haxe

IDL In The Middle?

Haxe