Uncategorized

Go Go Go

I’m new to Go—just a few months in. I’ve spent a lot more time with Java, C++, even Python, but the Gopher is an interesting critter so far.  It’s not just a better version of <your favorite language here>.

Every language is a commitment to a particular way way of looking at programming but rarely more so than with Go, which is often politely described as “opinionated.”

Some Informal History

Go’s ancestor, the C language, which was invented in 1972, spelled the end of the age of assembler for systems programming. In the words of its author, Dennis Ritchie, C is portable assembler. Until C came along, you had to hand-port operating systems and other systems code to each platform you wanted to run on. There weren’t any IDE’s—it was just you and your editor back then.

go-dude

A vast superstructure of software development tooling has evolved since C was young but less of it than one might imagine is about telling machines what to do.

There used to be a thing called “the software crisis” back in the 80’s and 90’s. Younger programmers may never have heard of it, but back then the majority of large projects were said to fail as the size of the problems we faced began to outstrip our ability to write commensurately large software.

Yet today, only the occasional overblown or ill-conceived project fails and success is the expectation. What happened?

It’s not the languages that have changed—I was a CS student in the late eighties and I have yet to encounter a significant language feature that did not already exist when I was an undergraduate.

It is tools and management techniques to facilitate people working together that beat down the software crisis, not high-powered syntax. Widespread adoption of Object Orientation allowed data models of unprecedented size to be developed and managed by large teams as they evolved over time. Open-source created a universe of high-quality computational Lego that let individuals or small teams produce incredibly powerful systems with little more than glue code. Agile and other management practices, powerful source control, documentation, tools like Maven, Jenkins, continuous integration systems, Jira, and of course, Unix for everyone.  These things are all about people, not machines. If you left it up to the machines, they’d write everything in C for portability and they’d skip the tabs or newlines.

Go first came out in 2007, into a world that would have been unimaginable when C was invented. When C was new there were no Internet or Web servers, no REST, no data centers that respond to millions of queries per second. Networks barely existed and certainly nobody carried networked supercomputers around in their pocket.

Against this historical background, Go is strange indeed. It is one of the newest major languages—Scala is a year or so younger—and it was designed to deal with some peculiarly modern problems, yet it sidesteps almost all of the last forty years of the software development revolution, in fact going out of it’s way to make many modern software development practices difficult.

Go is extremely simple syntactically, almost purely imperative and procedural with n’ere a nod towards functional programming. It eschews almost all of object orientation, and has strikingly little support for even simple software development concepts like encapsulation. Go is garbage-collected but other than that, the language is stripped down even compared to C, lacking even simple macros like #include and #define. The usual management tools, like Git and agile apply but you don’t really need any kind of build system with Go. It’s remarkably simple.

What Go Is And Isn’t

In principle, you can write almost anything in almost any language, but languages tend to be targeted on certain problems. C is great for coding operating systems. Java, Scala, even C++, are better choices for sprawling business applications. If an application has demanding latency requirements, the choice pretty much narrows down to C++ for big shops. Python’s execution speed is by far the slowest of the major languages but in development time and flexibility it crushes all of the above. In the age of many-cored multi-Ghz computers, speed does not always matter much.

So what is Go for?

A Systems Language?

Aficionados are prone to getting upset if you suggest that Go is a special-purpose language, as if that is something shameful. One even hears claims that it’s a systems-programming language, but take it with a grain of salt. Saying a language can be used for “systems programming” is a figure of speech, a technologist’s literary trope meant to convey nothing more precise than “really hard-core.”

Utilities, yes, but Go would be a strange choice indeed for building an operating system. For one thing, it’s garbage-collected and your programs must be linked to a substantial run-time. For another thing, it doesn’t really have pointers in the classic sense. It has variables that point to objects in memory, but a variable is a pointer by courtesy at best if you can’t do arithmetic on it.

The Go runtime is not a separate interpreter like the JVM. Instead, Go compiles to a native binary that is linked directly to the runtime libraries.  In addition to garbage collection, the runtime also does scheduling and other tasks for Go’s concurrency mechanisms.

Garbage collection is an ancient idea in computing, but Go’s garbage collection is worth a mention. As far as I know, there is no other language that beats Go when it comes to minimizing stop-the-world pauses. STW pauses are a huge issue in garbage collected languages. There are special real-time JVM’s that provide guarantees relating to maximum STW time, but these guarantees are not much better than Go’s ordinary performance, and real-time JVM’s typically have poor average speed, while Go is rather fast (whatever you take that vague term to mean.)

Object Orientation

One of the biggest surprises for people starting with Go is Go’s approach to Object Orientation.  There isn’t any. Zip. Nada. You can almost feel a generation of simmering resentment against the often petty and ideologically motivated excesses of the OO movement. As a wise person once put it “Dude, not everything is an object.”

I sympathize.  It seems like every good social program, not just in software but in the world at large, is quickly taken over by ideologues who lose sight of the original purpose and carry it too far. Java’s OO is to programming what political correctness is to being a civilized reasonable person.  Java insists that code can only exist within the context of an object, forcing a proliferation of unneeded classes to that exist only to give code a place to live.

Go throws out the entire idea of classes and along with it implementation inheritance and all but traces of interface inheritance.  It’s a return to what programming was like back in the old days with C and the immediate effect is like the feeling of a brisk wind in your face the first time you’re on a sailboat.

Not to strain the metaphor, a few trips in, you may begin to realize why everyone doesn’t travel around in sailboats anymore. Clearly, the designers realized it too, because Go is littered with work-arounds to get back to some of what was the good about OO before it fossilized into dogma.

For example, programmers naturally want to group procedures together around the data structures they work on. The usual mechanism for this is classes, but Go pointedly declines to provide them.  It’s very C-like in that respect—in Go, functions sit directly on the ground, not wrapped in a class.

What Go offers instead is function “receivers” which do something similar but weaker. You can define a function as being received by an instance of a data structure which the function will be able to operate on.  The result is the Go idiom of declaring something like “type EmptyPlaceHolder struct {}” and with it a set of methods of the form “(eph EmptyPlaceHolder) PseudoMethod(…).”  You can do it with non-empty structs too, of course. It’s a poor man’s instance methods but the “methods” aren’t really bound to the receiver.

The absence of classes takes a little getting used to, but for the kind of programs one writes with Go, it’s not too bad once you get on board.

That said, I’d be fascinated to see how one would go about using Go in an environment like investment bank X where I used to consult. Company X counts the number of business classes in use at any moment in six digits and often had a dozen or more versions of a single class in production at the same time, sometimes in hundreds of critical systems that run 24×365 handling many billions of dollars a day.

The Go answer is, yeah, well, obviously you shouldn’t be programming that way. But these situations don’t arise from bad programming management; it is all but impossible to coordinate a global cut-over to new versions of classes in that kind of environment. Unimaginable sums of money are in play and the pressure to make things run now is irresistible. Saying you shouldn’t program like that is like saying organizations shouldn’t be in businesses that handle trillions of dollars. I don’t know how you’d even begin to approach X’s situation without classes and inheritance.

What Go Is Really For

In principle you could build pretty much anything you’ve a mind to using Go (or any language) but there are two territories where Go shines:

Environments like heavy-duty Web-services or Ad-tech

These kinds of programs are characterized by blocks of relatively simple code that answer remotely called procedures or queries, often by making one or more remote queries of their own.  This kind of application tends to consist of many numerous, relatively simple and independent pieces that run asynchronously with respect to the caller, and spend most of their time waiting for  REST call or SQL query to return.

Such applications are often highly concurrent, serving blizzards of queries simultaneously. This is one kind of problem that Node.js was designed for, but Go is purpose-built for the problem all the way from the syntax to the compiler and it eschews all the asynchronous callbacks used in JS.

In broad strokes, these kinds of applications often have an architecture that is much like a fistful of dry spaghetti out of the box: there is a big wad of code that holds it all together (your fist) but the meat of the application is many relatively independent units that run end-to-end with little connection among them.

“Systems” Software That Isn’t Operating Systems

Another thing Go is great for is complex high-performance software built by highly skilled teams of limited size. Check out this list of Go projects. None are trivial in scale, but they are almost the opposite of “enterprise” data projects. Even when large, they tend to be conceptually well-defined from the get-go. Such a project can rely on exquisitely developed programmer etiquette to discipline development in a way that a sprawling business environment comprising thousands of applications and staffed by legions of more junior programmers cannot.

This is something that applies to many of the best languages: Go is designed to make good programmers better, but one suspects that it probably makes programmers in percentiles one through 95  worse.

With that language bent in mind, let’s look at some of the things that are different about Go.

A Catalog of Unusual Stuff

Think of Go as what could happen if C had one too many drinks and spent the night with Python. There are a lot of little things about Go that are odd coming from old-time C guys.

  • There is no ternary operator.  It was always syntactic sugar, but it sure was handy.
  • The increment operator ‘++’ isn’t an expression in Go; you can’t treat it like a number.
  • You can’t declare a variable that is visible only within a file. The only restriction is privacy within the package. Moreover, this is done by a naming convention, not a keyword, a very un-C-like idea.
  • In almost all other C-descendants, semicolons are terminators. In Go they are separators.
  • Related to this, the programmer rarely needs to use a semicolon. That’s not actually true under the covers—semicolons are required by the compiler but most of the time Go will infer where they are needed and slyly slip them into what the parser sees.
  • The above is part of a bigger surprise. Go can pretend that semicolons are optional because it’s line-oriented, like Python. In Go, newline characters are a part of the grammar, blurring the line between syntax and pretty-printing. This has some odd effects: if you use plus-signs to concatenate a string out of several sub-strings and decide want to put it on two lines, putting the ‘+’ at the beginning of the second line is different from putting it at the end of the first line. Likewise, it is a syntax error for ‘else’ to follow a newline because the newline terminates the statement, leaving the else hanging.

Go designers made some very unusual choices in terms of how it supports programming:

  • Structs but no classes, as mentioned above.
  • Go’s notion of an interface is oddly lax. It’s called “duck interfaces” as in, if it looks like a duck and quacks like a duck and is often seen in the company of ducks, it’s a duck.
  • You can’t overload a function with different argument types. In most modern languages the effective name of a function is the entire function signature, including the  name and the argument types and often the package and the class to which a method is attached.  In Go you can’t declare both foo( v int) and foo(f float) in the same package, which leads packages with many functions that do the same thing but require the types they operate in their names.
  • The flip side of the forgoing is that while Go is strongly typed, you don’t usually have to spell the type out when you declare variables. In many contexts Go can figure out types for you and does not require that you declare them.
  • Go won’t let you compile with an unused variable or import statement. Why, one wonders? It’s another odd blurring of concerns. These would traditionally be issues for lint,  or the IDE, rather than the compiler. The compiler would simply compile it out.
  • There are no exceptions in Go. There is a case to be made exceptions are inherently sloppy, but there are a lot of good reasons why languages support them anyway. One of the biggest is that it’s difficult to check errors when you have only one return value. For the same reason, exceptions encourage functions that aren’t functions, i.e., that rely on side-effects. Yet another reason is that excessive error-checking obscures the readability of code. Exceptions are highly abuse-able so Go summarily banishes them.
  • To fill the gap left by the absence of exceptions, Go supports multiple return values (a great idea) one of which is often an error code that is not-nil if an error occurred. The down side relying on return codes is that Go code tends to have a somewhat retro look with every function call followed by “if err!=nil” and  a block  of code to deal with the error. You also see logs of lines with the form “v, _=foo()” where the ‘_’ character is a way of swallowing the error code without declaring a variable that is an error if you don’t use it.

Run-time Model

Java, Scala, and similar virtual-machine languages compile down to what is called “byte code” which is the assembler language of the VM interpreter. This model has some interesting advantages. For instance, it makes it easy to do various optimization tricks such as optimize byte-code on the fly based on the run-time specific circumstances. It also means that a compiled binary is universally portable.

C and C++ don’t work that way. These languages compile all the way down to machine-language that runs directly on the host platform.

Go is in the middle.  Go code compiles to native binaries but Go has a run-time that manages things like garbage collection and concurrent execution of code. Unlike the JVM, it is not an interpreter.

The difference between the Go and the C, C++ model is a little subtle.  Most programs use auxiliary functionality from libraries—nothing odd there—but in ordinary compiled languages, what they use in those libraries is directly or indirectly called by the user code.  Go programs do that too, of course, but the Go runtime also has a parallel life of its own, actively providing services like scheduling and garbage collection on its own terms without your program invoking them.

Concurrency

Go’s concurrency model is unusual and central to the design of the language. It deeply influences the programming style appropriate to the language.

Practically all modern languages provide for concurrent programming, i.e., allow a program to proceed down more than one path of execution at a time.  The big surprise in Go is that it does not provide this capability by means of the kernel threads so familiar to programmers raised on Java, C++, C#, and many other languages.  Instead, Go uses light-weight “Go routines.”

We can’t talk about what’s different and interesting about concurrency using Go routines without saying a few things about how concurrency is normally implemented.

Threads and Processes

Most modern languages use a model of concurrency based on kernel-level threads that are very much like operating system processes. In these languages, switching from thread to thread is handled by the operating system kernel in the same loop that handles processes. From the kernel scheduler’s point of view there isn’t much difference between the two.

Within the kernel-level threads model, there are a couple of major design approaches.

  • In some operating systems, threads are exactly identical with processes.

  • In other OS’s, threads are a minor variant of a process in that they share certain resources with the process that spawned them, notably the virtual memory page-table.

This latter is a limited exception to the isolation from all other processes that is a hallmark of virtual memory operating system but either way, all true threads are managed by the operating system in the same general way as full fledged processes.

Go’s Secret Sauce

Threads are a powerful construct, but they’re heavyweight entities comparable to the program that spawns them.

What Go does differently isn’t entirely unique to Go, but it’s not a feature of the Java, C++, C#, Scala, etc. so many programmers won’t realize that it’s possible. Go-routines aren’t threads and are not managed by the kernel. Instead, they are a part of the program that spawns them much as a function call is.

What Is a Thread of Execution?

However you implement it, be it kernel threads or go routines,  a thread of execution is just a path through a program’s code together with a dynamically changing data structure that represents the programs state at a given point.

If programs didn’t have function calls, the state of a program would consist only of the program counter (i.e., the current instruction being executed) plus a set of variables some of which might point to data out on the heap. In the earliest days, that’s what programs actually looked like.

Modern languages organize code into functions that call functions that call functions. In many languages there is nothing else—just one first function (e.g. main) that calls another, which might call another, and another, resulting in a dynamically growing and shrinking chain of called functions with the data for all calls that have not yet returned managed by a dynamic data structure called the “program stack.”

—The thing to notice about that generic scheme is that chain of functions all the way from main to the current point of execution is always one dimensional. It’s not an arbitrary graph, or a DAG or anything fancy. It’s always a one-dimensional chain, i.e. single-threaded. A thread of execution is only in one place at a time–the tip of the stack.

What Go Routines Are

Go takes the position that a kernel-managed thread is a lot of hammer to hit hit a pretty small nail.  You can represent everything essential to a thread of execution with just a stack.

Accordingly, a Go routine is just a mini-stack that starts with your Go routine instead of all the way down in main.  Instead spawning one thread for each go routine, Go maintains a small set of kernel-threads on which it runs runs these mini-stacks, scheduling and running them much as the kernel schedules and runs processes and threads.

When your code fires off a Go routine,  Go allocates an 8kb block of memory and sets up your routine’s stack frame at the lowest spot.  Your frame will take up only a little of the 8kb, and the rest just sits there as head-room for stack growth as your routine calls other routines, and they call routines, etc.  Pushing and popping frames on this kind of stack is just moving a pointer, so you need contiguous space. Of a Go routine exhausts the 8kb because it is recursive or for some other reason it has a deep call stack, Go will simply allocate a larger block and copy the existing stack frame into it.

What Go Gets For Its Trouble

Running a Go routine is an extremely light weight operation—not much more expensive than the overhead of a function call—which means that a Go program can support ungodly numbers of concurrent Go routines.

Among other benefits, this makes Go routines perfect for running a kind of code that is common in industrial scale Web services with huge numbers of requests arriving asynchronously. Typically the code serving the requests follows the general pattern of executing a few microseconds of logic then making a remote call for a database query, S3, or REST operation, followed by a few more microseconds to format the data and send it back to the caller.

It’s not always that simple, but it often is, and consider the time scales involved. A typical server CPU might execute 3000 instructions per µ-sec, so the logic to assemble the data for a remote call might take just one or a few µ-sec. Contrast this with, say, an AWS S3 lookup might easily take 100,000 µ-sec or more, often a lot more. In this case, 0.99999 of the elapsed time for the routine is spent with the routine idle.

Now, consider that a server doing context switches as fast as it can among processes that do nothing can probably do no more than about 20,000/second, which is time for about 150,000 instructions. At that rate, unless you were doing at least 300,000 instructions worth of processing in each call, you’d be spending more cycles on context switching than on processing.

That there are long periods where functions are blocked doing IO presents an opportunity to run other concurrent routines during the dead time but that fact doesn’t automatically make it happen. A language with concurrency built on kernel threads is at a disadvantage here. When the scheduling of routines is handled in-process the compiler can set things up to make it easy for the language to transparently swap out any routine that is blocked more nimbly and at lower cost than the kernel can.

Communication and Coordination with Go Routines

For non Go-programmers, the most familiar way for multi-threaded programs to communicate with the caller is through shared data structures protected by locks. Go has this too, but it favors a different model called “channels”  which are simple producer/consumer connections lend themselves to access in a disciplined way. The Go mantra is “share memory by communicating, don’t communicate by sharing memory.”

Of course, under the covers a channel is a implemented as a shared memory location protected by a lock on the code that accesses it, so you aren’t really avoiding synchronizing on a data structure but what you do avoid is a host of easy mistakes to make when managing locks yourself.

I don’t know if there is synchronization problem that can’t be addressed in terms of channels, but there are definitely plenty of places where using them would be cumbersome. Therefore, Go also has explicit locking that demarcates critical code sections with Lock() and Unlock() function calls.

A Web search will immediately reveal furious disagreement about the wisdom of using locks, with many people opining that they should never be used.  Why this arouses so much passion is hard to say, because in addition to common sense recommending them, the official Go documentation itself says they are often preferable. Moreover, channels in no way immunize you against self-inflicted gunshot wound to the foot and the workarounds to implement synchronization with channels  can be baroque, but there you have it.

Non-reentrant Locking

The big surprise about the aforementioned locks is that they are not reentrant. Many programmers will have never encountered locking that wasn’t reentrant and won’t know what to make of Go’s behavior they first bump into it.

I know that most people don’t know this because I used to regularly ask Java interviewees to explain some basic things about synchronization. For example, if  thread X is in a synchronized block what happens when Y tries to enter the block under this or that situation.  People often answered correctly, but then I’d ask what happens when a function calls itself recursively from inside of its own synchronized block most people were flummoxed. Does it work? Do you get deadlock? Does the program blow up?

The answer in Java is that it’s fine because Java locks are reentrant, which means that a thread of control that acquires a lock can acquire it again and again. Recursive calls are one obvious place where this happens, but it also happens, for instance, when your function gets a lock and then calls a related function that uses the same lock. When you have reentrant locks, you barely notice it but in Go, a thread attempting to re-acquire a lock that it already has is an error.

Making a Virtue of Necessity

The absence of reentrant locking is doubtless one of the reasons why Go often favors channels over explicit locking. Channels make a virtue of necessity because you can implement them easily with simple locking, and a channel-write leaves you no place to insert any code that might inadvertently try to acquire the lock again. So no reentrancy problems.

Of course, that begs the question of why Go does not support reentrant locking when most languages have for decades. You read a lot online about how channels are inherently better, explicit synchronization is evil, etc., but I’m not buying it as the underlying reason. Those are non-sequiturs in any event, and why would the designers outlaw something as useful as reentrant locks on moral grounds? No, the real reason is probably deeper.

For one thing, it is hard to implement complex locking behavior efficiently without kernel support, and Go-routines are a language level, not a kernel level construct. This undoubtedly complicates things. But another thing about reentrant locks is that they are more useful in proportion to the number of calls between the first and the last lock.  If the distance is zero, there’s no difference.  In a language that is designed from the bottom up to support insane levels of concurrency you want locks held for the bare minimum time, which argues for both channels and for non-reentrant locks.

Garbage Collection

One of the most common complaints about Java is garbage collection pauses. Java and other JVM languages “stop the world” for significant periods. Under light load, Java stops the world for milliseconds, but under heavy load Java can go comatose for up to many seconds at a time.  Countless systems have been built in C++ solely because of the JVM’s erratic GC behavior.

Go’s STW behavior is fast and tight, taking at most a fraction of a millisecond with little variance.

The Bottom Line

Go is an awesomely capable language for the right applications, and it’s fun, like Python except fast. It’s simple and clear—a small, well organized toolbox that generates fast code with almost undetectably fast garbage collection pauses.

Go is famously opinionated but it’s more than that: it’s outright grumpy. Yeah, I get it—Java takes all the fun out of programming and C++ syntax is insane, but categorically banning every language feature associated with object orientation is excessive. Not supporting classes and inheritance could arguably be just be design choice, but gratuitously banning polymorphism begins to seem spiteful.

It’s more than the language design. I don’t think I’m imagining that there is a unique vehemence in the partisanship that surrounds Go. Language cultures have flavors. Arguments about what’s “Pythonic” tend to have the patronizing quality you’d hear in an academic argument about cultural appropriation or the correct use of pronouns.  C++ arguments tend to be legalistic and emphatic, but without the moral overtones of Python; a C++ programmer may conclude that you’re an idiot but they won’t necessarily think you’re a bad person. Java arguments have the mind-numbing blandness of testimony at a zoning commission hearing. Lisp people don’t properly argue at all, but speak in koans that invite no reply. The flavor of Go arguments is complex, with the moral overtones of Python, for sure, but instead of the C++ fury, Go arguments tend towards an eye-rolling cold exasperation. Go culture is like an elderly middle-school teacher who has broken one too many red pencil points and is relieved to have reached retirement without having strangled any of the little monsters. She will be moving to a seniors-only walled community.

Still it’s a great language for good programmers.  If you’re building Google’s or Amazon’s infrastructure, it might even be indispensable.

What Go is definitely not is not is a language for enterprise business programming. If your needs are like Google’s, Go for it. If you’ve got a crack team building high-performance software, Go might be just the ticket. But Go will never be an “enterprise” language. If you foresee a  large code base with gazillions of business objects evolving over years, think long and hard before committing to Go.

Standard

One thought on “Go Go Go

Leave a comment