Autotools Considered Harmful

Programmers quickly learn the importance of good tools. The right programming environment, libraries, editors, and so forth can mean the difference between a two-day project and a two-month project.

For the most part, Linux has the best developer tools around. However, in one important respect, Linux suffers, and that is in its usage of autotools.

In this essay, I'm going to argue that autools is not a blessing, but a curse. It is difficult for developers to understand, not portable, slow to execute, piggish with disk space, and generally a square wheel. Finally, Autotools could and should be replaced with CMake with relatively little effort.

About Autotools

Autotools, also known as the GNU build system, is a suite of of programming tools designed to build software. It is similar in spirit to the Make program. Autotools was designed to be portable to many different operating systems.

So what does autotools provide that the humble make prorgam does not?

The Overall Design

autotools was created over a period of a few years during the early 1990s. The first part that was created was a piece called autoconf. Basically, autoconf takes a few files as input and generates a script called the configure script.

The configure script is a large (about 700 kilobytes) shell script. When run, it scans the system, checking for required tools and libraries. It then generates another shell script named configure.status. This script takes Makefile.in files and generates Makefiles suitable for use with the standard UNIX make program.

autoconf is based around the m4 programming language. This programming language has an extremely low level of abstraction. In fact, it was originally developed to make it easier to write assembly language programs. It's very similar to the C preprocessor-- you know, those #define, #include, and #ifdef directives you use when you're writing C. It's not exactly the kind of thing that lends itself to clear and readable code, as many people have observed over the years.

Around about 1994, automake was developed to address this lack of abstraction in autoconf. The basic idea is that automake takes Makefile.am files as input, and generates the input files for autoconf. automake is written in Perl.

In 1996, development started on libtool, which is a separate project from autoconf and automake. As I mentioned before, libtool is designed to deal with shared libraries. libtool tends to be used in conjunction with automake.

Complexity Overload

This brings me to my first criticism of autotools: it is too complex. To understand and debug a project that uses autotools, you must understand Bourne shell scripting, the m4 programming language, the pseudo-language which Makefile.am files are written in, the pkg-config tool, libtool, and the large library of built-in m4 macros which autoconf relies on. This is all in addition to understanding the C compiler and linker-- you know, the things that are actually doing the useful work here. In a world where programmer time is money, that's a lot of overhead. In contrast, CMake requires you to learn exactly one syntax-- the syntax of CMakeLists.txt files.

Just for fun, let's count the number of executables here. I count autoconf, automake, autoheader, aclocal, autoscan, libtool, autom4te, autoreconf, libtoolize, m4, and autoupdate. My /bin directory is filled to the brim with incomprehensible garbage. In contrast, cmake installs one executable, the eponymous cmake binary.

The documentation of autotools talks a lot about hiding the details of the implementation from the programmer. But in reality, you're going to need to know those details in order to work with autotools. For example, in HDFS-3396, we discovered that the linker was being run with incorrect arguments, causing build bugs on some (but not all) platforms. The error message we got was from the linker, not from automake, which we were using at the time.

One of the great advantages of autotools is supposed to be the fact that "everyone is using it." Sure, the learning curve may be steep, but once you understand it, you should be able to hop into any new project and immediately understand what is going on-- right? Unfortunately, no.

If you hop right into a project that uses automake and start editing the Makefile.in files, you'll find that you've been editing a generated file. In contrast, if the project is using only autoconf but not automake, those are exactly the files you need to be editing. Some projects are using libtool, and some aren't. Just to muddy the waters even more, libtool likes to create files named like shared libraries which are actually shell scripts. Some project leaders prefer to check in various generated files. Others, on the other hand, prefer not to do this. It's a mess, and no matter how well you think you know autotools, you'll find that each project does things differently.

Bad Error Messages

Autotools is mostly written at a very low level of abstraction-- just barely above assembly language. Remember that the m4 programming language doesn't have concepts like datatypes, structures, for loops, lists, local references, closures, namespaces, modules, or functions. In a very real sense, C is a higher level programming language than m4. The consequence of all of this is that the error messages generated by autotools are very, very poor. Here are a few examples:

Portability

As I discussed earlier, autotools relies heavily on m4 scripts and shell scripts. Getting the right version of m4 is usually not a big problem-- after all, nobody is dumb enough to actually use m4 besides Linux users running automake. So there's almost no chance that you'll end up running under an m4 version that doesn't match your autotools version. However, the different shells that exist in the wild create a serious portability problem.

With autotools, you are responsible for writing portable shell scripts, that can run on ash, bash, ksh, pdksh, zsh, and whatever shells may have shipped on AIX, ancient versions of Solaris, Mac OS X, and so forth. But don't worry, the autotools manual will provide you with a "Zoology of shells"so that you can get started on your never-ending task of supporting all past and future shells.

Here's a handy tip: sed '10q' is a great alternative to head on systems that don't have the latter command installed. Also, aspirin is cheaper when you buy it in bulk!

autotools essentially does not support Windows. The only way to get autotools working on Windows is to install cygwin, or a similar UNIX emulation layer. This kind of dependency is unacceptable.

At the end of the day, autotools makes your code less portable, not more. If you use CMake, you never have to worry about shell incompatibilities. CMake works on every major platform, including Windows.

Thirdly, as we discussed earlier, autotools often exposes details of the underlying platform. It's a real challenge to use autotools without introducing Linux dependencies in your code.

In contrast, CMake doesn't suffer from any of these problems. CMake does not use shell scripts, so the whole shell portability nightmare fades awway. CMakeLists.txt files have the sam format on every platform. CMake supports Windows natively. And because CMakeLists.txt files use a real programming language rather than an ancient macro processor, the programmer is seldom exposed to platform details except when it's necessary.

Terrible Performance

Autotools has terrible performance, both in terms of execution time and disk space consumption.

Just to put some concrete numbers behind this, I have created a simple project to test autotools. There are about 34 total lines of C code in this project, but once I initialize autotools, the directory contains 4.1 megabytes. Running configure on this empty project takes 6.5 seconds.

From personal experience, I always have to wait several seconds after typing make foo before even seeing the compiler run. It's a waste of both CPU power and my time as a developer.

By using automake in a non-recursive way, you can acheive vaguely acceptable performance. But that means that you have to combine together all of your build system directives into one giant file, giving up modularity and clean design. As programmers, would we accept having to put all of our source into one giant file, because the compiler was too dumb to efficiently handle multiple files? No. And we shouldn't expect the same shabby treatment from our build systems.

The build-time performance of libtool is just inexcusable. Josh Triplett estimates that "for many... libraries and other packages, more than half of the build time goes into running the libtool shell script." Libtool, as he goes on to explain, is "an 8500 line, 250 kilobyte shell script, which runs every single time you compile a source file."

Not content with causing poor build time performance, libtool also causes unecessarily long startup times, by replacing your binary with a shell script that loads the "real" binary and libraries.

Lack of mindshare

Most successful open source tools get used by many different organizations. The Linux kernel has made its way into everything from home routers, to mobile phones, to supercomputers. Companies, universities, and other organizations recognized the value of the technology and put it to good use. The git source control system was picked up by a lot of companies, especially web developers.

Autotools, on the other hand, has barely made a ripple in the corporate world. Neither Google, nor Facebook, nor Amazon, nor any other major company has publicly committed to using autotools internally. Google even went so far as to roll their own build system for Android. No doubt, some people will argue that they did this to avoid the GPL. However, the build system that they ended up coming up with depends on Makefile extensions that exist only in GNU Make, a GPLed program.

Most developers on open source projects have a casual familiarity with autotools. But the number of people who would claim to be good with autotools is vanishingly small.

When speaking about the KDE experience with autotools, Alexander Neundorf wrote:
In KDE3 development, we ended up with an autotools-based build system that most of our developers did not fully understand. I estimate that no more than 10 people were able to fix any major problem occurring with it, first and foremost our "buildsystem-guru" Stephan Kulow. Over the years, these guys must have been contacted by hundreds of co-developers with requests for help to fix build problems in specific applications and modules.

Conclusion

One comment I hear frequently about autotools is that they may be bad, but they are the only viable option. But the reality is that CMake is faster, easier to understand, more portable, and altogether more usable than autotools.

Most Linux developers don't accept the argument that they should use Windows because "it's the standard" on the desktop. So why should we use a build system that we know is bad? And who exactly decided that autotools was "the standard"? It certainly wasn't Linus Torvalds, who designed his own build system for the Linux kernel. It wasn't the KDE project, which migrated from autotools to CMake. It wasn't MySQL, which also is in the process of migrating from autotools to CMake. It's not any of the big software companies-- they don't use autotools.

Autotools is the standard only in the minds of those who believe it is the standard. It's sad to see developers wasting their time learning an obscure macro language, or debugging problems related to libtool, when they could be improving open source software. It's sad to see projects that claim to be portable, but in fact are not, because autoconf required the developers to write code that was portable to 10 different shells, and they only managed 9. It's sad to see potential contributors get discouraged by the complexity of dealing with a Makefile.am file.

autotools had its moment in the sun, and now it's time for it to ride off into the sunset, along with MC Hammer, Kriss Kross, and the first dotcom bubble. The open source community deserves something better than an incomprehensible, unportable, buggy pile of archaic scripts. Give CMake a chance-- you won't be disappointed.