ICT: Diary
D: 90 W: 14

< July 2013 >
Sun Mon Tue Wed Thu Fri Sat
 123456
78910111213
14151617181920
21222324252627
28293031 

Based on notaweblog.php by joshua stein

[ ] Wednesday, 31 July 2013 [ ]

Compilers in OpenBSD

This is a tale about compilers by Miod sent to misc@:

A recent discussion (``Default software in the base'') suggests using Clang/LLVM as the system compiler in OpenBSD in the short-term future. This discussion hasn't really gone anywhere, yet I thought I could waste bandwidth with my thoughts as the current de-facto compiler maintainer in OpenBSD. Mind you, I did not ask to end up maintaining the system compiler in OpenBSD. I have earned this position because I have had to fix or workaround too many bugs in gcc, as a port maintainer. And I wish I hadn't need to do this. A long time ago, in the first few years of the *BSD projects, the only free software compiler spanning the various platforms BSD systems were targeting was gcc. pcc was orphaned, TenDRA still used a cumbersome build system and did not support enough platforms, and that was about everything in the free software land. Also, gcc 2.5 (at the time) had a few bugs, but not many. You could trust it to produce working code at any optimization level, and forget about it. In other words: there was no need to put any effort in maintaining the compiler, because it was (almost) bug-free. This state of mind was valid up to the 2.7 days. The accepted wisdom was that -O2 was supposed to be followed by -fno-strength-reduce, because 2.7 had bugs in the strength reduction code, which mostly affected i386 code. And then you could trust the compiler. And then C++98 came out, as well as C99, and it was time for serious work in gcc, if only to attempt to support the new features of these standards. One may remember the schism between gcc 2.8, conservative but trying to catch up on C++98, and the ``Pentium gcc'' group, attempting to produce faster code by stretching the optimizer code beyond its limits. These projects eventually merged as gcc 2.95. From then on, a few things changed forever: - many more people were working on specific optimizations - these optimizations, unlike the 2.5/2.7 optimizer, were no longer ``almost platform-independent'', but would benefit from particularities of the target platform, leading to more code attempting to decide whether a given optimization recipe was worth applying or not. As an unavoidable consequence of this, something very important in the world order changed: gcc had bugs, and you were expected to accept that and cope with them. When I write `gcc', you can read `the compiler'. As Arthur C. Clarke would have said, ``any sufficiently optimizing compiler is indistinguishable from magic.'' So what does this tale teaches us ? First, compilers are fragile. While one would like to expect a minimum level of correctness and trustworthiness from a modern compiler, we can't, regardless of the compiler we use. Second, compilers are a moving target. Architectures without enough testers and developers start misbehaving (because they are the only ones to subtly break assumptions of the newly added optimization passes, yet 95% of the time end up producing working code, after all), and eventually get dropped. The prime example of this had been m88k, which got broken in gcc 2.95 because of a target-specific macro suddenly needed its arguments to be brace-protected, and noone had fixed the m88k backend because noone had tested/cared. This is the reason why OpenBSD ships with different compilers, depending upon the platform you are running OpenBSD on: a given release of gcc might not be suitable on a given, less popular, platform (which is not surprising for gcc since, due to benchmark^Wcompetition with other compilers, from gcc 3 onwards, the gcc developers have been eager to release ``bug free'' new versions by enforcing a policy that only ``regressions'' would get fixed, and spending more time changing their definition of ``regression'' or trying to explain why regressions weren't, so as not need to fix them in any stable release). And it is very unfortunate that gcc 2.95 does not completely implement C99, for we would have happily kept it for the older platforms, those which are not supported, or fubar (does it make any difference) with later versions. Switching from gcc to clang is worth considering, and truth is that some developers have been tinkering with that idea. This is something that may (and probably will) happen on some platforms (since llvm does not support as many platforms as OpenBSD does); but switching a subset of OpenBSD's supported platforms is not a trivial task, and a lot of work needs to happen first (such as replacing libgcc with compiler-rt, and port it to the missing platforms). And if/when such a switch happens, bugs will trigger and problems will need fixing; and we can not risk being naive enough to expect llvm developers to handle bug reports and bugfix releases any better than the gcc developers do (although we hope they will). Assuming the upstream developers fail to deliver, it's up to us to fix or workaround compiler problems as we encounter them; sometimes it's as easy as finding out which patch has been commited upstream, but not backported to the version we use; and sometimes it's a genuine issue which may or may not have been reported in the latest compiler version, and we are on our own. When this happens, we can only rely upon our developer skills and intimacy with the compiler. A few of our developers have, over the years, become unafraid of gcc, and able to investigate issues, backport fixes, and fix or work around bugs: I'll only mention niklas@, espie@, etoh@ and otto@, and hope the few others will forgive me for not listing their names. This has not been an easy road, to say the least. Now, another few of our developers are working on building a similar knowledge of llvm. I wish them a lot of luck, and I will try to join them in the near future. In the meantime I am not sure they feel confident enough to support switching the most popular OpenBSD platforms from gcc to llvm. In a few months or years from now, things will be different... ...but there is something I wish would happen first. An LTS release of an open source compiler. Because all compilers nowadays are full of subtle bugs, but so many of them than you can't avoid them as soon as you compile any nontrivial piece of code, and because we can't afford to going back to assembly, we need a compiler we can trust. GCC, as well as LLVM, have Fortune 500 companies backing them, paying smart developers to work fulltime on these projects. Yet none of them dares to provide a long time support version. Bugs in version N are fixed in version N+1, but new bugs are introduced. And noone cares about trying to settle things down and produce a compiler one can trust (because version N+1 runs 3.14% faster in the loonystones benchmark which doesn't match any real life use case). Who cares? Tomorrow's compiler will generate code which will complete an infinite loop in less than 5 seconds; stay tuned for more accomplishments! The free software world needs an LTS compiler. The last de-facto LTS compiler we have had was gcc 2.7.2.1, and it is too old to compile modern C and C++ code. Should a free software LTS compiler appear (be it a gcc fork, or an llvm fork, or something else), then OpenBSD would consider using it, very seriously. And we probably wouldn't be the only free software project doing so. Miod (a.k.a ``Don Quixote de La Compiladora')


$Id: dates.htm,v 1.1342 2020/03/10 11:14:03 fred Exp $

$Id: diary,v 1.27 2017/09/01 17:12:44 fred Exp $