A first computer is like a first love — the wrong one will really screw you up, especially if you’re convinced it was the right one.
I’m an old-school programmer. I’ve been doing this for 30 years. I worked for over a decade in C and C++, often on operating systems and device drivers.
If you ask, “how do I get better at programming,” I’m supposed to tell you:
- you should learn C (or C++ or assembly) — a language that works like the computer hardware does
- you should write code that directly messes with the hardware, preferably an operating system
- you need to learn the highest-performance languages because otherwise what will you do if your code is too slow?
These are all terrible advice.
And they’re all variations on a different, wrong idea: that you should learn about computers and software past all the abstractions, “all the way to the bottom.”
That’s also terrible advice. Let me tell you why.
Why You Shouldn’t Learn C Early
First of all, C is a language that works like old computers used to, not a language that works like modern computers do. You can get the long version in the (excellent) ACM article “C is Not a Low-Level Language”, but processors no longer look like C. The C language is older than threads. It’s so old it has no real recognition of concurrency. In fact, it’s actively hostile to most concurrency and optimises it very poorly. That’s a problem. Around 2010 we suddenly stopped being able to ignore multiple processors… But C was close to forty years old at that point, and wasn’t going to make a drastic change at that age.
It can be hard keeping a whole stack of pointers in the air in C…
There’s another reason they tell you to learn C: so that you can use a language which, being similar to your processor, is easy to optimise to be extremely fast. That stops being true as soon as you look at processor features like concurrency and vectorisation, or SIMD (doing the same operation on big lists of numbers all at once.) Those two things are, by the way, the essence of optimising a modern processor. C also makes it hard to deal with cache line alignment, and its use of pointer aliasing means it can’t assume a thing it wrote to memory is still there… Basically, C was an excellent education in how the PDP-11 computer worked in 1972, but a lot has changed since then.
(C culture, by the way, is highly concerned with optimisation. Talking to C programmers can be a great way to learn optimisation. And they are mostly working around a language that no longer supports them well in this activity. They’re frequently stuck with assembly language, external libraries, and carefully tickling one specific compiler in the right way to emit the right code. It is the language that is teaching the wrong lessons — the developers are mostly conscientious and excellent, and it’s a shame they have no better tool for modern processors.)
I love C. It was the first language I really fell in love with, and I look for chances to use it when it’s appropriate. Those chances are now rare. How often are we writing device drivers or operating systems these days? Not very often. More on that in a moment.
C will teach you a number of potentially-interesting things about low-level programming, even if some of them are now misleading. And if you don’t have a specific use for them, you can skip them.
You Shouldn’t Write Hardware-Level Code, Especially an Operating System
I cut my teeth on Apple IIe code. Back then there was a lot more (what we’d now call) systems-level code. I didn’t write an OS for the Apple, but I wrote or worked on OSes for DECstations, Intel PCs, Motorola Dragonball and embedded ARM processors — the last two for money, the first two for education. I technically also wrote some Linux-based OS code on a few other processors while writing device drivers, but who’s counting?
So if I wrote that much OS code, why don’t I tell you to?
I tell you to pick something different because I wrote that much OS code.
If you specifically want to know OS coding for some reason, sure, go do it. Have fun with it as an eccentric side project.
But most people tell you to write an OS so that you “understand how your computer really works.” Balderdash, friends. Balderdash.
If you must spend your time studying, study the most valuable thing you can. You can never get your time back.
This is like recommending that you learn C. It will teach you how a computer used to work. You will learn a few interesting and useful tidbits like virtual memory, which certainly are useful. Indeed, they are so useful that I recommend skimming the Wikipedia article and perhaps even reading a bit about them in a book of your choice. I do not recommend the multi-month undertaking of writing an OS to learn about virtual memory. It is a waste of your time, and it will only teach you how old primitive memory systems used to work, because you will not be implementing modern virtual memory, so you won’t really understand it. I bring up virtual memory because nearly everything else you will learn from writing an OS is a less valuable use of your time.
“But you’ll learn how a scheduler works.” Nope. You’ll learn how an old-style really simplified scheduler used to work. You will only learn about serious scheduler problems if you implement a variety of schedulers under a variety of workloads, using a variety of applications you will have to custom-develop for your operating system that nobody has ever used, or will ever use. It is not a good use of your time.
I don’t mean learning about schedulers is a bad use of your time — that can be quite valuable. But you know what will teach you much faster than writing an OS? Writing (just) a scheduler in a modern fast-to-write language and watching scheduler failure patterns (like starvation) operate using more interesting data, where you don’t have to debug seg faults constantly.
In other words, writing your own operating system is no longer even the best way to learn about your operating system, let alone about the rest of your computer.
I’m pretty obviously a fan of understanding a system by rebuilding that system. I wrote a book about it. Writing a real, honest-to-Dijkstra OS is not the way to learn quickly. My book is written in a nice high-level language (Ruby) where you can make mistakes rapidly and therefore learn rapidly. C is not that sort of language. I’d listen to arguments that Rust might be, but I have doubts.
“But how will you ever learn about things an OS would teach you? How will you even know about schedulers and virtual memory?” The folks telling you to write an OS are convinced these lessons apply elsewhere, and that OS-like situations will just happen organically. In cases where they’re right, you’ll learn from those organic situations. Where they were wrong, you’ll skip learning something archaic and useless. There are many better uses of your time.
Don’t Learn High-Performance Languages Without a Good Reason
The conventional wisdom is that you should know a very high-performance language, and learn it early, so that you have a fallback if you need your code to be faster.
The same people telling you that will then tell you that you need to use it for nearly everything “in case your program turns out to be too slow later.”
There are many, many things wrong with this argument.
This is not the most efficient way to listen to a radio program. You only do it if you care about the project.
Let’s start with my favourite: a CPU-efficient language being “faster” starts by assuming that your problem is CPU time. But your actual problem is often memory, or network round-trips, or something else that has nothing to do with your CPU. I frequently spend my time on web programming, where network round trips trump CPU time every day of the week. A Ruby program (though not necessarily Rails) is normally much more memory-efficient than Java for the same purpose, and frequently more memory-efficient than the C you’d write in practice. In general, an interpreted language can be surprisingly memory-efficient even while burning more CPU cycles. Which one is more valuable depends on your constraints.
So having to write all your code in C or Java to save CPU is, as a rule, extremely foolish.
But the problem goes deeper than that.
Higher-level languages (I’m thinking of something like Python or Ruby here) will generally have better tools for finding the slow spots. Profiling tools tend to be more effective in higher-level languages, just in general. Which means CPU-efficient low-level languages are often hard to “see into” and wind up being penny-wise and pound-foolish — C has less CPU overhead for a given algorithm, but it’s harder to write in and so you tend to wind up with worse algorithms. In C especially, the right algorithm is often ugly and laborious to write and so there’s a strong tendency to use something that can be written quickly — which can eliminate the whole benefit of C pretty rapidly. The argument against Java is the same, though less extreme because Java is a higher-level language than C. That is, it’s both less efficient and easier to write in, both of which are the right choice in most cases; it’s just that you can do even better than Java, and usually you should.
The above argument also assumes that your code will wind up needing to be “too fast for language X.” That is far rarer than it sounds. It also assumes you can’t convert just the most important sections to another language via foreign function interface or a similar process, which you nearly always can.
At heart, using a lower-level language “just in case” is a premature optimization — which has been referred to as the root of all evil. In general, you want to profile (measure speed) before you optimise (improve speed) because otherwise you’re blindly doing extra work that is likely to be wasted.
Using a more laborious language “just in case” is nearly the definition of premature optimisation.
All the Way to the Bottom
All three of these pieces of advice presume that there’s value in learning things “all the way down.” Yes, yes, writing your web applications in C would be foolish in time terms, but surely there is virtue in doing something, doing anything, in a way where you fully understand everything your code is doing?
And here, finally, is the piece of buried bullshit that is so far outside most modern programmers’ view that they don’t understand how bad it is. Here is the problem underneath our problem.
Sure, let’s talk about that.
I used the word “virtue” above very purposefully. Programmers like to think we’re driven by logic and efficiency, but mostly we have all the same repressed issues as everybody else. How could it be otherwise? We’re people.
Want to be a hero? Great. Focus on what you actually
When most people say “all the way to the bottom,” they mean that you should learn down to the level they got down to. If they learned C early on, they mean C. If they learned some assembly to do old pre-ubiquitous-graphics-hardware games, then they mean assembly. If a hardware designer is talking, they mean hardware. Note that it’s a bit of a “natural” level thing — a systems programmer who did a little processor design probably still means assembly, even though they know perfectly well that there are more levels underneath and that they matter.
Each of those is wrong. Let’s talk about why.
C doesn’t work like modern hardware — no concurrency, no vectorisation, but also many other things simply aren’t represented, such as the behavior of cache lines, which is also critical to modern optimisation. That ACM article is, again, a good summary and I won’t repeat it here. So if learning C would teach you a solid understanding of the hardware it represents… then you shouldn’t learn C, because what it teaches you is no longer accurate.
Assembly language is, by definition, close to the hardware — it’s the second-lowest-level language of a given processor after actual machine code, which is barely a language at all. However, assembly language on modern processors is barely optimisable by humans. The issues involved are very difficult, and processor companies long since switched to designing their machine code to be written by compilers, not humans. So if you’re learning assembly, it shouldn’t be to learn about performance, because assembly mostly won’t teach you that. You could learn about compilers — they would teach you more about performance. Compilers aren’t written in assembly, though. Essentially every compiler-design class uses very high-level languages, and has at least since I was in university in the mid-1990s.
You could instead learn assembly because you want to understand “everything” about how your program works. I’ve worked at Palm and at NVidia on experimental hardware with hardware bugs… So I’ll tell you that assembly is not everything about how your program works. But it’s worse than that. The Pentium FDIV bug was a reminder that, yes, hardware bugs happen on ‘normal’ commodity hardware too…
But it’s not just hardware bugs. Some of a computer’s behaviour is basically inaccessible to software. Whether you’re looking at memory refresh on computers with DRAM (yeah, I’m old) or hardware interrupts, at some level the computer is doing something you can’t directly perceive or control with your software.
And it’s even worse than that…
Down under the level of assembly language is the level of hardware design diagrams. And under those are transistors. And down under that is the level of magnetic currents and fluctuations. And if you go down much further we don’t have language for the levels and the problems any more — my understanding bottoms out even before “quantum physics” — but even lower levels still exist.
You cannot learn computers “down to the bottom" because there is no bottom. And you can’t learn them "just far enough that you fully understand them” because there is not some fully understandable, predictable layer between deploying your Node.js app to Heroku and “the bottom,” which does not exist.
You might think Joel’s Law of Leaky Abstractions means that you should go underneath all the leaky abstractions, and learn all the lowest levels that leak through. But unfortunately it’s leaky abstractions all the way down to the lowest levels of human understanding.
Where, then, do you expect to find this magical place with no further leaky abstractions? In practice, no such place exists.
But What Do We Do Instead?
I hang out with my bear. And my kids. It’s not everybody’s thing, but it works for me.
It’s easy for me to be a curmudgeonly grognard, even if I’m not quite the usual flavour. But tearing down is always easier than building up — what do I think programmers should be doing instead?
Basically, don’t bias your study toward lower levels of abstraction. Instead, learn new disciplines as they apply to your problem. That doesn’t mean “never learn C” nor even “never learn operating systems.” But it does mean that you should de-emphasise them — you’ve almost certainly been told too often to learn about them. Learn them when and if you need them for some specific reason.
I don’t care if you’re learning a high-level language like Ruby or one like Haskell or OCaML or Scala — but learn a high-level language and work in it primarily. You’ll learn far faster. The main reason to learn languages that (claim to) care excessively about your computer’s physical architecture is to learn computer archaeology. Instead you should learn about problems that you’re interested in, and then learn the relevant tools. Don’t feel like you need to know how computers used to work unless that helps you somehow.
I’m also not saying you should avoid anything hard — if you’re going to learn Ruby on Rails, that’s likely to point you in directions like SQL and high-performance database work and HTTP and TCP packet protocols and more. There are many fine and difficult topics there, even in a low-abstraction direction. Web programming will also, if you let it, point you in directions like queueing theory and Little’s Law and many others that are quite high-abstraction indeed. Very low-level programmers often tend to ignore higher-level concerns, just like the other way around.
Don’t think of yourself as a turnip, starting at the surface and slowly descending downward below each abstraction as your roots grow. You’re a star, exploding outward in every direction, including high- and low-abstraction. You don’t need to make “downward” your particular preference unless there’s something there you want — and then you should know what it is.
How do you pick a direction right now? I’m a big fan of conscious coding practice as my personal compass. You no doubt have your own.
But Then Why Do People Say…?
Remember, it’s not bad advice if you never look directly at it…
It would be reasonable to ask, “why do people suggest that?” It’s common advice, after all.
Lots of reasons.
Here’s one: people give the advice that was up-to-date when they learned. Folks with more experience have been known to give outdated advice.
Here’s another: nobody really wants to think that skills they spent years on aren’t as valuable as they were. I’ve faced the fact that DECStation assembly language isn’t going to help me out ever again, but… Well, wouldn’t it be nice if it could? Maybe I should suggest to the young folks that there’s something important to be learned there.
(No, not really. But you see the temptation, right? I certainly do.)
And so that bad advice is often given by the people who have put the most effort into it. Somebody with a very expensive university education will stick up hard for the value of it. And somebody with fifteen years of low-level C experience will decide that every programmer needs the same.
If you finished reading through this (thank you!), you probably did it because you’re trying to get better at the craft of software. I certainly am, and I’ve been at it for decades!
If so, I hope you’ll look over my material about practicing software and getting better, and perhaps get on my email list. I work hard at this and I’m looking for other people who do too. I think that improving at actually writing software is something our industry neglects too much, and I want there to be more places to start on that, and more ways to improve. Join me?