do you know c?

November 13th, 2014

In discussions on programming languages I often see C being designated as a neat, successful language that makes the right tradeoffs. People will go so far as to say that it's a "small language", it "fits in your head" and so on.

I can only imagine that people saying these things have forgotten how much effort it was to really learn C.

I've seen newbies ask things like "I'm a java coder, what book should I use to learn C?" And a lot people will answer K&R. Which is a strange answer, because K&R is a small book (to further perpetuate this idea that it's a small language), is not exactly pedagogical, and still left me totally confused about C syntax.

In practice, learning C takes so much more than that. If you know C the language then you really don't know anything yet.

Because soon enough you discover that you also need to know the preprocessor and macros, gcc, the linker, the loader, make and autoconf, libc (at least what is available and what is where - because it's not organized terribly well), shared libraries and stuff like that. Fair enough, you don't need it for Hello World, but if you're going to do systems programming then it will come up.

For troubleshooting you also need gdb and basically fundamental knowledge of your machine architecture and its assembly language. You need to know about memory segments and the memory layout and alignment of your datastructures and how compiler optimizations affect that. You will often use strace to discover how the program actually behaves (and so you have to know system calls too).

Much later, once you've mastered all that, you might chance upon a slide deck like Deep C whose message basically is that you don't understand anything yet. What's more terrifying is that the fundamental implication at play is: don't trust the abstractions in the language, because when things break you will need to know how it works under the hood.

In a high level language, given effort, it's possible to design an API that is easy to use and hard to misuse and where doing it wrong stands out. Not so in C where any code is always one innocuous looking edit away from a segfault or a catastrophic security hole.

So to know C you need all of that. But that's mostly the happy path. Now it's time to learn about everything that results in undefined behavior. Which is the 90% of the iceberg below the surface. Whenever I read articles about undefined behavior I'm waiting for someone to pinch me and say the language doesn't actually allow that code. Why would "a = a++;" not be a syntax error? Why would "a[i]" and "i[a]" be treated as the same when syntactically they so clearly aren't?

Small language? Fits in your head? I don't think so.

Oh, and once you know C and you want to be a systems programmer you also need to know Posix. Posix threads, signals, pipes, shared memory, sync/async io, ... well you get the idea.

:: random entries in this category ::