Archive for 2008

why write? because it keeps you honest

January 26th, 2008

I have to admit that this writing business is a little eccentric. I mean who writes? Isn't that for authors? Most people don't even do much reading, let alone writing. And those who're not interested in reading more are definitely not inclined to start writing. What for?

I had never written anything before I went to junior high. Before that learning a language was all about you know, learning the language. Then they recast it as, well, using the language. They wanted to have discussions in class. And they gave us essay questions. It is a strange thing to do. You pick a topic off a list and you have to write an essay about it. About something you have no opinion on. I mean to what end? But once you learn the game, it's not hard. You just make up a standpoint and argue it. It doesn't matter what you decide, just as long as you can write a coherent argument. Of course, one other thing you learn is to lie. If the essay question is asking whether schools should assign more homework or less, you obviously argue for more. You give the people what they want, and they give you the higher grade. Telling people what they want to hear is a useful skill.

So that's what they teach you in school, to argue for standpoints. And to be... flexible in your stand. Frankly it makes no difference: what you write in an essay is not going to affect anything, so you can fake it without repercussions.

I guess I would be surprised to hear this back then, but this essay writing turns out to be a useful exercise. It's practical to be able to write arguments. It's not precisely the skill you need to get your job done, but it tends to come up, now and then.

So writing is a bit eccentric. And you may wonder why you should. Certainly, there are plenty of people writing about why you should be writing. Some will respond that they don't have anything to say, and therefore nothing to write. Others might say that they find reading more useful, because it gives them a chance to read material at a level they themselves could not produce.

Here is the thing. Writing is not some sort of special activity that comes with a license. If you have thoughts, you can write. Because that's all it is: expressing thought. It is a very different form of "thinking" than thought itself. It is impossible to just "write down" your thoughts, because they are all over the place. Your text, on the other hand, has to be something that holds together. So the process of writing is applying structure and plot (a progression) to a thought.

You may wonder why this is useful. Then think about how we communicate. We do not plug our brains together and synchronize our thoughts. We have to go through this process of taking a thought, verbalizing it in a specific, structured manner, and sending it across to the other person. This is the only way we know to interact. And however different this is from thinking itself, we are stuck doing this, it's the only way we can exchange ideas.

So where does honesty come into it? Let me motivate that. Verbal communication is a chaotic and erratic activity. It is also not accountable. If I tell you at the beginning of a conversation that I believe students should be assigned more homework, and if I later realize that this is an inconvenient standpoint in light of where the conversation has gone, I can change it. I can tell you that I never said I thought more homework was a good idea. And you can't prove that I didn't say it. I'm the authority on what I said, because I know my motives. And there is no record, all you can say is that you remember I said something different from what I'm claiming now. But I'm refuting that accusation. There's nothing you can do to make me accountable.

Notice that this is only the simplest example. I can twist words, I can take things out of context, I can be very unfair to your statements. And because it's a progressing, verbal exchange, it's quite difficult for you to pinpoint where I'm cheating and call me out. Do you think I could get away with the same thing if I had to put it in writing? I couldn't. Because you could point out any two spots in my text that are mutually contradictory and prove that my statement isn't even coherent. Or you could point to a place where I'm twisting words or taking things out of context and say that I'm cheating. It's easy to do that when there is evidence.

Given a choice, which would you rather have? Would you rather convince me with an argument that isn't coherent, but where the flaws are so well hidden that I can't figure out why your argument seems to be right even though something doesn't quite fit? Or would you rather have an actual coherent and convincing argument to present? Heck, even if you could convince me with completely flawed reasoning and I wouldn't know a thing, where only you would know the deception, wouldn't you rather have an honest argument?

In a sense, this is what writing is all about. It is taking a train of thought that seems to be a convincing argument and fleshing it out so that when you articulate it as a statement, it is coherent and convincing. And therefore it is about applying a certain rigor to a train of thought, checking whether something that seems convincing really is.

What might surprise you is that practicing to write not only improves writing ability, it also makes you more convincing in conversation. It happens because you begin to apply the same rigor when composing a statement in conversation that you do in writing. This counters the intuition that some people have about the internet. I'm talking about the school of thought that says talking to people on the internet isn't real, why don't you get a life. As if what we do "in here" (a strange name for a globally interconnected network) is completely disconnected from what happens "out there" (in your local neighborhood). But the mind is not so disjoint, when you practice writing arguments it affects how you construct arguments in speech as well. And it helps keep you honest, because even though cheating is much easier, you think to yourself I could not get away with this in writing.

Of course, there are many other reasons why writing is useful, and as I alluded to, there are plenty of bloggers out there who encourage blogging for various reasons. I don't find all of them convincing, but the one thing you can say for certain is that it's a different way of interacting with your thoughts. And it's a way to be precise about thoughts and have that train of thought on the books for something you can return to.

could we replace doctors with a search engine?

January 24th, 2008

Think about the last time you went to see a doctor. What happened? You spoke, right? Explained your symptoms? You were examined? And then? Got a prescription, did you? Or just advice on how to better your situation? Well, this is what most visits are like, most of mine anyway have been like this. In most cases it was only a conversation and in many cases there wasn't even an examination involved.

Doctors have an odd profession. In the course of their education they have to learn an extraordinary number of facts about diseases and ailments. Alright, so everyone has to learn a lot of stuff. But when they get out of school, they're not using their knowledge to build bridges, or write tv commercials, or fix vacuum cleaners. They just sort of give out parts of that knowledge bit by bit to patients. You come in, you explain your problem, the doctor digs deep into his vast database of knowledge and pulls out the bit of information that is relevant to your situation.

- Doc, I'm having trouble with my foot.
- You might have a swollen ankle. Do this, don't do that. Eat this, don't eat that. Good luck!

Their whole business is based on disseminating personalized information in little portions, at the appropriate time. It's like a temporary storage for information. Hold on to this and remind me at the appropriate time. They are walking encyclopedias, essentially. And, in fact, seeing a doctor is often called "a consultation". So they are consultants. You come to them if you want to know something about their field of expertise.

Can you think of anything else that works this way? Yes, a search engine. The symptoms is the index (think of the alphabetical index in the library), it tells you where the information is. You follow that and you get to the information you're looking for.

Of course, it isn't only what you say. A doctor will use all information about you to make a diagnosis, not just what you tell him in words. He'll also determine whether you look well, whether your pupils are dilated, whether your voice is changed, whether your breathing is normal etc. And he has your medical history available to him, as well as basic facts like age and weight, all of which goes into a deliberation of the most likely thing to be wrong with you.

But if you could actually articulate all of this information, then the role of a doctor could just as well be succeeded by a search engine that would give you the same information. In fact, it could actually be a much better doctor too. Just think of all the knowledge that a doctor has, based on all the experience and all the patients he's seen. When you come to him, that is everything he has available to him to understand your situation. In fact, I have the impression that a doctor who is just out of school isn't very good, because he has no experience. After working a couple of years, having seen hundreds of patients, he's building a database of knowledge that is invaluable to judge what the most likely problem is, given how many possible ailments exhibit similar symptoms. So it's a statistics game, the more experience you have, the more skilled you are.

But just imagine pooling *all* the knowledge that *all* doctors have all in one place and making it searchable. That would be the most knowledgeable doctor of all time. You could present any set of symptoms and if there's one doctor on the planet who has seen this before, you would get a good answer. (Think Google.)

It would also decrease the demand for doctors considerably. If there is now one doctor for 100 people, there could be one to 1000 if the other 9 were replaced. Of course, sometimes you need instruments to do an examination, so you couldn't do this at home, you'd still have to go to the clinic, but it would be a more do-it-yourself kind of place.

One upshot of not having a human doctor would actually be that you wouldn't have to lie about embarrassing problems. This is a real issue for doctors today. Patients don't want to tell the truth when it's humiliating, so they lie and the doctor has to see through this (and of course you *do* eventually want your doctor to know the truth to get the right diagnosis, you just don't want to have to say it) and use his judgment and a little diplomacy to help you despite yourself.

So what about those 9 doctors that aren't needed? Well, they could be working to enrich the database of information. In our current model, educating doctors means giving the same information to everyone (ie. massive duplication) and sending every guy to a different place to see patients. And there is an obvious limit to how much a person can learn in 4-5 years, so every doctor has the same limitation. Instead, what we could have is put those 9 people to work on different things (experimental drugs, studies of diseases and so on) and the huge database of information would be ever richer and more useful.

a critical look at paludis

January 23rd, 2008

I've been meaning to scribble something about paludis for a while now. I was tempted to do that right after I started using it, but then I thought it would be better to get some perspective and that would also cover issues that may come up some months into actual use.

So the moment has come. I installed paludis sometime in mid-November 2007. Long before that it was announced portage compatible, so it should be safe enough. Migration is neither all that long nor that complicated, it's mostly a matter of getting used to paludis's different philosophy. One of the odd things is setting up repositories (overlays) in /etc/paludis/repositories, but it's easy enough.

From a portage user's perspective configuring paludis is not the most pleasant experience. The documentation is quite complete, but it really demands that you know exactly what you want. There aren't any texts to read, it's generally just a FAQ. As far as user guides go it's not the most friendly one:

Non-Problem: There's no PORTAGE_NICENESS equivalent.

Rationale: Learn how to use nice. There's no GCC_NICENESS or VIM_NICENESS either.

To me personally (although I'm sure I'm not alone in this), it is portage's "rounded corners" that made it such a great package manager to use from the beginning. It had all this built-in convenience, like PORTAGE_NICENESS, like color output, like output that is verbose enough to be informative, but not overly verbose, like make.conf where you could set a range of optional, useful settings, like having emerge --ask which I would use all the time etc. Contrast that with something like apt-get and there's absolutely no doubt what the nicer tool is. Perhaps this bling also undermines portage's conceptual integrity, eventually turning it into an unmanageable codebase. But it's also what made me choose gentoo: the fact that it had, as it aspires to, the best tool.

Now paludis is more puritanical about this. What that means in practice is that it pushes that burden onto you, the user. We don't want it in paludis, so it's now your problem. As evidenced by advice like this:

Non-Problem: Paludis doesn't restore the xterm title on exit.

Rationale: Neither does anything else. Some programs do set it to a guessed value based upon a default prompt for certain distributions, but they don't restore it. You should be using PROMPT_COMMAND to do that yourself -- see the bash documentation.

So since paludis won't do this for me, it's now my problem to set in place the proper infrastructure for this, and to maintain it. It ceases to be a configuration option, it becomes a user environment issue. And I have to maintain this environment across machines, because it's no longer part of the application. Paludis is a tool that is technically superior, but inferior on user friendliness.

Not having FEATURES also means that I have to set all these things on the command line:

$ type ipal
ipal is aliased to `paludis -i --dl-reinstall if-use-changed --debug-build none --log-level warning --continue-on-failure if-independent'

And I'm still not sure if I'm setting all the optimal options, because there's tons of them. (Yes, there is PALUDIS_OPTIONS, but I wonder if it's useful to have different options on install, query etc.)

One serious usability problem is that paludis is ridiculously verbose. I wonder what kind of giant monitors the paludis developers own, but for my part paludis = lots of scrolling. Even running paludis --help has to be piped to less to refresh my memory on the most useful switches. What's more, the most important output is always at the top, so I always have to scroll the longest distance. If you want to flood the screen, put the crucial bits at the bottom, that's common sense. Case in point, if I'm installing packages and I do a paludis --pretend, I have to scroll up through all the use flags, then I come to the list of packages, but each package entry is several lines long, so by the time I get to the top I'm quite annoyed. The verbosity of paludis has to be easily 3-4 times that of portage.

Furthermore, default settings matter a great deal, and as a developer I think you are entirely culpable for setting poor defaults. On my first day with paludis I reached a show stopping bug when openssl refused to compile. As it turns out, it was the test suite that was broken, and since paludis by default runs the tests (or did, anyway), it just wouldn't install. SKIP_FUNCTIONS="test" fixes this, but that was clearly a misguided default setting. But there are many examples of this. --debug-build is enabled by default, which I think is wrong because most gentoo users don't actively debug *every* package they install. Meanwhile, these files take up quite a bit of space. Furthermore, coming back to the verbosity problem, --log-level is set to qa. This means that I, as a user, have seen this message every time I invoked paludis for the last two months:

paludis@1200787359: [QA] In program inquisitio -s perl:
... When performing query action from command line:
... When handling query 'php':
... When fetching versions of 'dev-lang/php' in gentoo:
... When loading versions for 'dev-lang/php' in gentoo:
... When extracting version from '/usr/portage/dev-lang/php/php-5.2.4_pre200708051230-r2.ebuild':
... When parsing version spec '5.2.4_pre200708051230-r2':
... Number part '200708051230' exceeds 8 digit limit permitted by the Package Manager Specification (Paludis supports arbitrary lengths, but other package managers do not)

Not only is that a trifle of a bug, how exactly am I the user served by seeing this warning? I'm not a developer, so I couldn't fix it if I wanted to. Furthermore, the irony of it all is that apparently paludis, which gracefully handles this problem, is the one emitting the error, whereas portage, which may actually be affected by it, doesn't. A pure exercise in futility.

Installing packages also outputs chunks of lines that never seem to change with any package, output I have no need to see:

>>> Running ebuild phase prepare as root:root...
>>> Starting builtin_prepare
>>> Done builtin_prepare
>>> Completed ebuild phase prepare
>>> Running ebuild phases init saveenv as root:root...
>>> Starting builtin_init
>>> Done builtin_init
>>> Starting builtin_saveenv
>>> Done builtin_saveenv
>>> Completed ebuild phases init saveenv
>>> Running ebuild phases loadenv setup saveenv as root:root...
>>> Starting builtin_loadenv
>>> Done builtin_loadenv

Is this supposed to be useful information for the user? I don't even know what it means.

Clearly, there are a few glaring problems. And maybe that's not so shocking from a development team that seems dead focused on technical issues. No surprise then, perhaps, that from a technical standpoint paludis fires on all cylinders. It took me a few days to get used to paludis, but since then it hasn't done anything weird or unexpected, it hasn't crashed, it has been rock solid. And those "issues" that may come up in the fullness of time? They never came up.

And this you already knew: paludis is fast.

user settings migration

January 19th, 2008

The nice thing about being a gentoo user (as all gentoo users know), is not having to wait for your distribution to ship packages for a new release. You just decide for yourself how soon you want to jump ahead and start using either unstable code or just-released goodness. So while Ubuntu is shipping KDE 4.0 in 8.04, and thus my laptop is stuck waiting for it, on my gentoo box I can use it as soon as the ebuilds hit the tree (and even before that, with layman).

So when I launch into the nicely pre-configured KDE 4.0 desktop the first thing I notice is that my configuration settings from KDE 3 no longer apply. What's happened is that the ~/.kde symlink has been pointed from ~/.kde3.5 to ~/.kde4.0 and so every remaining KDE 3 application (of which I have many), is now trying to locate its settings under ~/.kde4.0, where it has no settings. In other words, every application I've configured from akregator to yakuake now has to be reconfigured (even though it hasn't changed!) because of KDE 4. That stinks, I don't want to waste time trying to reproduce the settings of some 20 applications to match exactly what they used to be.

What are my options? I can go into ~/.kde3.5/share/apps and copy every directory I care about over to ~/.kde4.0/share/apps. Then I have to do something similar for ~/.kde3.5/share/config versus ~/.kde4.0/share/config. But if that is all it takes why didn't KDE 4 do that on the first run? There are a lot of configuration files in there, and I've never looked at them nor should I have any reason to, they've all been written by the application they belong to. Furthermore, some applications are upgraded with KDE 4.0.0, so I don't know if it's safe to copy their config files across. For instance, kwin-4.0.0 is one of the new packages I installed. Now, I like my existing kwin settings, and as far as they still apply I want to use them in kwin4, but I don't know how kwin4 deals with old configuration files. Applications know this, users don't.

What KDE 4 could have done is to duplicate ~/.kde3.5 into ~/.kde4.0 (although that could potentially grab a lot of disk space) and then selectively migrate the configuration files on a per-application basis. So kwin4 could figure out there are some things that no longer apply, discard those, and accept the rest. It would only have to do that once. And all the KDE 3 settings would still be preserved in ~/.kde3.5.

The thing to remember is that configuration settings is still user data. Losing a user's settings is not as egregious as losing his emails, but it's still data loss. It's valuable information. And the more configurable your application is, the more you should care about keeping the user's settings safe, because a complicated configuration is a lot harder to remember and reproduce than a simple one. Preserving old settings is fine, but it isn't very useful when you don't also migrate them to a new version of your application.

what is it about coding?

January 12th, 2008

This week marks the release of KDE 4 with a lot of noise. But what strikes me more than the actual code being released is all that I've read about kde4 ever since I started reading Planet KDE quite regularly last year. I've read the words of many people who are above all very excited about whatever it may be they are currently working on, big or small. There is a palpable widespread enthusiasm in that community (at least among the people who like to blog).

The question is why. What are the kde people so excited about? A qt widget that auto adjusts on resize? Uh-huh. Why is Brian Carper so determined to learn Lisp when it's not his job and no one is pushing him to it? Jarosław Rzeszótko is a Polish kid who tried to map out the entire realm of programming so that he can spend the next however many years learning... everything. Why does he care so much? The programming subreddit is consistently dominated by esoteric programming languages and wild ideas, which suggests that there's a lot of hobbyism and experimentation going on, not just plain "working for the man". Why bother with all this tinkering?

When we see a thrilling musical performance, there comes that realization of just how much work and how much mental energy it must have taken to produce it. If you start today and you're very talented, you might be able to give a thrilling performance in 10 years. People devote their lives to achieve this. But hey, it's music, it's art, it's incredible. Listening to music performed this way makes you feel something you can't reproduce any other way. It activates feelings deep inside you and brings them to the surface, making you experience hurt, relief, harmony, ... just by listening to sound.

So what about coding then? Why do people care about coding? It's not art. It doesn't make the user have all these wonderful experiences they get from music, it's really no more exciting for them than... filling in forms. There aren't any art exhibitions for software. And if ever using software channels deep feelings, it's usually not in a good way.

But there is something about programming that very deeply appeals to us programmers. It is difficult to explain, and I have never found it explained by anyone... fully. How do you explain something you don't completely understand?

Well, until today. I just received my order from Amazon, "The Mythical Man Month", Fred Brooks's set of essays on software engineering. In the first essay, The Tar Pit, Brooks writes:

The Joys of the Craft

Why is programming fun? What delights may its practitioner expect as his reward?

First is the sheer joy of making things. As the child delights in his mud pie, so the adult enjoys buildings things, especially things of his own design. I think this delight must be an image of God's delight in making things, a delight shown in the distinctness and newness of each leaf and each snowflake.

Second is the pleasure of making things that are useful to other people. Deep within, we want others to use our work and to find it helpful. In this respect the programming system is not essentially different from the child's first clay pencil holder "for Daddy's office".

Third is the fascination of fashioning complex puzzle-like objects of interlocking moving parts and watching them work in subtle cycles, playing out the consequences of principles built in from the beginning. The programmed computer has all the fascination of the pinball machine or the jukebox mechanism, carried to the ultimate.

Fourth is the joy of always learning, which springs from the nonrepeating nature of the task. In one way or another the problem is ever new, and its solver learns something: sometimes practical, sometimes theoretical, and sometimes both.

Finally, there is the delight of working in such a tractable medium. The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures. (As we shall see later, this very tractably has its own problems.)

Yet the program construct, unlike the poet's words, is real in the sense that it moves and works, producing visible outputs separate from the construct itself. It prints results, draws pictures, produces sounds, moves arms. The magic of myth and legend has come true in our time. One types the correct incantation on a keyboard, and a display screen comes to life, showing things that never were nor could be.

Programming then is fun because it gratifies creative longings built deep within us and delights sensibilities we have in common with all men.

In trying to formulate an answer to the question myself, I have only ever been able to clearly state the first of the mentioned facets: building things. The fluffiest and most intangible quality I see in the emphasized paragraph. In this respect I get people who don't understand programming. It is such a strange thing in many ways.

But it is also this that makes the endeavor forever interesting. Imagine a computer game that has such incredible depth that you can spend your whole life playing it and whichever dimension you pick you can never see the end of it. No matter how much you zoom in the picture, you can never see the pixels. No matter how refined your battle strategy is, you can never figure out the computer opponent, because as your strategy gains an ever greater granularity, so does his. And no matter how many levels you complete, there are always more left.

From a child psychology point of view, I may have been a surefire pick for programming (if someone were confident enough to foresee the PC revolution). I had building blocks from the beginning, and I loved Lego blocks more than anything else. I didn't really care about the structures they were for, I just built stuff. The only limitation to Lego blocks is actually the blocks themselves. There are only so many blocks, and there are only so many types of blocks. With rectangular blocks, you can't build something round. And with plastic blocks you can't build ships, cause the material is too heavy. And because the blocks are a certain size you can't build something that is both small and complex. And because the blocks are all the same material you can't do anything magical like make some parts lighter or run a current through a wire, because you need different materials for that.

I think of programming languages as blocks. They are what makes our programs "slightly removed from pure thought-stuff". They are what makes our abstract craft it into something real. But they also define and limit how we can build our castles in the air. Every programming language is a different straight jacket. A different set of blocks.

It is an interesting dilemma. We need languages to be able to express anything. A language, to be something, simply must be concrete. And with a concrete language we can build concrete things, but it is also the very same thing that limits what we can express. Without our language, our castles in the air have no restrictions. They are pure imagination. But they can also never be.