Archive for September, 2007

the unhappy reality of upgrades

September 25th, 2007

It struck me today that as coders we do what we can to wrap our nasty, complicated code in a neat package that the user will love. They don't realize, and we don't want them to know, just how convoluted and messy the stuff is on the inside. And this holds up for long periods of time. But there comes a time when our neat little illusion cracks up and the ugliness comes into view. Bugs expose it sometimes, but upgrades do this with immaculate regularity.

Why today? Because Wordpress 2.3 was dropped today. The Wordpress people decided to toss out categories in favor of a wonderfully engineered (isn't that what we always believe?) taxonomy system. With the immediate consequence that any code that has anything to do with categories would break. That's two of my plugins. Clearly these guys are not Windows users. Microsoft's Patch and Play strategy with Windows has kept *a lot* of companies happy, as they continually strive to emulate their old bugs to accommodate programs that were written to cope with them. This has seriously handicapped Windows from making progress, because they keep pulling that huge sack of legacy code going back to probably Win3.1 (with Workgroups, yay!).

Posts used to be related to Categories with an in-between table, the classic N:M relational idiom. Now there are 4 tables, all related to each other in interesting ways. It took me quite a while to crack this code. This was introduced to add tagging support, which is quite the annoyance, because I have no interest in tagging. I find it a useless errand. And, of course, for those not tagging from the beginning you always come back to having to post-tag 600 old posts. Forget it.

Tools always help a lot, but it's very difficult to capture all the nuances, and in many cases human review is necessary anyway (particularly when themes change). And this is the sad reality of it. While minor upgrades are now handled routinely, bigger changes will always cause problems.

The End of Faith

September 24th, 2007

I find that Sam Harris is a more articulate critic of religion than Richard Dawkins, who is clearly more hostile. His two books The End of Faith and Letter to a Christian Nation raise a lot interesting issues. The latter is not particularly interesting, but the former presents much insight.

The text is actually quite comprehensive at times, and piecemeal reading doesn't do it justice. I think he makes excellent points about the suffocating role of religion as opposed to progress and the evolution of knowledge. And his criticism of religion as a driving force in politics and policy making leaves little to dispute. Furthermore, the distinction of religions in terms of their values is a very relevant point. As is the condemnation of the taboo against criticizing irrational beliefs.

But what ultimately drives his "war on religion" is the premise that unless we do something right now we're going to destroy ourselves. It would be perfectly fine to make all the arguments he does as a crusade for intellectual honesty. And there is certainly enough social and political justification for it. But his bottom line is a doomsday scenario which I find rather extreme.

For more on this (and yes, he's an excellent speaker), youtube it:

The talks are rather focused on the struggle between religion and rationality, they don't go into his ideas about world destruction that much.

wordpress update script

September 22nd, 2007

Aah, the free world. It's beautiful, you have frequent releases, the code is there for you, everything's wondeful. But for web apps like Wordpress the maintenance cycle is less convenient than with desktop applications. There's no package manager to handle updates for you. Yes, that's the downside.

I've upgraded Wordpress now 3-4 times and I'm already sick of it. It's so mechanical. I've also rehearsed the cycle a bunch of times with vBulletin. Well, compared to the tidy and elegant Wordpress, vBulletin is a monster. But the upgrade issues are the same, albeit less painful now. I should have done this years ago, but now at least I have an organized way of handling these upgrades.

Here's the rationale. You download some Wordpress version from and install it. This we call the reference version. Then you hack on it a bit. You install plugins, maybe you hack the source a little. You change the theme a bit. And in the course of using Wordpress you also upload files with posts sometimes, for instance to include pictures with your posts.


So now the state of your Wordpress tree has changed a bit, you've added some files, maybe you've changed some files. Basically, it's different from the vanilla version. This we call mine. And now you've decided that the next Wordpress version has some nice features and bug fixes you want. This version we call latest.

You want to upgrade, but there is no upgrade path from mine to latest, because the Wordpress people can't know what you did with your local version. Upgrading from mine to latest may not be safe, it hasn't been tried.

wordpress_upgrade2.pngOf course, this sort of problem is nothing new. Coders have faced it forever. And that's why we have things like diff and patch, standard Unix tools. So here's how to upgrade safely.

  • First roll back the local changes so that we return to the reference version.
  • Save the local modifications.
  • Do a standard Wordpress upgrade going from ref to latest.
  • Re-apply, if possible, the local modifications.

And this replicates exactly what you would do manually if you wanted to be sure that the upgrade doesn't break anything. Just that it's a lot of hassle to do by hand. The upgrade is done offsite, so your blog continues to run in the meantime. And once you've upgraded, you can just move it into the right location.

In the event that merging diff and latest does not succeed, you have a list of the patches and files so that you know exactly which ones didn't succeed.

So far I've used it to do two updates, 2.2.1->2.2.2, 2.2.2->2.2.3, without any hiccups.


# >> 0.3
# added file/dir permission tracking
# added hint for failed file merges
# added hint for failed patches

echo<<header "
#                                                                              #
#                      Wordpress Updater / version 0.3                         #
#                   Martin Matusiak ~                      #
#                                                                              #
#  Description: A script to automate [part of] the Wordpress update cycle, by  #
#  finding my modifications to the codebase (mine), diffing them against the   #
#  official codebase (ref), and migrating files and patches to the latest      #
#  version (latest).                                                           #
#                                                                              #
#  Warning: Upgrading to a new version will probably not always work           #
#  seamlessly, depending on what changes have occurred. Do not use this as a   #
#  substitute for following the official upgrade instructions. Furthermore, if #
#  you don't understand what this script does, you probably shouldn't use it.  #
#  Also, it's always a good idea to backup your files before you begin.        #
#                                                                              #
#  Licensed under the GNU Public License, version 3.                           #
#                                                                              #

### <Configutation>


### </Configuration>

echo -e "Pausing 10 seconds... (Ctrl+C to abort)\007"
sleep 10

msg() {
fill=$(for i in $(seq 1 $((76 - ${#1}))); do echo -n " "; done)
echo<<msg "
+  ${1}${fill}+

if ! mkdir -p $temp_path; then
	echo "$temp_path not created"; exit 1

msg "Checking installed version... "
if [ -f $version_file ]; then 
	ref=$(cat $version_file | grep "\$wp_version" | tr -d " _$'';=[:alpha:]")
	echo $ref
	echo "$version_file not found"; exit 1

msg "Fetching version $ref... "
[ -f $temp_path/wordpress-$ref.tar.gz ] && rm $temp_path/wordpress-$ref.tar.gz
if wget -q -P $temp_path $wordpress_baseurl/wordpress-$ref.tar.gz; then
	echo "downloaded to $temp_path"
	echo "could not fetch $wordpress_baseurl/wordpress-$ref.tar.gz"; exit 1

msg "Unpacking reference version $ref... "
[ -d $temp_path/wordpress-$ref ] && rm -rf $temp_path/wordpress-$ref
if (cd $temp_path && tar zxf wordpress-$ref.tar.gz && mv wordpress wordpress-$ref); then
	echo "unpacked to $temp_path/wordpress-$ref"
	echo "failed"; exit 1


msg "Diffing codebase... "
( cd $wpref && find . -type f | sed "s|\./||g" | sort > $temp_path/files-$ref-ref ) &&
( cd $wpmine && find . -type f | sed "s|\./||g" | sort > $temp_path/files-$ref-mine ) &&
diff $temp_path/files-$ref-ref $temp_path/files-$ref-mine > $temp_path/diff
if [[ $? < 2 ]]; then
	echo "diff written to $temp_path/diff"
	echo "failed"; exit 1

msg "Recording my file/dir permissions..."
cd $wpmine && \
find . -exec ls -ld --time-style=+%s {} \; | sed "s|\./||g" | sort -k 7 \
> $temp_path/files-$ref-mine.perms
if [[ $? == 0 ]]; then
	echo "written to $temp_path/files-$ref-mine.perms"
	echo "failed"; exit 1

msg "Listing files added/removed... "
( cat $temp_path/diff | grep "^>" | awk '{ print $2 }' > $temp_path/only_mine ) &&
( cat $temp_path/diff | grep "^<" | awk '{ print $2 }' > $temp_path/only_ref ) &&
( cat $temp_path/only_mine > $temp_path/not_common ) &&
( cat $temp_path/only_ref >> $temp_path/not_common )
if [[ $? == 0 ]]; then
	echo "mine only files written to $temp_path/only_mine"
	echo "ref only files written to $temp_path/only_ref"
	echo "failed"; exit 1

msg "Listing files changed... "
[ -f $temp_path/changed ] && rm $temp_path/changed && touch $temp_path/changed
for i in $(cat $temp_path/files-$ref-ref); do
	if ! grep -x $i $temp_path/not_common >/dev/null; then
		if ! diff -q $temp_path/wordpress-$ref/$i $wpmine/$i >/dev/null; then
			echo $i >> $temp_path/changed
if [[ $(wc -l < $temp_path/changed) == "0" ]]; then
	echo "No changes detected"
	echo "Files changed written to $temp_path/changed"

msg "Writing individual diffs... "
[ -d $temp_path/diffs ] && rm -rf $temp_path/diffs
mkdir -p $temp_path/diffs
for i in $(cat $temp_path/changed); do
	e=$( echo $i | sed "s|\./||g" | tr "/" "." )
	diff -u $temp_path/wordpress-$ref/$i $wpmine/$i > $temp_path/diffs/$e
ds=$(ls $temp_path/diffs | wc -l)
echo "$ds diffs in $temp_path/diffs"

msg "Fetching latest version... "
[ -f $temp_path/latest.tar.gz ] && rm $temp_path/latest.tar.gz
if wget -q -P $temp_path $wordpress_baseurl/latest.tar.gz; then
	echo "downloaded to $temp_path"
	echo "could not fetch $wordpress_baseurl/latest.tar.gz"; exit 1

msg "Unpacking latest version... "
[ -d $temp_path/wordpress-latest ] && rm -rf $temp_path/wordpress-latest
if (cd $temp_path && tar zxf latest.tar.gz && mv wordpress wordpress-latest); then
	echo "unpacked to $temp_path/wordpress-latest"
	echo "failed"; exit 1


msg "Trying to patch diffs... "
post=$(echo $wpmine | tr -d "/")
patch_level=$(( ${#wpmine} - ${#post} ))
[ -f $temp_path/patches.failed ] && rm $temp_path/patches.failed
for i in $(ls $temp_path/diffs); do
	cd $wplatest && patch -p$patch_level < $temp_path/diffs/$i
	if [[ $? != 0 ]]; then
		echo $temp_path/diffs/$i >> $temp_path/patches.failed

msg "Merging in my files... "
[ -f $temp_path/file-merge.failed ] && rm $temp_path/file-merge.failed
for i in $(cat $temp_path/only_mine); do
	d=$(dirname	$wplatest/$i)
	mkdir -p $d
	if [ -e $wplatest/$i ]; then 
		( echo "file already exists: $i"; 
		echo $i >> $temp_path/file-merge.failed )
		( echo "merging: $i" && cp -a $wpmine/$i $wplatest/$i )

msg "Merging file/dir permissions..."
while read line; do
	f=$(echo $line | awk '{ print $7 }')
	p=$(echo $line | awk '{ print $1 }'); p=${p:1:9}
	u=$(echo ${p:0:3} | tr -d '-')
	g=$(echo ${p:3:3} | tr -d '-')
	o=$(echo ${p:6:3} | tr -d '-')
	if [ -e $wplatest/$f ]; then 
		echo "setting: $p $f"
		chmod u=$u $wplatest/$f
		chmod g=$g $wplatest/$f
		chmod o=$o $wplatest/$f
done < $temp_path/files-$ref-mine.perms

msg "Removing files I deleted... "
for i in $(cat $temp_path/only_ref); do
	[ -f $wplatest/$i ] && (echo "removing: $i" && rm $wplatest/$i)

msg "Complete"

[ -f $temp_path/patches.failed ] &&
echo "Some of my patches failed to apply, listed in $temp_path/patches.failed"

[ -f $temp_path/file-merge.failed ] &&
echo "Some of my files failed to merge, listed in $temp_path/file-merge.failed"

echo<<close "
The upgraded version is in $wplatest

To install the new version you'll want to do something like this:
  mv $wpmine ${wpmine}.old
  mv $wplatest $wpmine

Afterwards you can remove the temporary dir $temp_path

If the new version provides any php upgrades scripts (to upgrade the 
database), now would be a good time to run them"

Another option would be to use Subversion and just update between stable tags, but then again I don't have that on the server and most hosts probably don't install it. But the Subversion method and this one are functionally equivalent, with the small exception that this upgrade is done offsite while the Subversion way would typically (but not necessarily) be a live upgrade.

the ultimate distro shootout

September 19th, 2007

If you've ever asked yourself which distro is right for me? then you might be interested in trying them all. :D

But I have some further suggestions:

  • {Free,Open,Net,PC-}BSD
  • Solaris, OpenSolaris, Nexenta
  • ReactOS
  • Plan9
  • Minix
  • SkyOS
  • GNU Hurd (heck why not :D )

I once tried quadruple booting Gentoo, WinXP, Solaris and OS X, but the latter two conflicted with each other, so I had to pick one of them at a time.

which programming language? all of them

September 16th, 2007

Which programming language should I learn?

It's a question that comes up a lot, especially by novices who are trying to get into programming. It's a good question, but I'm not sure if it's the *right* question. It is a bit like walking into a tool shed and asking which tool should I use? To which the answer is... all the ones you need.

Peter Norvig wrote an opinion about novice programming a few years ago, called Teach Yourself Programming in Ten Years. He argued that a lot of people underestimate what it is to learn programming, judging by how many books there are on the subject of learning it quickly. And he said that people should realize it takes long and not be in such a hurry.

I find myself past the 10 year mark myself, since I started some time in junior high. Or maybe even elementary if you count modifying integer variables in Basic to make the gorilla throw a banana and cause a bigger explosion (and MsDos hackers here?), but without any interest in learning to code at the time.

At this point you might be tempted to ask... so did you? Well, I can't say that I know how, but I know enough to get by. I know enough to code the kind of things I have use for. And I never set out to accomplish more than that.

But returning to the question, whatever language you choose, it will be your first language. But it won't be the only language. There are people who have spent 10 years writing nothing but Excel macros, and never touched another programming language. But if you're asking this question, presumably you're not willing to limit yourself to Excel macros. And unless you actually spend decades writing the same kind of applications you will definitely touch multiple languages. So whichever does or doesn't come first isn't really that important.

In 10+ years I've touched quite a few languages. Aside from Basic, my very first attempts were with Pascal, and a few years later it was Delphi, writing silly desktop applications, in particular encryption software. (Along with a friend we made up a mock company called MicroProgz, and inexplicably, the website has somehow survived to this date. :proud: ) In college, they were big on Java, which I didn't particularly love from the start. But since that was the language du jour, I saw quite a bit of it; gui, databases, threading, networking. The gui was the worst by far. We also built mock applications for mock businesses for assignments, so that's the first time I encountered SQL. Around this time I was into web programming, so I started playing with PHP on my own. I also wanted to get into C++, which was sort of the "real" language you were supposed to know, or that was my impression at least. I took a course in it (through which I also had a modest introduction to C) and ever since... I've never really needed it. My college thesis introduced yet another language: Python. I took to it immediately, it was incredibly... readable. Since then I've also been introduced to Haskell at university and played a bit with Ruby on the side. And inevitably, as a linux user I've also had the opportunity to use Bash for all kinds of small things. I even took a stab at Perl a few weeks ago, and ran for the hills over the gruesome syntax.

So it's not so much which language to learn? as it is what do I want to do right now? Eventually you will probably see them all. Or many of them. And what's more, the tools of today are not the tools of tomorrow. Fortran, Cobol, Smalltalk, Algol.. these were the languages preceding my time. Perhaps I'll find some historical interest in some of them one day, but for the moment I have no use for them. In 1995 Java was a new thing, today it's perhaps the most used language out there. C# arrived in 2001, I haven't really had the opportunity to use it yet, but it's definitely making inroads. The thing is that whichever language you choose it becomes a means to accomplish something today, yes, but in the long run it becomes a stepping stone to another language you'll need in the future.

Of course, jack of all trades, master of none holds true just the same. Like a craftsman needs years to master his tools, achieving excellence takes a long time. Although I've encountered many languages (and I expect to see more in the future), I don't know any of them inside out. But... is that really a problem? There isn't a perfect language, each one teaches you to think about code in a different way. And that's really more important than any single language. Unavoidably, there are regrets that I can do X in language A, why can't I also do Y as in language B argh. But you gain something from all these different styles and methods. Your thinking becomes clearer, your code gets better.