Archive for 2008

of codecs and containers

September 8th, 2008

I have been very skeptical about adding options for other codecs in undvd, purely because of the test burden. With a single combination of container and pair of audio/video codecs I can be reasonably confident that I've done enough manual testing (and judging video quality doesn't trivially lend itself to automated testing, sadly) to account for most potential problems.

But at the end of the day it's a question of priorities, and having scratched all the important technical itches by now, if anything this is the right time for it. I got some user feedback recently that set me onto this path. The user was having trouble playing the files encoded in the classical avi+h264+mp3 format on other platforms, and that's when I asked myself how important is it really to have a single format? As long as the default still works well, what's the harm in offering a little customization?

Testing is a huge problem, which is why this new feature is considered to be experimental. The most common seems to be bad a/v sync. There is just no way to account for all the possible combinations of codecs and containers, and to maintain an up-to-date document for this as things evolve. So the burden of testing is squarely on the user here (which is quite unfortunate).

The new functionality is available in undvd 0.5 and up. Here's a shot of the new goodness. All these files were encoded from the same dvd title. A 22 minute title was ripped with different containers (represented with different filenames). The audio codec is mostly the same in all cases (mad = mp3), except for 1.mp4 (faad = aac). The video codec is also mostly the same (h264 = avc1), except for 1.flv. The only variation here is the container being set to different values, all the other settings are defaults. You can also witness that some containers are more wasteful than others (given the same a/v streams), but not by a huge amount. (The audio bitrates shown are actually misleading, mplayer seems to give the lowest bitrate in a vbr setting.)

This demo is by no means exhaustive of the full collection of codecs that can be used, for that see the user guide. There is also an option to use the copy codec, which just copies the audio/video stream as is.

coast to coast

September 7th, 2008

So you wanna drive coast to coast huh? That's what people say anyway, "oh how romantic, all those small towns, the landscapes".

Some people have a dream of driving in the US. I guess they relish an exciting drive across the Bible Belt and then desert country.

According to Google, you can do New York-LA in a day and 17 hours, provided you pick the shortest (and fastest?) roads and drive non-stop.

What about coast to coast somewhere closer to home? Every summer a bunch of tourists pack into their cars and drive up to the north or Norway for a relaxing road trip on our narrow and windy roads.

The scale here is 1:2.


I had to put in two extra pins on the map, or Google would send me into Sweden. But there's your Lindesnes-Nordkapp connection.

It turns out the East-West coast span of the US isn't even twice as wide as our North-South run, and you can do it in roughly the same amount of time.

So basically if you've done North-South in Norway you've done more than half the distance across the US. Doesn't sound that impressive at all anymore, does it? :D

Ps. In the Netherlands you can do Maastricht-Groningen in 3 hours, it's like a paper route. :howler:

a coder's bookshelf

August 30th, 2008

What is this obsession people have with books? They put them in their houses - like they're trophies. What do you need it for after you read it?
- Jerry Seinfeld

I think it's because reading a book takes a lot of effort, and we want to get credit for it. Reading a big book takes considerably longer than anything else you might do for "fun". And then you can point to it and say, "look, this is what I know".

I have a bunch of computer books, a lot of them from college, that I'll probably never toss out even though I'm unlikely to ever re-read them. Meanwhile, I can do what a lot of people are doing and put them on display. "Look, I must be really clever, I have all these books!"

Frankly that's all they're good for after I'm done with them.

tahple or twople?

August 21st, 2008

The word tuple is used quite a lot in computing. That's what database people call a row in a table. It's also what several programming languages call a structure where the fields are ordered but not named.

It seems to be one of those words that is hard to translate, so other languages often use the English word. And yet there is some confusion about pronunciation. Some say tahple, some say twople. As far as I know there is no dispute about the spelling, it's tuple. So where do you get twople from that?

I think having a lot of exceptions on pronunciation from what is the obvious pronunciation is bad for language. There are words that are fancy or interesting enough to perhaps deserve it, but tuple isn't one of them. So I'm going to keep saying tahple.

Beautiful code

August 16th, 2008

I don't remember who metioned this book or where they did it. I seem to remember it being mentioned by several people. But for one reason or another I decided to order it and I've eventually made my way through to it.

"Beautiful code" is a compilation of 30-something case studies, each chapter written by a different contributing author, describing code or systems they found beautiful. I suppose it is subjective how wide your definition of "beautiful code" is, but some authors describe architectures rather than code, which isn't quite what I'd expect. To me "code" is generally something that happens at the statement/function level, otherwise you call it "design" or "architecture".

The case studies are extremely diverse, you have everything from kernel code to high level systems. As I'm not a kernel hacker I have to say I didn't understand much of the chapter on Linux drivers, but then I get the feeling I'll never grok c types without a mentor or something, the Hungarian notation style variable naming tells me little about their meaning. There's a FreeBSD chapter on filesystem layering, and that's fairly straightforward, then there's a Solaris chapter on thread handling which is interesting, but the code unfortunately is less instructive to me than is the prose (the author's fascination with sewage is also mildly disturbing).

You'll find the code examples in a variety of languages, some familiar (c++, haskell, java, python, ruby), some not directly familiar but partly or mostly understandable (c, c#, javascript, perl, scheme), and some foreign (elisp, fortran, matlab, visual basic). There are two chapters showing implementations of python datastructures (in c) that I found quite interesting, one from the standard library (dictionaries), the other from NumPy (n-dimensional arrays).

It turns out this book is more interesting than I expected. Some of the chapters I'm just not in a position to understand, but many of them are well written and interesting to delve into. I successfully killed 4-5 hours of time in flight and at the airport with it, which is better mileage than I get out of books on tape. What I really like about it is that it's a book for hackers in the trade -- it's a book that shows you stuff, not one that tries to teach you. Which means you get right to the point without the obligation to introduce and prepare you for what you're about to read. It's a lot more like reading a blog.

So then there's the question, is the code that these supposed masters of the trade write more beautiful than yours and mine? Well, not necessarily. In some of the examples presented it's the design that's supposed to make it beautiful, not the code itself. And try as you might to imagine how an expert will wield untold levels of voodoo to problems you and I would love to solve better, most of the time they don't. I guess there isn't all that much hidden magic out there.