Archive for the ‘reviews’ Category

linux audio confusing as ever

December 19th, 2008

Audio in linux, how to put it into words? How about: oss, alsa, pulseaudio, esound, arts, portaudio, jack, gstreamer, phonon. :googly: Did I miss any? Embarrassment of riches? Or just embarrassment?

I will not rehash history any more than to say that between buggy/incomplete drivers for sound cards and the wonderful world of alsa I've never been able to understand how the hell audio works beyond getting output and, sporadically, input. I am the quintessential dumb user of linux audio, even though I have tried to figure it out.

But let past be past. Ubuntu 8.10 Intrepid. Pulseaudio ready and everything, since 8.04. Why Ubuntu decided to plug in pulseaudio without setting up any gui controls for it is beyond me. All I know is that occasionally sound output will stop working (I get 2 seconds of output and then it stops) and then "pkill pulse" cures it.

Let's run through the list.

alsamixer

I've learned to use alsamixer to fix low level audio problems. It is reliable in that it gives me all the channels on my sound card, so I always use it to mute/unmute my microphone and to untangle the master/pcm/headphones/front speakers settings caused by mixers that only allow me to control the volume of one channel and apps that output on god knows what channel.

That's alsamixer on a Ubuntu 8.10 desktop with all updates installed as per today. Here's the same thing on my laptop.

Same distro, same version, all updates installed. And I have made no conscious choice for this to come about, all I did was install the pulsaudio gui tools. All of a sudden, alsamixer is useless as it only has access to a single "master" channel. It would appear now that pulseaudio is sitting between the sound card and alsa, but does that sound right?

pulseaudio gui

pulseaudio is here, might as well use it, right? Ubuntu ships a bunch of pa* packages that are gui tools for pulseaudio (not installed by default). This one is called pavucontrol. It lets you set volume per audio stream. I love this, I've wanted to have it since forever.

But then it comes to output and what do I see?

A single output device. Where is my master, my pcm, all my channels? What's more bizzarre is that pulseaudio says it's connecting to alsa while alsamixer says it's connecting to pulsaudio. Surely that way madness lies?

What's worse, pavucontrol is a gui app without any kind of systray integration, so much as I would like to use it as an easy-to-reach mixer, I can't.

kmix

I run kmix in my tray, it lets me set the volume by scrolling the mousewheel over the icon, which is exactly what I want. It's not a terribly impressive mixer, as it only lets me select one channel to control, whereas what I really would like is to be able to lock several output channels into one, otherwise I'm never really in control. So the best I can do is set the volume to max on master, and tell kmix to show me pcm. That seems to work well enough.

It appears I have two identical channels, but the leftmost is misnamed, that's actually Master.

Now, let's see. alsamixer doesn't show any of these channels, because it delegates to pulseaudio. So kmix connects to... god knows what, but at least I can access these channels.

gnome-volume-control

Gnome's volume manager is an interesting one. It has a combo box for devices. I'm showing here all the devices related to output and how channels are connected.

The good news is that the channels in the default mixer correspond to those in kmix (sweet sanity!). But then there is that bottom mixer with a channel of the same name as the single output channel in pavucontrol. And yes, they are the same (and the same as the single output in alsamixer).

But this channel appears to be some kind of composite output channel that has no bearing on any of the output channels in the top mixer (or in kmix). (Ie. while it might appear to be a composite pipe that aggregates the flow of all these standard channels, it's actually a pipe that sits after all of them. If pcm volume is 0, this channel won't receive any input, and will be mute no matter its volume setting.)

So neither can I effect the settings of master/pcm/etc with this channel nor vice versa. That means if I want to use this single channel as my master volume control, I have to make sure that absolutely no application can mess with the volume for any of the standard master/pcm/etc channels. Good luck with that.

A sane spec

I know this sounds crazy, but what's it gonna take to get a single server+mixer applet that

  • captures every possible audio stream, no matter the audio server/api it's on,
  • prevents applications from messing with master volume controls (eg. mplayer),
  • has a single virtual channel controlling all the underlying output channels, so that I can have a single master volume slider,
  • has per stream/app volume controls pulseaudio style,
  • integrates well into the systray,
  • integrates with laptop hotkeys

?

UPDATE: Ubuntu users can now vote on a proposal to unify sound systems on ubuntu brainstorm. Hopefully this idea and the related ones can gain some traction with Canonical to attack the problem.

OpenID deserves to die

May 27th, 2008

Here's my perspective on it. We all have ideas, some good and some bad. Now it's understandable that people who have invested themselves into a bad idea, especially if they thought it was good, are reluctant to walk away from it. It's painful to have to realize that. But the flip side is that we have to maintain the myth of Santa Claus because, well, so many kids believe in him that we can't let them down. Bad ideas deserve to die for the good of everyone.

The first thing a good idea must have is a real problem to solve. OpenID does very well here. The point of OpenID is to solve our common problem of the internet age: many websites, many accounts, many usernames and passwords. This is probably why OpenID still appears to some people as being a good idea.

Here's how they do it. Instead of keeping track of your accounts on all the sites you're a member of, just let one site keep all your account records (sound ominous yet? it did to me). Now, whenever you want to login to one of your sites, instead of using your username/password for that site, you use your OpenID login, which looks like this: http://username.myid.net. This url is effectively your OpenID provider, ie. the site you use to keep track of all your accounts. So now the site you're logging into sends you to your provider, where you login with a username/password belonging to the account on the provider site, and that logs you into the site you were visiting. So in other words, your account on the provider is the gatekeeper to all your accounts. Sounds simple, right?

I remember when I first heard about this idea years ago. The first concern I had was that in order for this to work, I need a provider to keep track of all my accounts. So I asked myself the question: whom do I trust do this for me? The answer came back: myself. I don't know about you, but the idea of some third party storing all my logins doesn't make me feel warmy and cuddly. As it happens, the open in OpenID means you can choose any provider you want, including yourself. You just set up some php scritps and voila, you can use http://mysite.com as your provider. So basically, instead of storing your accounts in some "account manager" program on your computer, you do the same thing on your server. This is where the concept of OpenID died for me. I don't want to have to depend on my own OpenID provider to work in order to use other sites. I don't want to add a dependency on my ability to login to some other site contingent on the assumption that my own site is available and working properly at all times (which it isn't, I have a little downtime like everyone else).

If you don't want the hassle of being your own provider, you can pick a provider from a list. This is not an attractive fallback option, because now your account on the provider is your key to all your other accounts. If I have an account on some site and I forget my credentials, big deal, I only lose that one account. But if I lose my credentials on the provider, I lose everything.

In theory, OpenID tries to improve your overall security. The hassle of keeping track of accounts is known to us all, and we get around the problem by reusing the same (or similar) credentials on a lot of sites. This is obviously bad for security, because if someone gets your password to one site, they can access all your other accounts that use this password. So security people will always recommend that you use distinct credentials for every account. Suppose you do this, and you use OpenID to alleviate record keeping. Now, OpenID actually works against you. Your account on the OpenID provider is the key to everything. With a different password on every site, you're that much less likely to remember what it was, therefore your account on the provider is proportionally more valuable.

There is a strange irony at play here. Supposedly, the more accounts you manage with OpenID the more useful it is. But on the other hand, the more accounts you manage with it, the more you depend on it, and the more you make it the one gateway to all your online identities for a potential attacker or for abuse by a dishonest or incompetent provider.

Most importantly, however, OpenID's solution to the login problem isn't a very clever solution at all. Typing http://username.myid.net is not a big improvement over a username/password form. My browser already gives me the option to login without typing anything.

Those are my reasons why OpenID is a bad idea and should have died years ago. If you want more, Stefan Brands has an exhaustive laundry list of problems with OpenID.

clocking jruby1.1

April 21st, 2008

Did you hear the exciting news? JRuby 1.1 is out! For real, you can call your grandma with the great news. :party: Wow, that was quick.

Okay, so the big new thing in JRuby is a bytecode compiler. As you may know, up to 1.0 it was just a Ruby interpreter in Java. Now you can actually compile Ruby modules to Java classes and no one will know the difference, very devious. :cool: Sounds like Robin Hood in a way, doesn't it?

The JRuby guys are claiming that this makes JRuby on par with "regular Ruby" on performance, if not better. Hmm. Just to be on the safe side, what size shoes do you wear? Oh ouch, those are going to be tricky to fit in your mouth. :/ And Freud will say you're stuck in the oral stage. Too much? Okay.

So here is my completely unvetted, dirty, real world test. No laboratory conditions here, you're in the ghetto. First we need something *to* test. I don't have a great deal of Ruby code at my disposal, but this should do the trick. How does scanning the raw filesystem for urls sound? The old harvest script actually does a half decent job of turning up a bunch of findings.

Now introducing the contenders. First up, his name is JRuby, you know him from occasional mentions on obscure blogs and the programming reddit past the top 500 entries. He promises to free all Java slaves by giving away free Rubies to everyone!

Aaand the incumbent, the famous... Ruby! You know him, your parents know him, every family would adopt him as their own child if they could. He's the destroyer of kingdoms and the creator of empires, he's bigger than Moses himself!

Our two drivers will be racing across a hostile territory. Your track is a 25gb ext3 live file system. During this time, I can promise you that only Firefox is likely to be writing new urls to disk, but I could be lying eheheh. Due to the unpredictable nature of this rally track, regulations allow only one racer at a time, but you will be clocked.

First up is the new kid on the block Jay....Ruby. The Ruby code will not be compiled before execution, we'll let the just-in-time compiler do its thing.

$ time ( sudo cat /dev/sda5 | bin/jruby harvest.rb --url > /tmp/fsurls.jruby )
real 39m26.547s
user 37m19.072s
sys 1m28.406s

Not too shabby for a first run, but since this a brand new venue, we have no frame of reference yet. Let's see how Ruby will do here.

$ time ( sudo cat /dev/sda5 | harvest.rb --url > /tmp/fsurls.ruby )
real 78m42.186s
user 62m12.537s
sys 2m18.721s

Well, look at that! The new kid is pretty slick, isn't he? Sure is giving the old man a run for his money. Let's see how they answered the questions.

$ lh
-rw-r--r-- 1 alex alex 86M 2008-04-21 18:29 fsurls.jruby
-rw-r--r-- 1 alex alex 8.6G 2008-04-21 20:58 fsurls.ruby

Yowza! No less than a hundred times more matches with Ruby. What is going on here? Did Jay just race to the finish line, dropping the vast majority of his parcels? Or did father Ruby see double and triple and quadruple, ending up with lots and lots of duplicates? Well, we don't really *know* how many urls exist in those 25gb of data, but it seems a little bit suspect that there would be in excess of 8gb of them.

One way or the other, it's pretty clear that the regular expression semantics are not entirely identical. In fact, you might be sweating a little right now if your code uses them heavily.

UPDATE: Squashing duplicates in both files actually produces two files of very similar size (13mb), in which the disparity of unique entries is only a very reasonable 4% (considering the file system was being written to in the process). The question still remains how did Ruby produce 8gb of output.

evolving towards git

December 4th, 2007

I remember the first time I heard about version management. It was a bit like discovering that there is water. The idea that something so basic could actually not exist seemed altogether astounding afterwards. This was at a time when I hadn't done a lot of coding yet, so I hadn't suffered the pains of sharing code *that* much, but it was already relatable enough to understand the implications of version management. Emailing those zip files of code had been plenty annoying.

This happened just before we embarked upon a 6-month journey in college to write an inventory management application in java. There were four of us. We had done java before, but this was the first "big" application that also employed our newly acquired knowledge about databases. God only knows why, but they actually ran (and probably still do) an Oracle server at school for these projects. I don't know how much an academic license runs on that, but it's in the thousands of dollars no doubt. As the initiative taker of my group, I persuaded (not that it took much effort) the guys to forget about the Oracle server (which didn't come with any developer tools, just the awful command line interface) and use PostgreSQL (we rejected MySQL just to be on the safe side, as it lacked support for a bunch of things like views and foreign keys). I also introduced another deviation from the standard practice: CVS. I would be running both off of my tried and tested Pentium home server. :proud: We were the only group that used version management, which baffled me (why was it not offered as a service at the department?). I heard that the year after that they started offering CVS for projects. Incidentally, when you deviate off the path, you do expect to get some academic credit for the extra effort, obviously. ;)

So that was 6 months with CVS, and it worked quite well. We didn't do anything complicated with it, didn't know about tags or branches, just ran a linear development on it and that was it. And it was helpful (but not efficient) to track binary files (Word documents, bleh). But it had some annoying quirks. Not that we were all that concerned with tracking history back then, it was just about getting it finished and then obviously it would never be used, as school projects go. But the lack of renaming and moving in CVS was silly.

It's a bit funny, actually. CVS has been a standard for the longest time, and people have put up with its problems, until recently when a version management boom began. Why is that? I have a hunch that the relative calm in version management was kept intact by a predominantly corporate culture (and the corporates are obviously super conservative). But then once a certain number of people had gotten themselves into open source projects the need for better tools put this in motion. One of the first "new generation" systems was Subversion, the replacement for CVS. Subversion was adopted slowly, but quite steadily. Currently it's probably the "standard" for version management. I registered galleryforge with Berlios precisely because they offered svn services. Sourceforge also has it now, and people around the net have started talking in terms of svn a lot. My current school also uses it.

But with the momentum of version management systems in play, a lot of other ideas have surfaced besides those captured in Subversion. Arch and Darcs are two that employ the distributed paradigm. I don't think they have had much success though, and perhaps looking back they will have played the role of stepping stones rather than end products. A new (a third, if you will) generation of systems has appeared quite recently. Monotone is a newer system with its own network synchronization protocol. Mercurial seems to have enough support to replace Subversion/CVS down the line (not that it couldn't happen today, but people hate giving up something they're used to, so these things drag out). And there is Bazaar, which seems to discard the goal of being the best system and just aim to be easy and pleasant for small projects.

And somewhere in that mix we find Git, except that it has a much higher potential for success than other new systems, because it was launched into the biggest (or noisiest, at least) open source project and made to carry the heaviest loads right from the beginning, which is a good way to convince people that your product is robust. As such, it seems to me that git has had the quickest adoption of any system. It was launched in 2005 and it has already become one of the household names around the interwebs. So therefore I thought git is worth looking at. At this rate I may very well be using it on a project someday.

I was also encouraged by regular people saying it worked for them. A version management system for kernel developers is nice, but that doesn't necessarily make it right for everyone else. But here were people with one-man projects liking it, good sign. In fact, one of the things I used to read about git was "I like the simplicity".

What I discovered was that my dictionary was out of date, because it didn't carry that particular meaning of the word "simple". Git is not simple, it takes getting used to. It's one of those things you try and you don't understand, then you come back and try again and you make a little more progress and so on. I decided to open a git repo for undvd, both because I need to track the changes and so I can use git and get a feel for it.

The special thing about git is that it has well developed capabilities for interoperating with other systems. In other words, git doesn't assume it will wipe out everything else, it accepts the reality that some people will continue using whatever it is they're using (even CVS) and git can deal with that. Obviously, predicting the future is tricky, but I get the feeling that Subversion has hit that comfort spot where most people are satisfied with what it's doing and they really would need a lot of persuasion to learn something new. In other words, I expect Subversion to be around for quite a while. And therefore, git's svn tools may come in very handy. I haven't really looked at that part yet, but it's on my todo list.

So what is the future? I would say Subversion, Git, and maybe Mercurial. Version management systems are like programming languages - the more of them you use the easier it is to learn a new one. But switching is hard. When I'm thinking of checking something in version management I still begin typing svn st.. before I realize I'm using git.

what can you do with django?

December 2nd, 2007

So I'm reading about Django, which is a really groovy web framework for Python. As the Django people themselves say, there have been *a lot* of web frameworks for Python alone, and it's clear that a lot of lessons have been learned along the way. Personally I've only looked briefly at Zope before, but I was a bit intimidated by how complicated it seemed to be. I also once inherited a project that used Webware for Python, which was pretty nice to work with, but radically different from what I'd known because pages are constructed using a sort of abstract html notation (basically every html tag is a function etc).

There is a lot to like about Django. The MVC pattern is executed quite elegantly (or as they call it in Djangoland, the MTV (model - template - view) pattern). Templates are written in what resembles a server side language, but it's actually a specialized, both restricted and enhanced, Python-like templating language. Templates can inherit from each other to fill in placeholders in a cascading manner, very interesting design choice.

Then there is a nice separation between objects and persistence. Data is declared in terms of Python classes (called "the model") which are then translated somehow on-the-fly to sql-du-jour depending on your database backend. You can even run a Python shell to interact with your model to de facto edit your database.

The model and the template mechanism are both tied together with "views", which is the application code you write for your webapp. A cool thing here is that urls are decoupled from execution code, and the two things are mapped via regular expressions, so you have something like a mod_rewrite declaration (but nicer) that dispatches to functions.

All the same, if I were to work with Django I have a couple of concerns up front:

  1. Templates are written in html with inline Python-line expressions. This makes them both simple and transparent, because the html code that will be served is written by hand. However, considering the present situation with html and browser support, it would be nice if Django could enforce/validate html. But this would probably imply an abstraction over html (to prevent raw html being written) and a layer of code generation for html (which may or may not be easy/possible to cache efficiently).
  2. And that brings me to the second point. I have seen no mention of CSS anywhere. It appears that Django leaves the fairly heavy burden of supplying styles to the designer, without any intelligent mechanisms to help.
  3. Data models are represented in Python code, which is very convenient and provides a much sought after quick access to the contents of your database. Django will instantiate tables for you based on a model, but only for creation. It doesn't seem to enforce or update the correspondence between the Python model and the database schema. The Django book has a mention of strategies for keeping things in check, but this particular section is missing.

There is no doubt Django embodies a lot of what we've learnt about building web applications, but currently I would say it seems a bit incomplete (also pre v1.0 still), if terribly elegant in some parts.

Another question concerns performance. Ruby on Rails has been known to suffer when pitted against the effective but ugly php. Does Django fall in the same bracket or does it have a lower overhead? In general Python is an older and more refined environment, so one might expect Django to scale better.