numerodix blog

Archive for the ‘english’ Category

clocking jruby1.1

April 21st, 2008

Did you hear the exciting news? JRuby 1.1 is out! For real, you can call your grandma with the great news. :party: Wow, that was quick.

Okay, so the big new thing in JRuby is a bytecode compiler. As you may know, up to 1.0 it was just a Ruby interpreter in Java. Now you can actually compile Ruby modules to Java classes and no one will know the difference, very devious. :cool: Sounds like Robin Hood in a way, doesn't it?

The JRuby guys are claiming that this makes JRuby on par with "regular Ruby" on performance, if not better. Hmm. Just to be on the safe side, what size shoes do you wear? Oh ouch, those are going to be tricky to fit in your mouth. :/ And Freud will say you're stuck in the oral stage. Too much? Okay.

So here is my completely unvetted, dirty, real world test. No laboratory conditions here, you're in the ghetto. First we need something *to* test. I don't have a great deal of Ruby code at my disposal, but this should do the trick. How does scanning the raw filesystem for urls sound? The old harvest script actually does a half decent job of turning up a bunch of findings.

Now introducing the contenders. First up, his name is JRuby, you know him from occasional mentions on obscure blogs and the programming reddit past the top 500 entries. He promises to free all Java slaves by giving away free Rubies to everyone!

Aaand the incumbent, the famous... Ruby! You know him, your parents know him, every family would adopt him as their own child if they could. He's the destroyer of kingdoms and the creator of empires, he's bigger than Moses himself!

Our two drivers will be racing across a hostile territory. Your track is a 25gb ext3 live file system. During this time, I can promise you that only Firefox is likely to be writing new urls to disk, but I could be lying eheheh. Due to the unpredictable nature of this rally track, regulations allow only one racer at a time, but you will be clocked.

First up is the new kid on the block Jay....Ruby. The Ruby code will not be compiled before execution, we'll let the just-in-time compiler do its thing.

$ time ( sudo cat /dev/sda5 | bin/jruby harvest.rb --url > /tmp/fsurls.jruby )
real 39m26.547s
user 37m19.072s
sys 1m28.406s

Not too shabby for a first run, but since this a brand new venue, we have no frame of reference yet. Let's see how Ruby will do here.

$ time ( sudo cat /dev/sda5 | harvest.rb --url > /tmp/fsurls.ruby )
real 78m42.186s
user 62m12.537s
sys 2m18.721s

Well, look at that! The new kid is pretty slick, isn't he? Sure is giving the old man a run for his money. Let's see how they answered the questions.

$ lh
-rw-r--r-- 1 alex alex 86M 2008-04-21 18:29 fsurls.jruby
-rw-r--r-- 1 alex alex 8.6G 2008-04-21 20:58 fsurls.ruby

Yowza! No less than a hundred times more matches with Ruby. What is going on here? Did Jay just race to the finish line, dropping the vast majority of his parcels? Or did father Ruby see double and triple and quadruple, ending up with lots and lots of duplicates? Well, we don't really *know* how many urls exist in those 25gb of data, but it seems a little bit suspect that there would be in excess of 8gb of them.

One way or the other, it's pretty clear that the regular expression semantics are not entirely identical. In fact, you might be sweating a little right now if your code uses them heavily.

UPDATE: Squashing duplicates in both files actually produces two files of very similar size (13mb), in which the disparity of unique entries is only a very reasonable 4% (considering the file system was being written to in the process). The question still remains how did Ruby produce 8gb of output.

Posted in en, reviews | 5 Comments »

what the heck is a closure?

April 20th, 2008

That's a question that's been bugging me for months now. It's so vexing to try to find something out and not getting it. All the more so when you look it up in a couple of different places and the answers don't seem to have much to do with each other. Obviously, once you have the big picture, all those answers intersect in a meaningful place, but while you're still hunting for it, that's not helpful at all.

I put this question to a wizard and the answer was (not an exact quote):

A function whose free variables have been bound.

Don't you love to get a definition in terms of other terms you're not particularly comfortable with? Just like a math textbook. This answer confused me, because I couldn't think of a case that I had seen where that wasn't the case, so I thought I must be missing something. The Python answer is very simple:

A nested function.

It's sad, but one good answer is enough. When you can't get that, sometimes you end up stacking up several unclear answers and hoping you can piece it all together. And that can very well fail.

I read a definition today that finally made it clear to me. It's not the simplest and far from the most intuitive description. In fact, it too reads like a math textbook. But it's simply what I needed to hear in words that would speak to me.

A lexical closure, often referred to just as a closure, is a function that can refer to and alter the values of bindings established by binding forms that textually include the function definition.

I read it about 3 times, forwards and backwards, carefully making sure that as I was lining up all the pieces in my mind, they were all in agreement with each other. And once I verified that, and double checked it, I felt so relieved. Finally!

I can't follow the Common Lisp example that follows on that page, but scroll down and you find a piece of code that is much simpler.

(define (foo x)
	(define (bar y)
		(+ x y))
	bar)

(foo 1) 5 => 6
(foo 2) 5 => 7

What's going on here? First there is a function being defined. Its name is foo and it takes a parameter x. Now, once we enter the body of this function foo, straight away we have another function definition - a nested function. This inner function is called bar and takes a parameter y. Then comes the body of the function bar, which says "add variables x and y". And then? Follow the indentation (or the parentheses). We have now exited the function definition of bar and we're back in the body of foo, which says "the value bar", so that's the return value of foo: the function bar.

In this example, bar is the closure. Just for a second, look back at how bar is defined in isolation, don't look at the other code. It adds two variables: y, which is the formal parameter to bar, and x. How does x receive its value? It doesn't. Not inside of bar! But if you look at foo in its entirety, you see that x is the formal parameter to foo. Aha! So the value of x, which is set inside of foo, carries through to the inner function bar.

Can we square this code with the answers quoted earlier? Let's try.

A function whose free variables have been bound. - A function, in this case bar. Free variables, in this case x. Bound, in this case defined as the formal parameter x to the function foo.

A nested function. - The function bar.

A lexical closure, often referred to just as a closure, is a function that can refer to and alter the values of bindings established by binding forms that textually include the function definition. - A function, in this case bar. That can refer to and alter, in this case bar refers to the variable x. values of bindings, in this case the value of the bound variable x. established by binding forms, in this case the body of the function foo. that textually include the function definition, in this case foo includes the function definition of bar.

So yes, they all make sense. If you understand what it's all about. :/

Let's return to the code example. We now call the function foo with argument 1. As we enter foo, x is bound to 1. We now define the function bar and return it, because that is the return value of foo. So now we have the function bar, which takes one argument. We give it the argument 5. As we enter bar, y is bound to 5. And x? Is it an undefined argument, since it's not defined inside bar? No, it's bound *from before*, from when foo was called. So now we add x and y.

In the second call, we call foo with a different argument, thus x inside of bar receives a different value, and once the call to bar is made, this is reflected in the return value.

Well, that was easy. And to think I had to wait so long to clarify such a simple idiom. So what is all the noise about anyway? Think of it as a way to split up the assignment of variables. Suppose you don't want to assign x and y at the same time, because y is a "more dynamic" variable whose value will be determined later. Meanwhile, x is a variable you can assign early, because you know it's not going to need to be changed.

So each time you call foo, you get a version of bar that has a value of x already set. In fact, from this point on, for as long as you use this version of bar, you can think of x as a constant that has the value that it was assigned when foo was called. You can now give this version of bar to someone and they can use it by passing in any value for y that they want. But x is already determined and can't be changed.

Posted in dysfunctional, en | 2 Comments »

when faced with ethical ickiness

April 16th, 2008

And by ickiness I mean a question that you don't have the answer to, but you nevertheless have a gut feeling one way or the other. For instance: should gay couples be allowed to adopt? Another example would be: should it be permitted to clone humans? Or how about the old favorite: should sex play in kinder garden be encouraged (which I have absolutely no answer to)?

These are questions which have no prior answer, because we've only just been faced with them for the first time (or for that matter, only now been willing to consider them). There are many questions like this which have no answer (yet), but which nevertheless raise a certain instinctive feeling in us that makes us prone to lean to one side. This icky feeling is a fear within us that "something bad will happen" if this new thing is allowed to happen, without knowing what we really are scared of.

Many such questions have received answers in the past. For example the question of whether a brother and sister should be allowed to marry has been settled on the basis that children of such parents are born with serious deformities. Therefore we have a rational answer, not merely a fear.

What not to do: alternative A

Do not take your unarticulated fear to draw the conclusion that your instinct must be correct, and therefore suggest banning or condemning the practice. This is a purely emotional response with no rational justification.

Do not further aim to strengthen your argument by associating yourself with a large group of people who share your unarticulated fear and has decided to "do something about it". The ignorance of a thousand is no more equivalent to wisdom than the fact that the sun is the center of our solar system was discovered by popular opinion.

Those who would rather pretend that certain new possibilities were never discovered will desire to ban these, so that we can go back to believing these things are not possible. And if it is banned, no one will be doing it, so we can live in this illusion we've created for ourselves.

What to do: alternative B

Resign yourself to the fact that certain questions have no answer at the moment, and that at any given time there will always be such questions. Your pretty little head will resist this, because this makes certain things undecidable. But it is nevertheless the quickest path to happiness, as you will soon see.

What to do: alternative C

Pursue the answer intellectually, and aggressively. Read up on the science that is happening in this field and the discourse that is taking place between interested parties. Once you go in depth you will begin to understand not just the issue, but also your own fear and what it really is you're worried about. This will then prevent you from choosing the emotional answer of alternative A, because you will no longer be able to convince yourself that a rational answer is optional.

The final, undisputed answer to certain questions may not come for a long time, not even in the span of your lifetime. But with every step that you veer closer to the truth you will have a better idea of what it's likely to be. Until the truth is actually discovered, you will regularly find yourself faced with alternative B.

Posted in en, issues | No Comments »

book signings - are they utterly stupid?

April 13th, 2008

So it turns out that authors have "book tours" (yeah, it sounds crazy, doesn't it?). You would think that everything they had to say was already in the book, but they do this to sell more books. They go around to various cities and they talk about their book and sometimes participate in panel discussions with other authors.

An integral part of this is the book signing. Now suppose you read a book that was very good and you really appreciate the ideas of this person and their ability to express them in such a way that they have. What benefit do you possibly see in having it signed by the author? First of all, their name is already on the book (the cover, in fact), so it's redundant. So what do you benefit from knowing that this person wrote their name on this paper? What difference does it make?

It's stupid celebrity worship every day of the week. I can sort of understand more how people ask sportsmen for autographs, because when you meet an athlete then you don't really have anything "of theirs" to keep. So even an autograph (which again is meaningless, who cares about the calligraphic skills of a sportsman? that's not what you admire them for) is something. With an author this is turned on its head, because the item being signed is the very work that you appreciate, so you already _have_ their best output in your hand.

Posted in en, observations | 5 Comments »

the original bloggers

April 1st, 2008

There is no real definition of blogging. When no one is telling you how many inches you have to fill and by what time, then you can use as much or as little time as you like, to write as long or as short as you like. Blogging is essentially the ultimate freedom of expression, since there are no constraints. It is therefore not possible to define what a blog is from some manual. The only way is to observe how blogs are written and see whether there is a certain style that is more common than others. And there is one. The most common style of blogging is reminiscent of one principle from the open source method: release early, release often. Most blogs are relatively short and don't try to take on many arguments. Which is why they are also not too hard to write. You can limit your scope to a small and comfortable size.

Of course, blogs are a relatively new thing. A lot of people now have a voice who prior to blogging didn't have a suitable channel. But there is one group of people whose blogging predates the internet era. They have been doing it for centuries, albeit not always with the same level of freedom that we do, and sometimes under great pressure. What they have on us, however, is an audience.

If you're not familiar with how Catholic Mass works, it's a bit like a tv show. The introductory clip and the credits are always the same, the commercials always come at the same time and that defines the structure. You say the same words, you sing the same songs. For some parts you sit, for some you stand, for some you kneel. Then there is the content portion which changes depending on the Catholic calendar. But this too is completely scripted and if you come back on the same day next year, you'll hear the same thing. The highlight of the Mass is a two piece segment. First comes the Gospel, and for this you have to stand, unfortunately. This is a reading from the New Testament (boy they should give them more books to choose from) selected by the priest. Then comes the blog, or sometimes called "the sermon".

The sermon is a relief, as you can finally sit down. It's also a long segment, which means you can doze off a bit. Once when I was a kid I was so bored during the sermon that I actually fell asleep, hit my head on the bench in front of me. Whether you're a sinner or not, Sunday Mass is like your weekly purgatory. Getting out of going used to be my highest priority goal. One loophole is to attend Mass on a weekday, because the sermon is only given on Sundays, which makes Mass half an hour instead of an hour. You can then do away with your weekly guilt trip and on paper you're clean.

Anyway, now that I look back on it, the blog is actually the only part of the Mass that I would keep today. Certainly the only part that might be interesting to non-Catholics. It's basically a blog being read to you. The extent to which this is interesting depends on how smart your priest is. And a lot of priests are smart. I don't know if that's a qualification, but if you had a very dim individual, the people in the audience (especially the smarter ones) wouldn't want to listen to his drivel and would go to another church.

The blog is unique in the Mass in that it's a complete freestyle event. And priests are typically so bored with all the rituals that they embrace this opportunity to talk about something of their own choosing. This is the only time you'll hear the priest talk to you in his own words. In the church that I used to go to, the blogs were exclusively uplifting messages and had no religious content in them. They would generally be stories and anecdotes about people that would make you think, and whose message was to be a nice person and treat people nicely. It's really quite a nice thing to do, fill people every week with a good spirit, and a positive outlook. As I got older, and not long before I decided that I had done enough church going for one life, I started to appreciate the blogs a lot. I felt they had a positive influence on me, just as I gradually felt less and less attached to the church. In fact, some people consider this Sunday blogging on equal terms with reading a good book or watching a good movie. It gives you something to think about. In fact, a few have taken it so far as to occasionally wander into a Protestant church (naughty!) thinking the blog there might be more interesting.

So these guys (Catholic church is very conservative, no women priests. Protestants have them, though) have a real tradition for blogging that goes back a long time. Priests have been up there every Sunday (good thing Mass on weekdays doesn't have it, or they'd have to write a new blog every day, although many current blogging 'experts' recommend this) carrying that torch. As a matter of fact, since they are just blogs, they could just as well be posted online as well. I don't know if anyone is doing this, but it would be nice to share that creativity with the rest of the world. And it would allow the audience to post comments, something that is frowned upon in church (what did you say about the man who had an accident? that didn't really make sense).

What's interesting is that we have now started doing the same thing that they have been doing all these years. And I don't think we really intended to imitate, did we? Think about how cool it would be to do a guest blog in church. :cool:

Posted in en, observations | No Comments »

M	T	W	T	F	S	S
« Apr				May »
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30