Archive for May, 2009

ruby compiler series: annotated git history

May 13th, 2009

I've been reading along with Vidar Hokstad's rather excellent Writing a compiler in Ruby bottom up. It's a 20 part (so far) series documenting his effort to hack together a Ruby hosted compiler that in the end will compile a language similar to Ruby into x86 assembly.

Compilers are complicated beasts that take a lot of planning to build. Now I'm not saying Vidar didn't do all the planning, but what makes this series especially palatable is the fact that he's writing it literally bottom up, through what you might call evidence based hacking. That is, compile the very simplest thing you can (starting with an empty ELF binary), and then see what gcc produces. From there on, add print "Hello World" and see how the code changes and so forth, adding new constructs. This means you can read along even if you don't know any assembly (like yours truly) and take it in small steps without first having to absorb the complexity of a whole compiler.

It's a great learning opportunity, seeing as how each step is a working compiler one iteration up. You can read along Vidar's blog with the git diff side by side and see how the assembly is changing. To make this a bit clearer I've forked his repo and annotated the early commits with tags (where they were missing) and made sure the customary make run / make clean work as expected. I've also added some commit messages that tell you exactly what the iteration achieves at each particular step, so you can browse the history and figure out say "where do I look to see how to do a while construct".

I've annotated the first 15 steps (the rest were already tagged):

compiler-in-ruby

Given how git works, where every commit is hashed from the sum total of the previous ones, the only way I could do this is by rewriting the git history, which is not ideal. So Vidar's commit objects won't match mine, but all I've done is cherry pick them off his branch and add some annotations. It's all still there. :)

repeatable one off builds with 'emerge'

May 4th, 2009

I have to say I do actually use quite a lot of software. Not all the time, but I like to try things. I'm not one of those "email and web" people that you hear about. And because I like to be able to run some application that's not widespread without it causing me any extra hassle I'm really only interested in the distros that have the largest repos. Such as Gentoo and Ubuntu.

Even so, there are those package requirements that fall through the cracks. It's hard to get a hold of bleeding edge builds sometimes. Even if there are some dedicated packagers for a particular project working for a particular distro, these ad-hoc efforts don't cover everything. Not to mention the fact that unofficial builds on Ubuntu are not necessarily easily reversible. I spent some time recently working on a project in Mono, and there's just no real option for me to get close-to-svn builds. Even Gentoo is quite far behind and I never have any luck with the absolute latests ebuilds.

Hence the svn builds.

In any case, there will always be times when custom builds are necessary. After all that's what developers do: take on the scary world of unstable so that in the end we can make sure we ship something users can run both on the current version of their distro, and (hopefully) for some time to come.

To that end I wrote a messy perl script to do mono svn builds. That was over a month ago. I was looking to add a feature just the other day and it hit me just how awful perl is. Every time I come back to it after a period of absence I feel like a stranger in the valley of death, with the auto expanding arrays and other land mines. It's not the first time I've written one of these build scripts, I need them from time to time while I'm on some project. And I often end up with a couple bash scripts so I can reproduce these installs later on.

But do I really want to be writing these ad-hoc installers every time? That's the thing about one off builds, they're not really one off. Often you want to be able to reproduce it.

Nothing whatsoever to do with Gentoo portage!

I know I should have picked a different name, but the parallels are so clear that I just couldn't resist. emerge is a simple package installer. Emphasis on installer, it does not manage packages, it does not even keep track of what has been installed. No notion of versions, no safety checks, just the user supplied shell commands wrapped in a nicer package.

  • Phases: fetch, configure, build, install
  • Other actions: list, search, pretend
  • Switches: nodeps, revision
  • Fetch support: git, svn, archive (tar,gzip,bzip2,zip)

Here's the idea:

  1. Write a build file containing a number of packages.
  2. Record dependencies between them.
  3. Use emerge to run any combination of actions on them.

It's really that simple. To use the running mono example, here's a taste:

emerge_search

The build file is an optional argument if there's only one in the current directory. Most of the command line options are identical to portage:

emerge_opts

And if you want to just see the depgraph that works too. Just don't put cycles in it or weird things will happen.

emerge_pretend

And if you need to set certain environmental variables during building/running (so that the package doesn't meddle with your system) it supports that too.

Build files

Build files are basically just a listing of the shell commands necessary to build the package. I workshopped the idea of writing them in YAML, since it's a nice and compact format. But I decided against it, because it's not really buying me anything, and since YAML's grammar is rather quirky the parse errors are harder to understand than Python's. So the build files are just regular Python modules. Here's an excerpt from mono.py:

src_path = "/ex/mono-sources"
ins_path = "/ex/mono"
svnbase = "svn://anonsvn.mono-project.com/source/trunk"

conf = "./autogen.sh --prefix=%s" % ins_path
build = "make"
install = "make install"

env = os.environ.get

for v in ("libgdiplus", "mcs", "olive", "mono", "debugger", "mono-addins",
          "mono-tools", "gtk-sharp", "gnome-sharp", "monodoc-widgets",
          "monodevelop", "paint-net"):
    exec("%s='%s'" % (v.replace("-", "_"), v))

project = {
    "src_path": src_path,
    "ins_path": ins_path,
    
    "environment": {
        "DYLD_LIBRARY_PATH": "%s/lib:%s" % (ins_path, env("DYLD_LIBRARY_PATH","")),
        "LD_LIBRARY_PATH": "%s/lib:%s" % (ins_path, env("LD_LIBRARY_PATH","")),
        "C_INCLUDE_PATH": "%s/include:%s" % (ins_path, env("C_INCLUDE_PATH","")),
        "ACLOCAL_PATH": "%s/share/aclocal" % ins_path,
        "PKG_CONFIG_PATH": "%s/lib/pkgconfig" % ins_path,
        "XDG_DATA_HOME": "%s/share:%s" % (ins_path, env("(XDG_DATA_HOME","")),
        "XDG_DATA_DIRS": "%s/share:%s" % (ins_path, env("XDG_DATA_DIRS","")),
        "PATH": "%s/bin:%s:%s" % (ins_path, ins_path, env("PATH","")),
        "PS1": "[mono] \\w \$? @ ",
    },

    "packages": {
        libgdiplus: {
            "svnurl": "%s/%s" % (svnbase, libgdiplus),
            "configure": conf,
            "build": build,
            "install": install,
        },
        mcs: {
            "svnurl": "%s/%s" % (svnbase, mcs),
            "deps": [libgdiplus],
        },
        olive: {
            "svnurl": "%s/%s" % (svnbase, olive),
            "deps": [libgdiplus],
        },
        mono: {
            "svnurl": "%s/%s" % (svnbase, mono),
            "configure": conf,
            "build": "make get-monolite-latest && %s" % build,
            "install": install,
            "deps": [libgdiplus, mcs, olive],
        },

The only thing that gets read is the dict in the global scope called project. And then it has an optional member called environment, and the packages declared under packages. It should be pretty easy to figure out. If src_path isn't set fetching defaults to /tmp.

Anyway, at about 500 lines it's decent bang for buck I think. It's a classic 80/20 effort, getting 80% payoff for 20% effort.

Love it or hate it, here's the goodies:

Introducing SolarBeam

May 2nd, 2009

I don't usually code gui. It takes too much work to do it right and I can generally get a perfectly usable interface on the cli with much less effort. But exceptions can be made. These past few months I've been busy with a c# project written in Windows Forms. It's not a big application, but it's bigger than the kind of stuff I do most of the time, about 10kloc. I did quite relish this opportunity to try out Mono, which I've been meaning to do for some time.

solarbeam_diagrampane

The subject matter is solar diagrams. I guess there is no precise meaning for this phrase but I use it to talk about charts of the Sun's trajectory over the Earth. You have to be able to compute where the Sun is at any given time, and then you can draw these charts. Here's one for Utrecht.

solarbeam_diagram

The axis along the radius is the elevation. The other axis, along the circumference, is called the azimuth. With those two facts you can express where the Sun is in the sky. And the diagram depicts that trajectory with you, the observer, standing in the center. The application also tells you a few other related details, like the time of sunrise and sunset, dawn and dusk.

Technically speaking, it's a portable app, so you can plop it down anywhere on the file system and it runs. This was one of the key requirements actually, so that people can run it on their work machines or in labs, which tend to be restricted environments.

Diagrams can also be saved to image and scaled. Considering how much computation is going on there, I'm quite pleased with how responsive it turned out. There's also a funky clickable map and a collection of 2000 predefined locations, complete with timezone information.

solarbeam_map

Visit the project: