Mon, 12 Dec 2011
Twisted high scores
Living in the Boston area, I've had the chance to spend time with the lovely maintainers of the Twisted project.
Twisted is an event-driven network programming framework in Python. It's also a community of people for whom software is never good enough -- and they're right.
I visited the Twisted November sprint at the Smarterer.com office a few weeks ago and reviewed a ticket. So now I am on the Twisted high scores list for November!
It was one of the most rewarding short periods of time I've ever spent contributing to an open source project. I took someone's contribution and turned it into a patch, and also gave some feedback. This counted as reviewing a ticket, for which I was immediately and strongly socially rewarded: J.P. (exarkun) turned to me and say, "Thanks for contributing to Twisted. An IRC bot pinged me with a note saying my ticket review was complete. And now I appear in the high scores list for November!
[] permanent link and comments
Mon, 06 Jun 2011
How much do I charge?
A conversation between me and a wiser housemate, when I lived in San Francicsco.
[] permanent link and comments
Sun, 24 Apr 2011
Announcing the Scala Crash Course (for women & their friends)
Here is an email I just sent to the email list for PHASE, the Philly Area Scala Enthusiasts:
[] permanent link and comments
Thu, 24 Sep 2009
Award for the best clickable button in a mobile app
I just saw a screenshot of one of my favorite Android apps - at least, as far as user interface design goes.
A simple interface, and a single button that creates a Blue Screen of Death. My compliments to the chef.
(Found via [http://forum.xda-developers.com/showthread.php?t=563891 the XDA-Developers forum. It appears to be a UI-improved version of someone else's vulnerability tester called BSODroid.)
[] permanent link and comments
Sat, 25 Apr 2009
Explainer: "Why do some URLs have www in them, and what difference does it make?"
Katy (who I know from the CC internship in 2006) asked me this question recently:
To understand, I have to explain how a browser gets a web page from the Internet. When a browser is asked to load a URL like <a href="http://www.asheesh.org/scribble/enlightened-but-confused.html> http://www.asheesh.org/scribble/enlightened-but-confused.html</a>, it breaks it apart into components.
HTTP, the "scheme", tells the browser what protocol (or network language) to speak when it requests the page from the server.
The domain name is where things get interesting. This alone tells the browser who to ask for the page. The browser looks up www.asheesh.org in the domain name system, an Internet phone book service that converts names to numbers (so-called "IP addresses"). Once it knows the IP address for that name, it connects to it and prepares to speak HTTP.
The browser connects to that IP address, and asks (in the network language of HTTP):
So now, let's think about how http://www.asheesh.org/ and http://google.com/ differ: Their scheme is the same, and their path is the same. But the domain name is different.
The same is true for http://asheesh.org/ and http://www.asheesh.org/. You get the same content because, as luck has it, the administrator for asheesh.org is the same as the administrator for www.asheesh.org, and I decided to make them work the same way.
For some websites, if you add the www component, you do get different contents back: for example, http://cs.rochester.edu/ does not load, whereas http://www.cs.rochester.edu/ does.
So the final answer to Katy's question: You're lucky you ever get the same page for two URLs that are different, even if just by "www".
[] permanent link and comments
Tue, 21 Oct 2008
Hixie Limerick
RDFa is a standard now. Meanwhile, Ian Hixie is considering how (and if) to make this part of HTML5. To celebrate, I wrote this Limerick as a vision from the future.
[] permanent link and comments
Sun, 19 Oct 2008
Packaging, and other joys of Debconf
I was trying to explain to my friend Emily (C.) some of the fun things about Debconf.
On one of the first days I attended, I was standing around while some people I didn't yet know discussed piuparts, an automated Debian package tester.
At this point when talking to Emily, I thought, Maybe I shouldn't bother explaining what piuparts is. If I do explain it, it will make me much more interested in the telling of the story, as well as let her make sense of the story. Or I could be vague to avoid boring her, but then I'll bore myself by only teling the skeleton of a story.
I know Emily well enough that she'll forgive me boring her, I decided. So I'll give it a try.
"The bulk of work in Debian is packaging, which means finding up-to-date open source software and bundling it up into a nice installer," I began. "Windows installers, if you're lucky, will create an entry in Add/Remove Programs. But Debian installers, to comply with Debian Policy, have to do a lot more."
"Let's say you already had the Safari web browser installed and you wanted to install Google Chrome, their new browser based on the same core as Safari. When you upgrade Safari, it would be nice if Google Chrome also benefitted from the upgrade."
"In Debian, it would." I continued with another obscure fact about the Debian Policy. "Another element of the Policy is that when a package is fully removed, it must leave no files and leave no programs running."
Suddenly she was interested! For a moment I didn't understand why. Then I realized what I had said: Something I take for granted in Debian, the "leave no trace" element, is something Windows users often wish they had.
I continued, "There is an automated tool called piuparts which takes packages, creates a small virtual install of Debian, run the package's installer, the uninstalls it and verifies that the package does in fact leave no trace."
Explaining the rest was easy: The first day I was at Debconf, I ran into some people discussing piuparts. Lucas explained he was slow to program in Python, the language piuparts is written in, and Emily rightly picked up on the fact that Python is my favorite programming language. Lucas explained that piuparts needed a machine-readable report format so that you could automatically run it on the whole Debian archive and get a list of which packages have problems. I volunteered to add that.
After a few days of hardly working on this, I finally was sitting with some new friends Thursday night. They left, and I worked on everything I could possibly justify working on. Then it was 5 a.m., and I knew there was no more time to waste if I wanted to actually finish the modification to piuparts. So I began it, ate breakfast, and finished it.
It was really great having a comfortable environment to work all night in. It was even better that I had people to stay up late talking to about geeky things that came from a shared interest in programming, system administration, and Free Software principles. When people left, there was always a great reason to stay awake: more great people to talk to, or finally the assignment I gave myself at the start of Debconf. I had that joyous feeling from the people every evening at Debconf, and Thursday night the feeling brought me all the way to morning.
[] permanent link and comments
Wed, 15 Oct 2008
Mining Wikipedia for style edits
I had an old project that never quite succeeded to mine Wikipedia for style edits — through this, one could learn what made a style improvement, and attempt to generalize that to other texts. Think of it as the minimal boostrapping of a purely-statistical "grammar checker." It was originally to be my masters project under the awesome Jason Eisner. (Instead, I contributed to a storage layer branch of Dyna's compiler.)
It has some nice filters (written using SAX, so they don't chew up all your RAM for pages with 2GB of history) for filtering down MediaWiki page dumps into just what we want, also optionally modifying their text so that the new outputted versions contain the results of data processing.
The code also has some hilarious (and useful!) Makefiles that treat a bunch of heterogenous computers as a compute cluster. The higher "-j" you pass into make, the more machines it will SSH into, copy your code onto, run your code, and rescue the output.
I'm putting in my git repository having dug it out of my JHU NLP Subversion repository. Check it out in my gitweb.
It's not the prettiest thing ever....
[] permanent link and comments
Sun, 28 Sep 2008
Colorizing standard error: Adventures in LD_PRELOAD
Kristian again asked an interesting question on the SF-LUG mailing list. This time, it was: "How can one get stderr and stdout to appear in different colors?" He was asking on behalf of someone, in turn on behalf of a Java programmer.
I thought about this and discussed it with Jesse Zbikowski, who I happened to be sitting next to at the Tenderloin Computer Help Day that Christian Einfeldt invited the list to (which turned out to be a lot more interesting and orderly than I had imagined!).
Jesse and I talked and we thought of named pipes, which Jesse got to work on and produced a nice Perl tool for. I thought about LD_PRELOAD and got off to a few false starts, and finally came up with a tool I called stderred (tarball of v1.2). It includes a demo program in Java and a README.
LD_PRELOAD
LD_PRELOAD wrappers are a way to change the way a program executes by replacing library functions, like write() or gettimeofday(), with your own homebrew versions. You can think of the dynamic linker as allowing you to stack your own things "above" the C library, but "below" the actual program that runs. So in looking for a symbol (a function name, typically), the program searches down until it finds it, and uses that.
"stderred" is a C program and a Makefile that you can demonstrate works properly; it includes a sample Java program and a README. Because it intercepts the Java JRE's calls to write() to write out messages to stdout, stderr, or whatever, and only modifies the ones to stderr, it should be safe to use everywhere. Plus there are no race conditions; it runs right in the context of the program, so it also avoids the performance penalty of context switches.
This LD_PRELOAD wrapper is interesting, I think, because (thanks to Eric Northup for the idea) it calls the real system write() function by yanking it out of libc using dlopen()+dlsym(). I was also (you can see this in the first few revisions) trying a #define hack to get access to libc definitions without the real symbols; however, this failed a link-time. I don't see how it could work.
The problem with named pipes: Buffering can change the order of outputted lines
Jesse pointed out to me that the named pipe approach has a serious buffering issue related to timing: if the process writes to stderr and stdout in quick succession, the lines could appear colorized in the wrong order. Jesse shows me some variations of his script that changed which wrong order it generated, but we couldn't quite figure out how to make it always right. This seems like a race condition to me.
That's because when the named pipe in question is read from, the Perl script doesn't know *how much* to read. So in this case:
one line to stderr one line to stdout one line to stderrAfter Jesse explained this to me a few times, I understood it would get printed as either:
one line to stdout one line to stderr one line to stderror the same with stderr's lines on top. Note that the interweaving is gone; this is because the information of how *much* was printed each time is thrown away by the OS. Because the read()s are happening out-of-process in both the ZSH and Perl ways to do this, I don't see how they could get around this issue. An implementation based on select() or epoll() would have the same issues, I believe.
Why my solution doesn't work for "ls"
stderred is as simple as it is because it only overrides write(). The JRE only seems to use write(), not any of the helper functions like straight-up printf(), or error(), or fprintf(), that also write to file descriptors. Unfortunately, if you try to stderred-ify "ls", none of stderr appears red! That's because ls uses fprintf_unlocked() and error(), which themselves *inside libc* call write().
If you think of ls as standing on top of a library stack that looks like this:
ls [stderred] [libc]if you know that symbol resolution only looks "down," it's clear that the functions *inside libc* don't go back *up* to stderred to find my hacked write(). So they use the libc write(), which doesn't colorize.
Therefore, I started down the long road of modifying "all the important" functions to colorize if the output was going to stderr. Trying to colorize "ls" is where I started, so I wrote quite a few of those before actually checking what Java used. "ls" nearly gets colorized properly; you can look through the with_error branch for the latest work down that path. But I stopped once I figured out Java seems okay with just write(), and for cleanliness's sake I left that out of the released version (currently 1.1). Patches welcome!
zsh, python, and further reading
According to the Gentoo-Wiki, zsh users have an easy way to enable colorizing stderr. Knowing little about zsh but something about UNIX, it seems to me when they fork to run the new program, they close() fd #2 (stderr) and open it as a pipe to this program. I don't see how they solve the races brought up by the Perl thing; it seems to me they'd have the same race.
This is the same path that Jesse and I started down in the beginning; we read http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-3.html and noticed it didn't discuss setting stderr to a pipe, and then we talked about named pipes....
The Pythonic way to do this would have been to "simply" globally override what "sys.stderr" is. I don't know if such a thing is possible in Java.
You can read a quick tutorial on LD_PRELOAD in the IBM DeveloperWorks article, "Override the GNU C Library -- Painlessly." You can read a lot more about dynamic linking in the exhaustive "How To Write Shared Libraries" by Ulrich Drepper.
[] permanent link and comments
Fri, 06 Jun 2008
Build failure
A package I am working on fails to build. Mako helps me understand why:
[] permanent link and comments