Skip to main content.

Fri, 04 Dec 2015

How this website gets published

In case you're interested, here are the technical details on how this website gets published.

That's a large collection of things, but I like how they fit together. And, moreover, the fact that you can see this content means that the whole pipeline is working!

[] permanent link and comments

Sat, 02 Nov 2013

Censored on Facebook

For the first time in what feels like years, I wanted to share something with my friends on Facebook.

The background was that I read a note on Slashdot that Linus Torvalds thought a presidential candidate's remarks on a topic related to airline security were "moron"ic. So I did my own research, and I disagreed. I figured this was a topic of general enough interest that all my Facebook friends might be interested in knowing my position, so I wanted to share that.

Facebook didn't let me.

I tried first with a link to snopes.com, which blocked me with the rationale that http://snopes.com/images/template/snopes.gif is "spammy or unsafe":

You can't post this because it has a blocked link. The content you're trying to share includes a link that's been blocked for being spammy or unsafe. http://snopes.com/images/template/snopes.gif For more information, visit the Help Center. If you think you're seeing this by mistake, please let us know.

Then I thought I'd be clever, and I linked to the .nyud.net version of the snopes page on the topic. I earned the same message that my post included a blocked link.

So then I tried again, with a link to a video on YouTube of the same clip.

That's when I first got the extremely generic message that "The message could not be posted to this Wall." You can see the animation of what happened next by hovering below.

Finally, I removed all the links, and kept the first bit of text. For this, I got the same generic error: "The message could not be posted to this Wall."

Update: Patrick points out I should link to the actual video. Here it is, embedded:

(BTW: The first thing I did was to click "let us know" to indicate that I think I'm seeing this by mistake. I filled out the form to indicate there was a problem in an honest, respectful way. I got back an email autoresponse that said, "Thanks for taking the time to submit this report. While we don't currently provide individual support for this issue, this information will help us identify bugs on our site.")

[] permanent link and comments

Tue, 11 Jun 2013

De-spammed this blog (with Naive Bayes)

This morning, I was trying to decrease the amount of email in my inbox. I had a few messages with subjects like:

But all the comments in this case were spam. I'm using an Akismet API plugin for pyblosxom, but that has a few shortcomings. Like anything else, it misses some spam, but moreover, it doesn't help me find and remove old spam comments in bulk.

My pattern with email is basically to ignore it for a while, and then deal with it in bulk, sometimes missing messages from the past. The result is that I have often missed these comment notifications, and it was a bit of a drag to figure out which comments I had dealt with already.

So I wrote a small tool this morning. Here is how it works:

Voila! A spam moderation queue with artificial intelligence.

You can find it here, on my Github account: https://github.com/paulproteus/spambayes-pyblosxom

Permission to re-use the code is granted under the terms of CC Zero or Apache License 2.0, at your option.

Moreover, now I believe there are zero spam comments left lying around this blog!

[] permanent link and comments

Fri, 08 Feb 2013

Notes from attempting to despam a wiki with git-remote-mediawiki

I just tried to despam a mediawiki instance with git-remote-mediawiki.

The idea is as follows:

git log --since='Wed Dec 5 22:57:06 2012 +0000'

(You can process that with either 'grep ^Author' and so on, or you can use an overwrought Python script I wrote.)

git log --author=bad_user_1 --author=bad_user_2 --pretty="format:%H"

Here's where things start to go wrong.

You might try to revert them all:

git log --author=bad_user_1 --author=bad_user_2 --pretty="format:%H" |
xargs -n1 git revert

That works great until the first merge conflict.

So then you write a wrapper script that does "git revert $1 || git revert --abort", and you can still only revert the first few hundred (out of ~800) spam edits because one of the commits causes a conflict when you try to revert it.

Why a conflict? I suspect it's because there are spam edits that I neglected to include in the revert stream. (Update: The conflict was actually a real conflict -- some kind soul on the web had already reverted a bunch of the spam edits!)

In our case, there are fairly few pages getting spammed, so it'd be simpler to 'git log' the pages we care about and revert back to the commit IDs that look clean. 'git revert' could still be useful in the case of tangled history, but (apparently) there is a limit to how useful it can be, anyway.

Oh, also:

It'd be useful to be able to create MediaWiki dump files from git-remote-mediawiki exports. That way, I could use 'git rebase -i' to clean up history. (That would break links *unless* the MediaWiki revision IDs somehow stayed constant for the revisions with the same content. Maybe that's feasible. Actually, the simplest way might be to write a tool that filters the dump file itself, rather than exporting straight from git-remote-mediawiki.)

Also also, I fixed a format string bug in git mergetool, one of my favorite little pieces of git.

P.S. In this corpus, of the IP address editors (i.e., not logged in), 0 (of 16) are spammers. About 80% of the logged-in editors are spammers. (Admittedly our wiki does require you to log in if you are posting new URLs to a page.)

Update: It is way faster if you run it with low latency to the MediaWiki server in question. It probably could be adjusted to make fewer API calls, and to make more of them in parallel.

[] permanent link and comments

Sun, 29 Jul 2012

Twisted high scores

Living in the Boston area, I've had the chance to spend time with the lovely maintainers of the Twisted project.

Twisted is an event-driven network programming framework in Python. It's also a community of people for whom software is never good enough -- and they're right.

I visited the Twisted November sprint at the Smarterer.com office a few weeks ago and reviewed a ticket. So now I am on the Twisted high scores list for November!

It was one of the most rewarding short periods of time I've ever spent contributing to an open source project. I took someone's contribution and turned it into a patch, and also gave some feedback. This counted as reviewing a ticket, for which I was immediately and strongly socially rewarded: J.P. (exarkun) turned to me and say, "Thanks for contributing to Twisted."

An IRC bot pinged me with a note saying my ticket review was complete. And now I appear in the high scores list for November!

[] permanent link and comments

RHEL 7 will (probably) have GNOME 3

While chatting with Greg Price earlier this evening about the coming Linpocalypse, I said something I wanted to research. Upon further review, it seems that Red Hat Enterprise Linux 7 will ship GNOME 3.

You can see a video of Jonathan Blandford talking about it, where he says:

And then looking forward to the future, RHEL 7.... We're giving demos of GNOME3, and the new desktop is a huge change; we're doing some pretty exciting things there. So if you're interested in it, please come by and take a look!

"CubedRoot" on fedoraforum.org did take a look, and (s)he wries:

I went down to the Partner Pavillion and spent over an hour with Jonathan and the RHEL7 demo they had running. Besides a new wallpaper (Which was very beautiful BTW) they were running Gnome 3.5 on the demo, an the only other major changes were a few more account service providers and chat plugins (like Sametime, Yahoo, and stuff). It did not handle multi-monitors with different resolutions worth a flip (but it is beta after all). This is when I asked him if they planned on putting XFCE or LXDE or even Cinnamon in the Extra's channel, and they very confidently said they would not be in there. They had no plans to offer them.

[] permanent link and comments

Mon, 06 Jun 2011

How much do I charge?

A conversation between me and a wiser housemate, when I lived in San Francicsco.

Asheesh: "Hey, so I'm going to do some consulting work. It's the first time I've done this as a professional, like, not a student. How much should I be charging per hour?"

Matt: "It's easy; do what I do. Think of the biggest amount you can ask for with a straight face, then double it."

[] permanent link and comments

Sat, 23 Apr 2011

Announcing the Scala Crash Course (for women & their friends)

Here is an email I just sent to the email list for PHASE, the Philly Area Scala Enthusiasts:

First, let me introduce myself: Hi, everybody!

I'm Asheesh Laroia, as the "From:" header on this email message indicates. I'm also a Python user, former ocaml user in college, and Debian developer. (Maybe none of those will make me any friends in PHASE....)

I'm writing this because Yuvi and I are planning an event on the Friday evening before Scalathon: a low-cost Scala crash course.

http://www.meetup.com/scala-phase/events/17397558/ says a little more; I can, too:

My goal (I won't claim to speak for Yuvi) is to help encourage a diverse Scalathon that brings new people into the community. By being a "crash course" for people who already know some Scala or another functional programming language, we aim to select for people who can contribute during the big hackathon on so many Scala projects during the weekend. By insisting that attendees of the crash course attend Scalathon, we hope to use the time to enrich the Scalathon event.

The event has a stipulation that I'm borrowing from an effort called RailsBridge: to attend, you must either be a woman, or find a woman who will bring you as her guest. The idea here is to simultaneously make sure anyone who wants to can attend, while also inviting women to join Scala-based communities.

The event is separate from Scalathon; it's an effort to feed people into Scalathon.

If you are a woman who is thinking about attending Scalathon but wants to make sure you have time to sharpen your Scala chops, this event is for you. If you know such a person, please send her a copy of this email. If you're a man in the same situation, we hope you can find the woman in your life who is, too, so that she can invite you.

So -- that's what I'm working with Yuvi.

A bit more personally, I met Yuvi because I work on an open source community outreach website called OpenHatch: http://openhatch.org/. He and I met at a meet-up for that site last year, when I lived in West Philly (on a beautiful street called Hazel). He and I organized a Penn-based open source hackathon you can read about at http://opensource.com/life/10/11/introducing-students-world-open-source-day-1 . Right now I'm based in the Boston area, in Somerville, MA.

I'd love to hear what you all think.

--
-- Asheesh.

http://asheesh.org/

FORTUNE PROVIDES QUESTIONS FOR THE GREAT ANSWERS: #4
A: Go west, young man, go west!
Q: What do wabbits do when they get tiwed of wunning awound?

[] permanent link and comments

Thu, 24 Sep 2009

Award for the best clickable button in a mobile app

I just saw a screenshot of one of my favorite Android apps - at least, as far as user interface design goes.

A simple interface, and a single button that creates a Blue Screen of Death. My compliments to the chef.

(Found via [http://forum.xda-developers.com/showthread.php?t=563891 the XDA-Developers forum. It appears to be a UI-improved version of someone else's vulnerability tester called BSODroid.)

[] permanent link and comments

Sat, 25 Apr 2009

Explainer: "Why do some URLs have www in them, and what difference does it make?"

Katy (who I know from the CC internship in 2006) asked me this question recently:

Why do different pages show up depending on whether there's a www or not in the URL?

To understand, I have to explain how a browser gets a web page from the Internet. When a browser is asked to load a URL like <a href="http://www.asheesh.org/scribble/enlightened-but-confused.html> http://www.asheesh.org/scribble/enlightened-but-confused.html</a>, it breaks it apart into components.

HTTP, the "scheme", tells the browser what protocol (or network language) to speak when it requests the page from the server.

The domain name is where things get interesting. This alone tells the browser who to ask for the page. The browser looks up www.asheesh.org in the domain name system, an Internet phone book service that converts names to numbers (so-called "IP addresses"). Once it knows the IP address for that name, it connects to it and prepares to speak HTTP.

The browser connects to that IP address, and asks (in the network language of HTTP):

So now, let's think about how http://www.asheesh.org/ and http://google.com/ differ: Their scheme is the same, and their path is the same. But the domain name is different.

The same is true for http://asheesh.org/ and http://www.asheesh.org/. You get the same content because, as luck has it, the administrator for asheesh.org is the same as the administrator for www.asheesh.org, and I decided to make them work the same way.

For some websites, if you add the www component, you do get different contents back: for example, http://cs.rochester.edu/ does not load, whereas http://www.cs.rochester.edu/ does.

So the final answer to Katy's question: You're lucky you ever get the same page for two URLs that are different, even if just by "www".

[] permanent link and comments