Image of the glider from the Game of Life by John Conway
Skip to content

Use wget(1) To Expand Shortened URLs

I'm a fan of all things microblogging, but let's face it: until URLs become part of the XML, and not part of your character count (which is ridiculous anyway), shortened URLs are going to be a way of life. Unfortunately, those shortened URLs can be problematic. They could host malicious scripts and/or software that could infect your browser and/or system. They could lead you to an inappropriate site, or just something you don't want to see. And because these URLs are a part of our microblogging lives, they've also become a part of our email, SMS, IM, IRC, lives as well as other online aspects.

So, the question is: do you trust the short URL? Well, I've generally gotten into the habit of asking people to expand the shortened url for me if on IRC, email or IM, and it's worked just fine. But, I got curious if there was a way to do it automagically, and thankfully, you can use wget(1) for this very purpose. Here's a "quick and dirty" approach to expanding shortened URLs (emphasis mine):

$ wget --max-redirect=0 -O - http://t.co/LDWqmtDM
--2011-10-18 07:59:53--  http://t.co/LDWqmtDM
Resolving t.co (t.co)... 199.59.148.12
Connecting to t.co (t.co)|199.59.148.12|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://is.gd/jAdSZ3 [following]
0 redirections exceeded.

So, in this case "http://t.co/LDWqmtDM" is pointing to "http://is.gd/jAdSZ3", another shortened URL (thank you Twitter for shortening what is already short (other services are doing this too, and it's annoying- I'm looking at you StatusNet)). So, let's increase our "--max-redirect" (again, emphasis mine):

$ wget --max-redirect=1 -O - http://t.co/LDWqmtDM
--2011-10-18 08:02:12--  http://t.co/LDWqmtDM
Resolving t.co (t.co)... 199.59.148.12
Connecting to t.co (t.co)|199.59.148.12|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://is.gd/jAdSZ3 [following]
--2011-10-18 08:02:13--  http://is.gd/jAdSZ3
Resolving is.gd (is.gd)... 89.200.143.50
Connecting to is.gd (is.gd)|89.200.143.50|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://wiki.ubuntu.com/UbuntuOpenWeek [following]
1 redirections exceeded.

So, in this case, the link finally points to https://wiki.ubuntu.com/UbuntuOpenWeek. I'm familiar enough with the Ubuntu Wiki, that I know I should be safe visiting the initial shortened URL. If you want to add this to a script or shell function, then you can get a bit more fancy:

$ expandurl() { wget -O - --max-redirect=$2 $1 2>&1 | grep ^Location; }
$ expandurl http://t.co/LDWqmtDM 1
Location: http://is.gd/jAdSZ3 [following]
Location: https://wiki.ubuntu.com/UbuntuOpenWeek [following]

In this case, our "expandurl()" function takes two arguments: the first being the URL you wish to expand, and the second being the max redirects. You'll notice further that I added "-0 -" to print to STDERR. This is just in case you give too many redirects, it will print the content of the page's HTML to the terminal, rather than saving to a file. Because you're grepping for "^Location", and sending the HTML to your terminal anyway, technically you could get rid of the "--max-redirects" altogether. But, keeping it in play does seriously increase the time it takes to get the locations. Whatever works for you.

UPDATE (Oct 18, 2011): After some comments have come in on the post, and some discussion on IRC, there is a better way to handle this. According to the wget(1) manpage, "-S" or "--server-response" will print the headers and responses printed by the FTP/HTTP servers. So, here's the updated function that you might find to be less chatty, and faster to execute as well:

$ expandurl() { wget -S $1 2>&1 | grep ^Location; }
$ expandurl http://t.co/LDWqmtDM
Location: http://is.gd/jAdSZ3 [following]
Location: https://wiki.ubuntu.com/UbuntuOpenWeek [following]

Perfect.

{ 14 } Comments