Twitter changes the rules for URLs in tweets: the end of privacy or the end of the 140 character limit?

Twitter has decided that for our good and their own it would be better if any time you click a link in a tweet the request first went to Twitter before being redirected to the intended destination. This blog entry announces the decision, but a lot of the interesting details are hidden in the more technical description of the change sent to the Twitter developers mailing list.

Here is a quick analysis of the announcement and its ramifications.

The advertised benefits

For users:

  • Twitter scans the links for malware, offering a layer of protection
  • It becomes easy to shorten links directly from the Tweet box on twitter.com (or from any client that doesn’t have a built-in link shortening feature)

For Twitter:

  • they collect a lot of profiling data on user behavior, which can be used to improve the “Promoted Tweets” system (and possibly plenty of other uses)

You don’t have to be much of a cynic to notice that the user benefits are already available today for those who care about them (get a link scanner on your computer, get a Twitter client with built-in link shortening) while the benefit to Twitter is a brand new and major addition…

One interesting side-effect: the erosion of the 140 character limitation

Without going into the technical details of the new system, one change is that each URL will now “cost” 20 characters (out of the 140 allowed per tweet), no matter how long it really is. But in most cases the user will still see the complete URL in the tweet (clients may choose to display something else but I doubt they will, at least by default, except for SMS). So you could now see tweets like (line breaks added):

In the town where I was born / Lived a man who sailed to sea /
And he told us of his life / In the land of submarines, http://more.com/
So.we.sailed.on.to.the.sun/Till.we.found.the.sea.green/
And.we.lived.beneath.the.waves/In.our.yellow.submarine/
We.all.live.in.yellow.submarine/Yellow.submarine,yellow.submarine/
We.all.live in.yellow.submarine/Yellow.submarine,yellow.submarine.

Based on the Twitter proposal, clicking on this link would send you to a Twitter-operated link shortener (e.g. http://t.co/J7erFi3) which would then redirect you to the full URL above. The site (e.g. more.com in this example) could be trivially set up so that such URLs are all valid and they return a clean version of the encoded text (for the benefit of users of Twitter clients that may not show the full URL).

This long URL example may seem a bit overkill (just post 3 tweets), but if you are only short by 20 or 30 characters and just can’t find another way to shorten the tweet the temptation will be big to take this easy escape.

A cool new URL shortening domain

You may have noticed the t.co domain in the previous paragraph. Yes, it’s a real one. That’s the hard-to-beat domain that Twitter was able to score for this. Cute. But frankly I am tired of the whole URL shortening deal and these short domain names have stopped to amuse me. You?

Enforcement

How, you may wonder, can Twitter ensure that the clicks go through its gateway if the full URL is available as part of the Tweet? Simple: they change the terms of service to forbid doing otherwise. It’s interesting how the paragraph in the email to developers which announces that aspect starts by asking nicely “we really do hope that…”, “please send the user through the t.co link”, “please still send him or her through t.co” and ends with a more constraining “we will be updating the TOS to require you to check t.co and register the click”. Speak softly and carry a big stick.

It will be obviously easy to avoid this, and you won’t have to resort top copy/pasting URLs. Even if the client developers play ball, open source clients can be recompiled. Proxies can be put in front of clients to remember the mapping and do the substitution without ever hitting t.co. Plug-ins and Greasemonkey scripts can be developed. Etc. Twitter knows that for sure and probably doesn’t care. As long as by default most users go through t.co the company will get the metrics it needs. It’s actually to Twitter’s benefit to make it easy enough (but not too easy) to circumvent this, as it ensure that those who care will find a solution and therefore keep using the service without too much of a fuss. We can’t tell you to cheat but we’ll hint that we don’t mind if you do.

The privacy angle

This is a big deal, and disappointing to me. Obviously the hopes I had for Twitter to become the backbone of an open, user-controlled, social data bus are not shared by its management. Until now, Twitter was a good citizen in a world of privacy-violating social networks because the data it shared (your tweets and your basic profile data) had always and unambiguously been expected to be public. Not true with your click stream. An average user will have no idea that when he clicks on http://cnn.com/some.story the request first goes to Twitter. Twitter now has access to identified personal data (the click stream) that its users do not mean to share. I realize that this is old news in a world of syndicated web advertising and centralized analytics, but this is new for Twitter and now puts them in position to mishandle this data, purposely or not, in the way they store it, use it and share it.

I wouldn’t be surprised if they end up forced to offer an option to not go through t.co and they had additional metadata in the user profile to inform the Twitter client that it is OK for this user not to go through the gateway when they click on a link. We’ll see.

The impact on the Twitter application ecosystem

The assault on URL shorteners was expected since the Chirp conference (and even before). Most Twitter application developers had already swallowed the “if you just fill a hole in the platform we’ll eventually crush you” message. What’s new in this announcement is that it also shows that Twitter is willing to use its Terms of Service document as a competitive tool, just like Steve Jobs does with Flash. Not unexpected, but it wasn’t quite as clear before.

It’s not a pretty implementation

Here is what it looks like at the API level: the actual tweet text contains the t.co shortened URL. And for each such URL there is a piece of metadata that gives you the t.co URL, the corresponding full-length URL (which you should display instead of the t.co one) and the begin/end character position of the t.co URL in the original tweet so you can easily pull it out (not sure why you need the end position since it’s always beginning+20 but serialized Twitter message are already happily duplicative).

As a modeling and serialization geek, I am not impressed by the technical approach taken here. But before I flame Twitter I should acknowledge the obvious: that the Twitter API has seen a rate of adoption several orders of magnitude larger than any protocol I had anything to do with. Still, it would take more than this detail of history to prevent me from pontificating.

For such a young company, the payload of a Twitter message is already quite a mess, mixing duplications, backward-compatible band-aids, real technical constraints and self-imposed constraints. Why, pray tell, do we even need to shorten URLs? If you’re an outsider forced to live within the constraints of the Twitter rules (chiefly, the 140 character limit), they make sense. But if you’re Twitter itself? With the amount of cruft and repetition in a serialized Twitter message, don’t tell me these characters actually matter on the wire. I know they do for SMS, but then just shorten the links in tweets sent over SMS. In the other cases, it reminds me of the frustrating experience of being told by the owner of a Mom-and-Pop store that they can’t accede to your demand because of “company policy”.

Isn’t it time for the text of tweets to contain real markup so that rather than staring at a URL we see highlighted words that point somewhere? Just like… any web page. Isn’t it the easiest for an application to process and doesn’t it offer the reader a more fluid text (Gillmor and Carr notwithstanding)? By now, isn’t this how people are used to consuming hypertext?

Couldn’t the backward compatibility issue of such an approach be solved simply by allowing client applications to specify in their Twitter registration settings that yes, they are able to handle such earth-shattering concept as a <a href=””></a> element. This doesn’t prevent a URL to “cost” you some fixed number of characters, it doesn’t prevent the use of a tracker/filter gateway if that’s your business decision.

We’ll see how users (and application developers) react to this change. As fans of Douglas Adams know, the risk of claiming that “all your click are belong to us” is that you expose yourself to hearing “so long, and thanks for all the whale” as an answer…

11 Comments

Filed under Everything, Off-topic, Twitter

11 Responses to Twitter changes the rules for URLs in tweets: the end of privacy or the end of the 140 character limit?

  1. This comment is going to look like a spam because I just wanted to say: Great post. I appreciate your analysis.

  2. Once a browser/client caches the 301 redirect from the URL shortener it will never again defer to t.co again for that URL. Even if it should become malicious.

    I respect Twitter’s right to monetize their freely available service, but would agree the layers of complexity are mounding.

  3. So why doesn’t everyone switch to Identica already? At least there, the software running Identica (Status.Net) can be setup and you can run your own private service that interacts with other Status.Net installations (such as talking to other users on other sites or following other users on other sites) in the very small chance Identica turns evil.

  4. Nick

    Isn’t it about time people got over the whole twitter fad? With the iPhone, Android, and other smart mobile devices 140 characters has not been a limit for a long time. People talk about how the limit forces them to be concise, but they’re only trying to justify why twitter sucks. Combine that with the all-too-cheesy twitter events… hey, anyone up for a Tweet Chirp Egg NestUp? Bring your twiphones and twandroid devices, we’ll be serving plentwy of tweer and other twalocoholic twinks. Twahahahaah!

  5. Greg

    @Nick no smart mobile device exceeds the 160 character limit for SMS they OS is merely taking the parts and combining them based on information embedded within the message

  6. amuse.me, good idear for a new url shortening service :-)

  7. UJK

    ROTFL … “end of privacy”? You’ve simply got to be kidding. How is having Twitter do the exact same kind of clickstream analysis that bit.ly has been doing all along the “end of privacy”? Big freakin’ deal. This is all for links we are gladly publishing in public and clicking through on whilst logged into Twitter — hardly anything people are going to lengths to keep particularly private. The “end of privacy” headline really detracts from some otherwise good analysis here.

  8. UJK: I’ll grant you that “end of privacy” is a bit over-used these days. This is just one more cut to contribute to its bleeding death.

    On the other hand, there is a difference. When I click on a bit.ly URLs:
    – the purveyor of the link has decided to use bit.ly
    – I know I am going to bit.ly first

    With Twitter’s new system:
    – the purveyor of the link has no control
    – I don’t know I go to twitter, to me it looks like I am clicking on a link that goes straight to the real destination.

    I think that’s significant. I’ll grant you that it’s hardly unheard of (e.g. the way Google redirects some of your clicks on its search page to its servers so it knows which result you liked). But I had higher hopes for Twitter, as “the backbone of an open, user-controlled, social data bus” as I explained in the post.

    But it’s ok to call me naive. :-)

    Oh, and isn’t it ironic how this announcement came out at the same time as yet another massive service disruption by Twitter. Way to make us feel good about them inserting themselves in our browsing experience. I wonder if I violate the ToS if I switch to use the “real” link when the t.co service is down?

  9. Jerffrey A. Williams

    Twitter nor Google are even remotely interested in your privacy and never were. This is why I don’t use either services.

  10. @Jeffrey and every other privacy worrier: Neither Twitter nor Google, nor any other corporation are even remotely interested in you and never were. Google (and Twitter) only cares about your wallet, a far less personal interest. This not privacy; and by the way, Walmart and every other brick-and-mortar merchant is doing the same thing, not to mention online retailers like Amazon. So if that is the privacy you are concerned about, make sure you pay cash for everything.

    True privacy is a concern on a social networking site like Facebook and MySpace, only because people you know may see things you don’t want them to. This is the same privacy concern that has plagued small communities–filled with busybodies–for millennia.

  11. Pingback: William Vambenepe — URL shorteners and privacy: The Good, the Bad and the Cookie