Category Archives: Testing

URL shorteners and privacy: The Good, the Bad and the Cookie

The table below compares various URL shorteners based on how much they value service performance and the privacy of their users.

Here is the short version of the reading guide: a URL shorterner which gives a high priority to reliability, performance and privacy will use a 301 (“Moved Permanently”) response code, will not use cache control headers and will not use cookies. A URL shortener which gives high priority to its own ability to monetize its traffic by tracking users will do one or more of these things.

Here is how a few of the most popular shorteners perform by this measure (red is bad).

For the long version (and an explanation of how I came to create this table) read below the table.

Service name Cookie Status code Caching limitations
t.co (Twitter) 301 5 min
bit.ly tracking 301
tinyurl.com 301
goo.gl (Google) 301 24h
wp.me (WordPress) 301
snurl.com 301 10h
fb.me (Facebook) (*) 301
twurl.nl tracking 301
is.gd
ping.fm 301
p.ly tracking 301 no caching
ff.im tracking 301 (**)
u.nu 301
tiny.cc tracking 301
snipurl.com 301 10h
chkit.in tracking 301
ur1.ca 302 no caching
digs.by 302 no caching

Notes:

(*) Facebook’s service, fb.me, tries to set a cookie but its content is “locale=en_US” and cannot be used for identification. In addition, it sets the domain to “.facebook.com” in the Set-Cookie directive but since the response comes from another domain (fb.me) the cookie is actually never returned by the browser and therefore useless. It looks like this is a leftover configuration setting copied from the normal facebook.com servers. Defying all expectations, Facebook comes out as one of the most privacy-friendly URL shorteners.

(**) ff.im limits the cache to being “private” which means that your browser can cache the result but a shared proxy (e.g. your company’s proxy) should not cache it. Forcing each user behind that proxy to resolve the URL once. I magnanimously did not ding them for this, even though it’s sub-optimal.

Now for the longer explanation

Despite the potential it offers to stretch out our tweets, I wasn’t too impressed when I learned of Twitter’s plan to roll out (and mandate) its own URL shortening service. My fundamental issue is that URL shortening is made necessary by an arbitrary decision on Twitter’s part (the 140 character limit and the fact that URLs count toward it) and that it would be entirely within their power to make these abominations unneeded. Or, at least, much more rarely needed (when tinyurl.com came out, the main use case was to insert a very long URL in an email without having problems with carriage returns, not to turn third-world countries into purveyors of silly domain names).

Beyond this fundamental issue, my main concerns about Twitter’s t.co mechanism are that it reduces privacy and it demands that you break the HTTP specification.

From a privacy perspective, the issue is that anyone who clicks on these links tells Twitter where they are going. And Twitter can collect and correlate these actions. The easiest way for them (or any other URL shortener) to do this is to use cookies. Cookies aren’t often used as part of redirections, but technically nothing prevents them. So I wanted to see if Twitter used them.

[Side note: in practice there are ways to track your browser without using identifying cookies, not to mention simply using the IP address which works quite well on people who browse from home. Still, identifying cookies are the preferred method.]

From a specification conformance perspective, the problem is that Twitter announced that they would modify the Terms of Service of their API to prevent you from replacing the short URL with the real location once you’ve resolved it the first time (as of this writing they apparently haven’t yet made the ToS change). That behavior would be in violation of the HTTP specification if the redirection used status code 301 (“Moved Permanently”) which states that “any future references to this resource SHOULD use one of the returned URIs” and “clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server“. So I wanted to see whether t.co indeed returns a 301 (and asks us to violate the spec) or if they use a Temporary Redirect (302 or the new 307) in which case the specification would not be violated but other problems would arise (for example, search engines would not give you PageRank karma for such a link).

The other (spec-compliant) way to force a 301 to call back home once a while is the (strange but legal) practice of using cache control headers on permanent redirections. So I also wanted to see how t.co behaves on that front.

And then I decided to also test a few other services, which is how the table above came to be.

Comments Off on URL shorteners and privacy: The Good, the Bad and the Cookie

Filed under Everything, Facebook, Google, Protocols, Security, Social networks, Tech, Testing, Twitter

Dear Cloud API, your fault line is showing

Most APIs are like hospital gowns. They seem to provide good coverage, until you turn around.

I am talking about the dreadful state of fault reporting in remote APIs, from Twitter to Cloud interfaces. They are badly described in the interface documentation and the implementations often don’t even conform to what little is documented.

If, when reading a specification, you get the impression that the “normal” part of the specification is the result of hours of whiteboard debate but that the section that describes the faults is a stream-of-consciousness late-night dump that no-one reviewed, well… you’re most likely right. And this is not only the case for standard-by-committee kind of specifications. Even when the specification is written to match the behavior of an existing implementation, error handling is often incorrectly and incompletely described. In part because developers may not even know what their application returns in all error conditions.

After learning the lessons of SOAP-RPC, programmers are now more willing to acknowledge and understand the on-the-wire messages received and produced. But when it comes to faults, there is still a tendency to throw their hands in the air, write to the application log and then let the stack do whatever it does when an unhandled exception occurs, on-the-wire compliance be damned. If that means sending an HTML error message in response to a request for a JSON payload, so be it. After all, it’s just a fault.

But even if fault messages may only represent 0.001% of the messages your application sends, they still represent 85% of those that the client-side developers will look at.

Client developers can’t even reverse-engineer the fault behavior by hitting a reference implementation (whether official or de-facto) the way they do with regular messages. That’s because while you can generate response messages for any successful request, you don’t know what error conditions to simulate. You can’t tell your Cloud provider “please bring down your user account database for five minutes so I can see what faults you really send me when that happens”. Also, when testing against a live application you may get a different fault behavior depending on the time of day. A late-night coder (or a daytime coder in another time zone) might never see the various faults emitted when the application (like Twitter) is over capacity. And yet these will be quite common at peak time (when the coder is busy with his day job… or sleeping).

All these reasons make it even more important to carefully (and accurately) document fault behavior.

The move to REST makes matters even worse, in part because it removes SOAP faults. There’s nothing magical about SOAP faults, but at least they force you to think about providing an information payload inside your fault message. Many REST APIs replace that with HTTP error codes, often accompanied by a one-line description with a sometimes unclear relationship with the semantics of the application. Either it’s a standard error code, which by definition is very generic or it’s an application-defined code at which point it most likely overlaps with one or more standard codes and you don’t know when you should expect one or the other. Either way, there is too much faith put in the HTTP code versus the payload of the error. Let’s be realistic. There are very few things most applications can do automatically in response to a fault. Mainly:

  • Ask the user to re-enter credentials (if it’s an authentication/permission issue)
  • Retry (immediately or after some time)
  • Report a problem and fail

So make sure that your HTTP errors support this simple decision tree. Beyond that point, listing a panoply of application-specific error codes looks like an attempt to look “RESTful” by overdoing it. In most cases, application-specific error codes are too detailed for most automated processing and not detailed enough to help the developer understand and correct the issue. I am not against using them but what matters most is the payload data that comes along.

On that aspect, implementations generally fail in one of two extremes. Some of them tell you nothing. For example the payload is a string that just repeats what the documentation says about the error code. Others dump the kitchen sink on you and you get a full stack trace of where the error occurred in the server implementation. The former is justified as a security precaution. The latter as a way to help you debug. More likely, they both just reflect laziness.

In the ideal world, you’d get a detailed error payload telling you exactly which of the input parameters the application choked on and why. Not just vague words like “invalid”. Is parameter “foo” invalid for syntactical reasons? Is it invalid because inconsistent with another parameter value in the request? Is it invalid because it doesn’t match the state on the server side? Realistically, implementations often can’t spend too many CPU cycles analyzing errors and generating such detailed reports. That’s fine, but then they can include a link to a wiki a knowledge base where more details are available about the error, its common causes and the workarounds.

Your API should document all messages accurately and comprehensively. Faults are messages too.

9 Comments

Filed under API, Application Mgmt, Automation, Cloud Computing, Everything, Mgmt integration, Protocols, REST, SOAP, Specs, Standards, Tech, Testing, Twitter, Utility computing

Here comes WSTF

A new Web services-related industry body has been announced today: WSTF (Web Services Test Forum). More details about it from Infoworld. My employer (Oracle) seems to be one of the drivers (along with IBM) but I am not personally involved.

A lot of hand-wringing, of course, about its relationship with WS-I. Which is understandable if you consider what WS-I was originally supposed to deliver (profiles, sample applications and testing tools). But not if you consider what it has actually delivered that is relevant (a couple of profiles, some time ago). WSTF could also be compared with the SOAPBuilders Yahoo group, but since that group has seen only two emails messages so far in 2008 (last one dated April 2nd), it seems safe to consider it dead. It would be interesting to know why that is though (it used to be pretty active in the early days) and what lesson WSTF may learn from it. Another effort you may want to compare this to is the Microsoft Web Services Protocol Workshops Process. It’s too early to tell, but they may turn out to be more closely related than meets the eye.

I noticed this innocuous-sounding sentence in the press release (warning, PDF): “As an open community, WSTF has made it easy to introduce new interoperability scenarios and approve work through simple majority governance”. You may wonder why this is important enough to figure in the press release.

I interpret it as a dog whistle call (heard only by those to whom it is intended) for the WS-I board. Microsoft’s Paul Cotton responds to it in his quote for Infoworld: “WS-I provides a proven and open organization and process that best suits our customers’ needs”. He also talks about “the incredible industry-wide momentum and leadership of WS-I”, which is indeed not very credible (especially the momentum part). The WS-I process, associated board politics and resulting inaction is what I was talking about in this entry (“veto rules being commonly invoked, stopping most of the activities that the resort was originally planning to offer”).

Speaking of this “Standardstown” blog entry, I should probably soon update it to include WSTF. What should it be? Maybe a trailer park in which customers bring their own lodging, put them side by side and see how they line up?

The current test scenarios seem to focus on fixing the interoperability mess that is WS-Addressing. I assume more will soon be added to test the different WS-* specifications out there. It will be interesting to see what direction WSTF takes after that. Will the payloads of the test messages be obvious dummy payloads (so that the focus is on testing the implementation of the WS-* protocols)? Or will they start to include real payloads (e.g. real purchase orders from real enterprise applications)? How about this: “dear vendor, I will only buy your wonderfully open, standard and interoperable Web services-based application when it is available as a WSTF endpoint and there are three other real-life products (including one from your main competitor) that successfully interoperate with the exact same SOAP messages I will be using”. This could become an interesting tussle between vendors as well as between vendors and buyers.

Alternatively, of course, WSTF could turn into a test of how much difference there is between a “standard” and a publicly specified and interop-tested interaction scenario…

A quick (and unsuccesful) Technorati search for some blog comments returns the “WSTF Dark Retribution Dinorobots Limited Giftset” which “includes all five Dinorobots in their sinister evil incarnation”. Can’t say you were not warned…

[UPDATED 2008/12/10: Gilbert Pilz, who was involved with WSTF from the start (and also left a comment on this entry, see below), wrote a detailed description of the problem WSTF tries to address and how Gilbert and others have structured WSTF to solve it.]

[UPDATED 2008/12/15: Via InfoQ, another long description of the goals and processes of WSTF, this time from Doug Davis.]

[UPDATES 2009/1/5: Chris Ferris also weighs in, including his view on the relationship with WS-I. Having participated in several of the early WS-I plenary meetings, I have to wonder if Chris had any double-entente in mind when he wrote that WS-I helps “the community understand where the bar is”.]

[UPDATED 2009/2/17: A response from Redmond.]

1 Comment

Filed under Everything, IBM, Implementation, Microsoft, Oracle, SOAP, Specs, Standards, Testing

Progress Software acquires IONA… and MindReef too

The acquisition of IONA by Progress Software has been pretty widely written about (sometimes ironically). But that’s not the only thing happening in that neck of the wood. Less widely reported (but still covered on ZDNet, here and here) was their acquisition of MindReef. For the inside perspective, head over to Dan Foody’s blog.

This is yet another confirmation of the fact that testing and IT management are getting ever closer together. And for good reasons, if you want to better integrate application management tasks across the application’s lifecycle. Other signs of this were the recent acquisition of the e-Test suite from Empirix by Oracle (driven by Oracle’s application management team, not by the JDev team) and, some time ago, the fact that HP decided to hang on to Mercury’s testing business rather than spinning it off.

From the Progress perspective, the IT management side of course comes from the earlier Actional acquisition (that company itself having previously merged with WestBridge). From my earlier standards work I have worked with people from Actional, WestBridge and MindReef and there is an impressive wealth of SOAP and WS experience in these teams. What remains to be seen is how much management value can be derived from a very “on the wire” (as opposed to deep in the implementation) view of the interaction. Another challenge for SOAP-centric vendors, which might have been a driver for these acquisitions, is realization that SOAP is going to be only one component of the integration landscape, not its foundation. It may be the JMS of B2B integration, but it won’t be its TCP/IP as was once assumed.

2 Comments

Filed under Everything, IT Systems Mgmt, SOAP, Testing

Oracle is now a leading vendor of application testing tools

The e-TEST suite (previously from Empirix) has turned into a set of Oracle products for application testing (sorry, application quality management). Ever since the announcement of the deal, a couple of months ago, Oracle sales reps have received many unsolicited requests for that product, validating its good reputation in the market. If you have spent any time around sales reps, you know that for them to tell their customers that for the time being they should purchase from Empirix was about as pleasant as for Hillary Clinton to endorse Barack Obama. Fortunately, this awkward period is over. Not only can people buy the products from Oracle, all the technical support (even for people who bought from Empirix) is now provided by Oracle.

I probably don’t need to say it (since the products were created by an independent company) but just to be clear nothing in these products require the target Web applications to use Oracle infrastructure (e.g. DB, Middleware) or to be otherwise managed by Oracle Enterprise Manager. They will work happily with your Ruby-on-Rails application hosted on RedHat or your .NET Web application.

And it’s not just for HTML-driven, user-facing applications: XML-producing web services can also be tested that way.

Comments Off on Oracle is now a leading vendor of application testing tools

Filed under Application Mgmt, Everything, IT Systems Mgmt, Manageability, Oracle, Testing

Emulating a long-running process (and a scheduler) in Google App Engine

As previously described, Google App Engine (GAE) doesn’t support long running processes. Each process lives in the context of an HTTP request handler and needs to complete within a few seconds. If you’re trying to get extra CPU cycles for some task then Amazon EC2, not GAE, is the right tool (including the option to get high-CPU instances for the CPU-intensive tasks).

More surprising is the fact that GAE doesn’t offer a scheduler. Your app can only get invoked when someone sends it an HTTP request and you can’t ask GAE to generate a canned request every so often (crontab-style). That seems both limiting and arbitrary. In fact, I would be surprised if GAE didn’t soon add support for this.

In the meantime, your best bet is to get an account on a separate server that lets you schedule jobs, at which point you can drive your GAE application from that external scheduler (through HTTP requests to your GAE app). But just for the intellectual exercise, how would one meet the need while staying entirely within the confines of the Google-provided infrastructure?

  • The most obvious option is to piggyback on HTTP requests from your visitors. But:
    • this assumes that you consistently get visitors at a frequency greater than your scheduler’s interval,
    • since you can’t launch sub-processes in GAE, this delays your responses to the visitor,
    • more worrisome, if your scheduled task takes more than a few seconds this means your application might be interrupted by GAE before you respond to the visitor, resulting in a failed request from their perspective.
  • You can try to improve a bit on this by doing this processing not as part of the main request from your visitor but rather by putting in the response HTML some JavaScript that will asynchronously send you HTTP requests in the background (typically not visible to the user). This way, a given visitor will give you repeated invocations for as long as the page is open in the browser. And you can set the invocation interval. You can even create some kind of server-controlled auto-modulation of the interval (increasing it as your number of concurrent visitors increases) so that you don’t eat all your Google-allocated incoming HTTP quota with these XMLHttpRequest invocations. This would probably be a very workable way to do it in practice even though:
    • it only works if your application has visitors who use web browsers, not if it only consumed by programs (e.g. through RSS feeds or other XML format),
    • it puts the burden on your visitors who may or may not appreciate it, assuming they realize it is happening (how would you feel if your real estate agent had to borrow your cell phone to arrange home visits for you and their other customers?).
  • While GAE doesn’t offer a scheduler, another Google service, Google Reader, offers one of sorts. If you register a feed there, Google’s FeedReader will retrieve it once a while (based on my logs, it happens approximately every hour for each of the two feeds for this blog). You can create multiple URLs that all map to the same handler and return some dummy RSS. If you register these feeds with Google Reader, they’ll get pulled once a while. Of course there is no guarantee that the pulling of the different feeds will be nicely spread out, but if you register enough of them you should manage to get invoked with a frequency compatible with you desired scheduler’s frequency.

That’s all nice, but it doesn’t entirely live within the GAE application. It depends on either the visitors or Google Reader. Can we do this entirely within GAE?

The idea is that since a GAE app can only executes within an HTTP request handler, which only runs for a few seconds, you can emulate a long-running process by automatically starting a successor request when the previous one is killed. This is made possible by two characteristics of the GAE runtime:

  • When an HTTP request is canceled on the client side, the request execution on the server is permitted to continue (until it returns or GAE kills it for having run too long).
  • When GAE kills a request for having run too long, it does it through an exception that you have a chance to handle (at least for a few seconds, until you get killed for good), which is when you initiate the HTTP request that spawns the successor process.

If you’ve watched (or played) Rugby, this is equivalent to passing the ball to a teammate during that short interval between when you’re tackled and when you hit the ground (I have no idea whether the analogy also applies to Rugby’s weird cousin called American Football).

In practice, all you have to do is structure your long running task like this:

class StartHandler(webapp.RequestHandler):
  def get(self):
    if (StopExec.all().count() == 0):
      try:
        id = int(self.request.get("id"))
        logging.debug("Request " + str(id) + " is starting its work.")
        # This is where you do your work
      finally:
        logging.debug("Request " + str(id) + " has been stopped.")
        # Save state to the datastore as needed
        logging.debug("Launching successor request with id=" + str(id+1))
        res = urlfetch.fetch("http://myGaeApp.appspot.com/start?id=" + str(id+1))

Once you have deployed this app, just point your browser to http://myGaeApp.appspot.com/start?id=0 (assuming of course that your GAE app is called “myGaeApp”) and the long-running process is started. You can hit the “stop” button on your browser and turn off your computer, the process (or more exactly the succession of processes) has a life of its own entirely within the GAE infrastructure.

The “if (StopExec.all().count() == 0)” statement is my way of keeping control over the beast (if only Dr. Frankenstein had as much foresight). StopExec is an entity type in the datastore for my app. If I want to kill this self-replicating process, I just need to create an entity of this type and the process will stop replicating. Without this, the only way to stop it would be to delete the whole application through the GAE dashboard. In general, using the datastore as shared memory is the way to communicate with this emulation of a long-running process.

A scheduler is an obvious example of a long-running process that could be implemented that way. But there are other examples. The only constraint is that your long-running process should expect to be interrupted (approximately every 9 seconds based on what I have seen so far). It will then re-start as part of a new instance of the same request handler class. You can communicate state between one instance and its successor either via the request parameters (like the “id” integer that I pass in the URL) or by writing to the datastore (in the “finally” clause) and reading from it (at the beginning of your task execution).

By the way, you can’t really test such a system using the toolkit Google provides for local testing, because that toolkit behaves very differently from the real GAE infrastructure in the way it controls long-running processes. You have to run it in the real GAE environment.

Does it work? For a while. The first time I launched it, it worked for almost 30 minutes (that’s a lot of 9 second-long processes). But I started to notice these worrisome warnings in the logs: “This request used a high amount of CPU, and was roughly 21.7 times over the average request CPU limit. High CPU requests have a small quota, and if you exceed this quota, your app will be temporarily disabled.”

And indeed, after 30 minutes of happiness my app was disabled for a bit.

My quota figures on the dashboard actually looked pretty good. This was not a very busy application.

CPU Used 175.81 of 199608.00 Gigacycles (0%)
Data Sent 0.00 of 2048.00 Megabytes (0%)
Data Received 0.00 of 2048.00 Megabytes (0%)
Emails Sent 0.00 of 2000.00 Emails (0%)
Megabytes Stored 0.05 of 500.00 Megabytes (0%)

But the warning in the logs points to some other restriction. Google doesn’t mind if you use a given number of CPU cycles through a lot of small requests, but it complains if you use the same number of cycles through a few longer requests. Which is not really captured in the “understanding application quotas” page. I also question whether my long requests actually consume more CPU than normal (shorter) requests. I stripped the application down to the point where the “this is where you do your work” part was doing nothing. The only actual work, in the “finally” clause, was to opens an HTTP connection and wait for it to return (which never happens) until the GAE runtime kills the request completely. Hard to see how this would actually use much CPU. Yet, same warning. The warning text is probably not very reflective of the actual algorithm that flags my request as a hog.

What this means is that no matter how small and slim the task is, the last line (with the urlfetch.fetch() call) by itself is enough to get my request identified as a hog. Which means that eventually the app is going to get disabled. Which is silly really because by that the time fetch() gets called nothing useful is happening in this request (the work has transitioned to the successor request) and I’d be happy to have it killed as soon as the successor has been spawned. But GAE doesn’t give you a way to set client-side timeout on outgoing HTTP requests. Neither can you configure the GAE cop to kill you early so that you don’t enter the territory of “this request used a high amount of CPU”.

I am pretty confident that the ability to set client-side HTTP timeout will be added to the urlfetch API. Even Google’s documentation acknowledges this limitation: “Note: Since your application must respond to the user’s request within several seconds, a URL fetch action to a slow remote server may cause your application to return a server error to the user. There is currently no way to specify a time limit to the URL fetch action.” Of course, by the time they fix this they may also have added a real scheduler…

In the meantime, this was a fun exploration of the GAE environment. It makes it clear to me that this environment is still a toy. But a very interesting and promising one.

[UPDATED 2009/28: Looks like a real GAE scheduler is coming.]

15 Comments

Filed under Brain teaser, Everything, Google, Google App Engine, Implementation, Testing, Utility computing

Oracle acquires e-TEST from Empirix

Somewhat lost in the news about Oracle’s recent earning report is the announcement that Oracle just purchased e-TEST suite from Empirix. We are not purchasing the company, just some of their products (they also sell VoIP testing tools, for example, which will stay with Empirix). Most importantly, we are also getting the people who made the product, not just a code dump. They’ll join the Enterprise Manager team (my group). Welcome aboard!

The e-TEST suite is made of three integrated components (I am describing the current e-TEST suite, not necessarily the resulting Oracle offering):

  • e-Manager Enterprise is a process management application for application testing.
  • e-Tester lets you easily create sophisticated tests for functional and regression testing.
  • e-Load is a load and performance testing framework.

(these product names make me feel like I am back in HP’s e-speak team)

This is a mature product suite that will increases the scope and depth of EM’s application testing capabilities. It extends the existing EM recorder/beacon infrastructure. It offers a sophisticated test transaction model (remember VBA?). It offers load testing capabilities. Not to mention the process management capabilities around test cases.

My toddler daughter loves her book about the solar system. She has learned to say “hot!” whenever we look at Mercury. Tonight I’ll have to teach her to say “feeling the heat” instead.

More info here. That is probably also where specific product plans will be released.

If you’ve ever been to an Oracle Open World presentation, you won’t be surprised to see that this post ends with a disclaimer that:

It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality and should not be relied upon in making a purchasing decision. The development, release and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

3 Comments

Filed under Automation, Everything, IT Systems Mgmt, Oracle, Testing