I write the code

Monday, January 3, 2011

I created a simple web application and put it up on Amazon Web Services for about a week. The cost for a single instance and a single load balancer was around $21, including a few cents for storage and traffic. It would be about $82 a month. So I ported it to Google App Engine, where a low-traffic web application could be hosted for free. It took about a day. I replaced the storage layer that used Amazon S3 with one that used JDO. I replaced the memcached layer using the net.spy.memcached client with one using the JCache (JSR107) API. Those were straightforward and most of the rest of the code didn't need any changes. I also added the configuration files appengine-web.xml and jdoconfig.xml.

I ran into a few snags with the appengine servlet container, some of which might have been due to strange interactions with Google Guice, but that's just speculation without any real investigation.

The first one is probably a bug in the appengine (or jetty) implementation of HttpServletResponse.encodeRedirectURL(). It erroneously added ;jsessionid=sessionid to external URLs. I worked around that by not calling encodeRedirectURL() for external URLs, even though the javadoc says "All URLs sent to the HttpServletResponse.sendRedirect method should be run through this method."

The next one seemed like some weird difference in implementations. I used guice-servlet to configure servlet filters on /index.jsp, but they weren't being invoked for /, while the filters were being invoked under Tomcat and Glassfish. I worked around that by changing filter("/index.jsp").through(MyFilter.class) to filter("/","/index.jsp").through(MyFilter.class).

The last one was the weirdest, and has to be some kind of bug in either appengine, jetty, or guice. Once I got the filters passing the request for / through, instead of serving up /index.jsp, the service returned a redirect to //, which turned into a redirect to ///, etc, until the browser stopped due to too many redirects. I worked around that by kludging in a special servlet for /:


    @Singleton
    public static class RedirectToIndexJSP extends javax.servlet.http.HttpServlet {
        private static final long serialVersionUID = 0L;
        @Override
        protected void doGet(javax.servlet.http.HttpServletRequest request, javax.servlet.http.HttpServletResponse response) throws javax.servlet.ServletException, java.io.IOException {
            request.getRequestDispatcher("/index.jsp").forward(request, response);
        }
    }

After that, it all worked. Compared to Amazon, the service on Google App Engine is a lot slower for the first request, as this is a very low-traffic application, so it's pretty much never running, and a new virtual machine starts up when a request does come in, which seems to take around 10 seconds. Subsequent requests are reasonable, though still slower than Amazon.

Monday, December 27, 2010

In order to access a Garmin device from Javascript in a web page, the Prototype framework needs to be loaded. The bad thing about Prototype is that messes other things up. One nice thing about jQuery is that it is nonintrusive, defining only one global symbol. Prototype, on the other hand screws with basic Javascript datatypes.

The first thing that I had to deal with after loading Prototype was that I could no longer use for (i in array) { ... }, to iterate through array elements without getting a bunch of extra junk. So I rewrote my for loops to go from 0 to array.length - 1 instead.

The next thing really confused me for hours. I wanted to encode a Javascript data structure as JSON, and JSON.stringify([]) would inexplicably result in "\"[]\"". But if I used the browser's Javascript console, JSON.stringify([]) would correctly result in "[]". Finally, I came across a web page that identified Prototype as causing this problem by setting Array.prototype.toJSON, among other things, which also explained to me why Prototype was screwing up my for loops iterating through arrays.

Monday, December 20, 2010

When building a war file with ant, I found it convenient to include multiple <webinf/> elements in the war task when some of the WEB-INF files were generated, so that the source files and the generated files could be in separate directory trees. However, if there were entries under WEB-INF that were duplicated between the directory trees, one of them would always cause ant to find the war file out of date and rebuild it. It's not that I had any duplicated regular files, but I did have duplicated subdirectories, which had different timestamps. I had been puzzled by why ant was always rebuilding the war file, even if nothing had changed, before looking into it. I finally changed the war task to have a single <webinf/> element pointing at the generated files, and added a task that copies the source files into the directory tree of the generated files.

Monday, December 13, 2010

Facebook authorization has some weirdness that I could only deal with with an ugly workaround. My understanding of OAuth is that I redirect the user to a Facebook URL. Then, Facebook sends the user back to me with a code that I exchange for an access token with Facebook. This works fine if the user has already authorized my application: the user gets redirected to Facebook, and then redirected back to me without a hitch.

However, if Facebook need the user to authorize my application, then there is a problem. My application runs in an iframe in a Facebook page. However, if I redirect the user to Facebook, which shows the user the authorization page, Facebook detects that it's in an iframe and grays out the page. If the user tries to interact with it, the user gets sent to the page out of the iframe. Once the user authorizes my application, the user gets sent back to me, but not in the iframe. Fortunately, when the user comes to my application, Facebook passes a flag to me that says whether the user has already authorized my application. So if the user has already authorized my application, I redirect to Facebook and things work fine. If the user has not, then I show the user a page that has some javascript that sends the user to Facebook out of the iframe, and when the user comes back from Facebook, I redirect the user to the Facebook page that frames my application.

Monday, December 6, 2010

I have an ant target that runs some scripts on some Amazon EC2 instances. So, I first generate a file with a list the hostnames of the running instances. Then, ant runs a shell command that iterates over the hostnames and runs ssh to each with a command. Since I didn't want to run the command as root, the command was su -c "\"command with some parameters\"" username, which worked fine for me. However, on cygwin on Microsoft Windows, the nested quoting didn't work, so that su was trying to set the user to the first command parameter rather than the username I specified. None of the variations of single and double quotes and backslash quoting the space characters worked.

Finally, I gave up on trying to deal with the quirks of the Bourne shell under Microsoft Windows, and moved the su into a script on EC2:

if [ ! `id -un` = username ]; then su -c "sh -c \"$0 $*\"" username; exit; fi

Of course, none of the command parameters are expected to have spaces in them, as the $* would be trouble. And $@ wouldn't help because of the nested quoting.

Monday, November 29, 2010

For storing objects in a database, I chose to serialize the objects in a way that maintains compatibility between object versions. So Java serialization was out. My first implementation was to serialize to JSON using Jackson, which was simple because it could serialize and deserialize objects that I was already using. However, there were more compact serialization schemes. Apache Thrift and Google Protocol Buffers looked fairly equivalent in functionality. Both required generating code from some IDL, making them less convenient than JSON. I chose Protocol Buffers since its serialization seemed to be slightly smaller than that of Thrift.

In order to continue using the objects that I was using before, since they included annotations and logic that can't be cleanly added to the protobuf-generated code, I used java.beans.Introspector to extract values from the objects to be stored in the database and put them into the protobuf objects and vice-versa.

Monday, November 22, 2010

AWS (Amazon Web Services) is pretty nice. It not only has what's needed to deploy and scale a web service, but everything can be controlled through their REST (or SOAP) API, so that everything can be automated.

One thing that should have been obvious was how a service would start when an instance starts. The first time, I created an instance from a prebuilt image, copied the code to it, then logged in and started it manually. Eventually, I created an image that included all the code. I finally figured out that I could start it in /etc/rc.local.

The next thing I needed to figure out was how to update the software. Everything would be in a single war file (except for static assets pushed up to the CloudFront CDN (Content Delivery Network)), but automatically building a new image from an existing image with the war file replaced looked difficult. I could do it manually by starting an instance, replacing the war file, and then creating a new image. I could automate that manual process. Another way seems to involve turning the image into a loopback filesystem, replacing the war file in the filesystem, then turning the filesystem back into an image, then uploading the image. Finally, I figured out that I could just upload the war file to Amazon S3 (Simple Storage Service), and the instance could download the war file from S3 when it starts up, and creating new images is unnecessary.

However, the process of getting a working image was slow and tedious, since starting and stopping instances, and creating images were all really slow. Once I had an image that worked, setting up load balancing was trivially easy. Setting up Auto Scaling also looks very easy, once I figure out what metrics to use for launching and terminating instances.