Monday, May 23, 2011

I was writing some Haskell code where I was constructing a value with lots of record fields, but only cared about one of them initially.

data Type = Type { field1 :: Type1, field2 :: Type2 }

But, omitting the irrelevant fields from the initializer causes a big "Fields not initialised" warning.

So, I figured that, instead of writing out

Type { field1 = value, field2 = undefined }

which was verbose, or

Type { field1 = value }

which resulted in the big warning, I could use

undefined { field1 = value }

which did eliminate the warning.

However,

field1 (undefined { field1 = value })

resulted in an exception. I had expected it to be equivalent to

field1 (Type { field1 = value, field2 = field2 undefined })

but it's not. Since types can have multiple constructors, it's actually equivalent to

field1 (case undefined of { Type { field2 = field2 } -> Type { field1 = value, field2 = field2 })

according to section 3.16.3 in the Haskell report.

Monday, May 9, 2011

While trying to write concurrent code that forks off threads that sleep before doing some delayed actions using GHCi, version 6.10.4, which is a pretty old version, I found some strange behavior. I used map (flip addUTCTime time) [1..5] to generate the times that the delayed actions should be performed, and the interpreter would lock up. When I changed it to map (flip addUTCTime time) [1,2,3,4,5], everything worked as expected. Maybe there is something tricky in the implementation of enumFromTo :: NominalDiffTime -> NominalDiffTime -> [NominalDiffTime]

Monday, April 25, 2011

I started using Control.Concurrent to write concurrent code in Haskell. It's easier than Java's concurrency model, since values are immutable in Haskell, while one has to worry about values being changed by other threads in Java, which means having to use locks in the right places, and knowing how to use the java.util.concurrent classes.

Of course, it's still possible to deadlock in Haskell, such as with do { a <- takeMVar ma; b <- readMVar mb; putMVar ma a } and do { b <- takeMVar mb; a <- readMVar ma; putMVar mb b }.

Also, there is no getting around dealing with external concurrency issues, such as with databases.

Monday, April 11, 2011

Thinking about real numbers in the 01_ programming language, the fractional part can be represented as big endian base 1/2. (Or is it little endian? Bits to the left represent larger numbers, but smaller powers of 1/2.) Infinite lists of bits can represent numbers that are ≥ 0 and ≤ 1.

Addition of fractional numbers can be defined as

+/fractional 0a 0b = +/fractional/carry a b 0_ 1_ +/fractional a b.
+/fractional 1a 0b = +/fractional/carry a b 1_ 0_ +/fractional a b.
+/fractional 0a 1b = +/fractional/carry a b 1_ 0_ +/fractional a b.
+/fractional 1a 1b = +/fractional/carry a b 0_ 1_ +/fractional a b.

where evaluating the carry of fractional addition is

+/fractional/carry 0a 0b carry-zero carry-one = carry-zero.
+/fractional/carry 1a 0b carry-zero carry-one = +/fractional/carry a b carry-zero carry-one.
+/fractional/carry 0a 1b carry-zero carry-one = +/fractional/carry a b carry-zero carry-one.
+/fractional/carry 1a 1b carry-zero carry-one = carry-one.

And the subtraction of fractional numbers is

-/fractional 0a 0b = -/fractional/borrow a b 0_ 1_ -/fractional a b.
-/fractional 1a 0b = -/fractional/borrow a b 1_ 0_ -/fractional a b.
-/fractional 0a 1b = -/fractional/borrow a b 1_ 0_ -/fractional a b.
-/fractional 1a 1b = -/fractional/borrow a b 0_ 1_ -/fractional a b.

where evaluating the borrow of fractional subtraction is

-/fractional/borrow 0a 0b borrow-zero borrow-one = -/fractional/borrow a b borrow-zero borrow-one.
-/fractional/borrow 1a 0b borrow-zero borrow-one = borrow-zero.
-/fractional/borrow 0a 1b borrow-zero borrow-one = borrow-one.
-/fractional/borrow 1a 1b borrow-zero borrow-one = -/fractional/borrow a b borrow-zero borrow-one.

Unlike the addition and subtraction of integers, these operations, in general, require infinite time and memory to calculate a finite number of bits, due to the carry and borrow.

Monday, March 28, 2011

Thinking about numbers in the 01_ programming language, the natural way to represent integers would be to use little-endian base 2. To further simplify things, consider only infinite lists of bits. So the important numbers are

zero = 0 zero.
one = 1 zero.

Negative numbers can also be represented

-one = 1 -one.

Integer addition can be defined as

+/integer 0a 0b = 0 +/integer a b.
+/integer 1a 0b = 1 +/integer a b.
+/integer 0a 1b = 1 +/integer a b.
+/integer 1a 1b = 0 +/integer/carry a b.

where integer addition with carry is

+/integer/carry 0a 0b = 1 +/integer a b.
+/integer/carry 1a 0b = 0 +/integer/carry a b.
+/integer/carry 0a 1b = 0 +/integer/carry a b.
+/integer/carry 1a 1b = 1 +/integer/carry a b.

And integer subtraction is

-/integer 0a 0b = 0 -/integer a b.
-/integer 1a 0b = 1 -/integer a b.
-/integer 0a 1b = 1 -/integer/borrow a b.
-/integer 1a 1b = 0 -/integer a b.

where integer subtraction with borrow is

-/integer/borrow 0a 0b = 1 -/integer/borrow a b.
-/integer/borrow 1a 0b = 0 -/integer a b.
-/integer/borrow 0a 1b = 0 -/integer/borrow a b.
-/integer/borrow 1a 1b = 1 -/integer/borrow a b.

Monday, March 14, 2011

For work, some of the code is in JSTL (JavaServer Pages Standard Template Library) EL (Expression Language). JSTL EL is weakly typed and dynamically typed. There is no compile-time checking.

One day, a coworker sent me a message saying some stuff stopped working after merging in some of my changes. So I tried running it and it didn't work. I also added logging to the code I changed, which was all Java, and it was working fine. I then tracked it down to some JSTL (untouched by me):

<c:set var="flag" value="{flag1 || flag2}"/>
...
<c:if test="${flag}">
... stuff that failed to appear ...
</c:if>

The first line should have been

<c:set var="flag" value="${flag1 || flag2}"/>

This is the type of stupid mistake that compile-time checking, especially with static typing, can catch.

Monday, February 28, 2011

I wrote a compiler for 01_ to LLVM (Low Level Virtual Machine) in a week of weekends and evenings. LLVM's static type-checking caught numerous silly mistakes. However, I got bit twice because LLVM does not warn, at least with the default option, when the calling convention declaration of the caller and callee do not match. (I use fastcc because tail-call elimination is important for 01_ programs, and failed to specify it in the caller those two times.) This seems like something that could be checked by the computer and reminds me why I prefer using statically typed programming languages over dynamically typed programming languages.

I wrote the parser in one evening, which I had done before, so it was mostly a matter of getting reacquainted with the Parsec parsing library.

I spent another evening and a weekend learning the LLVM assembly language and writing the runtime library.

I spent another couple of evenings writing the code generator.

I spent the last evening chasing down memory leaks in the generated code.

The code is available at github.com.

Monday, February 14, 2011

I had been thinking about compiling 01_ to LLVM for a while, and finally decided to get started on it by playing around with LLVM assembly language. One thing I like about LLVM assembly language is the static type checking. Anyhow, I started writing a runtime library for 01_. The only data type in 01_ is a lazy list of bits. The data type does not permit circular references, so I'll use reference counting garbage collection.

Here's my first stab at the data type for 01_ values:

%val = type { i32, i1, %val*, { i1, %.val* } (i8*)*, void (i8*)*, i8* }

which, in a C-like syntax would be:

struct val {
int refcount;
bool bit;
struct val *next;
  struct { bool bit, struct val *next } (void*) *eval;
void (void *) *free_env;
void *env;
};

where bit and next are undefined and eval is non-null for unevaluated promises, and bit is undefined and next is null and eval is null for values evaluating to nil, and bit contains the bit value and next points to the next value and eval is null for non-nil values. That's a pretty large data structure for a single element of a bit list. I could shrink it by the size of a pointer by using the same location for next and env and casting, as env and free_env are never valid at the same time as next and bit. I won't do that, though, because it would make the code less clear, and having more understandable code is more important to me in this project.

Monday, January 31, 2011

I've integrated purchasing with Facebook Credits into a Facebook application. In the end, the API was good, but the documentation was terrible, making it hard to get started. To start with, I needed to handle a callback from Facebook, and the documentation said
There are two callbacks Facebook makes on the application back end. The application needs to verify the fb_sig parameter to make sure that the request is coming from facebook.

I then had to guess and use trial-and-error to get it to work. What the documentation should have said, but didn't say was
  • The callback is an HTTP POST.
  • The content-type of the posted content is application/x-www-form-urlencoded.
  • The two callbacks are indicated by the method parameter.
  • The fb_sig parameter only needs to be verified if the application does not have the OAuth 2.0 for Canvas setting enabled. If that setting is enabled, the fb_sig parameter is not sent, and all the parameters are in the signed_request, which includes a signature that needs to be verified.
  • The order_details parameter is a string containing the original JSON, which means it needs to be double parsed.
  • They provide an example for the response to the payments_get_items callback, but not for the payments_status_update callback. Following the given example for the payments_status_update response results in an unhelpful error message to the user, with no feedback pointing to the problem. As the documentation was unhelpful, and the Facebook developer forums had a few posts from someone facing the same problem with no response, I resorted to trial-and-error. (I'm not creating yet another account and password just to post to that forum.) The problem was that the content field in the payments_get_items response is supposed to be an array, but it is supposed to be a single item in the payments_status_update response.

Monday, January 24, 2011

I've said that I've found the Python programming language uncompelling. As a language, my impression is that it's a better language than perl, but I still use perl from time to time, but I don't use Python. The main way I use perl is from the command line to run one-off scripts. The one thing I knew about Python from the early 1990s when I first heard about it, until 2009, when I decided to try using it, was that indentation was syntactically significant, which means you can't write one-liners (even though the one line could contain 500 or more characters) like you can with braces and semicolons. perl -e 'while (<>) { ... }' is very convenient when trying to do something that would be too complicated when using pure shell constructs.

Monday, January 17, 2011

I found out that in Tomcat 6, HttpServletRequest.getRequestURL() omits the ";jsessionid=[sessionid]", if present, while Tomcat 7 returns it. In either case, HttpServletResponse.encodeRedirectURL() adds it in or replaces it with the actual session id when the client has cookies disabled. However, what I needed was to have the ";jsessionid=[requested sessionid]", and not the actual session id, because I needed to reconstruct the URL that was actually sent for the redirect_uri parameter when obtaining an OAuth 2.0 access token.

What had happened was that when multiple hosts were added to the load balancer, clients with cookies disabled would get authentication failures. Since this was a new service in private beta, the immediate fix was to have a single host in the load balancer. I then would add a second host to the load balancer, do a quick test, then remove the second host from the load balancer, and then look at the logs to see what was going on. I had originally constructed the redirect_uri parameter using HttpServletResponse.encodeRedirectURL() on the results of HttpServletRequest.getRequestURL() and HttpServletRequest.getQueryString(). It worked fine with clients that had cookies enabled, because the session id wouldn't be in the URL, and the load balancer used cookies for host stickiness. However, for clients with disabled cookies, the load balancer would bounce the client between hosts for subsequent requests, and since sessions weren't really being shared between hosts, so when a client request gets sent to a different host, it gets a new session. (I did have a scheme to recover a very small set of essential session data through memcached, but I didn't think that saving redirect_uri in it was necessary, since it ought to be available when the client is retrieving that exact URL. I also should have a scheme to make sure the session ids from one host never collide with session ids from another host by prepending the hostname to the session id.)

After I figured out what was happening, I removed the HttpServletResponse.encodeRedirectURL() call in the construction of the redirect_uri parameter. It seemed fine to me, because I was using Tomcat 7. However, once it was running on other systems, there immediate failures for clients with disabled cookies. From the logs, I could tell that HttpServletRequest.getRequestURL() was omitting the ";jsessionid=[session id]", so I theorized that it was because of differences between Tomcat 6 and Tomcat 7, and quickly confirmed it by running Tomcat 6 myself.

I finally settled on an ugly hack, where if HttpServletRequest.isRequestedSessionIdFromURL() and if ":jsessionid=" were not in HttpServletRequest.getRequestURL(), I would add ";jsessionid=" + HttpServletRequest.getRequestedSessionId(), which would then work on Tomcat 6 and Tomcat 7.

Monday, January 10, 2011

When doing some shell scripting for build and deploy processes, I ran into something I found strange and confusing. I had a loop

while read host; do ssh -i id $host command; done < hosts

However, it only executed the remote command on the first host, which was odd. I threw in some echos to see if something strange was happening, but it only confirmed that the loop only went through one iteration. Then I changed ssh to echo, and the loop iterated through all the hosts. Finally, I figured that ssh was slurping up stdin for some reason, causing the loop to end, and fixed the problem with

while read host; do ssh -n -i id $host command; done < hosts

Monday, January 3, 2011

I created a simple web application and put it up on Amazon Web Services for about a week. The cost for a single instance and a single load balancer was around $21, including a few cents for storage and traffic. It would be about $82 a month. So I ported it to Google App Engine, where a low-traffic web application could be hosted for free. It took about a day. I replaced the storage layer that used Amazon S3 with one that used JDO. I replaced the memcached layer using the net.spy.memcached client with one using the JCache (JSR107) API. Those were straightforward and most of the rest of the code didn't need any changes. I also added the configuration files appengine-web.xml and jdoconfig.xml.

I ran into a few snags with the appengine servlet container, some of which might have been due to strange interactions with Google Guice, but that's just speculation without any real investigation.

The first one is probably a bug in the appengine (or jetty) implementation of HttpServletResponse.encodeRedirectURL(). It erroneously added ;jsessionid=sessionid to external URLs. I worked around that by not calling encodeRedirectURL() for external URLs, even though the javadoc says "All URLs sent to the HttpServletResponse.sendRedirect method should be run through this method."

The next one seemed like some weird difference in implementations. I used guice-servlet to configure servlet filters on /index.jsp, but they weren't being invoked for /, while the filters were being invoked under Tomcat and Glassfish. I worked around that by changing filter("/index.jsp").through(MyFilter.class) to filter("/","/index.jsp").through(MyFilter.class).

The last one was the weirdest, and has to be some kind of bug in either appengine, jetty, or guice. Once I got the filters passing the request for / through, instead of serving up /index.jsp, the service returned a redirect to //, which turned into a redirect to ///, etc, until the browser stopped due to too many redirects. I worked around that by kludging in a special servlet for /:

@Singleton
public static class RedirectToIndexJSP extends javax.servlet.http.HttpServlet {
private static final long serialVersionUID = 0L;
@Override
protected void doGet(javax.servlet.http.HttpServletRequest request, javax.servlet.http.HttpServletResponse response) throws javax.servlet.ServletException, java.io.IOException {
request.getRequestDispatcher("/index.jsp").forward(request, response);
}
}

After that, it all worked. Compared to Amazon, the service on Google App Engine is a lot slower for the first request, as this is a very low-traffic application, so it's pretty much never running, and a new virtual machine starts up when a request does come in, which seems to take around 10 seconds. Subsequent requests are reasonable, though still slower than Amazon.