I write the code: July 2009

Friday, July 31, 2009

A month ago, I posted some obfuscated code. It's an Unlambda interpreter. Implemented in 45 lines of code in a language in which the only data type is a list of bits and the only operations are list concatenation and pattern matching.

Here is the code with comments and meaningful identifiers.


== unlambda 2 interpreter

unl2 input code = eval parse code _ _ _ _ input.

==============================================

== bit list utilities

drop 0x 0y = drop x y.
drop 0x 1y = drop x y.
drop 1x 0y = drop x y.
drop 1x 1y = drop x y.
drop _ y = y.

take 0x 0y = 0 take x y.
take 0x 1y = 1 take x y.
take . . = _.

concat x y = x y.

==============================================

== list processing
== 00 - 8 bits of data follow
== 01 - start sublist
== 11 - end sublist

car data = /car data _.
/car _ _ = _.
/car 00data _ = 00 take 00000000 data.
/car 00data nesting = 00 take 00000000 data /car drop 00000000 data nesting.
/car 01data nesting = 01 /car data concat 0 nesting.
/car 11. _ = _.
/car 11. 0_ = 11.
/car 11data 0nesting = 11 /car data nesting.

cdr data = drop car data data.

unlist 01data = /unlist data.
/unlist 11_ = _.
/unlist data = car data /unlist cdr data.

==============================================

parse _ = _.

== whitespace
parse 00100000input = parse input.
parse 00001001input = parse input.
parse 00001010input = parse input.
parse 00001101input = parse input.

== comment
parse 00100011input = strip-comment input.

strip-comment _ = _.
strip-comment 00001010input = parse input.
strip-comment input = strip-comment drop 00000000 input.

== `
parse 01100000input = 00 01100000 parse input.

== k
parse 01101011input = 00 01101011 parse input.

== s
parse 01110011input = 00 01110011 parse input.

== i
parse 01101001input = 00 01101001 parse input.

== v
parse 01110110input = 00 01110110 parse input.

== c
parse 01100011input = 00 01100011 parse input.

== d
parse 01100100input = 00 01100100 parse input.

== r
parse 01110010input = 01 00 00101110 00 00001010 11 parse input.

== .c
parse 00101110input = 01 00 00101110 00 take 00000000 input 11 parse drop 00000000 input.

== e
parse 01100101input = 00 01100101 parse input.

== @
parse 01000000input = 00 01000000 parse input.

== ?c
parse 00111111input = 01 00 00111111 00 take 00000000 input 11 parse drop 00000000 input.

== |
parse 01111100input = 00 01111100 parse input.

== ignore unrecognized characters
parse input = parse drop 00000000 input.

==============================================

first-expr _ . = _.
first-expr . _ = _.
first-expr nest 00 01100000code = 00 01100000 first-expr concat 0 nest code.
first-expr 0nest 01 00 00101110 00 code = 01 00 00101110 00 take 0000000000 code first-expr nest drop 0000000000 code.
first-expr 0nest 01 00 00111111 00 code = 01 00 00111111 00 take 0000000000 code first-expr nest drop 0000000000 code.
first-expr 0nest code = take 0000000000 code first-expr nest drop 0000000000 code.

==============================================

== eval: end of code
eval _ . . _ . . = _.

== eval: delay
== stack was: d apply REST
== stack becomes: `dF REST
== `dF is: (D (F))
eval code 00 01100100. 00 01100000. stack-rest current-char input = eval drop first-expr 0 code code concat 01 00 01000100 01 concat first-expr 0 code 11 11 car stack-rest cdr stack-rest current-char input.

== eval
== stack was: apply REST
eval code 00 01100000. stack-second stack-rest current-char input = eval cdr code car code 00 01100000 concat stack-second stack-rest current-char input.

== eval
== stack was: X apply REST
eval code stack-first 00 01100000. stack-rest current-char input = eval cdr code car code stack-first concat 00 01100000 stack-rest current-char input.

== eval: apply
== stack was: X Y apply REST
eval code stack-first stack-second 00 01100000stack-rest current-char input = apply code stack-first stack-second stack-rest current-char input.

== eval
eval code stack-first stack-second stack-rest current-char input = eval cdr code car code stack-first concat stack-second stack-rest current-char input.

==============================================

== apply: k
== stack was: X k apply REST
== stack becomes: `kX REST
== `kX is: (K X)
apply code stack-first 00 01101011. stack-rest current-char input = eval code concat 01 00 01001011 concat stack-first 11 car stack-rest cdr stack-rest current-char input.

== apply: `kX = (K X)
== stack was: Y `kX apply REST
== stack becomes: X REST
apply code . 01 00 01001011stack-second stack-rest current-char input = eval code car stack-second car stack-rest cdr stack-rest current-char input.

== apply: s
== stack was: X s apply REST
== stack becomes: `sX REST
== `sX is: (S X)
apply code stack-first 00 01110011. stack-rest current-char input = eval code concat 01 00 01010011 concat stack-first 11 car stack-rest cdr stack-rest current-char input.

== apply: `sX = (S X)
== stack was: Y `sX apply REST
== stack becomes: `sXY REST
== `sXY is: (s8 Y X)
apply code stack-first 01 00 01010011stack-second stack-rest current-char input = eval code concat 01 00 11110011 concat stack-first stack-second car stack-rest cdr stack-rest current-char input.

== apply: `sXY = (s8 Y X)
== stack was: Z `sXY apply REST
== stack becomes: Z X apply `SYZ apply REST
== `SYZ is: (S8 Y Z)
apply code stack-first 01 00 11110011stack-second stack-rest current-char input = eval code stack-first car cdr stack-second concat 00 01100000 concat 01 00 11010011 concat car stack-second concat stack-first concat 11 00 01100000 stack-rest current-char input.

== apply: `SYZ = (S8 Y Z)
== stack was: X `SYZ apply REST
== stack becomes: Z Y apply X apply REST
apply code stack-first 01 00 11010011stack-second stack-rest current-char input = eval code car cdr stack-second car stack-second concat 00 01100000 concat stack-first concat 00 01100000 stack-rest current-char input.

== apply: .c
== stack was: X .c apply REST
== stack becomes: X REST
apply code stack-first 01 00 00101110 00 stack-second stack-rest current-char input = take 00000000 stack-second eval code stack-first car stack-rest cdr stack-rest current-char input.

== apply: i
== stack was: X i apply REST
== stack becomes: X REST
apply code stack-first 00 01101001. stack-rest current-char input = eval code stack-first car stack-rest cdr stack-rest current-char input.

== apply: v
== stack was: X v apply REST
== stack becomes: v REST
apply code stack-first 00 01110110. stack-rest current-char input = eval code 00 01110110 car stack-rest cdr stack-rest current-char input.

== apply: c
== stack was: X c apply REST
== stack becomes: (continuation) X apply REST
== (continuation) is: (C (code) (REST))
apply code stack-first 00 01100011. stack-rest current-char input = eval code concat 01 00 01000011 01 concat code concat 11 01 concat stack-rest 11 11 stack-first concat 00 01100000 stack-rest current-char input.

== apply: (continuation) = (C (code) (cREST))
== stack was: X (continuation) apply REST
== stack becomes: X cREST
apply . stack-first 01 00 01000011 stack-second . current-char input = eval unlist car stack-second stack-first car unlist car cdr stack-second cdr unlist car cdr stack-second current-char input.

== apply: d
== stack was: X d apply REST
== stack becomes: X REST
apply code stack-first 00 01100100. stack-rest current-char input = eval code stack-first car stack-rest cdr stack-rest current-char input.

== apply: `dF = (D (F))
== stack was: X `dF apply REST
== stack becomes: `DX apply REST
== `DX is (d8 X)
apply code stack-first 01 00 01000100stack-second stack-rest current-char input = eval concat unlist car stack-second code concat 01 00 11100100 concat stack-first 11_ 00 01100000 stack-rest current-char input.

== apply: `DX = (d8 X)
== stack was: F `DX apply REST
== stack becomes: X F apply REST
apply code stack-first 01 00 11100100stack-second stack-rest current-char input = eval code car stack-second stack-first concat 00 01100000 stack-rest current-char input.

== apply: e
apply . . 00 01100101. . . . = _.

== apply: @
== stack was: X @ apply REST
== stack becomes: v X apply REST
apply code stack-first 00 01000000. stack-rest . _ = eval code 00 01110110 stack-first concat 00 01100000 stack-rest _ _.

== stack becomes: i X apply REST
apply code stack-first 00 01000000. stack-rest . input = eval code 00 01101001 stack-first concat 00 01100000 stack-rest take 00000000 input drop 00000000 input.

== apply: ?c
== stack was: X ?c apply REST
== stack becomes: v X apply REST
apply code stack-first 01 00 00111111 00stack-second stack-rest _ input = eval code 00 01110110 stack-first concat 00 01100000 stack-rest _ input.

== stack becomes: ? X apply REST
apply code stack-first 01 00 00111111 00stack-second stack-rest current-char input = eval code ? current-char stack-second stack-first concat 00 01100000 stack-rest current-char input.

? _ 11 = 00 01101001.
? 0current-char 0compare = ? current-char compare.
? 1current-char 1compare = ? current-char compare.
? . . = 00 01110110.

== apply: |
== stack was: X | apply REST
== stack becomes: v X apply REST
apply code stack-first 00 01111100. stack-rest _ input = eval code 00 01110110 stack-first concat 00 01100000 stack-rest _ input.

== stack becomes: .c X apply REST
apply code stack-first 00 01111100. stack-rest current-char input = eval code concat 01 00 00101110 00 concat current-char 11 stack-first concat 00 01100000 stack-rest current-char input.

Wednesday, July 29, 2009

I recently found a single-character bug that had been around for months, and was even in production. In a stored procedure, there was something like


SELECT COLUMN1, COLUMN2, COLUMN3
COLUMN4, COLUMN5 FROM TABLE1 WHERE ...

It wasn't discovered, since COLUMN3 and COLUMN4 weren't being used. Once I started implementing a feature that used COLUMN3, trying to retrieve it using the stored procedure resulted in a no such column exception, and once I looked at the source code for the stored procedure, it was obvious.

Monday, July 27, 2009

One minor annoyance is when things are declared public when they should have been declared private. It makes maintaining code more difficult, because changing public signatures means having to find out where they are used. If there are public signatures that are used externally, then changing them even more difficult and might not even be worth it. Private signatures, on the other hand, have limited scope, making them much easier to change. Dealing with public signatures that should have been private means having to look and see that they aren't being used, shouldn't be used, and won't be used outside the class, before changing them (and making them private).

Using the Spring Framework usually means making a bunch of public setter methods. Another option would be to have them all as constructor arguments, which more error-prone, since constructor arguments aren't by name, and can't be defaulted. The public setter methods aren't visible in practice, though, since the rest of the code sees the object through an interface that does not include the setters.

However, since having things set in the constructor better guarantees that the object is properly initialized when it is instantiates than having setter methods, I'm considering something like making an auxiliary parameter class for a constructor argument, and using public setters in parameter class. Something like


    public class Bean {
        public Bean(Parameters parameters) {
            ...
        }

        public static class Parameters {
            private Object parameter;
            ...

            public void setParameter(Object parameter) {
                this.parameter = parameter;
            }

            ...
        }
    }

and the configuration would be


  <bean class="Bean">
    <constructor-arg>
      <bean class="Bean$Parameter">
        <property name="parameter">...</property>
        ...
      <bean>
    </constructor-arg>
  </bean>

which has the advantage of named properties with default values without adding public setter methods to the main object.

Friday, July 24, 2009

I have a work computer running Microsoft Windows that I pretty much only use for email, since the company email is on Microsoft Exchange. The administration of that computer is done remotely by company IT, and it's been getting pretty slow with all the crap that they keep installing on it. Recently, whenever restarting the computer, it comes to a halt with a dialog window with title "SetWallpaper" saying "Invalid Image" or something. I've never set any wallpaper or anything like that for that computer, so what is that junk? I have to click on the button on the window to get past that before anything else happens. There is also other crap that trying to dereference a null pointer, and a window comes up saying a program tried to access memory at 0x000000F or something like that.

Which reminds me that there are low-level issues that I'm glad I don't have to deal with when using Java (or other good programming languages), such as pointer arithmetic, memory aliasing, or manual memory management. And I also don't miss other low level mechanisms, such as setjmp/longjmp.

Back to corporate annoyances, the HR department comes up with these initiatives with required participation, and they have their own web applications with their own passwords. And these passwords have different restrictions and requirements with regards to capitalized and uncapitalized letters, numbers, and punctuation, which means I can't use the password I use for the Microsoft Windows login. So I never remember the passwords I that I have to use. So every time I have to log into this or that HR application, I have to go to the forgot password page and get a temporary password sent to me.

Wednesday, July 22, 2009

When I was young, my family had an Atari 800. We had a number of cartridge games. One thing my brother did was to copy cartridges to disk. He would start the computer with the cartridge door open and the switch jammed on, and then stick in the cartridge while the computer was running his program. His first attempts failed. He got it to work by copying the cartridge ROM into RAM, and then saving the copied data instead of saving from the ROM directly.

The Atari disk, which had 128 byte sectors, used the last 2 or 3 bytes to link to the next sector of the file. When saving to disk, the Atari would overwrite the RAM with the links, save the sector, then restore the overwritten bytes. When saving the screen memory, spurious characters would pop up at intervals and then disappear. But that's the reason why saving directly from ROM didn't work.

One of the motivations for copying the cartridges was to be able to cheat by getting extra lives by tweaking some data, or by getting infinite lives by modifying a branch or increment or decrement operation, or something like that.

The Atari also went beep beep beep when reading from the disk, and clunk clunk clunk when writing to the disk.

Monday, July 20, 2009

One thing that I didn't understand when I first started my current job was the difference between internal and external hostnames (and ip addresses and ports). The use of load-balancers for internal as well as external transactions also complicates the issue.

It remains a stumbling block, since, in development and QA environments, the internal and external hostnames are the same, causing many developers to be either ignorant of the difference, or to treat it rather cavalierly.

It's not terribly complicated, but an awareness is required. The internal and external URLs have to be configured separately. Using a single URL configuration is one stumbling block. When sending callback URLs or redirect URLs, for example, one needs to know to send the internal URL when passing it to an internal system, and to send the external URL when passing it to an external system.

Friday, July 17, 2009

On one project that I worked on had a cumbersome check-in process. Due to this or that problem, it was instituted that a brief write-up of what changes would be made had to be created, reviewed, and approved before any work could be done on the code. Furthermore, after the work was done, every change had to be reviewed before it could be checked in.

At first, getting the changes reviewed was horrible. The person making the changes would have to find someone to review the code, then they both had to go over to the computer with the changes, and go over the changes. Most of the time, the reviewer wouldn't really be able find any errors other than really glaring superficial ones anyhow.

Eventually, a coworker came up with some shell scripts that package up the diffs of the change into zip file and attach it to the jira issue, and also to scrape the jira page for the zip file, and unpack and apply the changes to a source tree. That made finding someone to review the changes asynchronous and less painful. For the reviewer, it was now easier to examine the changes more in depth, including doing some testing of the changes if warranted. So, the review process became almost bearable.

All this heavy-handed process would be fine in a locked-down release branch, but this was for everything.

It got so that if I happened to notice a bug in some code that I wasn't reviewing, I'd just let it be. Before then and these days, I'd just fix it if it was trivial. But, at that time, if someone else didn't come across it, it probably wasn't important enough to fix. And if someone else did come across it, I'd know just how to fix it pretty much immediately. It just wasn't worth the hassle to have to make a write-up, have it reviewed and approved, just for what was often some one-line fix, and then have to get that reviewed before checking it in.

Wednesday, July 15, 2009

I'm not very good at or about writing documentation. I have some ideas on what I think documentation should be. I think documenting interfaces is more important than documenting implementations. I like documentation systems like Javadoc, since it is right with the code and it allows boilerplate to be automatically generated, and encourages the documentation of interface over implementation.

For external documents, there's often the essential information, and then there's lots of verbiage. I know how to stick in the essential information. I don't know how to generate the verbiage.

Also, where I worked in the past, documents had to be in Microsoft Word format, and were emailed around. I didn't like using Microsoft Word. If I could get away with it, I'd write the documentation in a plain text file using emacs.

Nowadays, the documentation is on Atlassian Confluence, which I greatly prefer over Microsoft Word documents. The Confluence search is horrible, but, other than that, I think it's pretty good. There was an intermediary period where documentation was either in Microsoft Word or in Confluence, or in both, and in discussions on where it should be, I'd always vote for Confluence.

Monday, July 13, 2009

I first started using Linux in January 1991. I downloaded the bootdisk and rootdisk of version 0.11 from tsx-11.mit.edu. My computer had 2 megabytes of RAM and a 40 megabyte disk, and a 16MHz processor and a 2400 baud modem, so it was super slow. The lack of RAM was the biggest issue, and gcc would be swapping forever to build anything.

So, what I did was to build gcc on the university SunOS computer (which had replaced an Ultrix computer a year or two earlier) to cross-compile to 80386. That worked pretty well. I also tried to build gas to cross-assemble to object files, but that failed. But gas was fast enough on my computer, so, whenever I wanted to build anything, I'd download it to the university computer and cross-compile it, then download the .s files to my Linux computer, and assemble and link it. It beat waiting hours for gcc on my Linux computer. That's how I built nethack, which I played quite a bit back then.

Back then, I thought I'd be switching to some GNU operating system eventually. After all, Linux was 386-only at the time. But nowadays, Linux is very widely used. My main computer at work now runs Linux, though I also have one with Microsoft Windows, as the company email and meetings are on Microsoft Exchange. The production application servers are all Linux, though the databases are Solaris.

Friday, July 10, 2009

I first started using source control when I was a student. I started with rcs. I had also read about sccs, but rcs was free and it was available. I liked having a history of changes and the ability to get at old versions of files, though I didn't fully appreciate its value at the time. A while later, I started using cvs on projects worked on by 2-4 people. This was where CVSROOT was on the filesystem.

Then, I took a job that used Microsoft Visual SourceSafe. It had its advantages and disadvantages compared to cvs. I disliked the interface, though. From time to time, I tried to figure out how to use it from the command line, but never really got anywhere.

A while later, the source control got switched to cvs (client-server). I really liked that change. Mainly because I could then do away with having to work on Microsoft Windows at all.

Where I work now uses perforce, which I like. It's a modern source control system with a command-line interface, and there's a nice emacs package for it that I also use often.

I imagine other modern source control systems are pretty much like perforce, but the most I've ever done with them is to download source code from public subversion repositories. I've also played with GNU arch a little, but with just a local repository. It was like going back to rcs, in a sense.

Wednesday, July 8, 2009

One of the times where the motivation to work on code is lower than I'd like is at the beginning of a project, where nothing is there. This happens quite a bit with personal projects. At work, starting projects from scratch is very rare. One problem is that there are lots of things that I have ideas about, but there is no framework that they'll fit in. Once the framework is in place, the motivation to work on something because much higher, because, when I have an idea, I can get straight to work on it and see how it works.

I generally get started by working bottom-up, making some components that I think I'll be using. Once I have some components, I'll work top-down, building out a skeleton framework. Once a sufficient framework is in place, it gets a lot more fun, where I can add stuff and test it out immediately.

The Spring Framework inversion of control container is really helpful throughout this process. I can make an initial implementation of a component by hard-coding its return values. After testing the framework around that component, I can swap the implementation for another test implementation that saves everything in memory. After that, I can swap it for a real implementation that stores stuff in a database. All that is also possible using hand-coded factory methods, but with the Spring Framework, it is available at every level -- each component can be injected with other components that can be swapped in or out in the configuration file, all without having to code up more factory methods.

Monday, July 6, 2009

The first bug tracking system I used had some Microsoft Window only client. I think it was called Track. Its database got corrupted too often. The next one I used was Bugzilla, which was a huge improvement. Plus, I didn't have to use Microsoft Windows in order to use it. The next one was Rational ClearQuest. It had a web interface, but it, rather pointlessly, I thought, used Java applets. It got replaced by Jira, another huge improvement, and is still being used.

When I get assigned a bug, it's sometimes clear from the report what code needs to be fixed. Most of the time, it's not.

Sometimes, logs are attached to the bug, and those are sometimes enough to determine what the fix should be. Sometimes there aren't logs attached, or the logs attached aren't. If the bug was filed by QA, then I go to the QA system and look at the logs there. The logs on the QA systems usually go back a week or two, and the bugs are usually assigned to me three or four days after they were filed. I can usually go back and see more context if the logs were attached, or try to guess at what logs from the time not too long before when the bug was filed was is applicable.

Most of the developers where I work will wipe out the logs when reproducing bugs, probably because they consider old logs clutter. But, it does mean losing some context, and sometimes when they come to me for help, the logs they've wiped out would have been helpful. I've never wiped out the logs on my systems, and have logs going back for years.

Finally, once all else fails, I'll try to reproduce the bug on my system.

Friday, July 3, 2009

The problem with restructuring a bunch of code to make adding a bunch of features much cleaner is that management wants some those features that I added after restructuring the code in an a branch that was made before the restructuring. And of course, the restructuring can't go into that branch.

That's what happened to me recently. Version 1.x had been branched off, and I did lots of restructuring of the main branch for version 2.0, as well as adding a bunch of features. Version 1.x.y was frozen except for critical bug fixes, but version 1.x.z given an extended schedule, and some features I put in 2.0 now need to be 1.x.z, and the implementations of most of those features depend in the restructuring to be done cleanly. I think I'll be hacking in throwaway implementations for a lot of them in 1.x.z.

Wednesday, July 1, 2009

One thing I've wished for from time to time when using Java is tuples and maybe some syntactical sugar on top of it for returning multiple values. This happens when I have a method that needs to return one more thing.

One solution I've used in the past was to pass in an array, and put the value to be returned in the array. It's really ugly, and it's not my preferred way of doing things.

What I do now is declare a new class to hold the multiple values.

It would be nicer to be able to be able to declare something like this


    Tuple<Integer,Integer,Integer> get3Numbers();

and then call it with


    a, b, c = get3Numbers();


    (a, b, c) = get3Numbers();

where a, b, and c are ints in this example.

I write the code