Saturday, March 5, 2011

Using Exceptions for Pattern Matching in Java

For a while I've had an idea that Java's try/catch scaffolding could be used to simulate variable return types and execute different code based on that. I decided the simplest way to demonstrate this would be to implement a pattern matcher, so I've done that here. I don't think this is a particularly elegant solution to pattern matching in Java, and I don't advocate it's use - think of it more as a different way to think about exceptions in your code. If you do decide to use something like this in production code (for whatever reason) I recommend having the MultiReturnTypeFunction and Matcher extend the Function type in Google's Guava library.

The important classes are Case, Matcher, MultiReturnTypeFunction, and Switch.

Case is a special type of Exception which does not fill in its stack trace when thrown (for performance reasons). MultiReturnTypeFunction throws a variety of Cases, which are caught by Switch and matched against the available Matchers until a match is found. The Match's apply method is called and the result is the Switch's result.


public abstract class Case extends Exception {

    /**
     * It's important to override this to reduce the time it takes to throw.
     */
    @Override
    public Case fillInStackTrace() {
        return this;
    }
}


public interface Matcher<U> {

    public boolean matches(Case c);

    public U apply(Case c);
}



public interface MultiReturnTypeFunction<T> {

    /**
     * Never returns normally, always throws a Case.
     */
    public void apply(T param) throws Case;
}



public class Switch <T, U> {
   public <T, U> U match (MultiReturnTypeFunction gen, T param, Matcher<U> defaultCase, Matcher<U>... matchers) {
        try {
            gen.apply(param); // always throws a Case
        }
        catch (Case e) {
            for (Matcher<U> m : matchers)
                if (m.matches(e))
                    return m.apply(e);
            return defaultCase.apply(e);
        }
        throw new IllegalStateException("No matcher found for " + param + " in " + gen);
    }
}


I have some example code to show this in use. It's simply finds the sign of a series of random Integers and prints them to standard output.

First we need Case corresponding to each case - Positive, Negative, and Zero. We could also have null, but I think that's overkill for this example.


class CaseNegative extends Case {}
class CasePositive extends Case {}
class CaseZero extends Case {}


And the example code:


public class Main {
    public static void main(String[] args) {

        MultiReturnTypeFunction<Integer> signOf = new MultiReturnTypeFunction<Integer>() {
            public void apply(Integer param) throws Case {
                if (param > 0) {
                    throw new CasePositive();
                }
                if (param == 0) {
                    throw new CaseZero();
                }
                throw new CaseNegative();
            }
        };

        Matcher<String> throwExceptionMatcher = new Matcher<String>() {
            public boolean matches(Case c) {
                return true;
            }

            public String apply(Case c) {
                throw new IllegalStateException("No matching matcher found for " + c);
            }
        };

        Matcher<String> positiveValueMatcher = new Matcher<String>() {
            public boolean matches(Case c) {
                return c instanceof CasePositive;
            }

            public String apply(Case c) {
                return "Positive";
            }
        };

        Matcher<String> negativeValueMatcher = new Matcher<String>() {
            public boolean matches(Case c) {
                return c instanceof CaseNegative;
            }

            public String apply(Case c) {
                return "Negative";
            }
        };

        Matcher<String> zeroValueMatcher = new Matcher<String>() {
            public boolean matches(Case c) {
                return c instanceof CaseZero;
            }

            public String apply(Case c) {
                return "Zero";
            }
        };

        Switch<Integer, String> _switch = new Switch <Integer, String>();

        for (int i = 0; i < 10; i++)
        {
            int val = new Random().nextInt(3) - 1;
            String sign = _switch.match(signOf, val, throwExceptionMatcher,
                    zeroValueMatcher, negativeValueMatcher, positiveValueMatcher);

            System.out.println(val + " is " + sign);
        }
    }
}


Wednesday, November 10, 2010

Participating in NaNoWriMo 2010

This year I am participating in NaNoWriMo and I have decided to publish my story as I go. It's pretty standard High Fantasy, so if you enjoy reading about magical events in other worlds, please feel free to head over to http://eric-nanowrimo-2010.blogspot.com to read it - I'd love to get your feedback! Yes, yours!

Monday, August 30, 2010

Loose Change

I'm sure someone has posted this somewhere before, but I've been curious about the number of different coins I need to carry around to ensure I can always reach the exact change for any amount between 5c and $5 (in Australia 5c is the lowest denomination of coin). Tonight I decided to work out a set of coins I would be happy carrying around if minimising the weight of my wallet ever actually becomes important to me. Interestingly, the weight/value ratio of 5 (0.566g/c), 10 (0.565g/c) and 20 (0.565g/c) cent pieces is so similar that it doesn't really matter (on a weight/value basis) which of these is chosen. On the other hand, I also dislike carrying large numbers of coins, so I came up with this combination of coins:

  • 2 × $2 (13.2g)

  • $1 (9g)

  • 50c (15.55g)

  • 2 × 20c (22.6g)

  • 10c (5.65g)

  • 5c (2.83g)


This gives a total of 8 coins, adding up to $6.05 and weighing 68.83g total. Replacing a $2 coin with a $1 coin will give you the same flexibility while carrying around one dollar less, but you'll also be carrying 2.4g more (according to the source of all knowledge).

Wednesday, June 3, 2009

Replacing Booleans with Enums for Great Justice

From time to time I see method signatures that look like the following contrived example:


public interface Copier {
public void copy (boolean readOnly, boolean fuzz);
}


If someone calling this function doesn't have access to the documentation or source code, they won't be able to see the names of the variables, and therefore will not know that the first variable signifies read-only status, and the second whether or not to use fuzz. This leads to coding by guesswork, which is a bad thing.

Enumerations can fix this:


public interface Copier {
public static final enum ReadMode {READ_ONLY, READ_WRITE}
public static final enum FuzzType {NO_FUZZ, REGULAR_FUZZ}

public void copy (ReadMode readOnly, FuzzType fuzz);
}


This means that the developer does not have to look up documentation to see what each argument is for, and the compiler will shout at people if they mix up the parameters.

This way of doing things is also much more extensible. Suppose the code needed to handle a new type of fuzz for the latest version of the product: super fuzz. The FuzzType implementation could easily be replaced with the following:


FuzzType {NONE, DEFAULT, SUPER}


Although this breaks the API, it does so in a less destructive way than changing the type of the fuzz parameter - the code will still compile, anyone using a default clause in their case statements (or else clause in their if statements) will already be able to handle the API change, most of the time things will continue working and babies will not die. Another sticky situation occurs if a client iterates over the enumeration. If the client does this, then (according to Murphy's Law) the client will do something bad and things will break.

Replacing booleans with enumerations is not a new idea (others have suggested similar things) and, like anything, is not a silver bullet: an enum takes up space just like any other class, which is less efficient space-wise than using a boolean; a third possible value (null) is added which should be checked for; and the code becomes a little more complex. The main benefit from using an enum instead of a boolean is self-documenting code.

Tuesday, May 12, 2009

Doin' the backup

I've finally convinced myself that a) I should backup, and b) it's not all that expensive. Following jwz's advice, I bought myself a drive, stuck it in a USB enclosure and ran the backup command. After that was all finished I realised I had to disconnect the drive. This is where things got a little tricky. If I was running OSX or Windows, I could use the standard eject command (remove hardware on Windows); the disc would be unmounted, stopped, and (possibly) turned off. I wasn't too worried about turning the disc off because that happens when the plug is pulled. I also wasn't too worried about unmounting the disc because I know how to do that. What I was worried about was spinning the disc down.

When power is cut to a typical hard disc, one of two things happens. First, the head will just sit there between the platters, and if you shake the drive or drop it (or whatever), then the disc will get scratched and you will lose data. Second, (preferably) the head will perform an emergency retraction and get itself out from between the platters. This is a good thing because it means that it won't stay in there and delete your precious bits, but it's still not ideal because there is the potential for it to scratch your disc on the way home. So, ideally, the disc should be stopped before the power is removed.

But, in this case, I'm not running OSX or Windows - I'm running Linux, and Linux doesn't have a simple, built-in, all-in-one solution for this.

The first program I tried was scsiadd. Random Bits has a pretty good post on scsiadd, and I was hopeful, but I found that, for whatever reason, it didn't work for me.

sdparm is what I was looking for, and it works like a charm.

As a side note, I probably should also suspend the USB port as Yan Li suggests, but my kernel doesn't like that so I didn't bother.

The script that I've settled on (for now) is:

#!/bin/sh

mount /dev/sdb /media/backup

rsync -vaxE --delete --ignore-errors / /media/backup/

umount /dev/sdb

## spin the disc down
sdparm -C stop /dev/sdb

## wait for disc to spin down
sleep 10

echo "Done"

I'll probably add a few more lines to check stuff like whether something is actually at /dev/sdb but this script already does pretty much everything I need it to.