Tuesday, September 11, 2007

Lesson 3: case

Overview


Today we will start learning about the case statement. Here is some code to get us started:
module Main where

main =
do putStrLn "Do you like Haskell? [yes/no]"
answer <- getLine
case answer of
"yes" -> putStrLn "yay!"
"no" -> putStrLn "I am sorry to hear that :("
_ -> putStrLn "say what???"

A Closer Look at case


The first line of the case statement looks like this:

case answer of

You can put any valid Haskell expression between the keywords case and of. In this example we have a very simple expression: the variable answer.

The next two lines look like this:

"yes" -> putStrLn "yay!"
"no" -> putStrLn "I am sorry to hear that :("

Notice that they are indented more than the case line. This is another example of whitespace sensitive layout in Haskell. Like with the do statement, each alternative of the case statement will be indented the same amount. If a line is indented more, then it is a continuation of the previous line. If a line is indented less, then the previous line is the last alternative in the case statement.

The case statement will check each alternative, in the order they are listed until it finds a pattern that matches. Once a match is found, the expression on the right hand side of the -> is evaluated. After a match is found, no further alternatives are considered.
The Default Wild Card Alternative

The final line is:

_ -> putStrLn "say what???"

The underscore is a wild card pattern that will match anything. So this alternative will match when the user enters something besides yes or no.

case Always Matches Exactly One Alternative


The case statement will always evaluate exactly one alternative. Let's see what happens when there is more than one match or no matches at all.
Overlapping Patterns

Let's say we stay up too late hacking Haskell code, and we accidentally put in the "yes" alternative twice:
main =
do putStrLn "Do you like Haskell? [yes/no]"
answer <- getLine
case answer of
"yes" -> putStrLn "yay!"
"yes" -> putStrLn "awesome!"
"no" -> putStrLn "I am sorry to hear that :("
_ -> putStrLn "say what???"

When we load this into GHCi (or compile it), we get a warning:

Prelude> :load "/tmp/Overlap.hs"
[1 of 1] Compiling Main ( /tmp/Overlap.hs, interpreted )

/tmp/Overlap.hs:6:7:
Warning: Pattern match(es) are overlapped
In a case alternative: "yes" -> ...
Ok, modules loaded: Main.
*Main>

If you try running the code, you will see that when you enter yes it always prints yay! and never prints awesome!. Notice that if we put the wild card pattern first, we will also get an overlapping pattern warning:
main =
do putStrLn "Do you like Haskell? [yes/no]"
answer <- getLine
case answer of
_ -> putStrLn "say what???"
"yes" -> putStrLn "yay!"
"no" -> putStrLn "I am sorry to hear that :("

GHCi tells us that "yes" and "no" will never be considered, since _ matches everything:

*Main> :load "/tmp/Overlap.hs"
[1 of 1] Compiling Main ( /tmp/Overlap.hs, interpreted )

/tmp/Overlap.hs:6:7:
Warning: Pattern match(es) are overlapped
In a case alternative:
"yes" -> ...
"no" -> ...
Ok, modules loaded: Main.
*Main>

Incomplete Patterns

Let's see what happens if we don't provide a default alternative:
main =
do putStrLn "Do you like Haskell? [yes/no]"
answer <- getLine
case answer of
"yes" -> putStrLn "yay!"
"no" -> putStrLn "I am sorry to hear that :("

When we load this code into GHCi, it loads with out any errors or warnings:
Prelude> :load "/tmp/Incomplete.hs"
[1 of 1] Compiling Main ( /tmp/Incomplete.hs, interpreted )
Ok, modules loaded: Main.
*Main>

But, if we enter a string other than yes or no, it throws an exception:
*Main> main
Do you like Haskell? [yes/no]
whee
*** Exception: /tmp/Incomplete.hs:(6,7)-(8,54): Non-exhaustive patterns in case

*Main>

It's nice that the exception tells us which file and line number the non-exhaustive pattern is at, but it would be even nicer if it told us before we tried to run the code. GHC can do this if we enable some extra warnings with the -W flag. In GHCi, we can set this flag by typing :set -W at the prompt:
*Main> :set -W
*Main> :load "/tmp/Incomplete.hs"
[1 of 1] Compiling Main ( /tmp/Incomplete.hs, interpreted )

/tmp/Incomplete.hs:6:7:
Warning: Pattern match(es) are non-exhaustive
In a case alternative:
Patterns not matched:
[]
(GHC.Base.C# #x) : _ with #x `notElem` ['y', 'n']
[GHC.Base.C# 'y']
(GHC.Base.C# 'y') : ((GHC.Base.C# #x) : _) with #x `notElem` ['e']
...
Ok, modules loaded: Main.
*Main>

Now, GHCi produces a (somewhat bizarre) warning, telling us that we have a non-exhaustive pattern. The last part of the error is not very easy to understand, but if we just look at the first two lines, things make sense:
/tmp/Incomplete.hs:6:7:
Warning: Pattern match(es) are non-exhaustive

This tells us that the case statement at Line 6, Column 7 in the file Incomplete.hs does not have alternatives for all possible values.

If you are compiling the code, you can just add the flag -W to the command-line:
 $ ghc --make -O2 -W Incomplete.hs -o incomplete

You may wonder why incomplete pattern matching is not enabled by default. Consider the following example:
main =
case 2 of
2 -> putStrLn "2"

With the extra warnings enabled, this produces the warning:
*Main> :load "/tmp/Complete.hs"
[1 of 1] Compiling Main ( /tmp/Complete.hs, interpreted )

/tmp/Complete.hs:2:4:
Warning: Pattern match(es) are non-exhaustive
In a case alternative:
Patterns not matched: #x with #x `notElem` [2#]
Ok, modules loaded: Main.
*Main>

The warning says, you only matched on the value 2, but you have not handled all the cases where the value is not equal to 2 (e.g. 1,3,4,5,6,...). Obviously 2 is the only value that will ever come up, so it does not matter that the other alternatives are not matched.

In this case, it is rather obvious that the warning can be ignored. A more sophisticated compiler might be able to figure this out as well, and not bother to warn you. In fact, there is a program catch, by Neil Mitchell, which does just that. I expect catch will be integrated in GHC someday.

Cool Stuff


We are not done learning about the case statement yet, but we have already seen some cool stuff. If you have used other languages such as C, C++, Java, etc, you are probably familiar with a similar construct know as the switch statement. However, in many languages, switch only works with a few (numeric) data types. The case statement in Haskell, however, can be used with (almost) all data types. In C, we would have to use a bunch of if-then-else statements like:
  if (!strcmp(answer,"yes"))
printf("yay!\n");
else if (!strcmp(answer,"no"))
printf("I am sorry to hear that :(\n");
else
printf("say what??\n");

I think you will agree that the Haskell version looks a lot more elegant and easier to comprehend. At the very least, the Haskell version is easier on the fingers to type.

Aesthetics aside, Haskell can help us avoid bugs by noticing overlapping patterns or incomplete patterns. A C compiler is not likely to notice if we have overlapping or incomplete patterns in our if-then-else-if... statement.

Warnings

GHC has lots of warnings that you can enable. They are documented here. Some projects, such as xmonad enable all the warnings using the -Wall flag, and fix all the warnings before shipping. All the extra warnings can be bothersome when you are developing. But, enabling and fixing the warnings is a good way to clean up your code and perhaps kill a few bugs before a release.

5 comments:

Wei Hu said...

This is a good read, thanks.
But right after you cited the program 'catch', you forgot to change the font back.

markstos said...

This was helpful and well-written. Thanks.

Jamin said...

Thanks for the post! I was playing around and ran into something I don't understand, could you explain?

Changing "no" to a constant string called testStr seems to not work for the non-"yes"/"no" cases. It seems to match to the testStr case rather than the _ case:



testStr :: String
testStr = "no"

main =
do putStrLn "Do you like Haskell? [yes/no]"
answer <- getLine
case answer of
"yes" -> putStrLn "yay!"
testStr -> putStrLn "I am sorry to hear that :("
_ -> putStrLn "say what???"

sipa said...

@Jamin: the 'testStr' you use in the case expression is a new local variable, and not the constant you defined toplevel. It is completely equivalent to _, except it does not bind the result to a new variable. To do what i believe you intend, use:

case answer of
"yes" -> putStrLn "yay!"
x | x==testStr -> putStrLn "I am sorry..."
_ -> putStrLn "say what?"

Anonymous said...

Do you have any sample ATM machine program written in haskell