Overview of Haskell

29 minute read

Introduction to Haskell

Haskell is a programming language that is:

  • Functional– the basic building blocks of programs are functions. Functions can return functions and take functions as arguments. Also, the only looping construct in Haskell is recursion.

  • Pure- Haskell functions are pure, that is, they don’t have side effects. Side effects mean things like reading a file, printing out text, or changing global variables. All inputs to a function must be in its arguments, and all outputs from a function in its return value. This sounds restricting, but makes reasoning about programs easier, and allows for more optimizations by the compiler.

  • Lazy- values are only evaluated when they are needed. This makes it possible to work with infinite data structures, and also makes pure programs more efficient.

  • Strongly typed- every Haskell value and expression has a type. The compiler checks the types at compile-time and guarantees that no type errors can happen at runtime. This means no AttributeErrors (a la Python), ClassCastExceptions (a la Java) or segmentation faults (a la C). The Haskell type system is very powerful and can help you design better programs.

  • Type inferred- in addition to checking the types, the compiler can deduce the types for most programs. This makes working with a strongly typed language easier. Indeed, most Haskell functions can be written completely without types. However programmers can still give functions and values type annotations to make finding type errors easier. Type annotations also make reading programs easier.

  • Garbage-collected - like most high-level languages these days, Haskell has automatic memory management via garbage collection. This means that the programmer doesn’t need to worry about allocating or freeing memory, the language runtime handles all of it automatically.

  • Compiled- even though we mostly use Haskell via the interactive GHCi environment on this course, Haskell is a compiled language. Haskell programs can be compiled to very efficient binaries, and the GHC compiler is very good at optimizing functional code into efficient machine code.

You’ll learn what these terms mean in practice during this course. Don’t worry if some of them sound abstract right now.

See also: The Haskell Wiki page on Functional programming.

Features

Here’s a showcase of some of the Haskell’s cool features:

Higher-order functions – functions can take functions as arguments:

map length ["abc","abcdef"]

This results in [3,6].

Anonymous functions aka lambdas – you can define single-use helper functions without giving them a name

filter (\x -> length x > 1) ["abc","d","ef"]

This results in ["abc","ef"].

Partial application – you can define new functions by giving another function only some of the arguments it needs. For example this multiplies all elements in a list by 3:

map (*3) [1,2,3]

Algebraic datatypes – a syntax for defining datatypes that can contain a number of different cases:

data Shape = Point | Rectangle Double Double | Circle Double

Now the type Shape can have values like Point, Rectangle 3 6 and Circle 5

Pattern matching – defining functions based on cases that correspond to your data definitions:

area Point = 0
area (Rectangle width height) = width * height
area (Circle radius) = 2 * pi * radius

Lists – Unlike many languages, Haskell has a concise built-in syntax for lists. Lists can be built from other lists using list comprehensions. Here’s a snippet that generates names of even length from a set of options for first and last names:

[whole | first <- ["Eva", "Mike"],
         last <- ["Smith", "Wood", "Odd"],
         let whole = first ++ last,
         even (length whole)]

This results in ["EvaSmith","EvaOdd","MikeWood"]. Thanks to the laziness of Haskell, we can even create so-called infinite lists:

primes = [ n | n <- [2..]
             , all (\k -> n `mod` k /= 0) [2..n `div` 2] ]

The first ten prime numbers can be then obtained by evaluating

take 10 primes

This evaluates to [2,3,5,7,11,13,17,19,23,29].

Parameterized types – you can define types that are parameterized by other types. For example [Int] is a list of Ints and [Bool] is a list of booleans. You can define typed functions that work on all kinds of lists, for example reverse has the type [a] -> [a] which means it takes a list containing any type a, and returns a list of the same type.

Type classes – another form of polymorphism where you can give a function a different implementation depending on the arguments’ types. For example the Show type class defines the function show that can convert values of various types to strings. The Num type class defines arithmetic operators like + that work on all number types (Int, Double, Complex, …).

Some History

A brief timeline of Haskell:

  • 1930s: Lamba Calculus
  • 1950s-1970s: Lisp, Pattern Matching, Scheme, ML
  • 1978 John Backus: Can programming be liberated from the von Neumann style?
  • 1987 A decision was made to unify the field of pure functional languages
  • 1990 Haskell 1.0
  • 1991 Haskell 1.1 (let-syntax, sections)
  • 1992 Haskell 1.2, GHC
  • 1996 Haskell 1.3 (Monads, do-syntax, type system improvements)
  • 1999 Haskell 98
  • 2000’s: GHC development, many extensions to the language
  • 2009 The Haskell 2010 standard
  • 2010’s: GHC development, Haskell Platform, Haskell Stack

The word ‘haskel’ means wisdom in Hebrew, but the name of the Haskell programming language comes from the logician Haskell Curry. The name Haskell comes from the Old Norse words áss (god) and ketill (helmet).

Uses of Haskell

Here are some examples of software projects that were written in Haskell.

  • The Darcs distributed version control system
  • The Sigma spam-prevention tool at Facebook
  • The implementations of the PureScript and Elm programming languages are written in Haskell
  • The Pandoc tool for converting between different document formats – it’s also used to produce this course material
  • The PostgREST server that exposes a HTTP REST API for a PostgreSQL database
  • Functional consulting companies like Galois and Well-Typed have a long history of developing critical systems for clients in Haskell

See The Haskell Wiki and this blog post for more!

GHCi

GHCi is the interactive Haskell interpreter. Here’s an example session:

$ ghci
GHCi, version 8.0.1  :? for help
Prelude> 1+1
2
Prelude> "asdf"
"asdf"
Prelude> reverse "asdf"
"fdsa"
Prelude> :type "asdf"
"asdf" :: [Char]
Prelude> tail "asdf"
"sdf"
Prelude> :type tail "asdf"
tail "asdf" :: [Char]
Prelude> :type tail
tail :: [a] -> [a]
Prelude> :quit
Leaving GHCi.

Let’s walk through this. Don’t worry if you don’t understand things yet, this is just a first brush with expressions and types.

Prelude> 1+1
2

The Prelude> is the GHCi prompt. It indicates we can use the functions from the Haskell base library called Prelude. We evaluate 1 plus 1, and the result is 2.

Prelude> "asdf"
"asdf"

Here we evaluate a string literal, and the result is the same string.

Prelude> reverse "asdf"
"fdsa"

Here we compute the reverse of a string by applying the function reverse to the value "asdf".

Prelude> :type "asdf"
"asdf" :: [Char]

In addition to evaluating expressions we can also ask for their type with the :type (abbreviated :t) GHCi command. The type of "asdf" is a list of characters. Commands that start with : are part of the user interface of GHCi, not part of the Haskell language.

Prelude> tail "asdf"
"sdf"
Prelude> :t tail "asdf"
tail "asdf" :: [Char]

The tail function works on lists and returns all except the first element of the list. Here we see tail applied to "asdf". We also check the type of the expression, and it is a list of characters, as expected.

Prelude> :t tail
tail :: [a] -> [a]

Finally, here’s the type of the tail function. It takes a list of any type as an argument, and returns a list of the same type.

Prelude> :quit
Leaving GHCi.

That’s how you quit GHCi.

Expressions and Types

Just like we saw in the GHCi example above, expressions and types are the bread and butter of Haskell. In fact, almost everything in a Haskell program is an expression. In particular, there are no statements like in Python, Java or C.

An expression has a value and a type. We write an expression and its type like this: expression :: type. Here are some examples:

Expression Type Value
True Bool True
not True Bool False
"as" ++ "df" [Char] "asdf"

Syntax of Expressions

Expressions consist of functions applied to arguments. Functions are applied (i.e. called) by placing the arguments after the name of the function – there is no special syntax for a function call.

Haskell Python, Java or C
f 1 f(1)
f 1 2 f(1,2)

Parentheses can be used to group expressions (just like in math and other languages).

Haskell Python, Java or C
g h f 1 g(h,f,1)
g h (f 1) g(h,f(1))
g (h f 1) g(h(f,1))
g (h (f 1)) g(h(f(1)))

Some function names are made special characters and they are used as operators: between their arguments instead of before them. Function calls bind tighter than operators, just like multiplication binds tighter than addition.

Haskell Python, Java or C
a + b a + b
f a + g b f(a) + g(b)
f (a + g b) f(a+g(b))

P.S. In Haskell, function application associates left, that is, f g x y is actually the same as (((f g) x) y). We’ll get back to this topic later. For now you can just think that f g x y is f applied to the arguments g, x and y.

Syntax of Types

Here are some basic types of Haskell to get you started.

Type Literals Use Operations
Int 1, 2, -3 Number type (signed, 64bit) +, -, *, div, mod
Integer 1, -2, 900000000000000000 Unbounded number type +, -, *, div, mod
Double 0.1, 1.2e5 Floating point numbers +, -, *, /, sqrt
Bool True, False Truth values &&, ||, not
String aka [Char] "abcd", " Strings of characters reverse, ++

As you can see, the names of types in Haskell start with a capital letter. Some values like True also start with a capital letter, but variables and functions start with a lower case letter (reverse, not, x). We’ll get back to the meaning of capital letters in the next reading.

Function types are written using the -> syntax:

  • A function of one argument: argumentType -> returnType
  • … of two arguments: argument1Type -> argument2Type -> returnType
  • … of three arguments: argument1Type -> argument2Type -> argument3Type -> returnType

Looks a bit weird, right? We’ll get back to this as well.

Note About Misleading Types

Sometimes, the types you see in GHCi are a bit different than what you’d assume. Here are two common cases.

Prelude> :t 1+1
1+1 :: Num a => a

For now, you should read the type Num a => a as “any number type”. In Haskell, number literals are overloaded which means that they can be interpreted as any number type (e.g. Int or Double). We’ll get back to what Num a actually means when we talk about type classes later.

Prelude> :t "asdf"
"asdf" :: [Char]

The type String is just an alias for the type [Char] which means “list of characters”. We’ll get back to lists on the next lecture! In any case, you can use String and [Char] interchangeably, but GHCi will mostly use [Char] when describing types to you.

The Structure of a Haskell Program

Here’s a simple Haskell program that does some arithmetic and prints some values.

module Gold where

-- The golden ratio
phi :: Double
phi = (sqrt 5 + 1) / 2

polynomial :: Double -> Double
polynomial x = x^2 - x - 1

f x = polynomial (polynomial x)

main = do
  print (polynomial phi)
  print (f phi)

If you put this in a file called Gold.hs and run it with (for example) runghc Gold.hs, you should see this output

0.0
-1.0

Let’s walk through the file.

module Gold where

There is one Haskell module per source file. A module consists of definitions.

-- The golden ratio

This is a comment. Comments are not part of the actual program, but text for human readers of the program.

phi :: Double
phi = (sqrt 5 + 1) / 2

This is a definition of the constant phi, with an accompanying type annotation (also known as a type signature) phi :: Double. The type annotation means that phi has type Double. The line with a equals sign (=) is called an equation. The left hand side of the = is the expression we are defining, and the right hand side of the = is the definition.

In general a definition (of a function or constant) consists of an optional type annotation and one or more equations

polynomial :: Double -> Double
polynomial x = x^2 - x - 1

This is the definition of a function called polynomial. It has a type annotation and an equation. Note how an equation for a function differs from the equation of a constant by the presence of a parameter x left of the = sign. Note also that ^ is the power operator in Haskell, not bitwise xor like in many other languages.

f x = polynomial (polynomial x)

This is the definition of a function called f. Note the lack of type annotation. What is the type of f?

main = do
  print (polynomial phi)
  print (f phi)

This is a description of what happens when you run the program. It uses do-syntax and the IO Monad. We’ll get back to those later in the course.

Working with Examples

When you see an example definition like this

polynomial :: Double -> Double
polynomial x = x^2 - x - 1

you should usually play around with it. Start by running it. There are a couple of ways to do this.

If a definition fits on one line, you can just define it in GHCi using let:

Prelude> let polynomial x = x^2 - x - 1
Prelude> polynomial 3.0
5.0

For a multi-line definition, you can either use ; to separate lines, or use the special :{ :} syntax to paste a block of code into GHCi:

Prelude> :{
Prelude| polynomial :: Double -> Double
Prelude| polynomial x = x^2 - x - 1
Prelude| :}
Prelude> polynomial 3.0
5.0

Finally, you can paste the code into a new or existing .hs file, and then :load it into GHCi. If the file has already been loaded, you can also use :reload.

-- first copy and paste the definition into Example.hs,
-- then run GHCi
Prelude> :load Example.hs
[1 of 1] Compiling Main          ( Example.hs, interpreted )
Ok, one module loaded.
*Main> polynomial 3.0
5.0
-- now you can edit the definition
*Main> :reload
[1 of 1] Compiling Main          ( Example.hs, interpreted )
Ok, one module loaded.
*Main> polynomial 3
3.0

After you’ve run the example, try modifying it, or making another function that is similar but different. You learn programming by programming, not by reading!

Dealing with Errors

Since Haskell is a typed language, you’ll pretty quickly bump into type errors. Here’s an example of an error during a GHCi session:

Prelude> "string" ++ True

<interactive>:1:13: error:
    • Couldn't match expected type ‘[Char]’ with actual
      type ‘Bool’
    • In the second argument of ‘(++)’, namely ‘True’
      In the expression: "string" ++ True
      In an equation for ‘it’: it = "string" ++ True

This is the most common type error, “Couldn’t match expected type”. Even though the error looks long and scary, it’s pretty simple if you just read through it.

  • The first line of the error message, <interactive>:1:13: error: tells us that the error occurred in GHCi. If we had loaded a file, we might instead get something like Sandbox.hs:3:17: error:, where Sandbox.hs is the name of the file, 3 is the line number and 17 is the number of a character in the line.

  • The line • Couldn't match expected type ‘[Char]’ with actual type ‘Bool’ tells us that the immediate cause for the error is that there was an expression of type Bool, when GHCi was expecting to find an expression of type [Char]”. The location of this error was indicated in the first line of the error message. Note that the expected type is not always right. Giving type annotations by hand can help debugging typing errors.

  • The line • In the second argument of ‘(++)’, namely ‘True’ tells that the expression that had the wrong type was the second argument of the operator (++). We’ll learn later why it’s surrounded by parentheses.

  • The full expression with the error was "string" ++ True. As mentioned above, String is a type alias for [Char], the type of character lists. The first argument to ++ was a list of characters, and since ++ can only combine two lists of the same type, the second argument should’ve been of type [Char] too.

  • The line In an equation for ‘it’: it = "string" ++ True says that the expression occurred in the definition of the variable it, which is a default variable name that GHCi uses for standalone expressions. If we had a line x = "string" ++ True in a file, or a declaration let x = "string" ++ True in GHCi, GHCi would print In an equation for ‘x’: x = "string" ++ True instead.

There are also others types of errors.

Prelude> True + 1

<interactive>:6:1: error:
    • No instance for (Num Bool) arising from a use of ‘+’
    • In the expression: True + 1
      In an equation for ‘it’: it = True + 1

This is the kind of error you get when you try to use a numeric function like + on something that’s not a number.

The hardest error to track down is usually this:

Prelude> True +

<interactive>:10:7: error:
    parse error (possibly incorrect indentation or
    mismatched brackets)

There are many ways to cause it. Probably you’re missing some characters somewhere. We’ll get back to indentation later in this lecture.

Arithmetic

There’s one thing in Haskell arithmetic that often trips up beginners, and that’s division.

In Haskell there are two division functions, the / operator and the div function. The div function does integer division:

Prelude> 7 `div` 2
3

The / operator performs the usual division:

Prelude> 7.0 / 2.0
3.5

However, you can only use div on whole number types like Int and Integer, and you can only use / on decimal types like Double. Here’s an example of what happens if you try to mix them up:

halve :: Int -> Int
halve x = x / 2
error:
    • No instance for (Fractional Int) arising from a use
      of ‘/’
    • In the expression: x / 2
      In an equation for ‘halve’: halve x = x / 2

Just try to keep this in mind for now. We’ll get back to the difference between / and div, and what Num and Fractional mean when talking about type classes.

How Do I Get Anything Done?

So far you’ve seen some arithmetic, reversing a string and so on. How does one write actual programs in Haskell? Many of the usual programming constructs like loops, statements and assignment are missing from Haskell. Next, we’ll go through the basic building blocks of Haskell programs:

  • Conditionals
  • Local definitions
  • Pattern matching
  • Recursion

Conditionals

In other languages, if is a statement. It doesn’t have a value, it just conditionally executes other statements.

In Haskell, if is an expression. It has a value. It selects between two other expressions. It corresponds to the ?: operator in C or Java.

// Java
int price = product.equals("milk") ? 1 : 2;

Python’s conditional expressions are quite close to haskell’s if:

# Python
price = 1 if product == "milk" else 2

This is how the same example looks in Haskell:

price = if product == "milk" then 1 else 2

Because Haskell’s if returns a value, you always need an else!

Functions Returning Bool

In order to write if expressions, you need to know how to get values of type Bool. The most common way is comparisons. The usual ==, <, <=, > and >= operators work in Haskell. You can do ordered comparisons (<, >) on all sorts of numbers, and equality comparisons (==) on almost anything:

Prelude> "foo" == "bar"
False
Prelude> 5.0 <= 7.2
True
Prelude> 1 == 1
True

One oddity of Haskell is that the not-equals operator is written /= instead of the usual !=:

Prelude> 2 /= 3
True
Prelude> "bike" /= "bike"
False

Remember that in addition to these comparisons, you can get Bool values out of other Bool values by using the && (“and”) and || (“or”) operators, and the not function.

Examples

checkPassword password = if password == "swordfish"
                         then "You're in."
                         else "ACCESS DENIED!"
absoluteValue n = if n < 0 then -n else n
login user password = if user == "unicorn73"
                      then if password == "f4bulous!"
                           then "unicorn73 logged in"
                           else "wrong password"
                      else "unknown user"

Local Definitions

Haskell has two different ways for creating local definitions: let...in and where.

where adds local definitions to a definition:

circleArea :: Double -> Double
circleArea r = pi * rsquare
    where pi = 3.1415926
          rsquare = r * r

let...in is an expression:

circleArea r = let pi = 3.1415926
                   rsquare = r * r
               in pi * rsquare

Local definitions can also be functions:

circleArea r = pi * square r
    where pi = 3.1415926
          square x = x * x
circleArea r = let pi = 3.1415926
                   square x = x * x
               in pi * square r

We’ll get back to the differences between let and where, but mostly you can use which ever you like.

A Word About Immutability

Even though things like pi above are often called variables, I’ve chosen to call them definitions here. This is because unlike variables in Python or Java, the values of these definitions can’t be changed. Haskell variables aren’t boxes into which you can put new values, Haskell variables name a value (or rather, an expression) and that’s it.

We’ll talk about immutability again later on this course, but for now it’s enough to know that things like this don’t work.

increment x = let x = x+1
              in x

This is just an infinite loop, because it tries to define a new variable x with the property x = x+1. Thus when evaluating x, Haskell just keeps computing 1+1+1+1+... indefinitely.

compute x = let a = x+1
                a = a*2
            in a
error:
    Conflicting definitions for ‘a’
    Bound at: <interactive>:14:17
              <interactive>:15:17

Here we get a straightforward error when we’re trying to “update” the value of a.

As a remark, local definitions can shadow the names of variables defined elsewhere. Shadowing is not a side-effect. Instead, shadowing creates a new variable within a more restricted scope that uses the same name as some variable in the outer scope. For example, all of the functions f, g, and h below are legal:

x :: Int
x = 5

f :: Int -> Int
f x = 2 * x

g :: Int -> Int
g y = x where x = 6

h :: Int -> Int
h x = x where x = 3

If we apply them to the global constant x, we see the effects of shadowing:

f 1 ==> 2
g 1 ==> 6
h 1 ==> 3

f x ==> 10
g x ==> 6
h x ==> 3

It is best to always choose new names for local variables, so that shadowing never happens. That way, the reader of the code will understand where the variables that are used in an expression come from. Note that in the following example, f and g don’t shadow each others’ arguments:

f :: Int -> Int
f x = 2 * x + 1

g :: Int -> Int
g x = x - 2

Pattern Matching

A definition (of a function) can consist of multiple equations. The equations are matched in order against the arguments until a suitable one is found. This is called pattern matching.

Pattern matching in Haskell is very powerful, and we’ll keep learning new things about it along this course, but here are a couple of first examples:

greet :: String -> String -> String
greet "Finland" name = "Hei, " ++ name
greet "Italy"   name = "Ciao, " ++ name
greet "England" name = "How do you do, " ++ name
greet _         name = "Hello, " ++ name

The function greet generates a greeting given a country and a name (both Strings). It has special cases for three countries, and a default case. This is how it works:

Prelude> greet "Finland" "Pekka"
"Hei, Pekka"
Prelude> greet "England" "Bob"
"How do you do, Bob"
Prelude> greet "Italy" "Maria"
"Ciao, Maria"
Prelude> greet "Greenland" "Jan"
"Hello, Jan"

The special pattern _ matches anything. It’s usually used for default cases. Because patterns are matched in order, it’s important to (usually) put the _ case last. Consider:

brokenGreet _         name = "Hello, " ++ name
brokenGreet "Finland" name = "Hei, " ++ name

Now the first case gets selected for all inputs.

Prelude> brokenGreet "Finland" "Varpu"
"Hello, Varpu"
Prelude> brokenGreet "Sweden" "Ole"
"Hello, Ole"

GHC even gives you a warning about this code:

<interactive>:1:1: warning: [-Woverlapping-patterns]
    Pattern match is redundant
    In an equation for ‘brokenGreet’: brokenGreet "Finland"
    name = ...

Some more examples follow. But first let’s introduce the standard library function show that can turn (almost!) anything into a string:

Prelude> show True
"True"
Prelude> show 3
"3"

So, here’s an example of a function with pattern matching and a default case that actually uses the value (instead of just ignoring it with _):

describe :: Integer -> String
describe 0 = "zero"
describe 1 = "one"
describe 2 = "an even prime"
describe n = "the number " ++ show n

This is how it works:

Prelude> describe 0
"zero"
Prelude> describe 2
"an even prime"
Prelude> describe 7
"the number 7"

You can even pattern match on multiple arguments. Again, the equations are tried in order. Here’s a reimplementation of the login function from earlier:

login :: String -> String -> String
login "unicorn73" "f4bulous!" = "unicorn73 logged in"
login "unicorn73" _           = "wrong password"
login _           _           = "unknown user"

Recursion

In Haskell, all sorts of loops are implemented with recursion. Function calls are very efficient, so you don’t need to worry about performance. (We’ll talk about performance later).

Learning how to do simple things with recursion in Haskell will help you use recursion on more complex problems later. Recursion is also often a useful way for thinking about solving harder problems.

Here’s our first recursive function which computes the factorial. In mathematics, factorial is the product of n first positive integers and is written as n!. The definition of factorial is

n! = n * (n-1) * … * 1

For example, 4! = 4*3*2*1 = 24. Well anyway, here’s the Haskell implementation of factorial:

factorial :: Int -> Int
factorial 1 = 1
factorial n = n * factorial (n-1)

This is how it works. We use ==> to mean “evaluates to”.

factorial 3
  ==> 3 * factorial (3-1)
  ==> 3 * factorial 2
  ==> 3 * 2 * factorial 1
  ==> 3 * 2 * 1
  ==> 6

What happens when you evaluate factorial (-1)?

Here’s another example:

-- compute the sum 1^2+2^2+3^2+...+n^2
squareSum 0 = 0
squareSum n = n^2 + squareSum (n-1)

A function can call itself recursively multiple times. As an example let’s consider the Fibonacci sequence from mathematics. The Fibonacci sequence is a sequence of integers with the following definition.

The sequence starts with 1, 1. To get the next element of the sequence, sum the previous two elements of the sequence.

The first elements of the Fibonacci sequence are 1, 1, 2, 3, 5, 8, 13 and so on. Here’s a function fibonacci which computes the nth element in the Fibonacci sequence. Note how it mirrors the mathematical definition.

-- Fibonacci numbers, slow version
fib 1 = 1
fib 2 = 1
fib n = fib (n-2) + fib (n-1)

Here’s how fib 5 evaluates:

fib 5
  ==> fib 3           + fib 4
  ==> (fib 1 + fib 2) + fib 4
  ==> (    1 +     1) + fib 4
  ==> (    1 +     1) + (fib 2 + fib 3)
  ==> (    1 +     1) + (fib 2 + (fib 1 + fib 2))
  ==> (    1 +     1) + (    1 + (    1 +     1))
  ==> 5

Note how fib 3 gets evaluated twice and fib 2 three times. This is not the most efficient implementation of the fib function. We’ll get back to this later. Another way to think about the evaluation of the fib function is to visualize it as a tree:

This tree then exaclty corresponds with the expression (1 + 1) + (1 + (1 + 1)). Recursion can often produce chain-like, tree-like, nested, or loopy structures and computations. Recursion is one of the main techniques in functional programming, so it’s worth spending some effort in learning it.

All Together Now!

Finally, here’s a complete Haskell module that uses ifs, pattern matching, local defintions and recursion. The module is interested in the Collatz conjecture, a famous open problem in mathematics. It asks:

Does the Collatz sequence eventually reach 1 for all positive integer initial values?

The Collatz sequence is defined by taking any number as a starting value, and then repeatedly performing the following operation:

  • if the number is even, divide it by two
  • if the number is odd, triple it and add one

As an example, the Collatz sequence for 3 is: 3, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1, 4, 2, 1 … As you can see, once the number reaches 1, it gets caught in a loop.

module Collatz where

-- one step of the Collatz sequence
step :: Integer -> Integer
step x = if even x then down else up
  where down = div x 2
        up = 3*x+1

-- collatz x computes how many steps it takes for the
-- Collatz sequence to reach 1 when starting from x
collatz :: Integer -> Integer
collatz 1 = 0
collatz x = 1 + collatz (step x)

-- longest finds the number with the longest Collatz
-- sequence for initial values between 0 and upperBound
longest :: Integer -> Integer
longest upperBound = longest' 0 0 upperBound

-- helper function for longest
longest' :: Integer -> Integer -> Integer -> Integer
-- end of recursion, return longest length found
longest' number _ 0 = number
-- recursion step: check if n has a longer Collatz sequence
-- than the current known longest
longest' number maxlength n = if length > maxlength
               then longest' n length (n-1)
               else longest' number maxlength (n-1)
  where length = collatz n

We can load the program in GHCi and play with it.

$ ghci
GHCi, version 8.2.2:  :? for help
Prelude> :load Collatz.hs
[1 of 1] Compiling Collatz     ( Collatz.hs, interpreted )
Ok, one module loaded.
*Collatz>

Let’s verify that our program computes the start of the Collatz sequence for 3 correctly.

*Collatz> step 3
10
*Collatz> step 10
5
*Collatz> step 5
16

How many steps does it take for 3 to reach 1?

*Collatz> collatz 3
7

What’s the longest Collatz sequence for a starting value under 10? What about 100?

*Collatz> longest 10
9
*Collatz> longest 100
97

The lengths of these Collatz sequences are:

*Collatz> collatz 9
19
*Collatz> collatz 97
118

A Word About Indentation

The previous examples have been fancily indented. In Haskell indentation matters, a bit like in Python. The complete set of rules for indentation is hard to describe, but you should get along fine with these rules of thumb:

  1. Things that are grouped together start from the same column
  2. If an expression (or equation) has to be split on to many lines, increase indentation

While you can get away with using tabs, it is highly recommended to use spaces for all indenting.

Some examples are in order.

These all are ok:

i x = let y = x+x+x+x+x+x in div y 5

-- let and in are grouped together, an expression is split
j x = let y = x+x+x
              +x+x+x
      in div y 5

-- the definitions of a and b are grouped together
k = a + b
  where a = 1
        b = 1

l = a + b
  where
    a = 1
    b = 1

These are not ok:

-- indentation not increased even though expression split
-- on many lines
i x = let y = x+x+x+x+x+x
in div y 5

-- indentation not increased even though expression is split
j x = let y = x+x+x
      +x+x+x
      in div y 5

-- grouped things are not aligned
k = a + b
  where a = 1
      b = 1

-- grouped things are not aligned
l = a + b
  where
    a = 1
     b = 1

-- where is part of the equation, so indentation needs to
-- increase
l = a + b
where
  a = 1
  b = 1

If you make a mistake with the indentation, you’ll typically get a parse error like this:

Indent.hs:2:1: error: parse error on input ‘where’

The error includes the line number, so just go over that line again. If you can’t seem to get indentation to work, try putting everything on just one long line at first.

Self Checks

Check 1

What is the Haskell equivalent of the C/Java/Python expression:

combine(prettify(lawn),construct(house,concrete))
  1. combine prettify (lawn) construct (house concerete)
  2. combine (prettify lawn (counstruct house concrete))
  3. combine (prettify lawn) (construct house concrete)

Check 2

What is the C/Java/Python equivalent of the Haskell expression

send metric (double population + increase)
  1. send(metric(double(population+increase)))
  2. send(metric(double(population)+increase))
  3. send(metric,double(population)+increase)
  4. send(metric,double(population+increase))

Check 3

Which one of the following claims is true in Haskell?

  1. Every value has a type
  2. Every type has a value
  3. Every statement has a type

Check 4

Which one of the following claims is true in Haskell?

  1. It’s impossible to reuse the name of a variable
  2. It’s possible to reassign a value to a variable
  3. An if always requires both then and else

Check 5

What does the following function do?

f x = if even (x + 1) then x + 1 else f (x - 1)
  1. Maps every value x to the least even number greater than or equal to x
  2. Maps every value x to the greatest even number less than or equal to x
  3. Maps every value to itself

Check 6

Why is 3 * "F00" not valid Haskell?

  1. 3 and "F00" have different types
  2. All numeric values need a decimal point
  3. "F00" needs the prefix “0x”

Check 7

Why does 7.0 `div` 2 give an error?

  1. Because div is not defined for the type Double
  2. Because div is not defined for the type Int
  3. Because `...` is used for delimiting strings.

Acknowledgment:

This reading was originally written by Joel Kaasinen and John Lång in their open source textbook. The book is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License, which allows users to copy, modify, and distribute the book.

This reading was modified by Titus H. Klinge in 2021 and presented above under the same license in order to better serve the students of this course.