Haskell Lecture #3
July 8, 2015
Anonymous functions
Often, you want to use a function without naming it; that is, create a λ-expression. In Haskell, use \ and -> like so.�
ghci> (\x -> x + 6) 10
16
ghci> (\x y -> y + x^3) 9 1000
1729
Multi-parameter lambdas
In the λ-calculus, one typically allows the shorthand:
λx. (λy. term) <==> λx y. term
Haskell also honors this in anonymous functions.
\x -> (\y -> <expr>) <==> \x y -> <expr>
\x -> (\y -> (\z -> <expr>)) <==> \x y z -> <expr>
Let’s talk about types.
You’ve realized by now that sometimes, the type signature is more general than what you expected.
ghci> :t 17
17 :: Num a => a
ghci> :t (\x -> x + 17)
(\x -> x + 17) :: Num a => a -> a
Let’s talk about types (2)
You’ve realized by now that sometimes, the type signature is more general than what you expected.
This is most often encountered at ghci. In source files, you may have already specified types.
-- Foo.hs
add17 :: Int -> Int
add17 = \x -> x + 17
17 :: (Num a) => a
What does 17 :: (Num a) => a mean?
Remember that types live at the expression (not value) level. The literal 17 might refer to an Int, a Double, or a Word8 (byte).
The compiler can figure out that it lives in some Num type, but not which one without more context.
:t (\x -> x)
Let’s look at the type for the identity function, id, defined as id x = x.
ghci> :t id
id :: a -> a
How do we know that id (5 :: Int) is well-typed? How about id id? Is that well-typed?
id (5 :: Int) is well typed.
The type variable a unifies with any concrete type. In this case, a -> a is shorthand for:
forall a. a -> a
meaning that id can be specialized to
id :: Int -> Int by letting a = Int.
Exercise: is id id well-typed?
id id :: ??
id id is well-typed.
Treat the two instances of id as separate:
id2 id1 :: ??
id2 :: b -> b id1 :: a -> a
The renaming is necessary because they’re different type variables. With b = a -> a, we get a well-typed expression. (id2 :: (a -> a) -> (a -> a) so id2 id1 :: a -> a).
A simple matter of typing...
Type checking is solving a system of equations based on context clues.
:t [Left 'z', Right False]
??
A simple matter of typing...
Type checking is solving a system of equations based on context clues.
:t [Left 'z', Right False]
[a] for some a (list literal)
a = Either Char x for some x (Left 'z')
a = Either y Bool for some y (Right False)
Solution exists: x = Bool, y = Char.
A simple matter of typing...
Type checking is solving a system of equations based on context clues.
:t [Left 'z', Right False]
[a] for some a (list literal)
a = Either Char x for some x (Left 'z')
a = Either y Bool for some y (Right False)
Solution exists: x = Bool, y = Char.
[Either Char Bool] is the answer.
A simple matter of typing… (2)
:t [Left 'z', Left False]
????
A simple matter of typing… (2)
:t [Left 'z', Left False]
[a] for some a (list literal)
a = Either Char x for some x (Left 'z')
a = Either Bool y for some y (Left False)
No solution exists! Char ≠ Bool, so it’s ill-typed.
Function type quiz (1)
λ :t \x -> if x then Left "eye" else Right "ear"
????
λ :t \x -> (x, x)
????
λ :t \x -> "fail"
????
Function type quiz (1)
λ :t \x -> if x then Left "eye" else Right "ear"
Bool -> Either String String
λ :t \x -> (x, x)
????
λ :t \x -> "fail"
????
Function type quiz (1)
λ :t \x -> if x then Left "eye" else Right "ear"
Bool -> Either String String
λ :t \x -> (x, x)
a -> (a, a)
λ :t \x -> "fail"
????
Function type quiz (1)
λ :t \x -> if x then Left "eye" else Right "ear"
Bool -> Either String String
λ :t \x -> (x, x)
a -> (a, a)
λ :t \x -> "fail"
a -> String
Function type quiz (2)
λ :t \x -> x + 1
????
λ :t \x -> case x of {[] -> True; _ -> False}
????
λ :t \x -> length x + 1
????
Function type quiz (2)
λ :t \x -> x + 1
(Num a) => a -> a
λ :t \x -> case x of {[] -> True; _ -> False}
????
λ :t \x -> length x + 1
????
Function type quiz (2)
λ :t \x -> x + 1
(Num a) => a -> a
λ :t \x -> case x of {[] -> True; _ -> False}
[a] -> Bool
λ :t \x -> length x + 1
????
Function type quiz (2)
λ :t \x -> x + 1
(Num a) => a -> a
λ :t \x -> case x of {[] -> True; _ -> False}
[a] -> Bool
λ :t \x -> length x + 1
[a] -> Int
Function type quiz (3)
-- Add17.hs
-- With which type signatures will the code complile?
add17 = \x -> x + 17
(A) add17 :: Int -> Int
(B) add17 :: Double -> Double
(C) add17 :: Int -> Double
(D) add17 :: (Num a) => a -> a
Function type quiz (3)
-- Add17.hs
-- With which type signatures will the code complile?
add17 = \x -> x + 17
(A) add17 :: Int -> Int
(B) add17 :: Double -> Double
(C) add17 :: Int -> Double
(D) add17 :: (Num a) => a -> a
Function type quiz (3)
(A) add17 :: Int -> Int
(B) add17 :: Double -> Double
(C) add17 :: Int -> Double
(D) add17 :: (Num a) => a -> a
(Num a) => a -> a is the general type signature. It allows the term \x -> x + 17 to specialize to a single Num type a, but not to convert from one to another. So (C) is the only illegal type.
(->) “function” is just a parameterized type
Recall the parameterized types we’ve encountered:
a -> b is syntactic sugar for (->) a b.
(->) as a parameterized type:
The type-level “operator” (->) is right-associative.
a -> b <==> (->) a b
a -> b -> c <==> (->) a ((->) b c)
a -> b -> c -> d <==>
(->) a ((->) b ((->) c d))
Some useful functions
Don’t fear the type class
(+) :: Num a => a -> a -> a
entails the following (they’re all Num classes):
(+) :: Int -> Int -> Int
(+) :: Double -> Double -> Double
(+) :: Integer -> Integer -> Integer
(+) :: Word8 -> Word8 -> Word8
Code reuse
Type classes (which we’ll cover in much more detail, later) enable code reuse. You don’t have to write this sum function for each type. It works for all Num types.
sum [] = 0
sum (x:xs) = x + sum xs
λ :t sum
sum :: Num a => [a] -> a
λ sum [(1 :: Integer), 2, 3]
6
λ sum [1.3, 7.9, 8.1]
17.3
λ sum [(39 :: Word8), 255]
38
Conversions (Int -> Double)
λ :t fromIntegral
fromIntegral :: (Integral a, Num b) => a -> b
All the integer types (Int, Integer, Word, WordN, IntN for N = 8, 16, 32, 64) are Integral.
Since Int is Integral and Double is Num, Int -> Double is a subcase of the above. Thus, you can use fromIntegral as Int -> Double for the conversion.
Conversions (Double -> Integer)
λ :t truncate
(RealFrac a, Integral b) => a -> b
What’s RealFrac? Use :i or :info to find instances of a typeclass.
λ :i RealFrac
[...]
instance RealFrac Float -- Defined in ‘GHC.Float’
instance RealFrac Double -- Defined in ‘GHC.Float’
Conversions (Double -> Int?)
λ :t truncate
(RealFrac a, Integral b) => a -> b
You can use truncate as Double -> Integer, or as Double -> Int, but in general, you should convert to Integer. Unless you know for sure that you’re working with a value that will fit in an Int, prefer Integer.
Integer vs. Int
Most Haskell programmers consider the Int type to be “legacy”; its size is system-dependent (30 to 64 bits) and, because it’s boxed, it’s not much faster than Integer. (Unboxed types are used in high-performance computing, but won’t be used now).
Many functions use Int, but (arbitrary-precision) Integer is preferable if there is any risk of integer overflow. It’s only slightly slower than Int.
Conversions (a -> String)
For any type that is in the Show type class, you have the function:
show :: (Show a) => a -> String�
Almost all of the base types are Show. Function types are not Show.
Int -> Int is not Show
ghci> \x -> (x :: Int)
<interactive>:22:1:
No instance for (Show (Int -> Int)) arising from a use of ‘print’
In a stmt of an interactive GHCi command: print it
There’s nothing wrong with the function value. This fails because ghci (a “read-eval-print loop”) cannot call print :: (Show a) => a -> IO () on and Int -> Int, which has no Show instance.
Conversions (String -> a)
The inverse of Show/show is Read/read.
ghci> :t read
read :: Read a => String -> a
ghci> read "15" :: Int
15
ghci> read "meow" :: Int
*** Exception: Prelude.read: no parse
read is out of fashion.
General Haskell consensus is that non-IO code shouldn’t throw exceptions, which read will do if it can’t parse the String.
In general, prefer the Safe module’s readMay.
ghci> import Safe
ghci> :t readMay
readMay :: Read a => String -> Maybe a
ghci> readMay "12.5" :: Maybe Int
Nothing
ghci> readMay "12.5" :: Maybe Double
Just 12.5
Recursion vs. looping
Or: “Dude, where’s my FOR-loop?”
Usually, Haskell prefers recursion over looping.
Recursion vs. looping
int factorial(int n) {
int out = 1;
for (int i = n; i > 0; i--)
out = out * n;
return out;
}
factorial n =
if n == 0 then 1
else n * factorial (n - 1)
factorial n = loop 1 n
where loop acc n =
if (n == 0) then acc
else loop (acc * n) (n - 1)
factorial n = product [1..n]
sequence, mapM and cousins
For effectful computations, looping is sometimes a useful idiom:
sequence :: [IO a] -> IO [a]
sequence [a1, …, aN] = do r1 <- a1
...
rN <- rN
return [r1, …, rN]
sequence example
import System.Posix.Files
getFileSize :: String -> IO Int
getFileSize fname = do
stat <- getFileStatus fname
return $ fromIntegral $
fileSize stat
λ getFileSize "Lab1.hs"
7038
-- return is IO Int,
-- ghci prints the Int
λ getFileSize "Lab2.hs"
21845
λ sequence [
getFileSize "Lab1.hs",
getFileSize "Lab2.hs",
getFileSize "README.md"]
[7038,21845,1051]
sequence, mapM and cousins
For effectful computations, looping is sometimes a useful idiom:
sequence :: [IO a] -> IO [a]
sequence [a1, …, aN] = do r1 <- a1
...
rN <- rN
return [r1, …, rN]
Exercise (answer given in Lecture #4): write sequence.
sequence, mapM and cousins
sequence :: [IO a] -> IO [a]
mapM :: (a -> IO b) -> [a] -> IO [b]
mapM f xs = sequence $ map f xs
mapM_ :: (a -> IO b) -> [a] -> IO ()
mapM_ f xs = mapM f xs >> return ()
forM = flip mapM, forM_ flip mapM_
mapM and forM examples
λ sequence [getFileSize "Lab1.hs", getFileSize "Lab2.hs", getFileSize "README.md"]
[7038,21845,1051]
λ mapM getFileSize ["Lab1.hs", "Lab2.hs", "README.md"]
[7038,21845,1051]
λ forM ["Lab1.hs", "Lab2.hs", "README.md"] getFileSize
[7038,21845,1051]
looping with forM_
λ import Control.Monad
λ forM_ [1..5] $ \i ->
print i
1
2
3
4
5
-- returns ()
λ sequence $
map (\x -> (print $ x^2)
>> (return $ x^2))
[1..4]
1
4
9
16
[1,4,9,16]
Change of topic: laziness
Yes, Haskell is “lazy”.
You’ve heard that Haskell’s “laziness” makes it hard to reason about code. The concern is mostly overblown. It is true that:
Yes, Haskell is “lazy”.
You’ve heard that Haskell’s “laziness” makes it hard to reason about code. The concern is mostly overblown. It is true that:
Advantages of laziness
Infinite collections
-- This is clearly NOT the most efficient
-- way to do this computation, but it is
-- correct.
taxicab :: Integer -> Bool
taxicab n =
let m = last $
takeWhile (\x -> x^3 <= n) [1..]
solns = [(x, y) | x <- [1..m],
y <- [1..m],
x <= y,
x^3 + y^3 == n]
in length solns >= 2
ghci> take 5 $
filter taxicab [1..]
[1729,4104,13832,20683,32832]
[1..] is infinite, so evaluating
filter taxicab [1..]
in its entirety would take (literally) forever, but we can take its first 5 elements in finite time.
Thunks
A “lazy” thunk is a piece of code that hasn’t been evaluated yet.
In a strict-by-default language like Scheme or OCaml, you might define a lazy X as a 0-argument function returning X.
Thunks in Clojure and OCaml
(defn force [th]
(th))
(defn thunk-7 []
(fn [] (+ 3 4)))
type ‘a thunk = unit -> ‘a
force th = th ()
let thunk_7 =
(fun _ -> 3 + 4)
Thunks in Haskell
In Haskell, an a is, by default, a thunk producing an a when evaluated.
thunk7 :: Int
thunk7 = 3 + 4
Lazy-by-default makes the infinite data structure “magic” possible.
take 2 [1..] ({} as thunk)
take 0 _ = []
take n [] = []
take n (x:xs) = x:(take (n-1) xs)
take 2 {[1..]} = take 2 (1:{[2..]})
= 1:(take 1 {[2..]})
= 1:(take 1 2:{[3..]})
= 1:(2:(take 0 {[3..]}))
= 1:(2:[]) = [1, 2]
take 2 [1..] ({} as thunk)
take 0 _ = []
take n [] = []
take n (x:xs) = x:(take (n-1) xs)
take 2 {[1..]} = take 2 (1:{[2..]})
= 1:(take 1 {[2..]})
= 1:(take 1 2:{[3..]})
= 1:(2:(take 0 {[3..]}))
= 1:(2:[]) = [1, 2]
never evaluated!
Short-circuiting (&&) for free
In many languages, you can’t define short-circuiting AND as a function (hence macros).
clojure> (defn my-and [x y] (if x y false))
#'user/my-and
clojure> (my-and false
(println "DROP TABLE Students"))
DROP TABLE Students
false
Short-circuiting (&&) for free
In many languages, you can’t define short-circuiting AND as a function. With Haskell’s thunks, you can. If it weren’t provided, you could write this:
(&&) :: Bool -> Bool -> Bool
(&&) True b = b
(&&) False _ = False
“Short-circuiting” function example
(&&) True b = b
(&&) False _ = False
{False && error "barf"} = (&&) {False} {error "barf"}
= (&&) False {error "barf"}
= False
Non-strict (or lazy) evaluation
λ 1 `quot` 0
*** Exception: divide by zero
λ length [7, 1 `quot` 0]
???
Non-strict (or lazy) evaluation
λ 1 `quot` 0
*** Exception: divide by zero
λ length [7, 1 `quot` 0]
2
{} means “deferred” (thunk) here
length {[7, 1 `quot` 0]}
= length ( (:) {7} {[1 `quot` 0]})
= 1 + length {[1 `quot` 0]}
= 1 + length ( (:) {1 `quot` 0} {[]})
= 1 + 1 + length {[]}
= 1 + 1 + length []
= 1 + 1 + 0
= 2
A danger at scale: thunk leaks!
foldl :: (a -> b -> a) -> a -> [b] -> a
Haskell’s fold (foldl, foldr, etc.) family of functions represent “reductions” or accumulations.
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f z [] = z
foldl f z (x:xs) = (foldl f (z `f` x) xs)
foldl f x0 [x1, … , xN] = ((x0 `f` x1) `f` … xN_1) `f` xN)
foldl and foldr in action
foldl (+) 867 [5, 3, 0, 9] =
(((867 + 5) + 3) + 0) + 9 = 884
foldr (++) "!"
["A ", "Red Dragon ", "draws near"] =
"A " ++ ("Red Dragon " ++ ("draws near" ++ "!")) =
"A Red Dragon draws near!"
foldl and foldr
λ foldl (+) 0 [1..5]
????
λ foldl (*) 1 [1..4]
????
λ foldr (++) "Zebra" ["Apple", "Bear"]
????
λ foldr (\x y -> x) 0 [1, 2, 3]
????
λ foldr (\x y -> x) 0 []
????
foldl and foldr
λ foldl (+) 0 [1..5]
15
λ foldl (*) 1 [1..4]
????
λ foldr (++) "Zebra" ["Apple", "Bear"]
????
λ foldr (\x y -> x) 0 [1, 2, 3]
????
λ foldr (\x y -> x) 0 []
????
foldl and foldr
λ foldl (+) 0 [1..5]
15
λ foldl (*) 1 [1..4]
24
λ foldr (++) "Zebra" ["Apple", "Bear"]
????
λ foldr (\x y -> x) 0 [1, 2, 3]
????
λ foldr (\x y -> x) 0 []
????
foldl and foldr
λ foldl (+) 0 [1..5]
15
λ foldl (*) 1 [1..4]
24
λ foldr (++) "Zebra" ["Apple", "Bear"]
"AppleBearZebra"
λ foldr (\x y -> x) 0 [1, 2, 3]
????
λ foldr (\x y -> x) 0 []
????
foldl and foldr
λ foldl (+) 0 [1..5]
15
λ foldl (*) 1 [1..4]
24
λ foldr (++) "Zebra" ["Apple", "Bear"]
"AppleBearZebra"
λ foldr (\x y -> x) 0 [1, 2, 3]
1
λ foldr (\x y -> x) 0 []
????
foldl and foldr
λ foldl (+) 0 [1..5]
15
λ foldl (*) 1 [1..4]
24
λ foldr (++) "Zebra" ["Apple", "Bear"]
"AppleBearZebra"
λ foldr (\x y -> x) 0 [1, 2, 3]
1
λ foldr (\x y -> x) 0 []
0
The bad news: “thunk leaks”
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl f z [] = z
foldl f z (x:xs) = (foldl f (z `f` x) xs)
This can* leak memory: foldl (+) 0 [1..10^9]
foldl (+) 0 [1..10^9] = foldl (+) (0 + 1) [2..10^9]
= foldl (+) ((0 + 1) + 2) [3..10^9]
= ...
*Technically, it’s undefined behavior. The compiler may optimize the leak away.
Building up a nested thunk
foldl (+) 0 [1, 2, 3] =
foldl (+) (0 + 1) [2, 3] =
foldl (+) ((0 + 1) + 2) [3] =
foldl (+) (((0 + 1) + 2) + 3) [] =
(((0 + 1) + 2) + 3) =
((1 + 2) + 3) = 3 + 3 = 6
It builds up a nested thunk. This is a non-issue for [1..3], but (unless the compiler optimizes the thunks away) a show-stopper for [1..10^9], which should be summable in constant space, but instead can blow out memory.
foldl' and seq
foldl' is a non-leaky version of foldl. To understand it, we need to understand seq:
seq :: a -> b -> b
seq x y forces x and returns y.
foldl' f z [] = z
foldl' f z (x:xs) =
let z' = f z x in z' `seq` foldl' f z' xs
Comparison: foldl vs. foldl'
foldl f z [] = z
foldl f z (x:xs) =
let z' = f z x in
foldl f z' xs
foldl (+) 0 [1, 2, 3] =
foldl (+) (0 + 1) [2, 3] =
foldl (+) ((0 + 1) + 2) [3] =
foldl (+) (((0 + 1) + 2) + 3) [] =
(((0 + 1) + 2) + 3) =
((1 + 2) + 3) = 3 + 3 = 6
foldl' f z [] = z
foldl' f z (x:xs) =
let z' = f z x in
z' `seq` foldl' f z' xs
foldl' (+) 0 [1, 2, 3] =
let z' = 0 + 1 in
z' `seq` foldl' (+) z' [2, 3] =
foldl' (+) 1 [2, 3] =
foldl' (+) 3 [3] =
foldl' (+) 6 [] = 6
Technical note: lazy vs. nonstrict
While lazy (vs. eager) and nonstrict (vs. strict) are often used interchangeably, they aren’t always the same thing.
Laziness is a matter of when work is done. Nonstrictness is a formal semantic property (covered in the next slide).
You could have nonstrictness without laziness.
Defining strictness
Recall ⊥, the “value” assigned to failure (inf. loop or exn). We use “= ⊥” to mean “computation fails”.
1 `quot` 0 = ⊥ (exception)
head [] = ⊥ (pattern match failure)
take 1 $ filter (\_ -> False) [1..] = ⊥ (non-term.)
length [1 `quot` 0] = 1
True && ⊥ = ⊥
False && ⊥ = False
Defining strictness
We use “= ⊥” to mean “computation fails”.
An N-ary function f is strict in argument x* ∈ {x1, … xN} if:
x* = ⊥ ⇒ f x1 … xN = ⊥
(+) x1 x2 is strict in both arguments.
(&&) b1 b2 is nonstrict in b2; False && ⊥ = False
What Haskell actually does
Thunks (e.g. Int) are evaluated, actions (e.g. IO Int) are executed (and may have side effects).
When a program runs, the action called main, with type signature IO (), is run. Actions are executed, and thunks are evaluated, as needed for main to do its job.
Which bits of code will run?
a :: Int
a = 5 + 6
b :: Int
b = 4 * 7
c :: Int
c = 5 + 3
d :: Int
d = a - c
action1 :: IO ()
action1 = print $ b + c
action2 :: IO Int
action2 = return $ a * a
action3 :: IO ()
action3 = print d
action4 :: IO Int
action4 = error "barf"
main :: IO ()
main = do
n <- action2
print n
action3
Which bits of code will run?
a :: Int
a = 5 + 6
b :: Int
b = 4 * 7
c :: Int
c = 5 + 3
d :: Int
d = a - c
action1 :: IO ()
action1 = print $ b + c
action2 :: IO Int
action2 = return $ a * a
action3 :: IO ()
action3 = print d
action4 :: IO Int
action4 = error "barf"
main :: IO ()
main = do
n <- action2
print n
action3
stdout:
121
a = 11
Which bits of code will run?
a :: Int
a = 5 + 6
b :: Int
b = 4 * 7
c :: Int
c = 5 + 3
d :: Int
d = a - c
action1 :: IO ()
action1 = print $ b + c
action2 :: IO Int
action2 = return $ a * a
action3 :: IO ()
action3 = print d
action4 :: IO Int
action4 = error "barf"
main :: IO ()
main = do
n <- action2
print n
action3
stdout:
121
3
a = 11
c = 8
d = 3
Which bits of code will run?
a :: Int
a = 5 + 6
b :: Int
b = 4 * 7
c :: Int
c = 5 + 3
d :: Int
d = a - c
action1 :: IO ()
action1 = print $ b + c
action2 :: IO Int
action2 = return $ a * a
action3 :: IO ()
action3 = print d
action4 :: IO Int
action4 = error "barf"
main :: IO ()
main = do
n <- action2
print n
action3
stdout:
121
3
a = 11
c = 8
d = 3
Questions?