Cactus

Arrow's place in the Applicative/Monad hierarchy

12 July 2014 (programming haskell language)

I've been thinking lately about arrows in relation to applicative functors and monads. The difference between the latter two is easy to intuit (and I'll describe it via an example below), but I never managed to get the same level of understanding about arrows. There's a somewhat famous paper about this question, which has a very clear-cut diagram showing that applicatives embed into arrows and arrows embed into monads (and both containments are non-trivial), which I understood as meaning every arrow is an applicative functor, and every monad is an arrow.

At first glance, this makes sense, given the well-known result that monads are exactly equivalent to arrows that are also instances of ArrowApply, as witnessed by the Haskell types Kleisli and ArrowMonad. However, there's no immediately obvious reason why you couldn't also turn an applicative functor into an arrow, so how does that leave any room for arrows to be different from applicatives? (As an aside, the fact that applicative functors have kind ⋆ → ⋆ and arrows have kind ⋆ → ⋆ → ⋆ has been a huge complication for me in trying to compare them).

Now, finally, based on the helpful replies to that StackOverflow question and the associated Reddit thread, I am confident enough to answer my own question.

Tom Ellis suggested thinking about a concrete example involving file I/O, so let's compare three approaches to it using the three typeclasses. To make things simple, we will only care about two operations: reading a string from a file and writing a string to a file. Files are going to be identified by their file path:

type FilePath = String

Monadic I/O

Our first I/O interface is defined as follows:

data IOM ∷ ⋆ → ⋆
instance Monad IOM
readFile ∷ FilePath → IOM String
writeFile ∷ FilePath → String → IOM ()

Using this interface, we can for example copy a file from one path to another:

copy ∷ FilePath → FilePath → IOM ()
copy from to = readFile from >>= writeFile to

However, we can do much more than that: the choice of files we manipulate can depend on effects upstream. For example, the below function takes an index file which contains a filename, and copies it to the given target directory:

copyIndirect ∷ FilePath → FilePath → IOM ()
copyIndirect index target = do
    from ← readFile index
    copy from (target ⟨/⟩ to)

On the flip side, this means there is no way to know upfront the set of filenames that are going to be manipulated by a given value action ∷ IOM α. By "upfront", what I mean is the ability to write a pure function fileNames :: IOM α → [FilePath].

Of course, for non-IO-based monads (such as ones for which we have some kind of extractor function μ α → α), this distinction becomes a bit more fuzzy, but it still makes sense to think about trying to extract information without evaluating the effects of the monad (so for example, we could ask "what can we know about a Reader Γ α without having a value of type Γ at hand?").

The reason we can't really do static analysis in this sense on monads is because the function on the right-hand side of a bind is in the space of Haskell functions, and as such, is completely opaque.

So let's try restricting our interface to just an applicative functor.

Applicative I/O

data IOF ∷ ⋆ → ⋆
instance Applicative IOF
readFile ∷ FilePath → IOF String
writeFile ∷ FilePath → String → IOF ()

Since IOF is not a monad, there's no way to compose readFile and writeFile, so all we can do with this interface is to either read from a file and then postprocess its contents purely, or write to a file; but there's no way to write the contents of a file into another one.

How about changing the type of writeFile?

writeFile′ ∷ FilePath → IOF (String → ())

The main problem with this interface is that while it would allow writing something like

copy ∷ FilePath → FilePath → IOF ()
copy from to = writeFile′ to ⟨*⟩ readFile from

it leads to all kind of nasty problems because String → () is such a horrible model of writing a string to a file, since it breaks referential transparency. For example, what do you expect the contents of out.txt to be after running this program?

(λ write → [write "foo", write "bar", write "foo"]) ⟨$⟩ writeFile′ "out.txt"

Two approaches to arrowized I/O

First of all, let's get two arrow-based I/O interfaces out of the way that don't (in fact, can't) bring anything new to the table: Kleisli IOM and Applicarrow IOF.

The Kleisli-arrow of IOM, modulo currying, is:

readFile ∷ Kleisli IOM FilePath String
writeFile ∷ Kleisli IOM (FilePath, String) ()

Since writeFile's input still contains both the filename and the contents, we can still write copyIndirect (using arrow notation for simplicity). Note how the ArrowApply instance of Kleisli IOM is not even used.

copyIndirect ∷ Kleisli IOM (FilePath, FilePath) ()
copyIndirect = proc (index, target) → do
    from ← readFile ↢ index
    s ← readFile ↢ from
    writeFile ↢ (to, s)

The Applicarrow of IOF would be:

readFile ∷ FilePath → Applicarrow IOF () String
writeFile ∷ FilePath → String → Applicarrow IOF () ()

which of course still exhibits the same problem of being unable to compose readFile and writeFile.

A proper arrowized I/O interface

Instead of transforming IOM or IOF into an arrow, what if we start from scratch, and try to create something in between, in terms of where we use Haskell functions and where we make an arrow? Take the following interface:

data IOA ∷ ⋆ → ⋆ → ⋆
instance Arrow IOA
readFile ∷ FilePath → IOA () String
writeFile ∷ FilePath → IOA String ()

Because writeFile takes the content from the input side of the arrow, we can still implement copy:

copy ∷ FilePath → FilePath → IOA () ()
copy from to = readFile from >>> writeFile to

However, the other argument of writeFile is a purely functional one, and so it can't depend on the output of e.g. readFile; so copyIndirect can't be implemented with this interface.

If we turn this argument around, this also means that while we can't know in advance what will end up being written to a file (before running the full IOA pipeline itself), but we can statically determine the set of filenames that will be modified.

Conclusion

Monads are opaque to static analysis, and applicative functors are poor at expressing dynamic-time data dependencies. It turns out arrows can provide a sweet spot between the two: by choosing the purely functional and the arrowized inputs carefully, it is possible to create an interface that allows for just the right interplay of dynamic behaviour and amenability to static analysis.


« Typed embedding of STLC into Haskell 
All posts
 My first computer »