#Global State in Haskell with IORef

AKA “The IORef trick”.

In Haskell, functions are pure and there’s no global state. But sometimes global state is convenient! If you want to add some extra bit of configuration that you would need to thread through some large portion of code in a fairly mechanical fashion (for example, “are we running in verbose mode” or “is this a dry run”) and your program is written in IO or some other context that’s not very amenable to being extended with additional data, or if you’re prototyping a solution before implementing it “the right way”, then a little bit of mutable global state can be a very useful thing indeed. To be clear, I’m not saying you should add global state to your programs. In fact, I will caution you against it: this is the sort of thing that gets your PRs rejected.

This technique, which I will call “the IORef trick”, allows us to implement a global variable that can be read or written in any IO action. I believe the IORef trick is fairly well-known, but not talked about much — I couldn’t find it documented anywhere when I needed it (and of course I didn’t really need it). I suspect that this is because your coworkers are worried you will start littering global mutable state everywhere you go, if only you knew how, and they would rather you do things “the right way” and. Admittedly, they’re probably correct, so be responsible with this technique!

#The IORef trick

Here’s how the trick works:

module MyParameter (getMyParameter, setMyParameter) where

import Data.IORef (IORef, atomicModifyIORef', newIORef)
import System.IO.Unsafe (unsafePerformIO)

{-# NOINLINE myParameterRef #-}
myParameterRef :: IORef Int
myParameterRef = unsafePerformIO (newIORef 0)

setMyParameter :: Int -> IO ()
setMyParameter newValue =
  atomicModifyIORef' myParameterRef (\_ -> (newValue, ()))

getMyParameter :: IO Int
getMyParameter =
  atomicModifyIORef' myParameterRef (\value -> (value, value))

First, we create a new IORef called myParameterRef to store our global variable and initialize it with its default value. This is where the naughtiness lives: we have to use unsafePerformIO to create the IORef. It’s very important to mark this value as NOINLINE or GHC may end up running newIORef multiple times, causing you an enormous headache. If your global variable has no sensible default value, you may want to adapt this code to work with an MVar instead.

Next, we write a setter for our variable using atomicModifyIORef'. Evaluating atomicModifyIORef' ref (\old -> (new, result)) will set ref to new and return result. Note that the supplied lambda has access to the previous value as well. (This is explained in the documentation for atomicModifyIORef, but only in the seventh paragraph, so I thought it might be worth calling attention to.)

We could simplify setMyParameter by writing it in terms of atomicWriteIORef, but there’s no corresponding atomicReadIORef function so we’ll need to import atomicModifyIORef' to write the getter.

Finally, we write a getter in terms of atomicModifyIORef', which re-stores the old value in the IORef again and then returns it.

#Strictness

The single quote at the end of atomicModifyIORef' means (in this context) that it’s the strict version of atomicModifyIORef. (You may hear this symbol referred to as a tick mark or a prime.) Strictness here means that running setMyParameter newValue will evaluate newValue to weak head normal form. One alternative is to leave the value unevaluated, which may cause lazy IO problems, memory leaks, or thunk buildup. Another alternative is to use deepseq to fully force the value. I think that atomicModifyIORef' is a good default but the correct choice will depend on the semantics of your particular application.

#Thread safety

The “atomic” in atomicModifyIORef' means that a memory fence is added when executing this function, which will make calls to the getter and setter return consistent values even with instruction reordering on multiple threads at the cost of some performance. If you don’t require thread safety, you can rewrite the getter and setter in terms of readIORef and writeIORef.