#Global State in Haskell with IORef
AKA “The IORef
trick”.
In Haskell, functions are pure and there’s no global state. But sometimes
global state is convenient! If you want to add some extra bit of configuration
that you would need to thread through some large portion of code in a fairly
mechanical fashion (for example, “are we running in verbose mode” or “is this a
dry run”) and your program is written in IO
or some other context that’s not
very amenable to being extended with additional data, or if you’re prototyping
a solution before implementing it “the right way”, then a little bit of mutable
global state can be a very useful thing indeed. To be clear, I’m not saying you
should add global state to your programs. In fact, I will caution you against
it: this is the sort of thing that gets your PRs rejected.
This technique, which I will call “the IORef
trick”, allows us to
implement a global variable that can be read or written in any IO
action. I
believe the IORef
trick is fairly well-known, but not talked about much — I
couldn’t find it documented anywhere when I needed it (and of course I didn’t
really need it). I suspect that this is because your coworkers are worried
you will start littering global mutable state everywhere you go, if only you
knew how, and they would rather you do things “the right way” and. Admittedly,
they’re probably correct, so be responsible with this technique!
#The IORef
trick
Here’s how the trick works:
module MyParameter (getMyParameter, setMyParameter) where
import Data.IORef (IORef, atomicModifyIORef', newIORef)
import System.IO.Unsafe (unsafePerformIO)
{-# NOINLINE myParameterRef #-}
myParameterRef :: IORef Int
myParameterRef = unsafePerformIO (newIORef 0)
setMyParameter :: Int -> IO ()
setMyParameter newValue =
atomicModifyIORef' myParameterRef (\_ -> (newValue, ()))
getMyParameter :: IO Int
getMyParameter =
atomicModifyIORef' myParameterRef (\value -> (value, value))
First, we create a new IORef
called myParameterRef
to store our
global variable and initialize it with its default value. This is where the
naughtiness lives: we have to use unsafePerformIO
to
create the IORef
. It’s very important to mark this value as
NOINLINE
or GHC may end up running newIORef
multiple times,
causing you an enormous headache. If your global variable has no sensible
default value, you may want to adapt this code to work with an MVar
instead.
Next, we write a setter for our variable using atomicModifyIORef'
. Evaluating
atomicModifyIORef' ref (\old -> (new, result))
will set ref
to new
and
return result
. Note that the supplied lambda has access to the previous value
as well. (This is explained in the documentation for atomicModifyIORef
, but
only in the seventh paragraph, so I thought it might be worth calling
attention to.)
We could simplify setMyParameter
by writing it in terms of atomicWriteIORef
,
but there’s no corresponding atomicReadIORef
function so we’ll need to import
atomicModifyIORef'
to write the getter.
Finally, we write a getter in terms of atomicModifyIORef'
, which re-stores
the old value in the IORef
again and then returns it.
#Strictness
The single quote at the end of atomicModifyIORef'
means (in this context)
that it’s the strict version of atomicModifyIORef
. (You may hear this symbol
referred to as a tick mark or a prime.) Strictness here means
that running setMyParameter newValue
will evaluate newValue
to weak head
normal form. One alternative is to leave the value unevaluated, which
may cause lazy IO problems, memory leaks, or thunk
buildup. Another alternative is to use deepseq
to
fully force the value. I think that atomicModifyIORef'
is a good default but
the correct choice will depend on the semantics of your particular application.
#Thread safety
The “atomic” in atomicModifyIORef'
means that a memory fence
is added when executing this function, which will make calls to the getter and
setter return consistent values even with instruction
reordering on multiple threads at the cost of some
performance. If you don’t require thread safety, you can rewrite the getter and
setter in terms of readIORef
and writeIORef
.