Cracking post on the Elm discourse by john_s in response to rupert’s question regarding “intensely monadic code”. In his response john_s details some findings from a review of 300k lines of Purescript/Haskell code:
In surveying over 300K lines of PureScript / Haskell code in 4 large applications I believe the most genuinely needed monadic binds I ever saw in any properly sized function was 4 and was 1 on average!!!. In other cases,
- No Monad was even necessary because it collapsed to identity!!!
- Functor was the real operation.
- Applicative was the real operation.
- The engineer did not understand the State monad or the State monad transformer.
- The function was too large.
He details 4 of these patterns. I can almost see how these patterns emerge as more complicated code is thought needed, or even written in the first place, and then that is simplified down, but then that simplification is just not taken down as far as it can go, here is the first pattern:
noWorkDone = do
x <- someResult
pure x
-- the above is basically like the following in Elm
noWorkDone =
someResult |> Result.andThen Ok
-- so... um... this doesn't do anything! It unwraps the result and then re-wraps it
-- without doing any work
noWorkDone = someResult
I think this survey lends credence to the idea that do
notation was an error, and hence to Elm’s rejection of that. I’m not saying it settles the case, but it’s definitely a point in favour not adding do
notation to your language.
I did quibble with the second variant of the third pattern, which was given as:
mostlyApplicative = do
a <- aResult
b <- bResult
c <- f a b
pure { a, b, c }
-- re-written in PureScript
mostlyApplicative = do
Tuple a b <- lift2 Tuple aResult bResult
map (\c -> { a, b, c }) (f a b)
-- in Elm... this is messy and should be re-framed but I am just trying to make
-- a point of showing only one real `andThen` needed
mostlyApplicative =
Result.map2 pair aResult bResult
|> Result.andThen (\(a, b) -> Result.map (\c -> { a = a, b = b, c = c }) (f a b))
I think the do
notation is very much easier to read. In fact the middle effort here is somewhat unreadable to me.
Here is how I replied:
Here I find the do notation version at least as readable as the others. In 3a I frankly find the middle
-- re-written in PureScript
pretty unreadable (though I do think it’s instantly recognisable as ‘type-munging’ code, for which you tend to really only have to look at the type signature). However, I think in most actual cases of this pattern, there is a bit of actual logic as well and that can get lost in amongst all the “type munging”. Anyway in 3a I find that the core thing to understand about the code would likely be thatc
is basically the result of applyingf
toa
andb
(albeit and then extracting the result), and I find that much easier to see/understand in thedo
notation version than either of the other two. Though I guess if you have more realistic names thanc
andf
it’s perhaps clearer, e.g. ifc
is actuallydistance
andf
isgetCartesianDistance
and perhapsa
isstartPoint
andb
isendPoint
or something like that.
Anyway food for thought. Here’s a link to john_s’s post again.