Problem
After I started learning about [FP] couple years ago my code started resemble pseudo-purely-functional one:
# Version A
# builds a reverse item-to-category map
def items_to_categories_map
reducer = -> (acc, (cat, items)) do
items.reduce(acc) { |acc, item| acc.merge(item => cat) }
end
CAT_TO_ITEM_MAP.reduce({}, &reducer)
end
CAT_TO_ITEM_MAP = {
'numbers' => %w[0 1 2 3],
'letters' => %w[a b c d],
}
today I’d have written the same code this way:
# Version B
def items_to_categories_map
acc = {}
reducer = -> (cat, items) do
items.each { |item| acc[item] = cat }
end
CAT_TO_ITEM_MAP.each(&reducer)
acc
end
# same CAT_TO_ITEM_MAP
The surprise here is that the [Version B is 3 times faster than Version A]
Why had I not been preferring performant version before? Read on…
Immutable by default
[FP] is great for few reasons and one of them is this:
pure (immutable) code is simpler to reason about compared to the impure one
while I agree with the above statement, it led me to develop “immutable by default” habit when writing code.
“Immutable by default” principle generally works because:
- it’s concurrent-safe by definition.
- it’s simpler to reason about: result only depends on input (
y = f(x)
) - single and simple data-flow principle: input comes in, result comes out.
- careless mutation, on other hand, introduces implicit coupling and may results in unpredictable results.
Despite the benefits, applying “Immutable by default” principle blindly is not always the best approach: application context must be considered.
Reality check
Version A of the example code is a direct application of the “Immutable by default” principle: avoids any mutation.
But once we consider context(s) (assuming the code must be concurrent-safe):
- It’s [Ruby].
- The resulting hash is a local variable and is not shared with any other processes therefore it’s safe to mutate within the scope.
then it becomes obvious that the “Immutable by default” principle has its flaws in this particular case.
It’s Ruby
[Ruby] is a general purpose scripting language and is not optimised for immutable programming: my conclusions are based purely on the fact that the Version A (immutable) performs poorer than the mutable counterpart Version B.
(In fact it’s optimised to please developers and make them happy!)
Safe to mutate
Sometimes mutable approach produces better results due to various factors:
- mutation avoids new object allocations thus redusing CPU and memory usage.
- mutation removes the need for GC to collect all the intermediate objects reducing CPU usage as well.
Conclusion
- “Immutable by default” is a safe bet and generally works.
- Careful mutation allows achiving better performance and reduce memory usage.
- In contrast, mutation, when misapplied, results in code that’s hard to reason about and leads to unpredictable results.
Know your options.
Links
- Gist with code and benchmarks [Version B is 3 times faster than Version A]:https://gist.github.com/gmarik/69e55f1dc28261f17aa4#file-code-rb-L56-L59 [FP]:https://en.wikipedia.org/wiki/Functional_programming [Ruby]:https://ruby-lang.org