Mutability is hard to reason about¶
A lot of programming languages support mutability. For example, some objects in Python are mutable:
x = [1, 2, 3]
x.reverse()
x
This may not seem problematic at first. A lot of people would argue that it is indeed necessary to program. However, when things can change, we sometimes are forced to understand more details than the bare minimum necessary. For example:
# This function is just for ilustration purposes.
# Imagine a situation where a very long and complex method mutates one of it's arguments...
from typing import List, TypeVar
T = TypeVar('T')
def m1(x: List[T]) -> None:
"""Reverses its argument"""
x.reverse()
return None
vowels = ['a', 'e', 'i', 'o', 'u']
m1(vowels)
vowels
Now, we have to dig into the implementation of m1
, to understand how the method affects its arguments.
A simpler approach is to rely on immutable data structures/variables. This may seem like a more difficult approach, but it makes programming easier in the long run.
# Note: The example above serves to illustrate the problems with mutation.
# Of course, it is not the *only* way to do it on Python.
# For example, a more functional approach would be (using `List[T]`):
def m2(x: List[T]) -> List[T]:
return x[::-1]
vowels2 = ['a', 'e', 'i', 'o', 'u']
print(m2(vowels2))
print(vowels2) # Remains unmodified
Let's use an immutable approach to the previous problem with pyrsistent
Python's library:
from pyrsistent import plist
ns1 = plist([1, 2, 3])
ns1
ns2 = ns1.reverse()
ns2
# Notice that original list remains unmodified (it is an immutable/persistent data structure!)
ns1
The following script, is a complete application of the concepts just presented.
from pyrsistent import PRecord, field
from typing import Callable, Optional, TypeVar
from scipy.optimize import newton
import matplotlib.pyplot as plt
import numpy as np
A = TypeVar('A')
B = TypeVar('B')
F1 = Callable[[A], B]
RealF = F1[float, float]
class RootPlot(PRecord):
def inv(self):
return self.x_min <= self.x_max, 'x_min bigger than x_max'
__invariant__ = inv
x_min = field(type=float, mandatory=True)
x_max = field(type=float, mandatory=True)
x_init = field(type=float)
output_file = field(type=str)
def plot(self,
y: RealF,
dy: Optional[RealF] = None,
dy2: Optional[RealF] = None) -> None:
root = newton(func=y, x0=self.x_init, fprime=dy, fprime2=dy2)
x = np.linspace(self.x_min, self.x_max)
plt.clf()
plt.plot(x, np.vectorize(y)(x))
plt.plot(root, 0.0, 'r+')
plt.grid()
plt.savefig(self.output_file)
plt.close()
def y(x: float) -> float:
return ((2*x - 11.7)*x + 17.7)*x - 5.0
def dy(x: float) -> float:
return (6.0*x - 23.4)*x + 17.7
def dy2(x: float) -> float:
return 12*x - 23.4
p = RootPlot(x_min=0.0,
x_max=4.0,
x_init=3.0,
output_file="simple_plot.png")
# This wouldn't change final result. You would still get a plot
# from 0.0 to 4.0
# p.set(x_init=2.0)
p.plot(y)
Functional programming relies on immutability¶
Immutable data structures/collections exist in a lot of programming languages:
- Haskell
- Scala
- FSharp
- Clojure
- C#
- JavaScript
- etc
You may have a lot of questions on the practicality and performance of Immutable Data Structures. There has been a lot of work and research on this topic. To give an example, Chris Okasaki received his PhD for his work on Purely Functional Data Structures. Take a look at https://www.cs.cmu.edu/~rwh/theses/okasaki.pdf
Immutability in the context of Object Oriented Programming¶
I will use examples from several programming languages that support Object Oriented Programming, mutability as well as immutability: Java, Scala, F#, C#.
Using Java¶
We are going to use Java to give an example (taken from Reactive Design Patterns by Roland Kuhn, et. al.) of an unsafe mutable class, which may hide unexpected behavior:
import java.util.Date;
public class Unsafe {
private Date timestamp;
private final StringBuffer message;
public Unsafe(Date timestamp, StringBuffer message) {
this.timestamp = timestamp;
this.message = message;
}
public synchronized Date getTimestamp() {
return timestamp;
}
public synchronized void setTimestamp(Date timestamp) {
this.timestamp = timestamp;
}
public StringBuffer getMessage() {
return message;
}
}
Can you spot the problems?
The following behaves predictably and is easier to reason about:
import java.util.Date;
public class Immutable {
private final Date timestamp;
private final String message;
public Immutable(final Date timestamp, final String message) {
this.timestamp = new Date(timestamp.getTime());
this.message = message;
public Date getTimestamp() {
return new Date(timestamp.getTime());
}
public String getMessage() {
return message;
}
}}
Using Scala¶
Let's start with an example that stresses that using mutability forces to understand the context where this technique is used.
class Counter {
private var value = 0
def increment() { value += 1} // <== This method *mutates* value
def current = value
}
Assume we create a Counter
instance, and then call "several times" the increment
method:
val counter = new Counter
// Block1 of code using increment(), possibly several times.
// ...
// ...
val count = counter.current
Can you guess which is the current count? Why? Do you need to know more information to give the exact answer? Do you think this requires more effort/time from you?
Now, lets compare with an the following immutable definition (also supported by the language):
final case class ImmutableCounter(current: Int = 0) {
def increment: ImmutableCounter = ImmutableCounter(current + 1)
}
NOTE (Scala specific): When you declare a
case class
, several things happen automatically:
- Each of the constructor parameters becomes a
val
unless it is explicitly declared as avar
.- An
apply
method is provided for the companion object that lets you construct objects withoutnew
.- An
unapply
method is provided that makes pattern matching work.Methods
toString
,equals
,hashCode
andcopy
are generated unless they are explicitly provided.To get the equivalent functionality in other languages, like Java, you would have to write much more code, and/or use libraries like Lombok. Hopefully we will see Java evolving. Take a look at Data Classes for Java from Project Amber and Value Types from Project Valhalla.
Now, for a given ImmutableCounter
instance, it is impossible to mutate the current
count. You would need to create new instances of the class to be able to get different values. For example:
val initialCount = ImmutableCounter(0)
val counter1 = initialCount.increment
// Possibly big chunk of code manipulating counters
// ...
// ...
val someCount = counter1.current
Can you guess which is the value of someCount without studying the "Possibly big chunk of code"? Which is the value of someCount
?
Whereas the above example may feel fictitious, it illustrates one important point: Immutability allows you to focus in less code, so it will be easier for you to catch errors, and the compiler can protect you from making mistakes. Final result: you will make less mistakes in your code (less bugs!).
In Scala, it is a best practice to avoid var
s, and try to use val
s for primitive types (the story has some subtleties for reference types) to avoid mutation and make your life easier.
Using .NET (F# and C#)¶
Take a look at this blog post: https://fsharpforfunandprofit.com/posts/correctness-immutability/
Immutable Data Structures allow easier concurrency¶
Take a look at https://clojure.org/about/concurrent_programming to read how immutable data structures will ease multicore/multithreaded programming on the JVM with Clojure.
Global Data and Mutable Variables¶
Using mutable global variables can be very dangerous (AFAIK JavaScript allows this). Take a look at a thorough discussion on this topic on Section 13.3 Global Data of Code Complete 2nd Edition, by Steve McConnell.
Extra: Avoiding Null Reference Exceptions by using descriptive types.¶
Sometimes people allow mutation of variables to encode the possibility that a value sometimes does not exist.
To encode the absence of a value, they use null
s. Like this:
final case class Configuration(numberOfCores: Int)
var configuration: Configuration = null
// Block1 of code logic depending on configuration
// ...
// Some time later
configuration = Configuration(4)
(Assume you "have to" use var
s here, because you have no control over the whole source code)
Can you spot a potential problem in Block1
above while trying to now the number of cores that have been configured?
If there is a possibility that sometimes a value may not exist, you can encode that using Option
:
final case class Configuration(numberOfCores: Int)
var configuration: Option[Configuration] = None
// Block1 of code logic depending on configuration
// ...
// some time later
configutation = Some(Configuration(4))
Now, our program won't crush at runtime if we try to get the number of cores configured in Block1
. We will simply get None
, meaning that we have not configured our system yet. No more runtime crashes. You just need to allow the type system work for you, and encode the possibility of absence of a value using an appropriate type.
We have been using Scala to exemplify this, but optionals have been included in mainstream languages also. For example, take a look at the following references:
- From the Java world: https://docs.oracle.com/javase/8/docs/api/java/util/Optional.html (We are now near to Java 11 Release Date)
- From the .NET world: https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/options
- C++17: https://en.cppreference.com/w/cpp/utility/optional
- Python: Look for
Optional
here https://docs.python.org/3/library/typing.html (supported with type annotations, for python 3.6+) - PureScript (a language that compiles to JavaScript): https://pursuit.purescript.org/packages/purescript-maybe/4.0.0