# com.wizardbeard/prng

Pseudorandom number generators in Clojure.

**This project is in pre-release development and subject to change.**

Orignally, this package was com.climate/prng, but that project appears to be abandoned, and no
artifacts have been published to Clojars.  

This source code was originally written by The Climate Corporation, and the original repo can
be found at:

https://github.com/TheClimateCorporation/prng

## Concepts

A pseudorandom number generator (PRNG) is a deterministic algorithm that
generates a sequence of numbers that appear to be random.

A PRNG has the following components:

- State space, S: a finite set of possible states.
- Output space, U: a finite set of possible outputs (pseudorandom numbers)
  that can be generated.
- Transition function, f: a pure function that takes a state as its
  argument and returns a (usually different) state.
- Output function, g: a pure function that maps a state to an element in
  the output space. Different output functions may be used to generate
  different numeric types (e.g. longs or doubles) from the same
  underlying sequence of states.

The state of a PRNG evolves according to the recurrence relationship
(LaTeX math markup):

```
s_{i} = f(s_{i - 1})
u_{i} = g(s{i})
```

where

```
f: S \rightarrow S
```

is the state transition function and

```
g: S \rightarrow U
```

is the output function.

The initial state of a PRNG is typically constructed by provided a
"seed" (a source of bits) to a factory function. When initialized with
the same seed, a PRNG is guaranteed to subsequently generate exactly the
same sequence of numbers.

Most widely used PRNG libraries represent PRNGs as stateful objects. A
typical example is the `java.util.Random` class in Java, which
encapsulates a hidden state. Calling a method to generate the next
pseudorandom number in the sequence has the side effect of mutating this
hidden state:

```clojure
(def r (java.util.Random. 1234567))
(.nextDouble r)
;=> 0.24283348968807417
; nextDouble is impure! the second call returns a different result!
(.nextDouble r)
;=> 0.7520183689538884
```

Such statefulness makes it difficult to reason about concurrency or
reproducibility. Also, as shown in this example, stateful PRNG libraries
typically conflate the transition function and the output function in
a single method.

The `prng` library takes a different approach to PRNGs that is more
functional and Clojure-like. Instead of implicitly mutating a stateful
object, the state of the PRNG is directly evolved by calling a
transition function that returns the next state, and a pseudorandom
number is directly generated by calling an output function on a state.
States are values that can be passed to functions that produce other
values. The transition function and the output function are separated
concerns. This functional design is intended to make it easy to reason
about concurrency and reproducibility, especially when using PRNGs in
large distributed systems in which coordination is costly.

## Usage

This is a quick demo that you can follow in a Clojure REPL to get
acquainted with the `prng` library.

The namespace `com.wizardbeard.prng.core` defines a `PRNGState` protocol
whose methods can be invoked on all implementing types.

```clojure
(require '[com.wizardbeard.prng.core :as prng])
```

Specific PRNG implementations are found in namespaces under the prefix
`com.wizardbeard.prng.generators`. Currently, two implementations are
available:

- A 64-bit version of the well-known MT19937 [Mersenne
  Twister](http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html)
  PRNG
- A counter-based PRNG that uses the [Threefish block
  cipher](https://www.schneier.com/threefish.html) as the source of
  pseudorandomness

Try using the Mersenne Twister:

```clojure
(require '[com.wizardbeard.prng.generators.mersenne-twister :as mt])
```

To instantiate a Mersenne Twister state, call a factory function with a
`long[]` seed argument.

```clojure
(def seed (long-array [0 1 2 3]))
(def initial-state (mt/seed-state seed)))
```

The `seed-state` factory function returns a state value that satisfies
the `PRNGState` protocol.

```clojure
(satisfies? prng/PRNGState initial-state)
;=> true
```

For a given seed argument, the same return value ("same" in the sense of
`equals()`) is returned:

```clojure
(= initial-state (mt/seed-state seed))
;=> true
```

The `PRNGState` protocol methods are pure functions that take a state
value argument and return other values, without mutating the input state
value.

To sample a long from this PRNG, simply call the `->long` protocol
method on the state. `->long` is a pure function; for a given state, it
always returns the same value, and it does not mutate its state argument.

```clojure
(prng/->long initial-state)
;=> -3722652993627521228
; Multiple calls return the same value.
(prng/->long initial-state)
;=> -3722652993627521228
```

Similarly, the `->double` protocol method returns a double:

```clojure
(prng/->double initial-state)
;=> 0.4049867569919572
```

The `next-state` protocol method takes a state argument and returns a
new state that is one step ahead in the state sequence of the PRNG. The
input state is not mutated.

```clojure
(def new-state (prng/next-state initial-state))
(= new-state initial-state)
;=> false
; Get a long from this new state
(prng/->long new-state)
;=> -909579606142348662
; Get a long from the old state (compare to previous calls)
(prng/->long initial-state)
;=> -3722652993627521228
```

The `jump-state` protocol method takes two arguments, a state and a jump
size, and jumps ahead by the provided jump size. `jump-state` returns
the same state value as iterative composition of `next-state`, but it is
much faster.

```clojure
(def jump-size 1000000)
(def state-after-jump (prng/jump-state initial-state jump-size))
(= state-after-jump ((apply comp (repeat jump-size prng/next-state)) initial-state))
;=> true
```

## Tests

`lein test` runs unit tests.

`lein test :benchmark` runs tests with the `:benchmark` test selector,
which are a few performance benchmarks. They may take a while to
complete.

## Known issues

### Concurrency

The Java classes `MersenneTwisterState` and `ThreefishState` have public
methods that are untested for thread safety and are probably unsafe.
Although the `PRNGState` protocol is designed for concurrency, it
possible to bypass the protocol and directly invoke the unsafe methods
of the implementation classes. To ensure thread safety, these
implementation classes ought to be re-implemented with Clojure-style
persistent data structures.

### Speed

For the `MersenneTwisterState` implementation class, the `jump-state`
protocol method uses a naive algorithm to compute jump-ahead from the
remainder polynomial. Maromoto and colleauges [demonstrated two
alternative algorithms](http://dx.doi.org/10.1007/978-3-540-85912-3_26)
to compute jump-ahead more quickly. The `jump-state` method ought to be
reimplemented with one of these faster algorithms.

The implementation of the `jump-state` method for `ThreefishState` uses
kludgy `BigInteger` and hexadecimal string manipulations to perform
binary arithmetic. Performance can almost certainly be improved by
thoughtful bit twiddling.

### Pseudorandomness

The Mersenne Twister is a well-known, high-quality PRNG that has been
thoroughly tested for pseudorandomness. The implementation in this
library is a bit-exact match to the reference implementation in C.

The counter-based Threefish PRNG has not been as well tested, and its
pseudorandomness ought to be teststed with a suite of statistical tests.

## License

Copyright © 2014 The Climate Corporation

Distributed under the Apache License, Version 2.0
