clojuress.v1.tutorial-test

clojuress.v1.tutorial-test - created by notespace, Tue Jan 21 01:27:06 IST 2020.
Checks: 38 PASSED

Clojuress tutorial

Basic examples

Let us start by some basic usage examples of Clojuress.

(require '[clojuress.v1.r :as r :refer
           [r eval-r->java r->java java->r java->clj clj->java r->clj clj->r
            ->code r+ colon function]]
         '[clojuress.v1.require :refer [require-r]]
         '[clojuress.v1.robject :as robject]
         '[clojuress.v1.session :as session]
         '[tech.ml.dataset :as dataset]
         '[notespace.v1.util :refer [check]])

First, let us make sure there are no R sessions currently running.

(r/discard-all-sessions)

Now let us run some R code, and keep a Clojure handle to the return value.

(def x (r "1+2"))

Convert the R to Clojure:

(->> x
     r->clj
     (check = [3.0]))

[:PASSED [3.0]]

Run some code on a separate session (specified Rserve port, rather than the default one).

(-> "1+2"
    (r :session-args {:port 4444})
    r->clj
    (->> (check = [3.0])))

[:PASSED [3.0]]

Convert Clojure data to R data. Note that nil is turned to NA.

(-> [1 nil 3]
    clj->r)

[1]  1 NA  3

Functions

We can define a Clojure function wrapping an R function.

(def f (function (r "function(x) x*10")))

Let us apply it to Clojure data (implicitly converting that data to R).

(->> 5
     f
     r->clj
     (check = [50.0]))

[:PASSED [50.0]]

We can also apply it to R data.

(->> "5*5"
     r
     f
     r->clj
     (check = [250.0]))

[:PASSED [250.0]]

Functions can get named arguments. Here we pass the na.rm argument, that tells R whether to remove missing values whenn computing the mean.

(->> ((function (r "mean")) [1 nil 3] :na.rm true)
     r->clj
     (check = [2.0]))

[:PASSED [2.0]]

An alternative call syntax:

(->> ((function (r "mean")) [1 nil 3] [:= :na.rm true])
     r->clj
     (check = [2.0]))

[:PASSED [2.0]]

Anoter example:

(let [f (->> "function(w,x,y=10,z=20) w+x+y+z"
             r
             function)]
  (->> [(f 1 2) (f 1 2 :y 100) (f 1 2 :z 100)]
       (map r->clj)
       (check = [[33.0] [123.0] [113.0]])))

[:PASSED ([33.0] [123.0] [113.0])]

Some functions are already created in Clojuress and given special names for convenience. For example:

(->> (r+ 1 2 3)
     r->clj
     (check = [6]))

[:PASSED [6]]

(->> (colon 0 9)
     r->clj
     (check = (range 10)))

[:PASSED [0 1 2 3 4 5 6 7 8 9]]

R dataframes and tech.ml.dataset datasets

Create a tech.ml.dataset dataset object, pass it to an R function to compute the row means, and convert the return value to Clojure.

(let [row-means (-> "function(data) rowMeans(data)"
                    r
                    function)]
  (->> {:x [1 2 3], :y [4 5 6]}
       dataset/name-values-seq->dataset
       row-means
       r->clj
       (check = [2.5 3.5 4.5])))

[:PASSED [2.5 3.5 4.5]]

Load the R package 'dplyr' (assuming it is installed).

(r "library(dplyr)")

Use dplyr to process some Clojure dataset, and convert back to the resulting dataset.

(let [filter-by-x (-> "function(data) filter(data, x>=2)"
                      r
                      function)
      add-z-column (-> "function(data) mutate(data, z=x+y)"
                       r
                       function)]
  (->> {:x [1 2 3], :y [4 5 6]}
       dataset/name-values-seq->dataset
       filter-by-x
       add-z-column
       r->clj
       (check (fn [d]
                (-> d
                    dataset/->flyweight
                    (= [{:x 2.0, :y 5.0, :z 7.0} {:x 3.0, :y 6.0, :z 9.0}]))))))

[:PASSED _unnamed [2 3]:

|    :x |    :y |    :z |
|-------+-------+-------|
| 2.000 | 5.000 | 7.000 |
| 3.000 | 6.000 | 9.000 |
]

Tibbles are also supported, as a special case of data frames.

(r "library(tibble)")

(let [tibble (function (r "tibble"))] (tibble :x [1 2 3] :y [4 5 6]))

# A tibble: 3 x 2
      x     y
   
1     1     4
2     2     5
3     3     6

(let [tibble (function (r "tibble"))]
  (->> (tibble :x [1 2 3] :y [4 5 6])
       r->clj
       dataset/->flyweight
       (check = [{:x 1.0, :y 4.0} {:x 2.0, :y 5.0} {:x 3.0, :y 6.0}])))

[:PASSED ({:x 1.0, :y 4.0} {:x 2.0, :y 5.0} {:x 3.0, :y 6.0})]

R objects

Clojuress holds handles to R objects, that are stored in memory at the R session, where they are assigned random names.

(def one+two (r "1+2"))

(->> one+two
     class
     (check = clojuress.v1.robject.RObject))

[:PASSED clojuress.v1.robject.RObject]

(:object-name one+two)

"x7ac9f5c42b3843ba"

We can figure out the place in R memory corresponding to an object's name.

(-> one+two
    :object-name
    clojuress.v1.objects-memory/object-name->memory-place)

".MEM$x7ac9f5c42b3843ba"

Generating code

Let us see the code-generation mechanism of Clojuress, and the rules defining it.

We will need a reference to the R session:

(def session (session/fetch-or-make nil))

For the following examples, we will use some dummy handles to R objects:

(def x (robject/->RObject "x" session nil))
(def y (robject/->RObject "y" session nil))

.. and some real handles to R objects:

(def minus-eleven (r "-11"))
(def abs (r "abs"))

For an r-object, we generate the code with that object's location in the R session memory.

(->> x
     ->code
     (check = ".MEM$x"))

[:PASSED ".MEM$x"]

For a clojure value, we implicitly convert to an R object, generating the corresponding code.

(->> "hello"
     ->code
     (check re-matches #"\.MEM\$.*"))

[:PASSED ".MEM$x0c7b547fc70d411b"]

For a symbol, we generate the code with the corresponding R symbol.

(->code 'x)

"x"

A sequential structure (list, vector, etc.) can be interpreted as a compound expression, for which code generation is defined accorting to the first list element.

For a list beginning with the symbol 'function, we generate an R function definition.

(->> '(function [x y] x)
     ->code
     (check = "function(x, y) {x}"))

[:PASSED "function(x, y) {x}"]

For a vector instead of list, we heve the same behaviour.

(->> '[function [x y] x]
     ->code
     (check = "function(x, y) {x}"))

[:PASSED "function(x, y) {x}"]

For a list beginning with the symbol 'tilde, we generate an R ~-furmula.

(->> '(tilde x y)
     ->code
     (check = "(x ~ y)"))

[:PASSED "(x ~ y)"]

For a list beginning with a symbol known to be a binary operator, we generate the code with that operator between all arguments.

(->> '(+ x y z)
     ->code
     (check = "(x + y + z)"))

[:PASSED "(x + y + z)"]

For a list beginning with another symbol, we generate a function call with that symbol as the function name.

(->> '(f x)
     ->code
     (check = "f(x)"))

[:PASSED "f(x)"]

For a list beginning with an R object that is a function, we generate a function call with that object as the function.

(->> [abs 'x]
     ->code
     (check re-matches #"\.MEM\$.*\(x\)"))

[:PASSED ".MEM$x69323b1f8cfd4361(x)"]

All other sequential things (that is, those not beginning with a symbol or R function) are intepreted as data, converted implicitly to R data.

(->> [abs '(1 2 3)]
     ->code
     (check re-matches #"\.MEM\$.*\(\.MEM\$.*\)"))

[:PASSED ".MEM$x69323b1f8cfd4361(.MEM$x2b478f26d89c4629)"]

Some more examples, showing how these rules compose:

(->code '(function [x y] (f y)))

"function(x, y) {f(y)}"

(->code '(function [x y] (+ x y)))

"function(x, y) {(x + y)}"

(->code ['function '[x y] ['+ 'x y]])

"function(x, y) {(x + .MEM$y)}"

(->code '(function [x y] (print x) (f x)))

"function(x, y) {print(x); f(x)}"

(->code ['function '[x y] [abs 'x]])

"function(x, y) {.MEM$x69323b1f8cfd4361(x)}"

(->code [abs minus-eleven])

".MEM$x69323b1f8cfd4361(.MEM$x9e229a1a4e304133)"

(->code [abs -11])

".MEM$x69323b1f8cfd4361(.MEM$xa5242746a53e40bf)"

Running generated code

Clojure forms can be run as R code. For example:

(->> [abs (range -3 0)]
     r
     r->clj
     (check = [3 2 1]))

[:PASSED [3 2 1]]

Let us repeat the basic examples from the beginning of this tutorial, this time generating code rather than writing it as Strings.

(def x (r '(+ 1 2)))

"checking again... "
(->> x
     r->clj
     (check = [3]))

[:PASSED [3]]

(def f (function (r '(function [x] (* x 10)))))

"checking again... "
(->> 5
     f
     r->clj
     (check = [50]))

[:PASSED [50]]

"checking again... "
(->> "5*5"
     r
     f
     r->clj
     (check = [250.0]))

[:PASSED [250.0]]

(let [row-means (-> '(function [data] (rowMeans data))
                    r
                    function)]
  (->> {:x [1 2 3], :y [4 5 6]}
       dataset/name-values-seq->dataset
       row-means
       r->clj
       (check = [2.5 3.5 4.5])))

[:PASSED [2.5 3.5 4.5]]

(r '(library dplyr))

(let [filter-by-x (-> '(function [data] (filter data (>= x 2)))
                      r
                      function)
      add-z-column (-> '(function [data] (mutate data (= z (+ x y))))
                       r
                       function)]
  (->> {:x [1 2 3], :y [4 5 6]}
       dataset/name-values-seq->dataset
       filter-by-x
       add-z-column
       r->clj))

_unnamed [2 3]:

|    :x |    :y | :(z = (x + y)) |
|-------+-------+----------------|
| 2.000 | 5.000 |          7.000 |
| 3.000 | 6.000 |          9.000 |

The strange column name is due to dplyr's mutate behaviour when extra parens are added to the expression.

Requiring R packages

We have seen earlier, that R functions can be wrapped by Clojure functions. Sometimes, we want to bring to the Clojure world functions from R packages. Here, we try to follow the require-python syntax of libpython-clj (though currently in a less sophisticated way.)

(require-r '[stats :as statz :refer [median]])

(->> [1 2 3]
     r.stats/median
     r->clj
     (check = [2]))

[:PASSED [2]]

(->> [1 2 3]
     statz/median
     r->clj
     (check = [2]))

[:PASSED [2]]

(->> [1 2 3]
     median
     r->clj
     (check = [2]))

[:PASSED [2]]

Data visualization

Functions creating R plots can be wrapped in a way that returns an SVG.

Currently there is a bug that sometimes causes axes and labels to disappear when rendered inside a larger HTML.

(require-r '[graphics :refer [plot]])
(require-r '[ggplot2 :refer [ggplot aes geom_point xlab ylab labs]])
(require '[clojuress.v1.applications.plotting :refer
           [plotting-function->svg ggplot->svg]])

(plotting-function->svg (fn []
                          (->> rand
                               (repeatedly 30)
                               (reductions +)
                               (plot :xlab "t" :ylab "y" :type "l"))))

ggplot2 plots can be also turned into SVG.

(ggplot->svg (let [x (repeatedly 99 rand)
                   y (map + x (repeatedly 99 rand))]
               (-> {:x x, :y y}
                   dataset/name-values-seq->dataset
                   (ggplot (aes :x x :y y :color '(+ x y) :size '(/ x y)))
                   (r+ (geom_point) (xlab "x") (ylab "y")))))

0.5 1.0 1.5 0.00 0.25 0.50 0.75 1.00 x y 1 2 (x + y) (x/y) 0.25 0.50 0.75

Intermediaty representation as Java objects.

Clojuress relies on the fact of an intemediary representation of java, as Java objects. This is usually hidden from the user, but may be useful sometimes. In the current implementation, this is based on REngine.

(import (org.rosuda.REngine REXP REXPInteger REXPDouble))

We can convert data between R and Java.

(->> "1:9"
     r
     r->java
     class
     (check = REXPInteger))

[:PASSED org.rosuda.REngine.REXPInteger]

(->> (REXPInteger. 1)
     java->r
     r->clj
     (check = [1]))

[:PASSED [1]]

We can evaluate R code and immediately return the result as a java object, without ever creating a handle to an R object holding the result:

(->> "1+2"
     eval-r->java
     class
     (check = REXPDouble))

[:PASSED org.rosuda.REngine.REXPDouble]

(->> "1+2"
     eval-r->java
     (.asDoubles)
     vec
     (check = [3.0]))

[:PASSED [3.0]]

More data conversion examples

Convertion between R and Clojure always passes through Java. To stress this, we write it explicitly in the following examples.

(->> "list(a=1:2,b='hi!')"
     r
     r->java
     java->clj
     (check = {:a [1 2], :b ["hi!"]}))

[:PASSED {:a [1 2], :b ["hi!"]}]

(->> "table(c('a','b','a','b','a','b','a','b'), c(1,1,2,2,3,3,1,1))"
     r
     r->java
     java->clj
     (check =
            {["1" "a"] 2,
             ["1" "b"] 2,
             ["2" "a"] 1,
             ["2" "b"] 1,
             ["3" "a"] 1,
             ["3" "b"] 1}))

[:PASSED
 {["1" "a"] 2,
  ["1" "b"] 2,
  ["2" "a"] 1,
  ["2" "b"] 1,
  ["3" "a"] 1,
  ["3" "b"] 1}]

(->> {:a [1 2], :b "hi!"}
     clj->java
     java->r
     r->java
     java->clj
     (check = {:a [1 2], :b ["hi!"]}))

[:PASSED {:a [1 2], :b ["hi!"]}]

(->> {:a [1 2], :b "hi!"}
     clj->java
     java->r
     ((r/function (r "deparse")))
     r->java
     java->clj
     (check = ["list(a = 1:2, b = \"hi!\")"]))

[:PASSED ["list(a = 1:2, b = \"hi!\")"]]


Checks: 38 PASSED
clojuress.v1.tutorial-test - created by notespace, Tue Jan 21 01:27:06 IST 2020.