# szew/io

File to data and back.

[![szew/io](https://clojars.org/szew/io/latest-version.svg)](https://clojars.org/szew/io)

[API Codox][1]

## Why

I've been dogfooding my private Clo*j*ure toolbox (named `szew`) since 2012.

Splitting out and releasing non-proprietary parts.

## What

Processing files into data and data into files, providing common protocol
for these operations. Pretty handy with ad hoc (read: REPL) single input
to single output tasks.

Consists of Records being specification and implementing protocols:

- `(Input/in! spec source)` for reading source, returns whatever your
  `:processor` callable returns, usual default is to materialize everything
- `(Output/sink spec target)` returns callable that will consume
  data and put it in the target, returns nil

Each constructor carries documentation for its spec, currently these are:

- `Lines`, constructed with `lines`

    * Input: text file in, sequence of Strings out
    * Output Sequence of Strings in, text file propagated

- `CSV`, constructed with `csv` or `tsv`

    * Input: \*SV in, sequence of vectors of Strings out
    * Output: sequence of vectors of Strings in, \*SV propagated

- `FixedWidth`, constructed with `fixed-width`

    * Input: fixed width lines in, sequence of vectors of Strings out
    * Output: sequence of vectors of Strings in, fixed width file propagated

- `XML`, constructed with `xml`

    * Input: XML in, `data.xml/parse` result out
    * Output: data in, `data.xml/emit` put in file

- `Files`, constructed with `files`

    * Input: file or directory in, sequence of files out
    * Output: N/A

- `Hasher`, constructed with `hasher`

    * Input: file in, hash out
    * Output: N/A

### Input processing

You prepare a processor that is a data eating function or composition of such
functions. You shove that into a spec, it is then fed data while your source
file open. Just remember to let go of the head if you're short on memory!

```clojure
(require '[szew.io :as io])

(let [proc (partial into [] (comp (drop 2) (take 2)))]
  (println (io/in! (io/tsv {:processor proc}) "input.csv")))

;; => displays vector of third and fourth rows of input.csv

```

### Output processing

On the other hand you've got output sink creators. That will accept spec
and path, giving you a callable that will consume a sequence and dump into
target output file.

```clojure
(require '[szew.io :as io])

(let [sink (io/sink (io/csv) "out.csv")]
  (io/in! (io/tsv {:processor sink}) "input.tsv"))

;; => returns nil, converts TSV into CSV

```

## Usage

Old fashioned composed partials.

```clojure
(require '[szew.io :as io])
(require '[szew.io.util :as util])

;; A seq of lines from input.txt, processed with composed
;; functions and written to out.txt

(def p (comp (io/sink (io/lines) "out.txt")
             (partial take 10)
             (partial filter true?)
             (partial map #(or % false))
             (partial drop 1)))

(io/in! (io/lines {:processor #'p}) "input.txt")

;; A seq of vectors from in.csv, processed with composed functions
;; and written to out.tsv
(let [adj (util/row-adjuster ["default #1" "default #2" "default #3"])
      out (io/sink (io/tsv) "out.tsv")
      pro (comp out
                (partial cons ["col #1" "col #2" "col #3"])
                (partial map adj)
                (partial take 10)
                (partial filter true?)
                (partial map #(or % false))
                (partial drop 1))]
  (io/in! (in/csv {:processor pro, :strict true}) "input.csv"))

```

## License

Copyright © 2012-2016 Sławek Gwizdowski

MIT License, text can be found in the LICENSE file.

[1]: http://spottr.bitbucket.org/szew-io/0.2.0/

