Counters
Overview
We often want to count things and a way to do that is to create a dictionary that maps objects to their counts. A Counter object simplifies that process. Say we want to count values of type String. We would create a counter for that type like this:
julia> c = Counter{String}()
Counter{String} with 0 entriesThe two primary operations for a Counter are value increment and value retrieval. To increment the value of a counter we do this:
julia> c["hello"] += 1
1To access the count, we use square brackets:
julia> c["hello"]
1
julia> c["bye"]
0Notice that we need not worry about whether or not a key is already known to the Counter. If presented with an unknown key, the Counter assumes its value is 0.
A Counter may be assigned to like this c["alpha"]=4 but the more likely use case is c["bravo"]+=1 invoked each time a value, such as "bravo" is encountered.
Counting the elements of a list
The function counter (lowercase 'c') counts the element of a list/array or set. The multiplicity of an element is the number of times it appears in the list.
julia> A = [ "alpha", "bravo", "alpha", "gamma" ];
julia> C = counter(A);
julia> C
Counter{String} with these nonzero values:
alpha ==> 2
bravo ==> 1
gamma ==> 1Addition of counters
If c and d are counters (of the same type of object) their sum c+d creates a new counter by adding the values in c and d. That is, if a=c+d and k is any key, then a[k] equals c[k]+d[k].
Incrementing
To increment the count of an item x in a counter c we may either use c[x]+=1 or the increment function like this: incr!(c,x).
The increment function incr! is more useful for incrementing a collection of items. Use incr!(c,items) to add 1 to the count for each element held in items. If an element is present in items multiple times, its count is incremented for each occurrence.
julia> c = Counter{Int}()
Counter{Int64} with 0 entries
julia> items = [1,2,3,4,1,2,1]
7-element Array{Int64,1}:
1
2
3
4
1
2
1
julia> incr!(c,items)
julia> c
Counter{Int64} with these nonzero values:
1 ==> 3
2 ==> 2
3 ==> 1
4 ==> 1In addition, incr! may be used to increment one counter by the amount held in another. Note that it's the first argument c that gets changed; there is no effect on the second argument d.
Note: incr!(c,d) and c += d have the same effect, but the first is more efficient.
julia> c = Counter{Int}()
Counter{Int64} with these nonzero values:
julia> items = [1,2,3,4,1,2,1]
7-element Vector{Int64}:
1
2
3
4
1
2
1
julia> incr!(c,items)
julia> c
Counter{Int64} with these nonzero values:
1 ==> 3
2 ==> 2
3 ==> 1
4 ==> 1More functions
sum(c)returns the sum of the values inc; that is, the total
of all the counts.
length(c)returns the number of values held inc. Note that
this might include objects with value 0.
nnz(c)returns the number of nonzero values held
in c.
keys(c)returns an iterator for the keys held byc.values(c)returns an iterator for the values held byc.display(c)gives a print out of all the keys and their nonzero
values in c.
clean!(c)removes all keys fromcwhose value is0. This
won't change its behavior, but will free up some memory.
Listing elements
We can convert a Counter into a one-dimensional array in which each element appears with its appropriate multiplicity using collect:
julia> C = Counter{Int}()
Counter{Int64} with 0 entries
julia> C[3] = 4
4
julia> C[5] = 0
0
julia> C[-2] = 2
2
julia> collect(C)
6-element Array{Int64,1}:
3
3
3
3
-2
-2
The function collect_by_counts lists the elements of a Counter once each, but in decreasing order of their counts. That is, the element with the highest count is first, the element with the second highest count is second, and so forth. Elements whose count is zero are not listed.
julia> collect_by_counts(C)
2-element Vector{Int64}:
3
-2Average value
If the objects counted in C are numbers, then we compute the weighted average of those numbers with mean(C).
julia> C = Counter{Int}()
Counter{Int64} with 0 entries
julia> C[2] = 3
3
julia> C[3] = 7
7
julia> mean(C)
2.7Hashing
hash(C::Counter) returns a hash value for the C. Note that clean! is applied to C before computing the hash. This is done to ensure that equal counters give the same hash value.
May also be invoked as hash(C::Counter, h::Uint).
It's Associative
A Counter is a subtype of Associative and therefore we can use methods such as keys and/or values to get iterators to those items.
CSV Printing
The function csv_print writes a Counter to the screen in comma-separated format. This can be readily used for importing into a spreadsheet.
julia> C = Counter{Float64}()
Counter{Float64} with 0 entries
julia> C[3.4]=10
10
julia> C[2.2]=3
3
julia> csv_print(C)
2.2, 3
3.4, 10Counting in parallel
See the parallel-example directory for an illustration of how to use Counters in multiple parallel processes.