using CategoricalArrays14 CategoricalArrays
To represent categorical variables in Julia, we can use the CategoricalArray type from CategoricalArrays.jl.
14.1 Create CategoricalArray with categorical()
x = ["a", "c", "d", "b", "a", "a", "d", "c"]8-element Vector{String}:
"a"
"c"
"d"
"b"
"a"
"a"
"d"
"c"
xc = categorical(x)8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"a"
"c"
"d"
"b"
"a"
"a"
"d"
"c"
The same can be achieved using the type object:
xc = CategoricalArray(x)8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"a"
"c"
"d"
"b"
"a"
"a"
"d"
"c"
14.2 The underlying UInt32 vector
A CategoricalArray is a mapping between an underlying UInt32 index to a set of levels.
You can access the underlying integers:
xc.refs8-element Vector{UInt32}:
0x00000001
0x00000003
0x00000004
0x00000002
0x00000001
0x00000001
0x00000004
0x00000003
Convert them to Int32:
xc.refs .% Int328-element Vector{Int32}:
1
3
4
2
1
1
4
3
or using convert():
convert(Array{Int32}, xc.refs)8-element Vector{Int32}:
1
3
4
2
1
1
4
3
14.3 Get levels of a CategoricalArray with levels()
levels(xc)4-element Vector{String}:
"a"
"b"
"c"
"d"
14.4 Set new level labels with recode() & recode!()
recode!(xc,
"a" => "alpha",
"b" => "beta",
"c" => "gamma",
"d" => "delta")8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"alpha"
"gamma"
"delta"
"beta"
"alpha"
"alpha"
"delta"
"gamma"
14.5 Reorder levels with levels() & levels!()
levels() in Julia vs. R
In Julia, levels() reorders levels of a CategoricalArray, unlike in R where it recodes / changes level labels.
xc8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"alpha"
"gamma"
"delta"
"beta"
"alpha"
"alpha"
"delta"
"gamma"
levels(xc)4-element Vector{String}:
"alpha"
"beta"
"gamma"
"delta"
levels!(xc, ["delta", "gamma", "beta", "alpha"])8-element CategoricalArrays.CategoricalArray{String,1,UInt32}:
"alpha"
"gamma"
"delta"
"beta"
"alpha"
"alpha"
"delta"
"gamma"
levels(xc)4-element Vector{String}:
"delta"
"gamma"
"beta"
"alpha"