# Mastering Efficient Array Operations with StaticArrays.jl in Julia

## Discover techniques to accelerate array operations and matrix calculations using StaticArrays.jl.

The Julia programming language is known for being a high-level language that can still compete with C in terms of performance. As such, Julia already has performant data structures built-in, such as arrays. But what if arrays could be even faster? That's where the StaticArrays.jl package comes in.

StaticArrays.jl provides drop-in replacements for `Array`

,
the standard Julia array type.
These `StaticArray`

s work just like `Array`

s,
but they provide one additional piece of information
in the type:
the size of the array.
Consequently,
you can't insert or remove elements of a `StaticArray`

;
they are statically sized arrays
(hence the name).
However,
this restriction allows more information
to be given to Julia's compiler,
which in turn results in more efficient machine code
(for example, via loop unrolling and SIMD operations).
The resulting speed-up can often be 10x or more!

In this post,
we will learn how to use StaticArrays.jl
and compare the performance of `StaticArray`

s
to that of regular `Array`

s
for several different operations.

Note that the code examples in this post assume StaticArrays.jl has been installed and loaded:

```
# Press ] to enter the package prompt.
pkg> add StaticArrays
# Press Backspace to return to the Julia prompt.
julia> using StaticArrays
```

(Check out our post on the Julia REPL for more details about the package prompt and navigating the REPL.)

# How to Use StaticArrays.jl

When working with StaticArrays.jl,
typically one will use the `SVector`

type
or the `SMatrix`

type.
(There is also the `SArray`

type for N-dimensional arrays,
but we will focus on 1D and 2D arrays in this post.)
`SVector`

s and `SMatrix`

es have both static size
and static data,
meaning the data contained in such objects
cannot be modified.
For statically sized arrays
whose contents can be modified,
StaticArrays.jl provides `MVector`

and `MMatrix`

(and `MArray`

).
We will stick with `SVector`

s and `SMatrix`

es in this post
unless we specifically need mutability.

## Constructors

There are three ways to construct `StaticArray`

s.

Convenience constructor

`SA`

:`julia> SA[1, 2, 3] 3-element SVector{3, Int64} with indices SOneTo(3): 1 2 3 julia> SA[1 2; 3 4] 2×2 SMatrix{2, 2, Int64, 4} with indices SOneTo(2)×SOneTo(2): 1 2 3 4`

Normal constructor functions:

`julia> SVector(1, 2) 2-element SVector{2, Int64} with indices SOneTo(2): 1 2 julia> SMatrix{2,3}(1, 2, 3, 4, 5, 6) 2×3 SMatrix{2, 3, Int64, 6} with indices SOneTo(2)×SOneTo(3): 1 3 5 2 4 6`

Macros:

`julia> @SVector [1, 2, 3] 3-element SVector{3, Int64} with indices SOneTo(3): 1 2 3 julia> @SMatrix [1 2; 3 4] 2×2 SMatrix{2, 2, Int64, 4} with indices SOneTo(2)×SOneTo(2): 1 2 3 4`

Note that using macros also enables a convenient way to create

`StaticArray`

s from common array-creation functions (eliminating the need to create an`Array`

first just to convert it immediately to a`StaticArray`

):`@SVector [10 * i for i = 1:10] @SVector zeros(5) @SVector rand(7) @SMatrix [(i, j) for i = 1:2, j = 1:3] @SMatrix zeros(2, 2) @SMatrix randn(6, 6)`

## Conversion to/from `Array`

It may occasionally be necessary
to convert to or from `Array`

s.
To convert from an `Array`

to a `StaticArray`

,
use the appropriate constructor function.
However, because `Array`

s do not have size information in the type,
we ourselves must provide the size to the constructor:

```
SVector{3}([1, 2, 3])
SMatrix{4,4}(zeros(4, 4))
```

To convert back to an `Array`

, use the `collect`

function:

```
julia> collect(SVector(1, 2))
2-element Vector{Int64}:
1
2
```

# Comparing `StaticArray`

s to `Array`

s

Once a `StaticArray`

is created,
it can be operated on in the same way
as an `Array`

.
To illustrate,
we will run a simple benchmark,
both to compare the run-time speeds
of the two types of arrays
and to show that the same code can work
with either type of array.

Here's the benchmark code, inspired by StaticArrays.jl's benchmark:

```
using BenchmarkTools, StaticArrays, LinearAlgebra, Printf
add!(C, A, B) = C .= A .+ B
function run_benchmarks(N)
A = rand(N, N); A = A' * A
B = rand(N, N)
C = Matrix{eltype(A)}(undef, N, N)
D = rand(N)
SA = SMatrix{N,N}(A)
SB = SMatrix{N,N}(B)
MA = MMatrix{N,N}(A)
MB = MMatrix{N,N}(B)
MC = MMatrix{N,N}(C)
SD = SVector{N}(D)
speedup = [
@belapsed($A + $B) / @belapsed($SA + $SB),
@belapsed(add!($C, $A, $B)) / @belapsed(add!($MC, $MA, $MB)),
@belapsed($A * $B) / @belapsed($SA * $SB),
@belapsed(mul!($C, $A, $B)) / @belapsed(mul!($MC, $MA, $MB)),
@belapsed(norm($D)) / @belapsed(norm($SD)),
@belapsed(det($A)) / @belapsed(det($SA)),
@belapsed(inv($A)) / @belapsed(inv($SA)),
@belapsed($A \ $D) / @belapsed($SA \ $SD),
@belapsed(eigen($A)) / @belapsed(eigen($SA)),
@belapsed(map(abs, $A)) / @belapsed(map(abs, $SA)),
@belapsed(sum($D)) / @belapsed(sum($SD)),
@belapsed(sort($D)) / @belapsed(sort($SD)),
]
return speedup
end
function main()
benchmarks = [
"Addition",
"Addition (in-place)",
"Multiplication",
"Multiplication (in-place)",
"L2 Norm",
"Determinant",
"Inverse",
"Linear Solve (A \\ b)",
"Symmetric Eigendecomposition",
"`map`",
"Sum of Elements",
"Sorting",
]
N = [3, 5, 10, 30]
speedups = map(run_benchmarks, N)
fmt_header = Printf.Format("%-$(maximum(length.(benchmarks)))s" * " | %7s"^length(N))
header = Printf.format(fmt_header, "Benchmark", string.("N = ", N)...)
println(header)
println("="^length(header))
fmt = Printf.Format("%-$(maximum(length.(benchmarks)))s" * " | %7.1f"^length(N))
for i = 1:length(benchmarks)
println(Printf.format(fmt, benchmarks[i], getindex.(speedups, i)...))
end
end
main()
```

Notice that all the functions called
when creating the array `speedup`

in `run_benchmarks`

are the same whether using `Array`

s or `StaticArray`

s,
illustrating that `StaticArray`

s
are drop-in replacements for standard `Array`

s.

Running the above code
prints the following results on my laptop
(the numbers indicate the speedup
of `StaticArray`

s over normal `Array`

s;
e.g., a value of 17.7 means
using `StaticArray`

s was 17.7 times faster
than using `Array`

s):

```
Benchmark | N = 3 | N = 5 | N = 10 | N = 30
====================================================================
Addition | 17.7 | 14.5 | 7.9 | 2.0
Addition (in-place) | 1.6 | 1.3 | 1.4 | 0.7
Multiplication | 8.2 | 7.0 | 4.2 | 2.6
Multiplication (in-place) | 1.9 | 5.9 | 3.0 | 1.0
L2 Norm | 4.2 | 4.0 | 5.4 | 9.7
Determinant | 66.6 | 2.5 | 1.3 | 0.9
Inverse | 54.8 | 5.9 | 1.8 | 0.9
Linear Solve (A \ b) | 65.5 | 3.7 | 1.8 | 0.9
Symmetric Eigendecomposition | 3.7 | 1.0 | 1.0 | 1.0
`map` | 10.6 | 8.2 | 4.9 | 2.1
Sum of Elements | 1.5 | 1.1 | 1.7 | 2.1
Sorting | 7.1 | 2.9 | 1.5 | 1.1
```

There are two main conclusions from this table.
First,
using `StaticArray`

s instead of `Array`

s
can result in some nice speed-ups!
Second,
the gains from using `StaticArray`

s tend to diminish
as the sizes of the arrays increase.
So,
you can't expect StaticArrays.jl
to always magically make your code faster,
but if your arrays are small enough
(the recommendation being fewer than about 100 elements)
then you can expect to see some good speed-ups.

Of course, the above code timed just individual operations; how much faster a particular application would be is a different matter.

For example,
consider a physical simulation
where many 3D vectors
are manipulated over several time steps.
Since 3D vectors are static in size
(i.e., are 1D arrays with exactly three elements),
such a situation is a prime example
of where StaticArrays.jl is useful.
To illustrate,
here is an example
(taken from the field of magnetic resonance imaging)
of a physical simulation
using `Array`

s vs using `StaticArrays`

:

```
using BenchmarkTools, StaticArrays, LinearAlgebra
function sim_arrays(N)
M = Matrix{Float64}(undef, 3, N)
M[1,:] .= 0.0
M[2,:] .= 0.0
M[3,:] .= 1.0
M2 = similar(M)
(sinα, cosα) = sincosd(30)
R = [1 0 0; 0 cosα sinα; 0 -sinα cosα]
E1 = exp(-0.01)
E2 = exp(-0.1)
(sinθ, cosθ) = sincosd(1)
F = [E2 * cosθ E2 * sinθ 0; -E2 * sinθ E2 * cosθ 0; 0 0 E1]
FR = F * R
C = [0, 0, 1 - E1]
# Run for 100 time steps (each loop iteration does 2 time steps).
for t = 1:50
mul!(M2, FR, M)
M2 .+= C
mul!(M, FR, M2)
M .+= C
end
total = sum(M; dims = 2)
return complex(total[1], total[2])
end
function sim_staticarrays(N)
M = fill(SVector(0.0, 0.0, 1.0), N)
(sinα, cosα) = sincosd(30)
R = @SMatrix [1 0 0; 0 cosα sinα; 0 -sinα cosα]
E1 = exp(-0.01)
E2 = exp(-0.1)
(sinθ, cosθ) = sincosd(1)
F = @SMatrix [E2 * cosθ E2 * sinθ 0; -E2 * sinθ E2 * cosθ 0; 0 0 E1]
FR = F * R
C = @SVector [0, 0, 1 - E1]
# Run for 100 time steps (each loop iteration does 1 time step).
for t = 1:100
# Apply simulation dynamics to each 3D vector.
for i = 1:length(M)
M[i] = FR * M[i] + C
end
end
total = sum(M)
return complex(total[1], total[2])
end
function main(N)
r1 = @btime sim_arrays($N)
r2 = @btime sim_staticarrays($N)
@assert r1 ≈ r2 # Make sure the results are the same.
end
```

The speed-ups on my laptop
for different values of `N`

were as follows:

`N = 10`

: 14.6x faster`N = 100`

: 7.1x faster`N = 1000`

: 5.2x faster

(Here, `N`

is the number of 3D vectors in the simulation,
not the size of the `StaticArray`

s.)

Note also that I wrote `sim_arrays`

to be as performant as possible
by doing in-place operations
(like `mul!`

),
which has the unfortunate side effect
of making the code a bit harder to read.
Therefore,
`sim_staticarrays`

is both faster *and* easier to read!

As another example of how StaticArrays.jl can speed up a more involved application, see the DifferentialEquations.jl docs.

# Summary

In this post,
we discussed StaticArrays.jl.
We saw that `StaticArray`

s are drop-in replacements
for regular Julia `Array`

s.
We also saw that using `StaticArray`

s
can result in some nice speed-ups
over using `Array`

s,
at least when the sizes of the arrays
are not too big.

Are array operations a bottleneck in your code? Try out StaticArrays.jl and then comment below how it helps!

# Additional Links

- StaticArrays.jl Docs
- Documentation for StaticArrays.jl.

Cover image background from https://openverse.org/image/875bf026-11ef-47a8-a63c-ee1f1877c156?q=circuit%20board%20array.