<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Great Lakes Consulting]]></title><description><![CDATA[Premier IT consultants give you Julia Language best practices. Read to succeed in Data Analytics and Dashboards, Scientific Computing, and HPC.]]></description><link>https://blog.glcs.io</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1696442655327/HoKccgZVN.png</url><title>Great Lakes Consulting</title><link>https://blog.glcs.io</link></image><generator>RSS for Node</generator><lastBuildDate>Fri, 17 Apr 2026 06:46:50 GMT</lastBuildDate><atom:link href="https://blog.glcs.io/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[How to Integrate Julia Code Within a Python Program]]></title><description><![CDATA[This post was written by Steven Whitaker.

Ever wish your Python code could run faster
on heavy calculations or simulations?
With juliacall,
you can call Julia straight from Python
and instantly access blazing-fast performance
and powerful scientific...]]></description><link>https://blog.glcs.io/julia-python-juliacall</link><guid isPermaLink="true">https://blog.glcs.io/julia-python-juliacall</guid><category><![CDATA[Julia]]></category><category><![CDATA[General Programming]]></category><category><![CDATA[Python]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Mon, 08 Dec 2025 18:06:44 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1755296562465/M5kv-PYRk.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by <strong>Steven Whitaker</strong>.</p>
</blockquote>
<p>Ever wish your Python code could run faster
on heavy calculations or simulations?
With <a target="_blank" href="https://juliapy.github.io/PythonCall.jl/stable/juliacall/"><code>juliacall</code></a>,
you can call Julia straight from Python
and instantly access blazing-fast performance
and powerful scientific libraries,
all without rewriting your existing code.
Supercharge your Python workflows today and
elevate your data science and engineering projects
to new heights!</p>
<p>In this <em>Julia for Devs</em> post,
learn step-by-step how to install and
utilize <code>juliacall</code>,
enabling you to
boost critical code performance
effortlessly,
without rewriting your entire project.
Unlock the powerful combination of
Python’s vast ecosystem
and Julia’s speed,
making it easy to experiment,
optimize,
or gradually migrate key components.</p>
<p>Let's dig in!</p>
<h1 id="heading-installing-juliacall">Installing <code>juliacall</code></h1>
<p>Installation is a breeze,
all you need is</p>
<pre><code class="lang-shell">pip install juliacall
</code></pre>
<p>You can test your installation
by running the following in Python:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> juliacall <span class="hljs-keyword">import</span> Main <span class="hljs-keyword">as</span> jl
</code></pre>
<p>The first time this runs,
it will install the Julia packages
needed for communicating
between Julia and Python.</p>
<p>Then you can try it out:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np
A = np.array(jl.rand(<span class="hljs-number">5</span>, <span class="hljs-number">3</span>))
x = np.array(jl.randn(<span class="hljs-number">3</span>))
y = A @ x
</code></pre>
<p>Great,
Julia-Python interoperability
works for this small example!
Now let's see
how we can extend this
to a larger example.</p>
<h1 id="heading-calling-custom-code">Calling Custom Code</h1>
<p>In practice,
we might have written some custom code in Julia
that we want to integrate
into our Python workflow.
Let's walk through the process
of this integration.</p>
<h2 id="heading-julia-code">Julia Code</h2>
<p>Typically,
the Julia code will be organized
into a package,
including its own package environment
and dependencies.</p>
<p>We'll work with an example
that runs a simulation
using <a target="_blank" href="https://github.com/SciML/OrdinaryDiffEq.jl">OrdinaryDiffEq.jl</a> and <a target="_blank" href="https://github.com/JuliaArrays/StaticArrays.jl">StaticArrays.jl</a>.
The example package
has the following directory structure:</p>
<pre><code class="lang-plaintext">JuliaExample
├── Project.toml
└── src
    └── JuliaExample.jl
</code></pre>
<p>The <code>Project.toml</code> has the following content:</p>
<pre><code class="lang-toml"><span class="hljs-attr">name</span> = <span class="hljs-string">"JuliaExample"</span>
<span class="hljs-attr">uuid</span> = <span class="hljs-string">"0b6476de-1cea-499f-93be-749bc74a9c07"</span>
<span class="hljs-attr">authors</span> = [<span class="hljs-string">"Author Name &lt;address@email.com&gt;"</span>]
<span class="hljs-attr">version</span> = <span class="hljs-string">"0.1.0"</span>

<span class="hljs-section">[deps]</span>
<span class="hljs-attr">OrdinaryDiffEq</span> = <span class="hljs-string">"1dea7af3-3e70-54e6-95c3-0bf5283fa5ed"</span>
<span class="hljs-attr">StaticArrays</span> = <span class="hljs-string">"90137ffa-7385-5640-81b9-e52037218182"</span>
</code></pre>
<p>And <code>JuliaExample.jl</code> contains:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">module</span> JuliaExample

<span class="hljs-keyword">using</span> OrdinaryDiffEq: ODEProblem, Tsit5, solve
<span class="hljs-keyword">using</span> StaticArrays: SVector

<span class="hljs-keyword">struct</span> Params
    α::<span class="hljs-built_in">Float64</span>
    β::<span class="hljs-built_in">Float64</span>
    <span class="hljs-literal">γ</span>::<span class="hljs-built_in">Float64</span>
<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> f(u, p, t)

    (; α, β, <span class="hljs-literal">γ</span>) = p

    dx = α * (u[<span class="hljs-number">2</span>] - u[<span class="hljs-number">1</span>])
    dy = u[<span class="hljs-number">1</span>] (β - u[<span class="hljs-number">3</span>]) - u[<span class="hljs-number">2</span>]
    dz = u[<span class="hljs-number">1</span>] * u[<span class="hljs-number">2</span>] - <span class="hljs-literal">γ</span> * u[<span class="hljs-number">3</span>]

    <span class="hljs-keyword">return</span> SVector(dx, dy, dz)

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> simulate(u0, t_start, t_end, α, β, <span class="hljs-literal">γ</span>)

    u0 = SVector{<span class="hljs-number">3</span>, <span class="hljs-built_in">Float64</span>}(u0)
    tspan = (t_start, t_end)
    p = Params(α, β, <span class="hljs-literal">γ</span>)
    prob = ODEProblem{<span class="hljs-literal">false</span>}(f, u0, tspan, p)
    sol = solve(prob, Tsit5())

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>
</code></pre>
<h2 id="heading-python-code">Python Code</h2>
<p>We have our custom Julia code,
so now let's see
what our Python workflow looks like
that calls out to Julia.</p>
<p>We'll have our code organized
in the following directory structure:</p>
<pre><code class="lang-plaintext">python_example
├── pyproject.toml
├── scripts
│   └── run.py
└── src
    └── python_example
        ├── __init__.py
        └── analysis.py
</code></pre>
<p>The main functionality of our Python code
is in <code>analysis.py</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> juliacall <span class="hljs-keyword">import</span> Main <span class="hljs-keyword">as</span> jl
jl.seval(<span class="hljs-string">"using JuliaExample: simulate"</span>)

<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np
<span class="hljs-keyword">import</span> matplotlib.pyplot <span class="hljs-keyword">as</span> plt

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">simulate</span>(<span class="hljs-params">*args</span>):</span>
    result = jl.simulate(*args)
    t = np.array(result.t)
    sol = np.array([np.array(u) <span class="hljs-keyword">for</span> u <span class="hljs-keyword">in</span> result.u])
    <span class="hljs-keyword">return</span> t, sol

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">plot_results</span>(<span class="hljs-params">t, sol</span>):</span>
    plt.figure(figsize=(<span class="hljs-number">10</span>, <span class="hljs-number">6</span>))
    labels = [<span class="hljs-string">"x"</span>, <span class="hljs-string">"y"</span>, <span class="hljs-string">"z"</span>]
    colors = [<span class="hljs-string">"tab:blue"</span>, <span class="hljs-string">"tab:orange"</span>, <span class="hljs-string">"tab:green"</span>]

    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(<span class="hljs-number">3</span>):
        plt.plot(t, sol[:, i], label=labels[i], color=colors[i])

    plt.xlabel(<span class="hljs-string">"Time [s]"</span>)
    plt.ylabel(<span class="hljs-string">"Value"</span>)
    plt.title(<span class="hljs-string">"Solution Components Over Time"</span>)
    plt.legend()
    plt.grid(<span class="hljs-literal">True</span>)
    plt.tight_layout()
    plt.show()
</code></pre>
<p>This code provides two functions:
one for calling out to Julia
to run a simulation,
and another for plotting the simulation results.</p>
<p>Let's break down some of this code
to see how Julia is incorporated:</p>
<ul>
<li><pre><code class="lang-python"><span class="hljs-keyword">from</span> juliacall <span class="hljs-keyword">import</span> Main <span class="hljs-keyword">as</span> jl
</code></pre>
We saw this earlier;
this is how to load <code>juliacall</code>.</li>
<li><pre><code class="lang-python">jl.seval(<span class="hljs-string">"using JuliaExample: simulate"</span>)
</code></pre>
Here,
we load our Julia package,
specifically bringing the function <code>simulate</code>
into scope.</li>
<li>The Python function <code>simulate</code>
calls the Julia <code>simulate</code>,
passing along all its inputs:<pre><code class="lang-python">result = jl.simulate(*args)
</code></pre>
The Python function
then does some processing
to convert the Julia results
into something more easily utilized
by further Python processing.</li>
</ul>
<p>This functionality is exercised
in the <code>run.py</code> script:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> python_example <span class="hljs-keyword">import</span> simulate, plot_results
<span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np

u0 = np.array([<span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>])
t_start = <span class="hljs-number">0</span>
t_end = <span class="hljs-number">100</span>
alpha = <span class="hljs-number">10</span>
beta = <span class="hljs-number">28</span>
gamma = <span class="hljs-number">8</span>/<span class="hljs-number">3</span>
t, sol = simulate(u0, t_start, t_end, alpha, beta, gamma)
plot_results(t, sol)
</code></pre>
<p>Finally,
for completeness,
here's <code>__init__.py</code>:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> .analysis <span class="hljs-keyword">import</span> simulate, plot_results
</code></pre>
<h2 id="heading-finding-julia">Finding Julia</h2>
<p>We have all our code set up,
so now we need Python
to be able to find the Julia code
so we can call out to it.
In other words,
we need the Julia package environment
used by <code>juliacall</code>
to have <code>JuliaExample</code> as a dependency.</p>
<p>We can accomplish this
by creating a <code>juliapkg.json</code> file
in our Python project directory
(i.e., <code>python_example/juliapkg.json</code>).
The file should contain the following JSON:</p>
<pre><code class="lang-json">{
    <span class="hljs-attr">"packages"</span>: {
        <span class="hljs-attr">"JuliaExample"</span>: {
            <span class="hljs-attr">"uuid"</span>: <span class="hljs-string">"0b6476de-1cea-499f-93be-749bc74a9c07"</span>,
            <span class="hljs-attr">"path"</span>: <span class="hljs-string">"path/to/JuliaExample"</span>,
            <span class="hljs-attr">"dev"</span>: <span class="hljs-literal">true</span>
        }
    }
}
</code></pre>
<p>Note that the <code>uuid</code> here
needs to match the <code>uuid</code>
in <code>JuliaExample/Project.toml</code>.
And the <code>"path"</code> and <code>"dev": true</code> fields are necessary
because our Julia package exists locally on our machine;
it is not a registered Julia package.
See the <a target="_blank" href="https://juliapy.github.io/PythonCall.jl/stable/juliacall/#julia-deps"><code>juliacall</code> docs</a>
for more information about <code>juliapkg.json</code>.</p>
<h2 id="heading-putting-it-together">Putting It Together</h2>
<p>Now we have all the components we need:
Python code to run,
Julia code to call out to,
and <code>juliapkg.json</code> to tell <code>juliacall</code>
where to find our Julia code.</p>
<p>So, what happens when we run <code>run.py</code>?</p>
<p>The first time it is run,
<code>juliacall</code> will set up the Julia package environment,
installing the dependencies of <code>JuliaExample</code>.
Then, the script proceeds to run the simulation
(calling out to Julia to do so)
and plot the results:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1755282331427/ejdtNXcQk.png?auto=format" alt="Simulation results" /></p>
<p>Awesome,
we now have a working example
showing how to call out to Julia
from a Python project!</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we saw how to install and use <code>juliacall</code>
to call out to Julia from within Python.
We looked at a trivial example
as well as a more realistic example
of integrating custom Julia code
into a Python project.</p>
<p>What custom Julia code
do you want to integrate
into your Python projects?
<a target="_blank" href="https://glcs.io/modeling-simulation/">Contact us</a>, and we can make it happen!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://github.com/JuliaPy/PythonCall.jl">PythonCall.jl</a><ul>
<li>One cool thing about <code>juliacall</code>
is it is maintained in the same GitHub repo
as <code>PythonCall</code>,
which is the recommended way
to call Python code from Julia.</li>
</ul>
</li>
<li><a target="_blank" href="https://glcs.io/modeling-simulation/">GLCS Modeling &amp; Simulation</a><ul>
<li>Connect with us for Julia Modeling &amp; Simulation consulting.</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Simplifying Julia Package Integration with Extensions]]></title><description><![CDATA[This post was written by Steven Whitaker.

The Julia programming language
is a high-level language
that is known, at least in part,
for its outstanding composability.
Much of Julia's composability
stems from its multiple dispatch,
which allows functi...]]></description><link>https://blog.glcs.io/package-extensions</link><guid isPermaLink="true">https://blog.glcs.io/package-extensions</guid><category><![CDATA[Julia]]></category><category><![CDATA[library]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[package]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Mon, 17 Nov 2025 18:05:57 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1754678023597/Ii3WtIkQm.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by <strong>Steven Whitaker</strong>.</p>
</blockquote>
<p>The <a target="_blank" href="https://julialang.org/">Julia programming language</a>
is a high-level language
that is known, at least in part,
for its outstanding composability.
Much of Julia's composability
stems from its multiple dispatch,
which allows functions written in one package
to work with objects from another package
without either package needing to depend on or even know about the other.
(See <a target="_blank" href="https://blog.glcs.io/multiple-dispatch">another blog post</a> for more details.)</p>
<p>Sometimes, however,
it is useful for a package
to be able to extend its functions
to provide additional functionality
when given an object of a specific type
from another package.
One way to do so
is to add the other package as an explicit dependency
so that its type is available
for the first package to use
to define a specific method for it.</p>
<p>But what if the package can function just fine
without the additional functionality?
What if the extra functionality
isn't integral to what the package does
and only applies
if the user
wants to work with objects
of that specific type?
In this case,
it doesn't make much sense
to make the other package a direct dependency,
because then <em>every</em> user
pays the price of extra package load time
for functionality that only some users actually want.</p>
<p>The solution is package extensions.
A package extension is code
that gets loaded <em>conditionally</em>,
depending on what other packages
the user has explicitly loaded.
In other words,
when a user loads both the package
and the dependency the extension depends on,
the extension gets loaded automatically.
This way,
users who want to use the package
can do so without the added dependency,
while users who want the extra functionality
can load the dependency themselves.</p>
<p>In this post,
we will learn about some package extensions
that exist in the Julia package ecosystem.
We will also learn how to write a package extension
and how to load the extension.</p>
<p>This post assumes you are familiar
with the structure of a Julia package.
If you need to learn more,
check out <a target="_blank" href="https://blog.glcs.io/package-creation">our post on creating Julia packages</a>.</p>
<h1 id="heading-package-extensions-in-the-wild">Package Extensions in the Wild</h1>
<ul>
<li><a target="_blank" href="https://github.com/JuliaDiff/ForwardDiff.jl">ForwardDiff.jl</a> has an extension for <a target="_blank" href="https://github.com/JuliaArrays/StaticArrays.jl">StaticArrays.jl</a>,
enabling forward-mode automatic differentiation
with the performance benefits of StaticArrays.</li>
<li><a target="_blank" href="https://github.com/LuxDL/Lux.jl">Lux.jl</a> has an extension for <a target="_blank" href="https://github.com/FluxML/Flux.jl">Flux.jl</a>
that adds functionality
for converting deep learning models defined in Flux to Lux.</li>
<li>Some other packages with extensions
(just look in the <code>ext</code> folder of these git repos):<ul>
<li><a target="_blank" href="https://github.com/SciML/ModelingToolkit.jl">ModelingToolkit.jl</a></li>
<li><a target="_blank" href="https://github.com/aviatesk/JET.jl">JET.jl</a></li>
<li><a target="_blank" href="https://github.com/JuliaStats/Distributions.jl">Distributions.jl</a></li>
<li><a target="_blank" href="https://github.com/JuliaDiff/ChainRulesCore.jl">ChainRulesCore.jl</a></li>
<li><a target="_blank" href="https://github.com/SciML/LinearSolve.jl">LinearSolve.jl</a></li>
<li><a target="_blank" href="https://github.com/itsdfish/PackageExtensionsExample.jl">PackageExtensionsExample.jl</a><ul>
<li>Okay, this one isn't really a package extension in the wild,
but it could be a good resource
for learning how they work.</li>
</ul>
</li>
</ul>
</li>
</ul>
<h1 id="heading-writing-a-package-extension">Writing a Package Extension</h1>
<p>To create a package extension,
one needs to create a module
that adds method definitions
to functions from one of the packages
(either the package being extended
or the package that triggers loading the extension)
that dispatch on types from the other package.
This module will live in the <code>ext</code> directory
of the package being extended.
Additionally,
the extended package's <code>Project.toml</code>
needs to be updated
to inform the package manager
of the existence of the extension
and when to load it.</p>
<p>Let's look at a concrete example.</p>
<h2 id="heading-example-package-to-extend">Example Package to Extend</h2>
<p>This example will build on a custom package called Averages.jl
that we discussed in <a target="_blank" href="https://blog.glcs.io/package-testing">our blog post on testing Julia packages</a>.
The package code is as follows:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">module</span> Averages

<span class="hljs-keyword">using</span> Statistics: mean

<span class="hljs-keyword">export</span> compute_average

compute_average(x) = (check_real(x); mean(x))

<span class="hljs-keyword">function</span> compute_average(a, b...)

    check_real(a)

    N = length(a)
    <span class="hljs-keyword">for</span> (i, x) <span class="hljs-keyword">in</span> enumerate(b)
        check_real(x)
        check_length(i + <span class="hljs-number">1</span>, x, N)
    <span class="hljs-keyword">end</span>

    T = float(promote_type(eltype(a), eltype.(b)...))
    average = <span class="hljs-built_in">Vector</span>{T}(undef, N)
    average .= a
    <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> b
        average .+= x
    <span class="hljs-keyword">end</span>
    average ./= length(b) + <span class="hljs-number">1</span>

    <span class="hljs-keyword">return</span> a <span class="hljs-keyword">isa</span> <span class="hljs-built_in">Real</span> ? average[<span class="hljs-number">1</span>] : average

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> check_real(x)

    T = eltype(x)
    T &lt;: <span class="hljs-built_in">Real</span> || throw(<span class="hljs-built_in">ArgumentError</span>(<span class="hljs-string">"only real numbers are supported; unsupported type <span class="hljs-variable">$T</span>"</span>))

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> check_length(i, x, expected)

    N = length(x)
    N == expected || throw(<span class="hljs-built_in">DimensionMismatch</span>(<span class="hljs-string">"the length of input <span class="hljs-variable">$i</span> does not match the length of the first input: <span class="hljs-variable">$N</span> != <span class="hljs-variable">$expected</span>"</span>))

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>
</code></pre>
<h2 id="heading-creating-the-extension">Creating the Extension</h2>
<p>For this example,
we will create an extension
that implements additional functionality for <code>DataFrame</code>s.
These are the tasks we need to do
to implement the extension:</p>
<ol>
<li><p>Create the extension
at <code>Averages/ext/AveragesDataFramesExt.jl</code>.
Note that this follows the naming convention for extensions:
<code>&lt;PackageName&gt;&lt;NameOfPackageThatTriggersExtension&gt;Ext</code>.
Inside this file,
we create a module called <code>AveragesDataFramesExt</code>
(same name as the file)
and put the code we want to be included
when Averages.jl and DataFrames.jl are loaded together:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">module</span> AveragesDataFramesExt

<span class="hljs-keyword">import</span> Averages
<span class="hljs-keyword">using</span> Averages: compute_average
<span class="hljs-keyword">using</span> DataFrames: All, DataFrame, combine

<span class="hljs-keyword">function</span> Averages.compute_average(df::DataFrame)

    <span class="hljs-meta">@info</span> <span class="hljs-string">"Running code in AveragesDataFramesExt!"</span>

    df_avg = combine(df, All() .=&gt; compute_average)

    <span class="hljs-keyword">return</span> df_avg

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>
</code></pre>
</li>
<li><p>Add <code>[weakdeps]</code> and <code>[extensions]</code> sections
to the <code>Project.toml</code> of Averages.jl.
(See <a target="_blank" href="https://blog.glcs.io/package-testing">our previous blog post</a> for the original <code>Project.toml</code>.)
In <code>[weakdeps]</code>,
specify DataFrames.jl and its UUID,
and, in <code>[extensions]</code>,
specify our extension (AveragesDataFramesExt)
and its dependency (DataFrames.jl).
The UUID of DataFrames.jl can be found
in <a target="_blank" href="https://github.com/JuliaData/DataFrames.jl/blob/main/Project.toml">DataFrames.jl's <code>Project.toml</code></a>.</p>
<p>Here's the updated <code>Project.toml</code> for Averages.jl:</p>
<pre><code class="lang-toml"><span class="hljs-attr">name</span> = <span class="hljs-string">"Averages"</span>
<span class="hljs-attr">uuid</span> = <span class="hljs-string">"1fc6e63b-fe0f-463a-8652-42f2a29b8cc6"</span>
<span class="hljs-attr">version</span> = <span class="hljs-string">"0.1.0"</span>

<span class="hljs-section">[deps]</span>
<span class="hljs-attr">Statistics</span> = <span class="hljs-string">"10745b16-79ce-11e8-11f9-7d13ad32a3b2"</span>

<span class="hljs-section">[weakdeps]</span>
<span class="hljs-attr">DataFrames</span> = <span class="hljs-string">"a93c6f00-e57d-5684-b7b6-d8193f3e46c0"</span>

<span class="hljs-section">[extensions]</span>
<span class="hljs-attr">AveragesDataFramesExt</span> = <span class="hljs-string">"DataFrames"</span>

<span class="hljs-section">[extras]</span>
<span class="hljs-attr">Test</span> = <span class="hljs-string">"8dfed614-e22c-5e08-85e1-65c5234f0b40"</span>

<span class="hljs-section">[targets]</span>
<span class="hljs-attr">test</span> = [<span class="hljs-string">"Test"</span>]
</code></pre>
<p>(Note that,
just as compatible versions of the <code>[deps]</code> packages
can be specified in a <code>[compat]</code> section,
so too can the compatible versions of the <code>[weakdeps]</code> packages
be specified.)</p>
</li>
</ol>
<h2 id="heading-using-the-extension">Using the Extension</h2>
<p>First,
let's see what happens
if we try this without the extension:</p>
<pre><code class="lang-julia">julia&gt; compute_average(DataFrame(a = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], b = [<span class="hljs-number">3.0</span>, <span class="hljs-number">4.0</span>]))
ERROR: <span class="hljs-built_in">ArgumentError</span>: only real numbers are supported; unsupported <span class="hljs-keyword">type</span> <span class="hljs-built_in">Any</span>
Stacktrace:
 [<span class="hljs-number">1</span>] check_real(x::DataFrame)
   @ Averages /path/to/Averages/src/Averages.jl:<span class="hljs-number">34</span>
 [<span class="hljs-number">2</span>] compute_average(x::DataFrame)
   @ Averages /path/to/Averages/src/Averages.jl:<span class="hljs-number">7</span>
 [<span class="hljs-number">3</span>] top-level scope
   @ REPL[<span class="hljs-number">5</span>]:<span class="hljs-number">1</span>
</code></pre>
<p>So,
now let's see if the extension
allows this function call to work.</p>
<p>To use the extension,
install and load Averages.jl and DataFrames.jl
(for Averages.jl, use the <code>dev</code> command,
i.e., <code>pkg&gt; dev /path/to/Averages</code>)
and then call <code>compute_average</code>:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> Averages, DataFrames

julia&gt; compute_average(DataFrame(a = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], b = [<span class="hljs-number">3.0</span>, <span class="hljs-number">4.0</span>]))
[ Info: Running code <span class="hljs-keyword">in</span> AveragesDataFramesExt!
<span class="hljs-number">1</span>×<span class="hljs-number">2</span> DataFrame
 Row │ a_compute_average  b_compute_average
     │ <span class="hljs-built_in">Float64</span>            <span class="hljs-built_in">Float64</span>
─────┼──────────────────────────────────────
   <span class="hljs-number">1</span> │               <span class="hljs-number">1.5</span>                <span class="hljs-number">3.5</span>
</code></pre>
<p>Nice, it works!
And with that,
we have an example package extension
that illustrates how to implement your own.</p>
<p>And remember,
a user of Averages.jl
will only incur the cost of loading AveragesDataFramesExt
if they load DataFrames.jl.
For more details,
see the slide annotations
in this screenshot from JuliaCon 2023:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725921992130/7_kyasRPu.png?auto=format" alt="JuliaCon 2023 package extensions talk" /></p>
<p>(See also the full <a target="_blank" href="https://youtu.be/TiIZlQhFzyk">talk on package extensions</a>
for even more details.)</p>
<h2 id="heading-note-where-should-an-extension-live">Note: Where Should an Extension Live?</h2>
<p>By the way,
if you're wondering why we put the extension in Averages.jl
instead of DataFrames.jl,
the answer is
that it doesn't really matter
because the user experience
will be the same regardless.
If you still want some rules to follow,
I'm not aware of any Julia best-practices
in this regard,
but here are some rules that make sense to me:</p>
<ul>
<li>If one of the two packages in question
defines an interface,
the extension should go in the package
that implements the interface.</li>
<li>Otherwise,
put the extension in the package
that owns the functions
that are being extended.
In our example,
we extended the <code>compute_average</code> function.
Since this function is defined in Averages.jl,
we put the extension in Averages.jl.</li>
<li>An exception to the previous rule
is if getting the new functionality right
requires a good understanding
of the internals of the new data type
that's being dispatched on,
in which case the extension
should belong in the package
that defines the type.
For example,
if <code>compute_average</code> was super complicated
for some reason
when working with <code>DataFrame</code>s,
it would make sense for those with the needed expertise
(i.e., the developers of DataFrames.jl)
to own and maintain the extension.</li>
</ul>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we listed some real Julia packages
that have their own package extensions.
We also demonstrated creating our own extension
for an example package
and showed how to use the extension's code.</p>
<p>What package extensions have you found useful?
Let us know in the comments below!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://pkgdocs.julialang.org/v1/creating-packages/#Conditional-loading-of-code-in-packages-(Extensions)">Julia Package Extension Docs</a><ul>
<li>Official Julia documentation on package extensions.</li>
</ul>
</li>
<li><a target="_blank" href="https://youtu.be/TiIZlQhFzyk">JuliaCon 2023 Talk Introducing Package Extensions</a><ul>
<li>Talk by Kristoffer Carlsson in which package extensions are introduced.</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Discover the Key Features and Updates in Julia 1.12]]></title><description><![CDATA[This post was written by Steven Whitaker.

A new version of the Julia programming language
was just released!
Version 1.12 is now the latest stable version of Julia.
This release is a minor release,
meaning it includes language enhancements
and bug f...]]></description><link>https://blog.glcs.io/julia-1-12</link><guid isPermaLink="true">https://blog.glcs.io/julia-1-12</guid><category><![CDATA[Julia]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Thu, 09 Oct 2025 16:53:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1750800672398/94_gjVLKb.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by <strong>Steven Whitaker</strong>.</p>
</blockquote>
<p>A new version of the <a target="_blank" href="https://julialang.org">Julia programming language</a>
was just released!
Version 1.12 is now the latest stable version of Julia.</p>
<p>This release is a minor release,
meaning it includes language enhancements
and bug fixes
but should also be fully compatible
with code written in previous Julia versions
(from version 1.0 and onward).</p>
<p>In this post,
we will check out some of the features and improvements
introduced in this newest Julia version.
Read the full post,
or click on the links below
to jump to the features that interest you.</p>
<ul>
<li><a class="post-section-overview" href="#heading-improved-quality-of-life-with-redefinable-types">Improved Quality of Life with Redefinable Types</a></li>
<li><a class="post-section-overview" href="#heading-new-way-to-fix-function-arguments">New Way to Fix Function Arguments</a></li>
<li><a class="post-section-overview" href="#heading-progress-towards-more-reasonable-executables">Progress Towards More Reasonable Executables</a></li>
</ul>
<p>If you are new to Julia
(or just need a refresher),
feel free to check out our <a target="_blank" href="https://blog.glcs.io/series/julia-basics-programmers">Julia tutorial series</a>,
beginning with <a target="_blank" href="https://blog.glcs.io/install-julia-and-vscode">how to install Julia and VS Code</a>.</p>
<h1 id="heading-improved-quality-of-life-with-redefinable-types">Improved Quality of Life with Redefinable Types</h1>
<p>Julia 1.12 introduces a major quality of life improvement
for package development
by allowing types to be redefined.
To illustrate,
prior to Julia 1.12:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">struct</span> A <span class="hljs-keyword">end</span>

julia&gt; <span class="hljs-keyword">struct</span> A
           x::<span class="hljs-built_in">Int</span>
       <span class="hljs-keyword">end</span>
ERROR: invalid redefinition of constant Main.A
Stacktrace:
 [<span class="hljs-number">1</span>] top-level scope
   @ REPL[<span class="hljs-number">2</span>]:<span class="hljs-number">1</span>
</code></pre>
<p>But in Julia 1.12,
no error is thrown!
Note, however,
that any objects of the old type
will still be of the old type---they don't automatically update
to conform to the new type definition.
For example:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">struct</span> A <span class="hljs-keyword">end</span>

julia&gt; a1 = A()
A()

julia&gt; <span class="hljs-keyword">struct</span> A
           x::<span class="hljs-built_in">Int</span>
       <span class="hljs-keyword">end</span>

julia&gt; a2 = A(<span class="hljs-number">1</span>)
A(<span class="hljs-number">1</span>)

julia&gt; a1
<span class="hljs-meta">@world</span>(A, <span class="hljs-number">38513</span>:<span class="hljs-number">38515</span>)()
</code></pre>
<p>Note that after redefining <code>A</code>,
<code>a1</code> is denoted as being of type <code>A</code>
from a previous so-called "world age".</p>
<p>Importantly,
when defining a method that dispatches on a type,
it will be defined for the version of the type
that existed at the time of method definition.
Continuing from the previous example:</p>
<pre><code class="lang-julia">julia&gt; f(a::A) = println(<span class="hljs-string">"hello"</span>)
f (generic <span class="hljs-keyword">function</span> with <span class="hljs-number">1</span> method)

julia&gt; f(a2)
hello

julia&gt; f(a1) <span class="hljs-comment"># Method doesn't exist for the old type</span>
ERROR: <span class="hljs-built_in">MethodError</span>: no method matching f(::<span class="hljs-meta">@world</span>(A, <span class="hljs-number">38513</span>:<span class="hljs-number">38515</span>))
The <span class="hljs-keyword">function</span> <span class="hljs-string">`f`</span> exists, but no method is defined <span class="hljs-keyword">for</span> this combination of argument types.

Closest candidates are:
  f(::A)
   @ Main REPL[<span class="hljs-number">6</span>]:<span class="hljs-number">1</span>

Stacktrace:
 [<span class="hljs-number">1</span>] top-level scope
   @ REPL[<span class="hljs-number">8</span>]:<span class="hljs-number">1</span>

julia&gt; <span class="hljs-keyword">struct</span> A <span class="hljs-comment"># Update the type</span>
           x::<span class="hljs-built_in">Float64</span>
       <span class="hljs-keyword">end</span>

julia&gt; f(A(<span class="hljs-number">2.0</span>)) <span class="hljs-comment"># Method doesn't exist for the new type</span>
ERROR: <span class="hljs-built_in">MethodError</span>: no method matching f(::A)
The <span class="hljs-keyword">function</span> <span class="hljs-string">`f`</span> exists, but no method is defined <span class="hljs-keyword">for</span> this combination of argument types.

Closest candidates are:
  f(::<span class="hljs-meta">@world</span>(A, <span class="hljs-number">38516</span>:<span class="hljs-number">38521</span>))
   @ Main REPL[<span class="hljs-number">6</span>]:<span class="hljs-number">1</span>

Stacktrace:
 [<span class="hljs-number">1</span>] top-level scope
   @ REPL[<span class="hljs-number">18</span>]:<span class="hljs-number">1</span>
</code></pre>
<p>In other words,
if updating a type,
be sure to update methods as well.</p>
<p>Often, however,
when someone is working with a type
and many methods that use the type for dispatch,
they are developing a package.
Package development works differently
than just working in the REPL,
so let's see how package development
is affected by the ability to update types.</p>
<p>First,
without proper tooling,
Julia package development can be a slog,
regardless of whether types (or anything else)
can be updated.
This is because normally,
after a package is loaded,
changes made to the package's source code
don't take effect
until Julia is restarted
and the package is loaded again.
In other words,
the whole Julia runtime
has to be started up again
and code has to be reloaded
even if just a single function in the package was updated.</p>
<p>Fortunately,
<a target="_blank" href="https://github.com/timholy/Revise.jl">Revise.jl</a> exists.
Revise.jl provides massive quality of life improvements
for package developers
by allowing changes to the package source code
to take effect immediately,
without restarting Julia.</p>
<p>The biggest caveat?
Revise.jl couldn't handle changes to struct definitions.
So,
if you decide a struct in your package needs an extra field,
you're out of luck,
you have to restart Julia.
And then if you decide the struct actually didn't need that field,
too bad again,
restart Julia.</p>
<p>But that all changes with Julia 1.12!
This release adds a mechanism
for redefining types
that Revise.jl hooks into,
removing arguably the most significant remaining pain point
of Julia package development.
The number of times a developer needs to restart Julia
will decrease significantly
with Julia 1.12,
allowing for greater productivity.</p>
<h1 id="heading-new-way-to-fix-function-arguments">New Way to Fix Function Arguments</h1>
<p>Julia 1.12 introduces the <code>Fix</code> struct
that can be used to fix function arguments.
(Here, "to fix" means "to set",
not "to correct".)
Think of it as another way to define a closure.</p>
<p>One of the main uses of <code>Fix</code>
is purely for convenience,
particularly when creating closures
of functions with many inputs.
For example,
suppose we have the following (nonsensical) physical model:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">function</span> dynamics(velocity, resistance, gravity, position, friction, length, mass)
    <span class="hljs-keyword">return</span> velocity + resistance + gravity + position + friction + length + mass
<span class="hljs-keyword">end</span>
</code></pre>
<p>Now,
suppose we have another function
that computes
(or accepts as input)
a particular value for <code>mass</code>
and then creates a new function
that computes <code>dynamics</code> with <code>mass</code> fixed.
Previously,
an anonymous function typically would be used,
so let's compare using an anonymous function
to using <code>Fix</code>:</p>
<pre><code class="lang-julia">mass = <span class="hljs-number">30</span> <span class="hljs-comment"># Some computed or given value</span>

<span class="hljs-comment"># Using an anonymous function.</span>
<span class="hljs-comment"># Note that the `let mass = mass` is essential</span>
<span class="hljs-comment"># to guarantee Julia doesn't box `mass`</span>
<span class="hljs-comment"># (see &lt;https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-captured&gt;).</span>
<span class="hljs-comment"># (Though also note that sometimes boxing does not occur</span>
<span class="hljs-comment"># even if the `let` block is omitted.)</span>
dynamics_with_mass = <span class="hljs-keyword">let</span> mass = mass
    (velocity, resistance, gravity, position, friction, length) -&gt; dynamics(velocity, resistance, gravity, position, friction, length, mass)
<span class="hljs-keyword">end</span>

<span class="hljs-comment"># Using `Fix`.</span>
<span class="hljs-comment"># The `7` indicates that we want to fix the 7th argument.</span>
dynamics_with_mass_fix = Base.Fix{<span class="hljs-number">7</span>}(dynamics, mass)
</code></pre>
<p>As you can see,
using <code>Fix</code> is much more concise.
But both
<code>dynamics_with_mass</code>
and <code>dynamics_with_mass_fix</code>
act as a function of six arguments
and compute <code>dynamics</code> with <code>mass</code> fixed
to a specific value.</p>
<p>Next,
suppose we want to compute the derivative of <code>dynamics</code>
as a function of <code>velocity</code>
for fixed values of the other parameters.
In this case,
there are six arguments that need to be fixed.
<code>Fix</code> allows for fixing just one argument,
but <code>Fix</code>es can be nested
to fix multiple arguments.
Let's compare using an anonymous function
to using <code>Fix</code>:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Fixed values, either computed or given.</span>
resistance = <span class="hljs-number">2</span>
gravity = <span class="hljs-number">9.8</span>
position = <span class="hljs-number">0</span>
friction = <span class="hljs-number">0</span>
length = <span class="hljs-number">0.5</span>
mass = <span class="hljs-number">30</span>

<span class="hljs-comment"># Using an anonymous function.</span>
f = <span class="hljs-keyword">let</span> resistance = resistance,
        gravity = gravity,
        position = position,
        friction = friction,
        length = length,
        mass = mass
    velocity -&gt; dynamics(velocity, resistance, gravity, position, friction, length, mass)
<span class="hljs-keyword">end</span>

<span class="hljs-comment"># Using `Fix`.</span>
f_fix = Base.Fix{<span class="hljs-number">2</span>}(Base.Fix{<span class="hljs-number">2</span>}(Base.Fix{<span class="hljs-number">2</span>}(Base.Fix{<span class="hljs-number">2</span>}(Base.Fix{<span class="hljs-number">2</span>}(Base.Fix{<span class="hljs-number">2</span>}(dynamics, resistance), gravity), position), friction), length), mass)

<span class="hljs-comment"># Using `Fix` with piping for nicer formatting.</span>
pipefix(::<span class="hljs-built_in">Val</span>{N}, x) <span class="hljs-keyword">where</span> {N} = Base.Fix{<span class="hljs-number">2</span>}(Base.Fix{N}, x)
f_pipefix = Base.Fix{<span class="hljs-number">2</span>}(dynamics, resistance) |&gt;
    pipefix(<span class="hljs-built_in">Val</span>{<span class="hljs-number">2</span>}(), gravity) |&gt;
    pipefix(<span class="hljs-built_in">Val</span>{<span class="hljs-number">2</span>}(), position) |&gt;
    pipefix(<span class="hljs-built_in">Val</span>{<span class="hljs-number">2</span>}(), friction) |&gt;
    pipefix(<span class="hljs-built_in">Val</span>{<span class="hljs-number">2</span>}(), length) |&gt;
    pipefix(<span class="hljs-built_in">Val</span>{<span class="hljs-number">2</span>}(), mass)
</code></pre>
<p>In this case,
I would argue using an anonymous function
is clearer to read.
However,
<code>Fix</code> can have some performance advantages
over anonymous functions.
For example,
using <code>Fix</code> can reduce the need to compile
anonymous functions that are used repeatedly.</p>
<p>For a lot more context and discussion,
see the <a target="_blank" href="https://github.com/JuliaLang/julia/pull/54653">pull request</a>
that introduced <code>Fix</code>.</p>
<h1 id="heading-progress-towards-more-reasonable-executables">Progress Towards More Reasonable Executables</h1>
<p>Perhaps surprisingly,
an experimental feature
is probably the most highly anticipated addition in Julia 1.12:
a new command-line argument <code>--trim</code>
that enables a (currently experimental) feature
that removes unnecessary code
from compiled executables.</p>
<p>Along with the release of Julia 1.12,
<a target="_blank" href="https://github.com/JuliaLang/JuliaC.jl">JuliaC.jl</a> was also created
to streamline the creation of executables.
JuliaC.jl can be installed as a Julia app
(another new feature with 1.12!):</p>
<pre><code class="lang-julia">pkg&gt; app add JuliaC
</code></pre>
<p>Installing JuliaC.jl in this way
creates the <code>juliac</code> executable
at <code>~/.julia/bin/juliac</code>.
(After installation,
you may want to add <code>~/.julia/bin</code>
to your <code>PATH</code>.)</p>
<p>Let's see how to use <code>juliac</code>
with a simple program:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># main.jl</span>

<span class="hljs-keyword">function</span> <span class="hljs-meta">@main</span>(args::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">String</span>})::<span class="hljs-built_in">Cint</span>
    <span class="hljs-keyword">for</span> arg <span class="hljs-keyword">in</span> args
        <span class="hljs-comment"># Note the use of `Core.stdout` instead of `stdout`</span>
        <span class="hljs-comment"># (which is used by default if an `io` argument is omitted).</span>
        println(Core.stdout, arg)
    <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>
<span class="hljs-keyword">end</span>
</code></pre>
<p>This program simply prints to the console
all its given command-line arguments.</p>
<p>To compile it into an executable,
just run the following command:</p>
<pre><code class="lang-shell">juliac --output-exe print_args --project . --bundle bundle --trim main.jl
</code></pre>
<p>To verify it works:</p>
<pre><code class="lang-shell">$ ./bundle/bin/print_args hello world
hello
world
</code></pre>
<p>So,
what exactly is the impact of <code>--trim</code>?
Without the new (experimental) feature,
the size of the <code>print_args</code> executable
was 205 MB on my computer.
With <code>--trim</code>,
that number improved by over 100x,
dropping to just 1.7 MB!</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we learned about
some of the new features
and improvements
introduced in Julia 1.12,
including redefinable types,
<code>Fix</code>,
and <code>--trim</code>.
Curious readers can
check out the <a target="_blank" href="https://github.com/JuliaLang/julia/blob/v1.12.0/NEWS.md">release notes</a>
for the full list of changes.</p>
<p>What are you most excited about
in Julia 1.12?
Let us know in the comments below!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://github.com/JuliaLang/julia/blob/v1.12.0/NEWS.md">Julia v1.12 Release Notes</a><ul>
<li>Full list of changes made in Julia 1.12.</li>
</ul>
</li>
<li><a target="_blank" href="https://blog.glcs.io/series/julia-basics-programmers">Julia Basics for Programmers</a><ul>
<li>Series of blog posts covering Julia basics.</li>
</ul>
</li>
<li><a target="_blank" href="https://blog.glcs.io/series/diving-into-julia">Diving into Julia</a><ul>
<li>Series of blog posts covering more advanced Julia concepts.</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Integrating Julia and MATLAB: Julia and MATLAB Can Coexist]]></title><description><![CDATA[This post is a cleaned-up transcript
of the JuliaCon 2025 talk
Integrating Julia and MATLAB: Julia and MATLAB Can Coexist
by Steven Whitaker.
Check out the submission page
for the talk abstract and summary.
To download a PowerPoint of the talk slides...]]></description><link>https://blog.glcs.io/juliacon-2025</link><guid isPermaLink="true">https://blog.glcs.io/juliacon-2025</guid><category><![CDATA[integration]]></category><category><![CDATA[Julia]]></category><category><![CDATA[Matlab]]></category><category><![CDATA[modeling]]></category><category><![CDATA[open source]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Wed, 01 Oct 2025 22:36:04 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1756936911393/bD5U1v9rM.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post is a cleaned-up transcript
of the JuliaCon 2025 talk
<a target="_blank" href="https://www.youtube.com/watch?v=vvnfyVMwu_Y"><em>Integrating Julia and MATLAB: Julia and MATLAB Can Coexist</em></a>
by <strong>Steven Whitaker</strong>.
Check out the <a target="_blank" href="https://pretalx.com/juliacon-2025/talk/9WCTQR/">submission page</a>
for the talk abstract and summary.
To download a PowerPoint of the talk slides, <a target="_blank" href="https://pretalx.com/media/juliacon-2025/submissions/9WCTQR/resources/Integrating_EYNjCiI.pptx">click here</a>.</p>
</blockquote>
<h1 id="heading-introduction">Introduction</h1>
<p>Thank you for the introduction.
I'm going to be talking about Julia-MATLAB integration.
I'll start by
talking about why we might want to do
this. Why not just stick with MATLAB<sup>®</sup> or
why not transition entirely to Julia?
I'll talk about how to actually do the
integration and I'll walk through an
example model. I'll show you
the MATLAB code, show how it needs to be
translated to Julia and then
reintegrated back into the MATLAB
workflow that we're working with.
And I'll demonstrate some benchmark results
from before and after doing the Julia-MATLAB integration.</p>
<h1 id="heading-matlab-is-everywhere-julia-is-fast">MATLAB Is Everywhere, Julia Is Fast</h1>
<p>To begin, I do want to say that from
our perspective MATLAB is great
software. It's used by businesses and
people all over the world, and it has
withstood the test of time. MATLAB has
excellent tooling and documentation. It
has been the go-to for scientific
computing and engineering, and it is
ubiquitous in modeling and simulation.
By a quick raise of hands, who here
has written a model or a function in MATLAB?
I think literally everyone (in the JuliaCon audience).
Me too.
Engineers have had time to
amass decades worth of MATLAB models
and workflows, which is to say MATLAB
isn't going anywhere.</p>
<p>But is MATLAB the best option in all
cases? Now, there's this trade-off
between speed and what I'll call
reputation. MATLAB has an incredibly
good reputation. It has excellent
tooling. It's robust and it has a ton of
other awesome features. It has gained a
reputation for being reliable and it is
an established industry standard. Large
companies may have hundreds of MATLAB
models and thousands upon thousands of
lines of MATLAB code. But that makes it
impractical to switch all this MATLAB code
entirely over to Julia all at
once.</p>
<p>However, there may be instances where
MATLAB isn't the best choice to use.
For example, if you have performance
critical or large-scale models where speed
is absolutely paramount,
a language like Julia may be more
suitable for those cases.
Let's look at some concrete numbers.</p>
<h1 id="heading-benchmark-comparisons">Benchmark Comparisons</h1>
<p>I pulled some benchmarks from the
<a target="_blank" href="https://docs.sciml.ai/SciMLBenchmarksOutput/dev/MultiLanguage/ode_wrapper_packages/">SciML benchmarks website</a>. And as an aside for
those of you who haven't heard about the
SciML ecosystem, the Julia scientific
machine learning ecosystem is
incredible. So check it out if you
haven't yet.</p>
<p>So what these benchmarks do is
compare ODE solvers from various
languages for various ODEs and just see
how fast these ODE solvers compare.
So for one example we have a non-stiff
problem here. We're plotting how long it
takes these ODE solvers to solve this
non-stiff problem. The Julia solvers are
in green, MATLAB in orange. And looking
at this plot, we see that for this
particular non-stiff problem, Julia is
10 to 1000 times faster than MATLAB.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1758905746087/oH3JBltqL.png?auto=format" alt="SciML benchmark for non-stiff problem" /></p>
<p>Now what about a stiff problem?
Similar story for this particular
stiff problem. Julia is 10 to 1000
times faster than MATLAB.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1758905762464/Qkqbc2NoH.png?auto=format" alt="SciML benchmark for stiff problem" /></p>
<p>Now consider if your long simulations
or parameter optimizations take hours or
even days to run in MATLAB.
You have to start asking yourself: is it
worth using MATLAB when time is money?
Does the reputation of MATLAB really
outweigh the cost in time?</p>
<h1 id="heading-introducing-julia-with-minimal-disruptions">Introducing Julia with Minimal Disruptions</h1>
<p>The key question is:
How do we get Julia's speed
but without rewriting all of our MATLAB
code in Julia? How can we get that speed
without major disruption, without
completely retooling everything?</p>
<p>So another question: does anyone here by
the raise of hands work with large MATLAB codebases?
All right, a few of you (in the JuliaCon audience).
Like I said,
companies have had decades to mass
tons of MATLAB code. And so if you're
working at a company that has a large
MATLAB codebase, if you use MATLAB
every day, your colleagues use
MATLAB every day, and, frankly, the code
works, it runs, it gives the results
that you need—these are all
valid reasons for sticking with MATLAB.
But the speed—Julia is fast! And
some might say that Julia is the
superior language. That's why
we're here at JuliaCon—we want to
use Julia!</p>
<p>The good news is that there
is a way to introduce Julia into your
MATLAB workflows, and you can do this
with minimal disruption.
You want to start small. Pinpoint the
particular pain point in your MATLAB
workflow and start
targeting that area. You want to keep
your existing MATLAB workflows intact.
That way you help keep people on board
with your project. But most
importantly, you also want to show
meaningful performance wins because
that ultimately gets people's
attention.</p>
<h1 id="heading-example-model">Example Model</h1>
<p>For the rest of my talk,
I will walk through the process of
doing Julia-MATLAB integration
using an example model.
The example model that I'll use
includes 50 equations and state
variables along with 37 model
parameters.</p>
<p>We'll also have discrete
events and continuous events.
For the discrete events, we will
update one of the parameters at four
specific time points along the
simulation. An example of this in
practice might be injecting a dose of
medicine or modifying the thrust
provided by a rocket.</p>
<p>We'll also have a continuous event where
we modify a state variable when it
reaches a certain value. A
classic example is when modeling
the velocity of objects that collide
with boundaries. For example, if you
drop a ball, that ball is going to fall
until it hits the ground. And when it
hits the ground, it doesn't keep
falling. There's an event that occurs
that causes the ball to bounce.
That's the sort of thing that's captured
by these continuous events.</p>
<p>For our example, we'll look at two
problems of interest. One is just simply
solving the ODE one time. How long does
it take to do that? But then in
practice, we don't normally solve an ODE
a single time and then call it
quits. Normally, we're solving ODEs many,
many times. So, we're also going to look
at doing a parameter sweep where we'll
solve this example model ODE with
about 200,000 different sets of model
parameters and then choose the one to
use in production.</p>
<h1 id="heading-model-plots">Model Plots</h1>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1758905775945/kpl8Nti_V.png?auto=format" alt="Example model time course plots" /></p>
<p>To illustrate this model a bit more,
here are three of the state variables
plotted over time. The first plot is a
state variable of interest. This
might represent
the position of a mechanical part, or
maybe it's the temperature of concrete
or some other substance that we're
modeling over time.</p>
<p>The second plot illustrates the discrete
events. At these particular time
points, we're changing one of the model
parameters, and we see how that
influences the time course for the state
variable.</p>
<p>And then the third plot
illustrates the continuous events where,
when this state variable reaches a value
of -10 or -2, we negate its derivative
and we see how the time course is
influenced as a result.</p>
<p>Looking back at the first state
variable here, I'm plotting it for two
different sets of model parameters. The
first set of model parameters results in
these large oscillations while the
second set of model parameters has
virtually no oscillations.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1758905785647/351tFwMhb.png?auto=format" alt="State variable time course for two parameter sets" /></p>
<h1 id="heading-matlab-ode-solve-timing">MATLAB ODE Solve Timing</h1>
<p>The first question is how long did it
take to solve this ODE one time so we could
get this plot.
Solving this ODE in MATLAB took about 20
milliseconds. Okay, so maybe that's not
so bad.</p>
<p>Now, going back to this plot,
like I mentioned, this state variable
might represent the position of a
mechanical part. And so if we have these
oscillations, we can imagine that those
oscillations cause wear on this
mechanical part. And as a result, we
would love to use the second set of
model parameters so that there's less
wear on that part, which means we don't
have to replace it as often, which saves
our company money. The problem is we
don't know what those parameters
are beforehand. Okay. So, what we'll do
is pick 200,000 different
sets of parameter values. We'll solve
the ODE with each of those sets of
parameter values and choose the
parameter set that minimizes the
oscillations so that we don't have to
replace this part so often.</p>
<p>Now, if we do this parameter sweep in
MATLAB, it takes about <strong>five hours</strong> to
run on my laptop.</p>
<p>Now, if you run this exactly one time,
maybe that's not such a huge deal. But
if this is part of continuous
integration or testing or if you do a
parameter sweep for many different
models and many different workflows, you
can see how the time starts to add up
quickly.</p>
<p>So, we want to see how Julia can fare in
this situation.
What we're going to do is we're going
to take our MATLAB parameter sweep and
translate that
into Julia but then reconnect it into
our MATLAB workflow.</p>
<h1 id="heading-julia-vs-matlab-for-simulations">Julia vs MATLAB for Simulations</h1>
<p>The first thing to mention when
converting from MATLAB to Julia is the
syntax. MATLAB and Julia
have a very similar syntax. MATLAB
was the language that I used primarily
before I started using Julia, and when I
started learning Julia it almost didn't
feel like I was learning a new language.
That's how similar the syntax was to me.</p>
<p>In terms of the dynamics, Julia is
designed for performance. In MATLAB,
every time the ODE solver calls out to your
dynamics function it has to allocate new
memory to store the results of
the computed state derivatives. Whereas
in Julia, you can allocate that memory
up front and then every time the ODE
solver calls out to your dynamics
function, it can reuse that memory,
which boosts performance.</p>
<h1 id="heading-events-and-callbacks">Events and Callbacks</h1>
<p>I found that in Julia working with
events and callbacks was more intuitive
and easier to do. For this particular
example, I could define them in
fewer lines of code in Julia. Julia
provides for better code organization
and it also provides many pre-built
callbacks that you can use out of the
box.</p>
<p>So, looking more into events and
callbacks:
here's some example code in MATLAB
and in Julia for defining events and
callbacks.</p>
<pre><code class="lang-matlab">%% MATLAB

% In events_func.m
v(n+1) = u(u49) + 2;
v(n+2) = u(u49) + 10;

% In callback_func.m
if any(ismember([n + 1, n + 2], ie))
    u(u50) = -u(u50);
end
</code></pre>
<pre><code class="lang-julia"><span class="hljs-comment"># Julia</span>

<span class="hljs-comment"># In callbacks.jl</span>
event1(u, t, integrator) = u[u49] + <span class="hljs-number">2</span>
event2(u, t, integrator) = u[u49] + <span class="hljs-number">10</span>
callback!(integrator) = integrator.u[u50] = -integrator.u[u50]
ContinuousCallback(event1, callback!)
ContinuousCallback(event2, callback!)
</code></pre>
<p>In MATLAB we have one file for the
events and then a separate file for all
the callbacks. Whereas in Julia we can
define the events and the corresponding
callbacks together in the same file
which allows for better code
organization.</p>
<p>Additionally, in MATLAB we have a single
function that defines all of the events
and then we have a single function that
defines all of the callbacks. In Julia
we can be more modular. We can define a
single function per event, a single
function per callback.</p>
<p>And then finally,
in MATLAB there's more bookkeeping that
we have to do on our own. We have to
keep track of indexes of the events. We
have to manually check if events occur
and manually call the correct callbacks
if those events occur. Whereas Julia
takes care of all of that for us. In
Julia, all we have to do is explicitly
pair an event with its callback and then
Julia takes care of the rest. Julia will
check if the event occurs. It will then
call the correct callback that you
assigned it and we don't have to worry
about that ourselves.</p>
<p>So again, I found
that working with events and callbacks
in Julia was more intuitive and
easier to do.</p>
<h1 id="heading-julia-is-easy-to-learn">Julia Is Easy to Learn</h1>
<p>Now, another consideration when adopting
a new software is how much learning
you have to do to get up and
running with it. I already mentioned
the syntax. Julia syntax is similar to
MATLAB, and so that isn't much of a
barrier to entry.</p>
<p>Then there's also what packages are
available and how quickly can I
learn those packages. For this
particular example I only needed to use
two Julia packages to get it up and
running. I used <a target="_blank" href="https://github.com/SciML/DiffEqCallbacks.jl">DiffEqCallbacks.jl</a> for
the events and callbacks and
<a target="_blank" href="https://github.com/SciML/OrdinaryDiffEq.jl">OrdinaryDiffEq.jl</a> for setting up and solving the
ODE.</p>
<p>And then perhaps more importantly:
part of learning the package is going
through its documentation and
learning how to use the package from the
documentation. So how good is the
documentation?</p>
<p>Particularly for the SciML ecosystem,
but also other ecosystems in Julia, the
documentation is excellent. So if you
look at the <a target="_blank" href="https://docs.sciml.ai/OrdinaryDiffEq/stable/">OrdinaryDiffEq.jl documentation</a>
it will clearly go through how to set up
an ODE problem, how to solve it, and what
options are available for solving. It
also has a list of different
solvers and shows you how to swap
out different solvers with essentially
just a single line of code change in
Julia.</p>
<p>Then the <a target="_blank" href="https://docs.sciml.ai/DiffEqCallbacks/stable/">DiffEqCallbacks.jl documentation</a>
is similarly good. It
describes quite a few ready-to-use
callbacks that you can use. There's the
<code>PresetTimeCallback</code> for the discrete
events at specific times. There are also
more advanced callbacks such as those
for step size control.</p>
<p>And then kind of the cherry on top for
converting from MATLAB to Julia is
there's one of the pages of the
documentation that is a <a target="_blank" href="https://docs.sciml.ai/DiffEqDocs/stable/solvers/ode_solve/#Translations-from-MATLAB/Python/R">solver comparison</a>.
And so basically what it does is it
lists the MATLAB solvers and
then it says what are the corresponding
Julia solvers you can use. So if in MATLAB
you're using the <code>ode45</code> solver, this
web page will say, "okay, the corresponding
Julia solver is <code>DP5</code>, but then there are
also these other solvers you might want
to try that might perform better."</p>
<p>So, the excellent documentation, the
excellent package ecosystem and the
similar syntax between Julia and MATLAB
really makes transitioning from MATLAB
to Julia a breeze.</p>
<h1 id="heading-julia-matlab-integration-demo">Julia-MATLAB Integration Demo</h1>
<p>Doing the actual integration we use
the <a target="_blank" href="https://github.com/ASML-Labs/MATFrost.jl">MATFrost.jl</a> package. MATFrost.jl is a
package that was developed by ASML. So,
thank you to ASML for developing the
package and open sourcing it for the
Julia community to use. And I will
illustrate the process of integrating
Julia back into MATLAB via a live demo.
So, I'm going to switch over to MATLAB.</p>
<pre><code class="lang-matlab">% demo.m

[u0, tspan, p] = get_inputs();
[p_opt_indexes, grid_vals, lb_cons, ub_cons] = get_gridsearch_inputs();

tic
p_opt = gridsearch(u0, tspan, p, p_opt_indexes, grid_vals, lb_cons=lb_cons, ub_cons=ub_cons);
toc

sol0 = solve(u0, tspan, p);
p.p(p_opt_indexes) = p_opt;
sol = solve(u0, tspan, p);

plot(sol0.Time, sol0.Solution(1,:), Color="r", LineWidth=2);
hold on
plot(sol.Time, sol.Solution(1,:), Color="b", LineWidth=2);
hold off
legend("Before", "After");
title("u1 Before and After Parameter Sweep");
xlabel("Time");
ylabel("Value");
</code></pre>
<p>Alright. So, here's the MATLAB
workflow we're going to work with.
In this case, it's fairly minimal.
That's for demonstration purposes, but
the main point is we have some
workflow, but there's a particular part
of the workflow that's causing us grief,
and we want that to be faster, but we
want to make it faster without entirely
destroying or disrupting our workflow.</p>
<p>So, I'm going to start this running.</p>
<h2 id="heading-the-workflow">The Workflow</h2>
<p>In this workflow, we have some
pre-processing. We're just getting
inputs ready for our computation. The
core computation in this workflow is
this <code>gridsearch</code> function.
This <code>gridsearch</code> function is our
implementation of that parameter sweep
that I mentioned.
And then once we have the output, we do
some post-processing. In this case,
we're just plotting values.</p>
<p>But again,
the main point is that we have a
workflow, but there's a pain point in
that workflow that we want to be better.
We want to make it better with Julia.</p>
<h2 id="heading-matlab-results">MATLAB Results</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1758905796751/XKLhs8oZP.png?auto=format" alt="Demo MATLAB results" /></p>
<p>So there are our results.
Alright, so here I'm plotting the first
state variable before the parameter
sweep and after the parameter sweep. And
as expected, we see that the
oscillations are smaller than before.
And then if we look at the elapsed time,
it took about 40 seconds for this to run
in MATLAB.</p>
<h2 id="heading-translating-to-julia">Translating to Julia</h2>
<p>Okay. So now we want Julia. So, the first
step is to take the part of the workflow
that we want to improve, and we're going
to translate that into
Julia.</p>
<pre><code class="lang-julia"><span class="hljs-comment"># JuliaModel/src/gridsearch.jl</span>

<span class="hljs-keyword">function</span> gridsearch(
    u0::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>},
    tspan::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>},
    p::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>},
    cb_times::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>},
    cb_values::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>},
    p_opt_indexes::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Int64</span>},
    grid_vals::<span class="hljs-built_in">NTuple</span>{<span class="hljs-number">5</span>, <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}},
    lb_cons::<span class="hljs-built_in">Float64</span>,
    ub_cons::<span class="hljs-built_in">Float64</span>,
    abstol::<span class="hljs-built_in">Float64</span>,
    reltol::<span class="hljs-built_in">Float64</span>,
)

    <span class="hljs-comment"># `p` is updated in place during the callbacks,</span>
    <span class="hljs-comment"># so take a copy that will be used to re-initialize `p`</span>
    <span class="hljs-comment"># at the start of each loop iteration below.</span>
    p_original = copy(p)

    callback = get_callbacks(cb_times, cb_values)

    prob = ODEProblem{<span class="hljs-literal">true</span>}(dynamics!, u0, (tspan[<span class="hljs-number">1</span>], tspan[<span class="hljs-number">2</span>]), p)
    solver = DP5()

    sol0 = OrdinaryDiffEq.solve(prob, solver; callback, abstol, reltol)
    loss0 = max_variation(sol0)

    losses = <span class="hljs-built_in">Array</span>{<span class="hljs-built_in">Float64</span>, length(grid_vals)}(undef, length.(grid_vals)...)
    loss_opt = loss0
    p_opt = p_original[p_opt_indexes]
    <span class="hljs-keyword">for</span> (i, p_vals) <span class="hljs-keyword">in</span> enumerate(Iterators.product(grid_vals...))
        p .= p_original
        p[p_opt_indexes] .= p_vals
        new_prob = remake(prob; p)
        sol = OrdinaryDiffEq.solve(new_prob, solver; callback, abstol, reltol)
        <span class="hljs-keyword">if</span> constraints_met(sol, lb_cons, ub_cons)
            l = max_variation(sol)
            <span class="hljs-keyword">if</span> l &lt; loss_opt
                loss_opt = l
                p_opt .= p_vals
            <span class="hljs-keyword">end</span>
            losses[i] = l
        <span class="hljs-keyword">else</span>
            losses[i] = <span class="hljs-literal">Inf</span>
        <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">end</span>

    <span class="hljs-keyword">return</span> (p_opt, loss_opt, loss0, losses)

<span class="hljs-keyword">end</span>
</code></pre>
<p>So, we take the <code>gridsearch</code>
function, we look at its implementation,
we translate it into Julia, and we end up
with a Julia function. I'm calling it
<code>gridsearch</code> just like the MATLAB
function. It takes the same inputs.
It does the parameter sweep. It returns
the same outputs as the MATLAB version.
So the same computation, just now written
in Julia instead of MATLAB.</p>
<p>And then we're going to house this
function inside of a Julia package. In
this case I called the Julia package
<code>JuliaModel</code>. So now we have a Julia
package. It has our Julia implementation
of <code>gridsearch</code>, our parameter sweep.</p>
<h2 id="heading-using-matfrostjl">Using MATFrost.jl</h2>
<p>Now
we need to build a bridge using MATFrost.jl.
So, MATFrost.jl needs to be a dependency
of our Julia package, even if the
package doesn't use it at all, because
what we're going to do is
instantiate our Julia package's package
environment. Then import
MATFrost.jl and then call
<code>MATFrost.install</code>.
And what this does is it takes our Julia
package and creates a MATLAB class out
of that package, and in this case and it
uses the same name.</p>
<p>So now we have
a <code>JuliaModel</code> Julia package, but now
we have a <code>JuliaModel</code> MATLAB class,
which we'll use to interface with
Julia. So this is what MATFrost.jl
created for us.</p>
<h2 id="heading-minimizing-disruptions">Minimizing Disruptions</h2>
<p>So now that we have that MATLAB class,
now we need to start using this code to
start interfacing with Julia. Now, we
could add this directly to the workflow
itself, but again, we want to minimize
disruptions to the workflow, especially
if we have colleagues that are also
working on this workflow simultaneously
with us.</p>
<p>So, the key idea here is we want
calling out to Julia to be
an implementation detail from the
perspective of the workflow. So we don't
want the workflow to care that we're
calling out to Julia. We want the
workflow to work as is.</p>
<p>So, what we're
going to do is we're going to
create a new MATLAB function,
but within that function we
call out to Julia. This MATLAB
function I'm going to call <code>gridsearch_julia</code>,
and importantly it needs the same
API as whatever function we're
replacing. So, I'm going to have it take
the same inputs. It's going to return
the same output,
but then internally there's not much to
it, again, because all the functionality
is in Julia.</p>
<pre><code class="lang-matlab">% gridsearch_julia.m

function [p_opt, loss_opt, loss0, losses] = gridsearch_julia(u0, tspan, p, p_opt_indexes, grid_vals, options)

    arguments

        u0
        tspan
        p
        p_opt_indexes
        grid_vals
        options.lb_cons = 1.65
        options.ub_cons = 2.25
        options.abstol = 1e-7
        options.reltol = 1e-7

    end

    jl = get_julia();

    out = jl.gridsearch(u0, tspan, p.p, p.cb_times, p.cb_values, p_opt_indexes, grid_vals, ...
        options.lb_cons, options.ub_cons, options.abstol, options.reltol);

    [p_opt, loss_opt, loss0, losses] = out{:};

end
</code></pre>
<p>What this function is
going to do is I have this function <code>get_julia</code>
which is a function I wrote that
instantiates a <code>JuliaModel</code> object. Then
with that object I can interface with
Julia. So if I do <code>jl.gridsearch</code> right
here, this <code>gridsearch</code> is not the MATLAB
function. This is the <code>gridsearch</code> that I
wrote in Julia in my Julia package. So
what we're doing here is taking
our MATLAB values and passing
them over to Julia. Julia is going to
run the <code>gridsearch</code> function I defined
in Julia. It'll do the parameter sweep
and then it will return a <code>Tuple</code> of
values which is a cell array in MATLAB.
So I just unpack that cell array to get
the appropriate output for this
function.</p>
<p>But again the main point to
minimize disruption for the workflow is
that this new function needs to
have the same API so that from
the workflow's perspective nothing
changes.</p>
<h2 id="heading-julia-matlab-results">Julia-MATLAB Results</h2>
<p>Let's see how this works. So,
I'm just going to change the one
function.
(Change <code>gridsearch</code> to <code>gridsearch_julia</code> in <code>demo.m</code>.)
Importantly I'm not changing any of the
pre-processing or post-processing. I'm
not changing any of the inputs or
outputs. And then if I run it I get my
result.
The plot looks pretty much the same as
last time, maybe exactly the same.
And if we look at the timing, it was
<strong>100 times faster</strong>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1758905806380/jl1yrB7uk.png?auto=format" alt="Demo Julia-MATLAB results" /></p>
<p>So, from the workflow's perspective,
essentially nothing changed. I called
out to a new function, but didn't have
to change any the pre- or
post-processing,
but now the workflow runs 100 times
faster than before.</p>
<p>So this was a 40-second version of
the parameter sweep, not the five hour
version. So, does it scale? Let's look
at some other benchmarks.</p>
<h1 id="heading-ode-solve-timing-matlab-vs-julia-matlab">ODE Solve Timing: MATLAB vs Julia-MATLAB</h1>
<p>Alright, first looking at a single ODE
solve.
If we just did a single ODE solve in
Julia and then did the Julia-MATLAB
integration, that ODE solve takes 10
milliseconds. So, it's twice as fast as
the pure MATLAB version. I will note
that there is a significant overhead to
using MATFrost.jl for the first call, but
subsequent calls are faster. So a 2x
speed up, that's something, but it's not
the 10 to 1000 times speed up that
we were seeing with the SciML benchmarks.
But again, we don't typically solve
an ODE a single time.</p>
<p>So what about the
parameter sweep? You'll recall that in
MATLAB the parameter sweep with 200,000
ODE solves took about five hours to run.
In Julia, that time dropped to <strong>less than two minutes</strong>.
So that's <strong>163 times faster</strong> than the
pure MATLAB version.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1758905818353/_gUQrpx5p.png?auto=format" alt="Julia is speed" /></p>
<p>So I think that's
fast enough for this meme. I am speed.
Julia is speed. So that's amazing. We
took our workflow,
we created a new MATLAB function
that called out to Julia internally (but
from the workflow's perspective nothing
really changed) but now it's 163 times
faster.</p>
<h1 id="heading-uncompromised-simulation-accuracy">Uncompromised Simulation Accuracy</h1>
<p>Okay, so it's faster, but now the other
question is are the results still good
or are they garbage now.</p>
<p>The results
are still good. Here, I'm plotting the
first state variable before and after
the parameter sweep; the MATLAB results
are in red, the blue dots are
the Julia results. And we see that the
Julia results align with the MATLAB
results.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1758905827819/EunsrZ3jf.png?auto=format" alt="MATLAB vs Julia-MATLAB results" /></p>
<p>I also checked all the other
state variables, and in all cases all of
the Julia-MATLAB integrated results
are within 1/100th of a percent compared
to the MATLAB results. So, we
achieve the accuracy that we want for
our simulations at a fraction of the
computational cost.</p>
<h1 id="heading-julia-matlab-integration-with-glcs">Julia-MATLAB Integration with GLCS</h1>
<p>So now if you want these speed ups in
your code, but you're not quite sure how to
start, feel free to reach out to us at
GLCS. We have an approach for helping
clients pinpoint these areas of concern
in their MATLAB workflows. We help
translate the code to Julia and
reintegrate it back into MATLAB and we
ensure that we meet the accuracy and
performance metrics that we need to.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1758905837333/Ml-T2Y7DH.png?auto=format" alt="Julia-MATLAB integration with GLCS" /></p>
<h1 id="heading-conclusion">Conclusion</h1>
<p>So, unlock up to <strong>100 times faster performance</strong>
by integrating Julia. In
today's example, we saw even faster than
100 times faster. Adopt Julia
seamlessly. Integrate it step by step
without disrupting your MATLAB
workflows. So, enhance the MATLAB code
that you have. Elevate its capabilities
by integrating Julia and taking
advantage of its speed and flexibility.</p>
<p>Thank you.</p>
<h1 id="heading-qampa">Q&amp;A</h1>
<ol>
<li><p><em>Audience:
Can you show us that <code>get_julia</code> function you mentioned?</em></p>
<p>Yeah. So we want to see the <code>get_julia</code>
function that I wrote in MATLAB.
So yes,
here it is.
So basically I made use of a persistent
variable to make sure I'm not
reinitializing
the Julia instance every time.
So I'm reusing the same Julia instance
every time I call this function. But the
first time there's that overhead that I
mentioned.</p>
<p><em>Audience:
Your MATLAB code isn't showing.</em></p>
<pre><code class="lang-matlab">function jl = get_julia()

    persistent jl_
    if isempty(jl_)
        mjl = JuliaModel();
        jl_ = mjl.JuliaModel;
    end

    jl = jl_;

end
</code></pre>
<p>Oh, thank you. I need to actually
get me out of the slideshow.
There we go.
Okay. Persistent variable. So that way
on subsequent calls to the function, I'm
using the same initialization of the
Julia runtime. So we're not wasting
resources every time.
But yeah, here is where we're actually
initializing the <code>JuliaModel</code> MATLAB
object.
Yeah, there's nothing to it. There's no
parameters to—I mean you can change the Julia version
if you want but if you have everything
set up in your PATH you don't need to
pass anything to it.</p>
</li>
<li><p><em>Audience:
Just a quick question, about how long did
it take to, when you call MATFrost.jl
to install <code>JuliaModel</code>, how long
does that take?</em></p>
<p>That's a good question. I don't remember
off the top of my head.</p>
<p><em>Audience:
Well, it was quick, I'm assuming, or
it just builds a little bit of MATLAB
code, right?</em></p>
<p>Right, if I remember correctly, it
was on order of seconds, maybe minutes, but
nothing terrible.</p>
</li>
<li><p><em>Audience:
Does MATFrost.jl work with Linux or
is it still like only for Windows?</em></p>
<p>As far as I know, it's only supported
for Windows.</p>
</li>
<li><p><em>Audience:
Some of these numbers scare me a little
with the performance comparisons
between Julia and MATLAB. I know
there's a huge difference in LAPACK
and BLAS versioning, right, MATLAB
typically goes for a more numerical
stable version of LAPACK versus if
Julia will use OpenBLAS by default, I
think (I might be wrong on that). Could
you tell me what version of each you're
using for both?</em></p>
<p>I do not know what versions I'm
using.</p>
<p><em>Audience:
You just, in the MATLAB terminal, just
do <code>version -lapack</code>.
There's an "i" in "version".</em></p>
<p>Oh, thank you. Is it capital LAPACK or
lower?</p>
<p><em>Audience:
That's correct. Yeah.
Okay, that is a faster one. Okay.</em></p>
</li>
<li><p><em>Audience:
Can you actually like create structs or
more complicated types than arrays with
this?</em></p>
<p>So can you create structs and arrays in
Julia, are you saying?</p>
<p><em>Audience:
No no, like in MATLAB can you like create
structs from Julia? Like if you define a
data type in Julia can you like create
it here in MATLAB?</em></p>
<p>Yeah. So one of the key things to
working with MATFrost.jl is everything
needs to be concretely typed.
So, like, if you create a struct, you can
have nested structs and they basically
get translated to MATLAB structs, but you
need to concretely type it as a <code>Float64</code>
or an <code>Int</code> depending on whatever data
type you have. So, no abstract types,
and the
function calls also need to be typed.</p>
</li>
<li><p><em>Audience:
I remember a long time ago
when I was calling C code from MATLAB it
had to use these MEX files. Is that
what it's using under the hood here or
is there some new fancier way of calling
external code and MEX files that this is
using?</em></p>
<p>Yeah, so MATFrost.jl does use a MEX
file to work.</p>
</li>
</ol>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://pretalx.com/juliacon-2025/talk/9WCTQR/">JuliaCon Talk Submission Page</a><ul>
<li>Submission page for the talk,
including an abstract and summary.</li>
</ul>
</li>
<li><a target="_blank" href="https://glcs.io/julia-matlab">Julia and MATLAB<sup>®</sup> Integration at GLCS</a><ul>
<li>Our web page detailing our Julia and MATLAB<sup>®</sup> integration service.</li>
</ul>
</li>
</ul>
<p>MATLAB is a registered trademark
of The MathWorks, Inc.</p>
<p>Cover image:
The JuliaCon 2025 logo
was obtained from <a target="_blank" href="https://juliacon.org/2025/">https://juliacon.org/2025/</a>.</p>
]]></content:encoded></item><item><title><![CDATA[MATLAB vs Julia: Best Programming Language for Renewable Energy Simulations]]></title><description><![CDATA[This post was written by Steven Whitaker.

At GLCS, we're proud to have delivered innovative projects 
across top industries, including renewable energy, aerospace, and biomedical engineering.
Our comprehensive Modeling and Simulation Services
empowe...]]></description><link>https://blog.glcs.io/matlab-vs-julia-renewable-energy</link><guid isPermaLink="true">https://blog.glcs.io/matlab-vs-julia-renewable-energy</guid><category><![CDATA[#differential-equations]]></category><category><![CDATA[Julia]]></category><category><![CDATA[Matlab]]></category><category><![CDATA[General Programming]]></category><category><![CDATA[renewable-energy]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Mon, 29 Sep 2025 18:06:32 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1758741434158/cdLbQr39d.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by <strong>Steven Whitaker</strong>.</p>
</blockquote>
<p>At GLCS, we're proud to have delivered innovative projects 
across top industries, including renewable energy, aerospace, and biomedical engineering.
Our comprehensive Modeling and Simulation Services
empower clients to elevate their designs,
whether rewriting models in Julia
for greater efficiency or unlocking cutting-edge features
to solve complex problems.
With deep expertise spanning several engineering and scientific domains,
including computational fluid dynamics,
thermodynamics,
controls,
biomedical engineering,
and chemistry,
we are your trusted partner in pushing modeling boundaries
and achieving breakthrough results.</p>
<p>In this post,
we'll focus on systems modeling
for renewable energy.</p>
<p>Renewable energy systems,
from wind farms to wave power,
require precise modeling and simulation.
Selecting the optimal computational tools is crucial;
it can significantly accelerate
development, reduce costs, and drive the transition
to a sustainable future.</p>
<p>Both MATLAB<sup>®</sup> and Julia are widely used in engineering,
particularly for solving differential equations.
This post compares how each handles
a renewable energy modeling scenario,
the <strong>steady axisymmetric turbulent wake</strong>
behind a wind turbine,
ultimately showing why Julia has the edge.
Maximize efficiency and energy output
with cutting-edge technologies
designed for tomorrow’s energy.</p>
<h1 id="heading-steady-axisymmetric-turbulent-wake">Steady Axisymmetric Turbulent Wake</h1>
<p>When air flows past a wind turbine,
a wake forms downstream.
The wake velocity deficit impacts turbine spacing,
efficiency,
and power output.</p>
<p>A simplified steady axisymmetric turbulent wake equation is:</p>
<p>\[
\frac{\partial U}{\partial x} = \nu_t \cdot \left( \frac{\partial^2 U}{\partial r^2} + \frac{1}{r} \frac{\partial U}{\partial r} \right)
\]</p><p>where:</p>
<ul>
<li>\( U \) is the axial velocity.</li>
<li>\( x \) is the downstream distance.</li>
<li>\( r \) is the radial coordinate.</li>
<li>\( \nu_t \) is the turbulent viscosity.</li>
</ul>
<p>This is a reduced form of the momentum equation,
capturing diffusion of momentum
due to turbulence.</p>
<h1 id="heading-matlab-vs-julia-odepde-system-set-up">MATLAB vs Julia: ODE/PDE System Set-up</h1>
<p>Let's see how to convert the math into code
and solve this PDE.</p>
<ul>
<li><p>MATLAB:</p>
<pre><code class="lang-matlab">function dUdx = wake_eq(x, U, p)
    % Radial step size
    dr = p.r(2) - p.r(1);

    % First derivative (central difference)
    dUdr = (U(3:end) - U(1:end-2)) / (2 * dr);

    % Second derivative (central difference)
    d2Udr2 = (U(3:end) - 2 * U(2:end-1) + U(1:end-2)) / dr^2;

    dUdx = zeros(size(U));
    dUdx(2:end-1) = p.nu_t * (d2Udr2 + dUdr ./ p.r(2:end-1));
end

U0 = initial_profile();
xspan = [0 100];
p.nu_t = 0.05;
p.r = linspace(0, 1, numel(U0));

prob = ode;
prob.ODEFcn = @dUdx;
prob.InitialTime = xspan(1);
prob.InitialValue = U0;
prob.Parameters = p;
prob.Solver = "ode45";

sol = solve(prob, xspan(1), xspan(2));
</code></pre>
</li>
<li><p>Julia:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">using</span> DifferentialEquations

<span class="hljs-meta">@kwdef</span> <span class="hljs-keyword">mutable struct</span> WakeParams{T}
    ν_t::<span class="hljs-built_in">Float64</span>
    <span class="hljs-keyword">const</span> r::T
<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> wake_eq!(dU, U, p, x)
    <span class="hljs-comment"># Radial step size</span>
    dr = p.r[<span class="hljs-number">2</span>] - p.r[<span class="hljs-number">1</span>]

    <span class="hljs-comment"># First derivative (central difference)</span>
    dUdr = (U[<span class="hljs-number">3</span>:<span class="hljs-keyword">end</span>] .- U[<span class="hljs-number">1</span>:<span class="hljs-keyword">end</span>-<span class="hljs-number">2</span>]) ./ (<span class="hljs-number">2</span> * dr)

    <span class="hljs-comment"># Second derivative (central difference)</span>
    d2Udr2 = (U[<span class="hljs-number">3</span>:<span class="hljs-keyword">end</span>] .- <span class="hljs-number">2</span> .* U[<span class="hljs-number">2</span>:<span class="hljs-keyword">end</span>-<span class="hljs-number">1</span>] .+ U[<span class="hljs-number">1</span>:<span class="hljs-keyword">end</span>-<span class="hljs-number">2</span>]) ./ dr^<span class="hljs-number">2</span>

    dU[<span class="hljs-number">1</span>] = dU[<span class="hljs-keyword">end</span>] = <span class="hljs-number">0</span>
    dU[<span class="hljs-number">2</span>:<span class="hljs-keyword">end</span>-<span class="hljs-number">1</span>] .= p.ν_t .* (d2Udr2 .+ dUdr ./ p.r[<span class="hljs-number">2</span>:<span class="hljs-keyword">end</span>-<span class="hljs-number">1</span>])
<span class="hljs-keyword">end</span>

U0 = initial_profile()
xspan = (<span class="hljs-number">0.0</span>, <span class="hljs-number">100.0</span>)
p = WakeParams(; ν_t = <span class="hljs-number">0.05</span>, r = range(<span class="hljs-number">0.0</span>, <span class="hljs-number">1.0</span>, length(U0)))

prob = ODEProblem(wake_eq!, U0, xspan, p)
solver = Tsit5()

sol = solve(prob, solver)
</code></pre>
</li>
</ul>
<p>(Note that, in practice,
the boundary conditions
for \( r = 0 \) and \( r = 1 \)
would need to be handled with more care.)</p>
<p>As you can see,
the syntax for Julia and MATLAB
is quite similar.
However,
there are some key differences
between the two approaches:</p>
<ul>
<li>Julia's broadcasting (<code>.=</code>, <code>.*</code>, etc.) is explicit and fast.</li>
<li>Julia can use in-place functions
to minimize memory allocations,
boosting performance.</li>
<li>The Julia code for the dynamics above was written
to look more like the MATLAB code.
However,
additional easy performance optimizations are possible
to reduce memory allocations.</li>
<li>Julia's <a target="_blank" href="https://github.com/SciML/DifferentialEquations.jl">DifferentialEquations.jl</a> supports <a target="_blank" href="https://docs.sciml.ai/DiffEqDocs/stable/solvers/ode_solve/">many solvers</a>
with a unified interface.
MATLAB supports only a limited set of solvers.</li>
</ul>
<h1 id="heading-event-handling-and-callbacks-wind-gusts">Event Handling and Callbacks: Wind Gusts</h1>
<p>Now let's add a gust at \( x = 50 \)
that will change \( \nu_t \)
for \( x \ge 50 \).</p>
<ul>
<li><p>MATLAB:</p>
<pre><code class="lang-matlab">function v = events_func(x, U, p, gust_position)
    % Event occurs when `x == gust_position`.
    v = x - gust_position;
end

function [stop, U, p] = callbacks_func(x, U, ie, p)
    stop = false;
    % Check if the event occurred.
    if ismember(1, ie)
        p.nu_t = 0.08;
    end
end

gust_position = 50;

event = odeEvent;
event.EventFcn = @(x, U, p) events_func(x, U, p, gust_position);
event.Response = "callback";
event.CallbackFcn = @callbacks_func;

prob.EventDefinition = event;

sol = solve(prob);
</code></pre>
</li>
<li><p>Julia:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">function</span> gust_affect!(integrator)
    integrator.p.ν_t = <span class="hljs-number">0.08</span>
<span class="hljs-keyword">end</span>

gust_position = <span class="hljs-number">50.0</span>;
callback = PresetTimeCallback([gust_position], gust_affect!)

sol = solve(prob, solver; callback)
</code></pre>
</li>
</ul>
<p>When it comes to events and callbacks,
Julia's approach is better:</p>
<ul>
<li>Julia's callbacks are better organized;
events and corresponding callbacks
can be defined next to each other
in the code,
instead of separated across files
as is typical in MATLAB.</li>
<li>Multiple events can be combined cleanly
with <code>CallbackSet</code>,
no need to try to cram multiple events
into a single function
like you have to do in MATLAB.</li>
<li>Julia keeps track of what events are triggered.
In MATLAB,
you have to check
what events were triggered manually.</li>
<li>Julia provides many pre-defined callbacks
(in <a target="_blank" href="https://docs.sciml.ai/DiffEqCallbacks/stable/">DiffEqCallbacks.jl</a>)
that often have to be manually implemented in MATLAB.</li>
</ul>
<h1 id="heading-other-considerations">Other Considerations</h1>
<p>In addition to the differences
in solving differential equations,
here are some other key differences
between Julia and MATLAB:</p>
<ul>
<li><em>Performance:</em>
Julia's JIT-compilation and method specialization
yield C-like speeds.
Large-scale renewable energy simulations
run faster and scale better,
especially for parameter sweeps.
You can expect to see 50--150 times faster code with Julia.</li>
<li><em>Workflow:</em>
MATLAB offers a polished GUI and plotting out of the box.
However,
Julia offers open-source freedom,
easy integration with Python/C,
and a thriving ecosystem.</li>
<li><em>Licensing:</em>
MATLAB requires a paid license;
Julia is entirely free and open-source.</li>
</ul>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we saw how Julia and MATLAB compare
for defining and solving
steady axisymmetric turbulent wake.
Both languages can model renewable energy systems effectively.
However, Julia offers:</p>
<ul>
<li><strong>Performance</strong>: JIT speed for large systems.</li>
<li><strong>Flexibility</strong>: Powerful callbacks and open integrations.</li>
<li><strong>Cost</strong>: No license fees.</li>
<li><strong>Modern syntax</strong>: Designed for productivity and clarity.</li>
</ul>
<p>Ready to revolutionize your renewable energy simulations? 
Transition your MATLAB models to Julia
and experience unparalleled speed, flexibility, and long-term maintainability.
<a target="_blank" href="https://glcs.io/modeling-simulation/">Reach out</a> today and let's accelerate
your green energy innovations together!</p>
<p>Worried about the technical hurdles and costs
of switching from MATLAB to Julia?
Discover our cutting-edge
<a target="_blank" href="https://glcs.io/julia-matlab/">Julia-MATLAB Integration</a>.
We develop high-performance Julia models
that seamlessly connect with
your existing MATLAB codebase,
minimizing risks while maximizing
your return on investment.
Transition smarter, faster, and more cost-effectively
with our expert solutions!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://docs.sciml.ai/Overview/stable/">Julia SciML Docs</a><ul>
<li>Documentation for the scientific machine learning ecosystem
(including <a target="_blank" href="https://docs.sciml.ai/DiffEqDocs/stable/">DifferentialEquations.jl</a>).</li>
</ul>
</li>
<li><a target="_blank" href="https://docs.sciml.ai/SciMLBenchmarksOutput/dev/MultiLanguage/ode_wrapper_packages/">SciML Benchmarks</a><ul>
<li>ODE solver performance comparisons,
including Julia and MATLAB solvers (and others).</li>
</ul>
</li>
<li><a target="_blank" href="https://glcs.io/modeling-simulation/">GLCS Modeling &amp; Simulation</a><ul>
<li>Connect with us for Julia development help.</li>
</ul>
</li>
<li><a target="_blank" href="https://glcs.io/julia-matlab/">GLCS Julia-MATLAB Integration</a><ul>
<li>A smart way to begin your Julia journey, fully modeled and seamlessly integrated with MATLAB.</li>
</ul>
</li>
</ul>
<p>MATLAB is a registered trademark
of The MathWorks, Inc.</p>
]]></content:encoded></item><item><title><![CDATA[Accelerate Your Julia Code with Effective Profiling Methods]]></title><description><![CDATA[This post was written by Steven Whitaker.

In this Julia for Devs post,
we will discuss
using Julia's Profile standard library
for performance and allocation profiling.
We will illustrate these tools
with example code
and then show how to improve the...]]></description><link>https://blog.glcs.io/profiling_allocations</link><guid isPermaLink="true">https://blog.glcs.io/profiling_allocations</guid><category><![CDATA[Julia]]></category><category><![CDATA[performance]]></category><category><![CDATA[profiling]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Mon, 08 Sep 2025 17:31:04 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1754509671366/tERWVgSLT.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by <strong>Steven Whitaker</strong>.</p>
</blockquote>
<p>In this <em>Julia for Devs</em> post,
we will discuss
using Julia's <a target="_blank" href="https://docs.julialang.org/en/v1/manual/profile/"><code>Profile</code> standard library</a>
for performance and allocation profiling.
We will illustrate these tools
with example code
and then show how to improve the code.</p>
<p>This post will showcase
the powerful techniques we used
to significantly accelerate
our clients simulations,
resulting in a remarkable 25% reduction in run time
of code that was <strong>already highly optimized for performance</strong>.
The impact of these techniques
on our client's work
is a testament to their effectiveness
and should inspire you
in your own projects.</p>
<p>Many new Julia developers
face poor code performance,
but these techniques can net
10x or even 100x faster run times!</p>
<p>While we won't be sharing client-specific code,
we will provide similar examples
that are practical and applicable
to a wide range of simulations.
This approach
should give you the confidence
to apply these techniques
in your work.</p>
<p>Here are some of the key ideas we will focus on:</p>
<ul>
<li>Pinpoint areas of improvement with <code>Profile.@profile</code>.</li>
<li>Track down allocations with <code>Profile.Allocs.@profile</code>.</li>
<li>Improve run time and reduce allocations with <code>@generated</code> functions.</li>
</ul>
<p>Let's dive in!</p>
<h1 id="heading-profiling-in-julia">Profiling in Julia</h1>
<p>Profiling is a crucial tool
for locating performance bottlenecks.
This knowledge is invaluable
as it guides development efforts
to the parts of the code
that will have the most significant impact on run time.
Understanding this process
will keep you informed
and in control
of your code's performance.</p>
<p>Here is some example code
we will use
to illustrate profiling in Julia:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">using</span> StaticArrays: SVector

<span class="hljs-keyword">function</span> kernel_original(x::SVector{N}) <span class="hljs-keyword">where</span> N

    y = zeros(N + <span class="hljs-number">1</span>)
    y[<span class="hljs-number">1</span>] = x[<span class="hljs-number">1</span>]
    <span class="hljs-keyword">for</span> i = <span class="hljs-number">2</span>:N
        y[i] = <span class="hljs-number">0.5</span> * y[i-<span class="hljs-number">1</span>] + x[i]
    <span class="hljs-keyword">end</span>
    y[<span class="hljs-keyword">end</span>] = sum(<span class="hljs-meta">@view</span>(y[<span class="hljs-number">1</span>:<span class="hljs-keyword">end</span>-<span class="hljs-number">1</span>]))

    <span class="hljs-keyword">return</span> SVector{N+<span class="hljs-number">1</span>}(y)

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> workflow_original()

    total = <span class="hljs-number">0.0</span>
    <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:<span class="hljs-number">100</span>
        x = SVector{<span class="hljs-number">10</span>, <span class="hljs-built_in">Float64</span>}(rand(<span class="hljs-number">10</span>))
        y = kernel_original(x)
        total += y[<span class="hljs-keyword">end</span>]
    <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">return</span> total

<span class="hljs-keyword">end</span>
</code></pre>
<p>The main idea with this example
is that there is a workflow,
<code>workflow_original</code>,
that calls out to a core computation
for many different input values.</p>
<p>To profile this code,
we will use the <code>Profile</code> standard library.
Since <code>Profile</code> implements a statistical profiler,
we need to ensure the code we profile
runs long enough
to reduce the impact of noise in the measurements
(see the <a target="_blank" href="https://docs.julialang.org/en/v1/manual/profile/">documentation</a> for more info).
So,
let's see how long <code>workflow_original</code> takes
(using <a target="_blank" href="https://github.com/JuliaCI/BenchmarkTools.jl">BenchmarkTools.jl</a>):</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> BenchmarkTools

julia&gt; <span class="hljs-meta">@btime</span> workflow_original();
  <span class="hljs-number">6.410</span> μs (<span class="hljs-number">300</span> allocations: <span class="hljs-number">25.00</span> KiB)
</code></pre>
<p>6 microseconds is very fast
compared to the default profiling sample delay
of 1 millisecond.
Therefore,
we must run the workflow many times
to get enough samples.
We have found that 1000--5000 samples
are usually plenty
for identifying performance bottlenecks,
though slower code will likely need more samples
and/or a larger sample delay.</p>
<p>So,
to get approximately 1000 samples,
we will need to run the workflow
\( 1000 \cdot \frac{1 \mathrm{ms}}{0.006 \mathrm{ms}} = 166,667 \) times.
We'll round up to 200,000 for good measure.</p>
<p>Let's see how to profile the code:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> Profile

julia&gt; Profile.clear()

julia&gt; workflow_original();

julia&gt; Profile.<span class="hljs-meta">@profile</span> <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> <span class="hljs-number">1</span>:<span class="hljs-number">200_000</span> workflow_original() <span class="hljs-keyword">end</span>
</code></pre>
<p>A couple of notes:</p>
<ul>
<li>The <code>Profile.clear()</code> step
is not strictly necessary.
However, a habit of always clearing the profile data
ensures one doesn't inadvertently spoil their profiles
with old data from previous profiling results.</li>
<li>We called <code>workflow_original</code> once before profiling
to avoid profiling JIT compilation.</li>
</ul>
<p>Now we want to display the profile data.
One way is via <code>Profile.print</code>,
which prints a textual representation
of the profile to the console,
but this isn't the most efficient method
for inspecting the data.
Typically,
profile data is visualized
using a flamegraph.</p>
<p>The <a target="_blank" href="https://docs.julialang.org/en/v1/manual/profile/">Profiling docs</a> list several packages
for visualizing profiles.
We will use <a target="_blank" href="https://github.com/pfitzseb/ProfileCanvas.jl">ProfileCanvas.jl</a>,
which creates an HTML file
we can view in a web browser
and interactively inspect the profiling results:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> ProfileCanvas

julia&gt; ProfileCanvas.view()
</code></pre>
<p>This code creates and displays a flamegraph:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754500561646/MTkONJeJu.jpeg?auto=format" alt="Profile of original workflow" /></p>
<p>One thing that stands out in this flamegraph
is the three yellow rectangles.
These indicate occurrences of garbage collection (GC),
which implies the code allocated memory.
Since these yellow blocks
represent a decent portion of the run time
(as indicated by the width of the rectangles),
let's investigate.</p>
<h1 id="heading-allocation-profiling">Allocation Profiling</h1>
<p>We will use Julia's allocation profiling tools
to further inspect
the allocations we know the workflow has.
The process is similar to performance profiling,
with the following differences:</p>
<ul>
<li>We will use <code>Profile.Allocs.@profile</code>
and pass the option <code>sample_rate = 1.0</code>
to record all allocations.
In a larger workflow with many more allocations,
a smaller <code>sample_rate</code> is advised
(the default value is <code>0.1</code>).</li>
<li>We will run the workflow just one time.</li>
<li>We will visualize the results with <code>ProfileCanvas.view_allocs</code>.</li>
</ul>
<p>Here's how to profile allocations:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> Profile, ProfileCanvas

julia&gt; Profile.Allocs.clear()

julia&gt; workflow_original();

julia&gt; Profile.Allocs.<span class="hljs-meta">@profile</span> sample_rate = <span class="hljs-number">1.0</span> workflow_original();

julia&gt; ProfileCanvas.view_allocs()
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754500536894/ys22spY7n.jpeg?auto=format" alt="Allocation profile of original workflow" /></p>
<p>In this flamegraph,
the widths of each rectangle
correspond to how many allocations were made.
In this example,
each yellow block
is the same width,
meaning they all contributed
the same number of allocations.</p>
<p>Now we need to determine
whether we can do anything about these allocations.</p>
<p>Moving up from the bottom of the flamegraph
lets us trace the call stack
to see where these allocations came from.
For example,
we can see that <code>GenericMemory</code>
was called from <code>Array</code>,
which itself was called from <code>Array</code>,
and so on.
Eventually,
we get to <code>workflow_original</code>.</p>
<p>If we hover our mouse cursor over that block,
it will display the file and line number:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754500517314/J-aU0QZ6C.png?auto=format" alt="Mouse hover displaying file and line info" /></p>
<p>(In this case,
I defined <code>workflow_original</code>
in the first REPL prompt
of a fresh Julia session,
so that's why <code>REPL[1]</code> shows up for the file name.)</p>
<p>Looking at these profile results
shows us that the offending lines are:</p>
<ul>
<li><code>x = SVector{10, Float64}(rand(10))</code> (in <code>workflow_original</code>)</li>
<li><code>y = zeros(N + 1)</code> (in <code>kernel_original</code>)</li>
</ul>
<p>This makes sense;
in each of these lines,
we explicitly create an array
(via <code>rand</code> and <code>zeros</code>),
which allocates memory.</p>
<p>But it turns out
we can eliminate these allocations!</p>
<p>Eliminating the first allocation
requires knowledge
of the StaticArrays.jl package.
In particular,
the <code>@SVector</code> macro
can create an <code>SVector</code>
directly from standard array expressions
(like <code>rand</code>)
without allocating memory.
So,
instead of using the <code>SVector</code> constructor,
we can write:</p>
<pre><code class="lang-julia">x = <span class="hljs-meta">@SVector</span> rand(<span class="hljs-number">10</span>)
</code></pre>
<p>The second allocation,
however,
is a bit trickier to remove.</p>
<h1 id="heading-reducing-allocations-with-generated-functions">Reducing Allocations with <code>@generated</code> Functions</h1>
<p>What makes it difficult
to remove the allocation
in <code>kernel_original</code>
is the elements of the vector <code>y</code>
depend on previous elements
of the vector.
That means we have to compute one element
to compute the next,
which means we have to store that element somehow.
If we knew exactly how long <code>y</code> would be,
we could store the computations
in local (scalar) variables.
However, the function needs to work with any input size;
we don't want to create a separate method
for each possible input size.</p>
<p>At least, not by hand.</p>
<p>It turns out
Julia can automate
the creation of these specialized methods.
The trick is to use the <a target="_blank" href="https://docs.julialang.org/en/v1/manual/metaprogramming/#Generated-functions"><code>@generated</code> macro</a>.</p>
<p>A method annotated with <code>@generated</code>
uses type information
to produce specialized implementations
of the method
depending on the input types.
And since type information is available at compile time,
this specialization occurs at compile time,
leading to run time improvements.</p>
<p>Using a <code>@generated</code> function
to replace <code>kernel_original</code>
will allow us to, essentially,
move the run time allocation
to compile time.</p>
<p>Here's what the new function looks like:</p>
<pre><code class="lang-julia"><span class="hljs-meta">@generated</span> <span class="hljs-keyword">function</span> kernel_generated(x::SVector{N}) <span class="hljs-keyword">where</span> N

    assignments = [:(y1 = x[<span class="hljs-number">1</span>])]

    <span class="hljs-keyword">for</span> i = <span class="hljs-number">2</span>:N
        yprev = <span class="hljs-built_in">Symbol</span>(:y, i - <span class="hljs-number">1</span>)
        yi = <span class="hljs-built_in">Symbol</span>(:y, i)
        push!(assignments, :($yi = <span class="hljs-number">0.5</span> * $yprev + x[$i]))
    <span class="hljs-keyword">end</span>

    yend = <span class="hljs-built_in">Symbol</span>(:y, N + <span class="hljs-number">1</span>)
    sum_expr = reduce((a, b) -&gt; :($a + $b), (<span class="hljs-built_in">Symbol</span>(:y, i) <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:N))
    push!(assignments, :($yend = $sum_expr))

    vars = ntuple(i -&gt; <span class="hljs-built_in">Symbol</span>(:y, i), N + <span class="hljs-number">1</span>)

    <span class="hljs-keyword">return</span> <span class="hljs-keyword">quote</span>
        $(assignments...)
        <span class="hljs-keyword">return</span> SVector{$(N + <span class="hljs-number">1</span>), <span class="hljs-built_in">Float64</span>}($(vars...))
    <span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>
</code></pre>
<p>The first thing you might notice,
especially if you are unfamiliar with metaprogramming,
is that this function looks quite different
from <code>kernel_original</code>.
So, let's unpack this a bit:</p>
<ul>
<li>A <code>@generated</code> function
needs to return an expression.
The compiler will then compile
the code resulting from the expression.
Finally, at run time,
the compiled code will be used,
<em>not</em> the code used
to generate the compiled expression.
In other words,
this function must return
the specialized code itself,
not the result of a run time computation.</li>
<li>The returned expression
is created with a <code>quote</code> block.
This <code>quote</code> block
interpolates (using <code>$</code>) expressions
built up earlier in the function.
The computations
to build up these expressions
occur only at compile time.</li>
</ul>
<p>If we want to inspect
what the generated function looks like,
we need to refactor the code just a bit.
Essentially,
we'll create a regular Julia function
that returns an expression,
and the <code>@generated</code> function
will just call that function:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">function</span> _gen_kernel_generated(::<span class="hljs-built_in">Type</span>{&lt;:SVector{N}}) <span class="hljs-keyword">where</span> N
    <span class="hljs-comment"># Same code as `kernel_generated` above.</span>
<span class="hljs-keyword">end</span>

<span class="hljs-meta">@generated</span> kernel_generated(x::SVector) = _gen_kernel_generated(x)
</code></pre>
<p>Then we can call <code>_gen_kernel_generated</code> directly
to see what code actually runs:</p>
<pre><code class="lang-julia">julia&gt; _gen_kernel_generated(typeof(<span class="hljs-meta">@SVector</span> rand(<span class="hljs-number">1</span>)))
<span class="hljs-keyword">quote</span>
    <span class="hljs-comment">#= REPL[2]:18 =#</span>
    y1 = x[<span class="hljs-number">1</span>]
    y2 = y1
    <span class="hljs-comment">#= REPL[2]:19 =#</span>
    <span class="hljs-keyword">return</span> SVector{<span class="hljs-number">2</span>, <span class="hljs-built_in">Float64</span>}(y1, y2)
<span class="hljs-keyword">end</span>

julia&gt; _gen_kernel_generated(typeof(<span class="hljs-meta">@SVector</span> rand(<span class="hljs-number">2</span>)))
<span class="hljs-keyword">quote</span>
    <span class="hljs-comment">#= REPL[2]:18 =#</span>
    y1 = x[<span class="hljs-number">1</span>]
    y2 = <span class="hljs-number">0.5</span>y1 + x[<span class="hljs-number">2</span>]
    y3 = y1 + y2
    <span class="hljs-comment">#= REPL[2]:19 =#</span>
    <span class="hljs-keyword">return</span> SVector{<span class="hljs-number">3</span>, <span class="hljs-built_in">Float64</span>}(y1, y2, y3)
<span class="hljs-keyword">end</span>

julia&gt; _gen_kernel_generated(typeof(<span class="hljs-meta">@SVector</span> rand(<span class="hljs-number">3</span>)))
<span class="hljs-keyword">quote</span>
    <span class="hljs-comment">#= REPL[2]:18 =#</span>
    y1 = x[<span class="hljs-number">1</span>]
    y2 = <span class="hljs-number">0.5</span>y1 + x[<span class="hljs-number">2</span>]
    y3 = <span class="hljs-number">0.5</span>y2 + x[<span class="hljs-number">3</span>]
    y4 = (y1 + y2) + y3
    <span class="hljs-comment">#= REPL[2]:19 =#</span>
    <span class="hljs-keyword">return</span> SVector{<span class="hljs-number">4</span>, <span class="hljs-built_in">Float64</span>}(y1, y2, y3, y4)
<span class="hljs-keyword">end</span>
</code></pre>
<p>Yep, the implementation looks right!</p>
<p>We can also see
that the generated code
avoids creating an array
to store intermediate results,
instead storing computations
in local variables.
But we didn't have to write
any of those methods ourselves!
Using <code>@generated</code> allows us
to maintain just one function
to generate all these specialized implementations.</p>
<p>Let's benchmark the new implementation:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-meta">@btime</span> workflow_generated();
  <span class="hljs-number">1.093</span> μs (<span class="hljs-number">0</span> allocations: <span class="hljs-number">0</span> bytes)
</code></pre>
<p>Nice, six times faster
and no allocations!</p>
<p>And here's the profile:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1754500550643/D0jALwUVg.jpeg?auto=format" alt="Profile of optimized workflow" /></p>
<details>

<summary>(Click here to see the code for profiling.)</summary>

<code>julia
using StaticArrays: @SVector, SVector

@generated function kernel_generated(x::SVector{N}) where N

    assignments = [:(y1 = x[1])]

    for i = 2:N
        yprev = Symbol(:y, i - 1)
        yi = Symbol(:y, i)
        push!(assignments, :($yi = 0.5 * $yprev + x[$i]))
    end

    yend = Symbol(:y, N + 1)
    sum_expr = reduce((a, b) -&gt; :($a + $b), (Symbol(:y, i) for i = 1:N))
    push!(assignments, :($yend = $sum_expr))

    vars = ntuple(i -&gt; Symbol(:y, i), N + 1)

    return quote
        $(assignments...)
        return SVector{$(N + 1), Float64}($(vars...))
    end

end

function workflow_generated()

    total = 0.0
    for i = 1:100
        x = @SVector rand(10)
        y = kernel_generated(x)
        total += y[end]
    end
    return total

end

using BenchmarkTools
@btime workflow_generated();

using Profile, ProfileCanvas
Profile.clear()
Profile.@profile for _ in 1:200_000 workflow_generated() end
ProfileCanvas.view()</code>

Note,
there's no need for allocation profiling
because there are no allocations.

</details>

<p>There aren't obvious places for improvement,
so I'd say we did a pretty good job
optimizing the code.</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we saw how to use the <code>Profile</code> standard library
to profile the run time
and the allocations
of a piece of code.
We also illustrated
how a <code>@generated</code> function
can eliminate run time allocations
and speed up the code.
These were key ideas we used
to help one of our clients
speed up their simulations.</p>
<p>Do you need help pinpointing performance bottlenecks
or tracking down allocations
in your code?
<a target="_blank" href="https://glcs.io/modeling-simulation/">Contact us</a>, and we can help you out!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://docs.julialang.org/en/v1/manual/profile/">Profiling in Julia</a><ul>
<li>Julia manual section on profiling.</li>
</ul>
</li>
<li><a target="_blank" href="https://docs.julialang.org/en/v1/manual/metaprogramming/">Metaprogramming in Julia</a><ul>
<li>Julia manual section on metaprogramming,
including <code>@generated</code> functions.</li>
</ul>
</li>
<li><a target="_blank" href="https://glcs.io/modeling-simulation/">GLCS Modeling &amp; Simulation</a><ul>
<li>Connect with us for Julia Modeling &amp; Simulation.</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Enhance Model Accuracy Using Universal Differential Equations]]></title><description><![CDATA[This post was written by Steven Whitaker.

In this Julia for Devs post,
we will explore the fascinating world of using universal differential equations (UDEs)
for model discovery.
A UDE is a differential equation
where part (or all) of it
is defined ...]]></description><link>https://blog.glcs.io/ude_symbolic_regression</link><guid isPermaLink="true">https://blog.glcs.io/ude_symbolic_regression</guid><category><![CDATA[Julia]]></category><category><![CDATA[modeling]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Mon, 18 Aug 2025 16:21:57 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1753987736520/UvLfeQ_O1.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by <strong>Steven Whitaker</strong>.</p>
</blockquote>
<p>In this <em>Julia for Devs</em> post,
we will explore the fascinating world of using <a target="_blank" href="https://arxiv.org/abs/2001.04385">universal differential equations</a> (UDEs)
for model discovery.
A UDE is a differential equation
where part (or all) of it
is defined by a universal approximator
(typically a neural network).
These UDEs play a pivotal role
in uncovering missing aspects
of established models.</p>
<p>UDEs enable seamless integration
of known mechanistic models
with data-driven modeling approaches.
Rather than omit or guess at unknown aspects of a model,
a neural network can be embedded into the model,
which can then be trained using observed data.</p>
<p>Once trained,
the embedded neural network
can be approximated by a mathematical expression.
This allows us to replace the neural network
with an interpretable formula,
improving transparency
while filling in the missing aspects
of the original model
with new structure learned from observed data.</p>
<p>This post will illustrate
key ideas
and design decisions made
while helping one of our clients
uncover missing dynamics in their model,
a complex system representing
the human body under treatment
of a rare disease.</p>
<p>By replacing just one aspect of our client's model
with a neural network,
we achieved a significant milestone:
a 30% reduction in the error of the model predictions.
This success story is a testament to the power
of UDEs in enhancing model performance.</p>
<p>We will not share client-specific code,
but will provide similar examples
to illustrate our approach.</p>
<p>Here are some of the key ideas we will focus on:</p>
<ul>
<li>Ensure nonnegativity of state variables (as needed).</li>
<li>Appropriately initialize neural network weights.</li>
<li>Normalize neural network inputs.</li>
<li>Make dynamics modular.</li>
<li>Fit symbolic expression.</li>
</ul>
<p>Note that we will discuss
fully connected neural networks,
so some comments below
may not apply if other architectures are used.</p>
<p>And with that,
let's dive in!</p>
<h1 id="heading-ensure-nonnegativity-of-states">Ensure Nonnegativity of States</h1>
<p>When working with a model
representing a real phenomenon,
chances are that at least some of the model's state variables
are naturally nonnegative.
For example,
the energy of a system
or the concentration of a chemical
cannot be negative quantities.</p>
<p>However,
a differential equation solver
sees only math,
not the physical phenomenon
the math represents.
So,
we have to tell the solver
to respect the natural constraints
that exist.</p>
<p>One way to enforce nonnegativity
is to pass the <code>isoutofdomain</code> option
to <a target="_blank" href="https://docs.sciml.ai/DiffEqDocs/stable/basics/common_solver_opts/"><code>solve</code></a>,
where <code>isoutofdomain</code> is a function
that returns <code>true</code>
when at least one of the provided state variables
violates the nonnegativity constraint.
For example,
if all state variables should be nonnegative,
the function can be defined as:</p>
<pre><code class="lang-julia">isoutofdomain(u, p, t) = any(&lt;(<span class="hljs-number">0</span>), u)
</code></pre>
<p>As another example,
if only the first and third states should be nonnegative:</p>
<pre><code class="lang-julia">isoutofdomain(u, p, t) = u[<span class="hljs-number">1</span>] &lt; <span class="hljs-number">0</span> || u[<span class="hljs-number">3</span>] &lt; <span class="hljs-number">0</span>
</code></pre>
<p>For more information
and other tools
for tackling nonnegativity,
check out the <a target="_blank" href="https://docs.sciml.ai/DiffEqDocs/stable/basics/faq/#My-ODE-goes-negative-but-should-stay-positive,-what-tools-can-help?">SciML docs</a>.</p>
<p>The idea
of telling the solver
about the nonnegativity constraints
may seem obvious.
Still, it has important implications
for the initialization of neural networks
embedded into the model.
Care should be taken
to ensure the initial network weights
(before training)
do not cause states to become negative.
See the next section below for more details.</p>
<h1 id="heading-appropriately-initialize-neural-network-weights">Appropriately Initialize Neural Network Weights</h1>
<p>To learn missing dynamics,
the neural network in a UDE
needs to be trained.</p>
<p>Training the neural network can occur
only if the UDE can be solved
using the initial network weights.
Otherwise,
the gradient of the loss,
which is used by optimization algorithms
to train the neural network,
cannot be computed.</p>
<p>As a result,
it is necessary
to choose a good initialization
for the neural network weights.
Here are some approaches for initialization
and how well they work:</p>
<ul>
<li><em>Random initialization:</em>
Typically,
neural network weights
are initialized randomly.
However,
this results in random UDE dynamics,
so whether a state variable
stays positive
(or satisfies any other constraints)
is left to chance.
And if constraints are violated,
but we try to enforce them
via, e.g., <code>isoutofdomain</code>,
the solver will be unable to solve the UDE,
preventing computation of the loss gradient.
So,
random initialization might work,
but it might not.</li>
<li><em>Zero initialization:</em>
Don't do this.
Network weights won't ever update during training
because the gradients with respect to those weights
will always be zero.</li>
<li><em>Constant initialization:</em>
Don't do this, either.
Initializing all the weights
to the same, non-zero value
will result in some amount of learning,
but it severely limits
the expressivity of the network.</li>
<li><em>Pre-training:</em>
If the neural network in the UDE
is used to replace an expression
that is believed to be incorrect
(but otherwise gives a usable model),
one way to initialize the network weights
for UDE training
is to initialize the weights randomly
but then train the network directly
(i.e., not the UDE as a whole)
to fit the expression the network is to replace.
This works because no constraints are involved
during pre-training
(so randomly initializing the weights is okay),
but when training the UDE,
the network weights have been set
to avoid running into issues with constraints.</li>
<li><em>Smoothed Collocation:</em>
Smoothed collocation is another approach
for pre-training the neural network
without involving constraints.
See the <a target="_blank" href="https://docs.sciml.ai/DiffEqFlux/stable/examples/collocation/">SciML docs</a> for more info.</li>
</ul>
<p>For our client,
randomly initialized weights
failed to give a usable gradient
for training,
so we decided on pre-training
because they had a reasonable guess
that we could fit the initial network weights to.</p>
<h1 id="heading-normalize-neural-network-inputs">Normalize Neural Network Inputs</h1>
<p>One key to ensuring UDE training proceeds nicely
is to normalize the inputs
to the neural network
to ensure all inputs
are roughly of the same magnitude.</p>
<p>Normalizing is essential
because the gradient of the training loss function
is proportional to the network inputs.
If one input tends to be 1000 times larger
than another,
the gradient will also tend to be that much larger,
dominating the learning algorithm.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1753980775280/_V2hdczW9.png?auto=format" alt="Effect of input scale imbalance on training" /></p>
<p>To illustrate how to normalize,
suppose we have the following neural network
(created using <a target="_blank" href="https://github.com/LuxDL/Lux.jl">Lux.jl</a>):</p>
<pre><code class="lang-julia">nn = Chain(
    Dense(<span class="hljs-number">2</span>, <span class="hljs-number">5</span>, tanh),
    Dense(<span class="hljs-number">5</span>, <span class="hljs-number">5</span>, tanh),
    Dense(<span class="hljs-number">5</span>, <span class="hljs-number">1</span>),
)
</code></pre>
<p>Then we can add a function to the <code>Chain</code>
that performs the normalization:</p>
<pre><code class="lang-julia">nn_normalized = Chain(
    x -&gt; x ./ normalization,
    <span class="hljs-comment"># Other layers as before</span>
)
</code></pre>
<p>In this example,
<code>normalization</code> is a <code>Vector</code>
containing the values
to scale each input variable.
And of course,
we're not limited to scaling the inputs;
we can implement whatever function we want
to process the inputs
(such as z-score standardization).</p>
<p>The neural network in our client's UDE
did not learn without normalization.
After normalizing the inputs,
we got the neural network
to learn meaningful dynamics.</p>
<h1 id="heading-make-dynamics-modular">Make Dynamics Modular</h1>
<p>While not strictly necessary for UDEs,
making the dynamics modular
vastly simplifies comparing
between different approaches.</p>
<p>For our client,
we wanted to compare
three versions of the dynamics:
the original dynamics,
the UDE
(where part of the original dynamics
was replaced with a neural network),
and the updated dynamics
(where the trained neural network
was replaced with a symbolic expression
fit to the neural network).</p>
<p>Rather than duplicate the dynamics function three times
with each slight variation,
we allowed the variable part of the dynamics
to be passed in as an option
to the dynamics function.
To illustrate:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">function</span> f!(du, u, p, t; g = original_g)
    <span class="hljs-comment"># ... some code ...</span>

    <span class="hljs-comment"># `a` and `b` are values computed above</span>
    <span class="hljs-comment"># or pulled from `u`, etc.</span>
    val = g(a, b, p)
    <span class="hljs-comment"># `val` is used to update `du` in some way.</span>

    <span class="hljs-comment"># ... some code ...</span>
<span class="hljs-keyword">end</span>
</code></pre>
<p>However,
<code>ODEProblem</code>s don't allow passing options
to the dynamics function.
But that's not a problem;
we can create a closure
to set the option
before handing the function to an <code>ODEProblem</code>.
For example,
if we want <code>g</code> to use a Lux.jl neural network:</p>
<pre><code class="lang-julia">nn = Chain(...)
(p_nn, st) = Lux.setup(...)
<span class="hljs-comment"># Be sure to include `p_nn` in the `ODEProblem` parameters `p`.</span>
g_nn = (a, b, p) -&gt; nn([a, b], p.p_nn, st)[<span class="hljs-number">1</span>][<span class="hljs-number">1</span>]
f_nn! = (du, u, p, t) -&gt; f!(du, u, p, t; g = g_nn)
</code></pre>
<p>Setting up our client's code like this
reduced code duplication
while allowing easy comparisons
across different versions of the dynamics.</p>
<h1 id="heading-fit-symbolic-expression">Fit Symbolic Expression</h1>
<p>One of the key steps
to learning <em>interpretable</em> dynamics
from a UDE
is to fit a symbolic expression
to the trained neural network.
Doing so allows us to replace
the black box neural network
with a mathematical expression
we can understand.</p>
<p>The first step
is to collect the data to use
for the fitting.
If the neural network takes two inputs,
we will need a <code>Matrix</code> of values:</p>
<pre><code class="lang-julia">X = [
    a1 a2 ... aN;
    b1 b2 ... bN;
]
</code></pre>
<p>The (<code>ai</code>, <code>bi</code>) pairs
should cover the expected range of input values.
One way to determine this range
is to solve the UDE
and save off the values
that end up going into the neural network.</p>
<p>To get the target values,
evaluate the neural network
with the <code>X</code> we just created:</p>
<pre><code class="lang-julia">y = nn(X, p_nn, st)[<span class="hljs-number">1</span>] |&gt; vec
</code></pre>
<p>(Here we assume the neural network
outputs a single number
per input pair.)</p>
<p>Now we can figure out
a mathematical expression
relating <code>X</code> to <code>y</code>.
One way to do this
is to use <a target="_blank" href="https://github.com/MilesCranmer/SymbolicRegression.jl">SymbolicRegression.jl</a>.</p>
<p>After obtaining an <code>Expression</code> object <code>eq</code>
(called <code>tree</code> in the README for SymbolicRegression.jl),
we can use it in our dynamics function
as described in the previous section:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># `p` needs to be an input even though it's not used.</span>
g_eq = (a, b, p) -&gt; eq([a; b;;])[<span class="hljs-number">1</span>]
f_eq! = (du, u, p, t) -&gt; f!(du, u, p, t; g = g_eq)
</code></pre>
<p>Using SymbolicRegression.jl
to fit the trained UDE neural network,
we could provide our client
with a precise mathematical expression
representing the dynamics that were missing
from their original differential equation.</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we discussed vital aspects
of implementing UDEs
to learn missing dynamics.
We learned some tips
for helping UDE training,
including appropriately initializing network weights
and normalizing network inputs.
We also saw the benefits
of code modularity
and how that can easily allow us
to insert a learned mathematical expression
into the dynamics.
For our client,
we were able to use these ideas
to train a UDE
to achieve 30% less error
in the model predictions.</p>
<p>Do you want to leverage UDEs
to improve your models' predictive capabilities?
<a target="_blank" href="https://glcs.io/modeling-simulation/">Contact us</a>, and we can help you out!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://docs.sciml.ai/Overview/stable/showcase/missing_physics/">SciML UDE Tutorial</a><ul>
<li>UDE tutorial from official Julia SciML docs.</li>
</ul>
</li>
<li><a target="_blank" href="https://docs.sciml.ai/DiffEqDocs/stable/basics/faq/#My-ODE-goes-negative-but-should-stay-positive,-what-tools-can-help?">SciML Docs about Nonnegativity</a><ul>
<li>More information about debugging problems with negativity,
including a discussion about <code>isoutofdomain</code>.</li>
</ul>
</li>
<li><a target="_blank" href="https://github.com/LuxDL/Lux.jl">Lux.jl</a><ul>
<li>Package for creating and training neural networks.</li>
</ul>
</li>
<li><a target="_blank" href="https://github.com/MilesCranmer/SymbolicRegression.jl">SymbolicRegression.jl</a><ul>
<li>Package for fitting expressions to data.</li>
</ul>
</li>
<li><a target="_blank" href="https://glcs.io/modeling-simulation/">GLCS Modeling &amp; Simulation</a><ul>
<li>Connect with us for Julia Modeling &amp; Simulation consulting.</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Optimize Flight Simulations with Improved Type-Stability]]></title><description><![CDATA[This post was written by Steven Whitaker.

In this Julia for Devs post,
we will discuss improving the performance
of simulation code written in Julia
by eliminating sources of type-instabilities.
We wrote another post
detailing what type-stability is...]]></description><link>https://blog.glcs.io/sim-performance-type-stability</link><guid isPermaLink="true">https://blog.glcs.io/sim-performance-type-stability</guid><category><![CDATA[Julia]]></category><category><![CDATA[performance]]></category><category><![CDATA[General Programming]]></category><category><![CDATA[Types]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Thu, 26 Jun 2025 15:57:59 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1748029622661/STIg1TaT_.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by <strong>Steven Whitaker</strong>.</p>
</blockquote>
<p>In this <em>Julia for Devs</em> post,
we will discuss improving the performance
of simulation code written in Julia
by eliminating sources of type-instabilities.</p>
<p>We wrote <a target="_blank" href="https://blog.glcs.io/type-stability">another post</a>
detailing what type-stability is
and how type-instabilities can degrade performance.
We also showed how <a target="_blank" href="https://github.com/timholy/SnoopCompile.jl">SnoopCompile.jl</a> and <a target="_blank" href="https://github.com/JuliaDebug/Cthulhu.jl">Cthulhu.jl</a>
can be used to pinpoint causes of type-instability.</p>
<p>This post will cover some of the type-instabilities
we helped one of our clients overcome.</p>
<p>Our client is a technology innovator.
They are building a first-of-its-kind logistics system
focused on autonomous electric delivery
to reduce traffic and air pollution.
Their aim is to provide
efficient delivery services for life-saving products
to people in both urban and rural areas.
Julia is helping them power critical
Guidance, Navigation, and Control (GNC) systems. </p>
<p>With this client, we:</p>
<ul>
<li>Eliminated slowdown-related failures
on the most important simulation scenario.</li>
<li>Decreased the compilation time of the scenario by 30%.</li>
<li>Improved the slowest time steps
from 300 ms to 10 ms (30x speedup),
enabling 2x real-time performance.</li>
</ul>
<p>We will not share client-specific code,
but will provide similar examples
to illustrate root-cause issues
and suggested resolutions.</p>
<p>Here are the root-cause issues and resolutions we will focus on:</p>
<ul>
<li>Help type-inference by unrolling recursion.</li>
<li>Standardize the output of different branches.</li>
<li>Avoid loops over <code>Tuple</code>s.<ul>
<li>Use SnoopCompile.jl to reveal dynamic dispatches.</li>
<li>Investigate functions with Cthulhu.jl.</li>
</ul>
</li>
<li>Avoid dictionaries that map to functions.</li>
</ul>
<p>Let's dive in!</p>
<h1 id="heading-help-type-inference-by-unrolling-recursion">Help Type-Inference by Unrolling Recursion</h1>
<p>One of the interesting problems we saw
was that there was a part of the client's code
that SnoopCompile.jl reported
was resulting in calls to inference,
but when we inspected the code with Cthulhu.jl
the code looked perfectly type-stable.</p>
<p>This code consisted of a set of functions
that recursively called each other,
traversing the model tree
to grab data from all the submodels.</p>
<p>As it turns out,
recursion can pose difficulties for Julia's type-inference.
Basically,
if type-inference detects recursion
but cannot prove it terminates
(based only on the types of inputs---remember
that type-inference occurs before runtime),
inference gives up,
resulting in code that runs like it is type-unstable.
(See this <a target="_blank" href="https://discourse.julialang.org/t/make-julia-complete-the-inference-of-some-recursive-code">Discourse post</a> and comments and links therein
for more information.)</p>
<p>The solution we implemented
was to use a <a target="_blank" href="https://docs.julialang.org/en/v1/manual/metaprogramming/#Generated-functions"><code>@generated</code> function</a>
to unroll the recursion at compile time,
resulting in a flat implementation
that could be correctly inferred.</p>
<p>Here's an example that illustrates
the essence of the recursive code:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Grab all the data in the entire model tree beginning at `model`.</span>
get_data(model::NamedTuple) = (; data = model.data, submodels = get_submodel_data(model.submodels))

<span class="hljs-comment"># This generated function is necessary for type-stability.</span>
<span class="hljs-comment"># It calls `get_data` on each of the fields of `submodels`</span>
<span class="hljs-comment"># and returns a `NamedTuple` of the results.</span>
<span class="hljs-comment"># (This is not the generated function implemented in the solution.)</span>
<span class="hljs-meta">@generated</span> <span class="hljs-keyword">function</span> get_submodel_data(submodels::NamedTuple)
    assignments = map(fieldnames(submodels)) <span class="hljs-keyword">do</span> field
        <span class="hljs-built_in">Expr</span>(:kw, field, :(get_data(submodels.$field)))
    <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">Expr</span>(:tuple, <span class="hljs-built_in">Expr</span>(:parameters, assignments...))
<span class="hljs-keyword">end</span>

get_submodel_data(::<span class="hljs-meta">@NamedTuple</span>{}) = (;)
</code></pre>
<p>Note that in this example
<code>get_data</code> calls <code>get_submodel_data</code>,
which in turn calls <code>get_data</code>
on the submodels.</p>
<p>Here's the code for unrolling the recursion:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">function</span> _gen_get_data(T, path)
    subT = _typeof_field(T, :submodels)
    subpath = :($path.submodels)
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">quote</span>
        (;
            data = $path.data,
            submodels = $(_gen_get_submodel_data(subT, subpath)),
        )
    <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

<span class="hljs-comment"># This function determines the type of `model.field`</span>
<span class="hljs-comment"># given just the type of `model` (so we can't just call `typeof(model.field)`).</span>
<span class="hljs-comment"># This function is necessary because we need to unroll the recursion</span>
<span class="hljs-comment"># using a generated function, which means we have to work in the type domain</span>
<span class="hljs-comment"># (because the generated function is generated before runtime).</span>
<span class="hljs-keyword">function</span> _typeof_field(::<span class="hljs-built_in">Type</span>{NamedTuple{names, T}}, field::<span class="hljs-built_in">Symbol</span>) <span class="hljs-keyword">where</span> {names, T}
    i = findfirst(n -&gt; n === field, names)
    <span class="hljs-keyword">return</span> T.parameters[i]
<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> _gen_get_submodel_data(::<span class="hljs-built_in">Type</span>{<span class="hljs-meta">@NamedTuple</span>{}}, subpath)
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">quote</span>
        (;)
    <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> _gen_get_submodel_data(subT, subpath)
    assignments = map(fieldnames(subT)) <span class="hljs-keyword">do</span> field
        T = _typeof_field(subT, field)
        path = :($subpath.$field)
        <span class="hljs-built_in">Expr</span>(:kw, field, _gen_get_data(T, path))
    <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">return</span> <span class="hljs-built_in">Expr</span>(:tuple, <span class="hljs-built_in">Expr</span>(:parameters, assignments...))
<span class="hljs-keyword">end</span>

<span class="hljs-meta">@generated</span> get_data_generated(model::NamedTuple) = _gen_get_data(model, :model)
</code></pre>
<p>Unfortunately,
this example doesn't reproduce the issue our client had,
but it does show how to use a <code>@generated</code> function
to unroll recursion.
Note that there is still recursion:
<code>_gen_get_data</code> and <code>_gen_get_submodel_data</code> call each other.
The key, though, is that this recursion happens <em>before</em> inference,
which means that when <code>get_data_generated</code> is inferred,
the recursion has already taken place,
resulting in unrolled code
without any recursion
that might cause inference issues.</p>
<p>When we implemented this solution for our client,
we saw the total memory utilization of their simulation
decrease by ~35%.
This was enough to allow them to disable garbage collection
during the simulation,
speeding it up
to faster than real-time!
And this was the first time
this simulation had run faster than real-time!</p>
<h1 id="heading-standardize-output-of-different-branches">Standardize Output of Different Branches</h1>
<p>The client had different parts of their model
update at different frequencies.
As a result,
at any particular time step
only a subset of all the submodels
actually needed to update.
Here's an example of what this might look like:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">function</span> get_output(model, t)
    <span class="hljs-keyword">if</span> should_update(model, t)
        out = update_model(model, t)
    <span class="hljs-keyword">else</span>
        out = get_previous_output(model)
    <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">return</span> out
<span class="hljs-keyword">end</span>
</code></pre>
<p>Unfortunately,
<code>update_model</code> and <code>get_previous_output</code>
returned values of different types,
resulting in type-instability:
the output type of <code>get_output</code>
depended on the runtime result of <code>should_update</code>.</p>
<p>Furthermore,
this function was called at every time point
on every submodel (and every sub-submodel, etc.),
so the type-instability in this function
affected the whole simulation.</p>
<p>The issue was that <code>update_model</code>
typically returned the minimal subset of information
actually needed for the specific model,
whereas <code>get_previous_output</code> was generic
and returned a wider set of information.
For example,
maybe <code>update_model</code> would return a <code>NamedTuple</code>
like <code>(x = [1, 2], xdot = [0, 0])</code>,
while <code>get_previous_output</code> would return
something like <code>(x = [1, 2], xdot = [0, 0], p = nothing, stop_sim = false)</code>.</p>
<p>To fix this issue,
rather than manually updating the return values
of all the methods of <code>update_model</code>
for all the submodels in the system,
we created a function <code>standardize_output</code>
that took whatever <code>NamedTuple</code> returned by <code>update_model</code>
and added the missing fields
that <code>get_previous_output</code> included.
Then,
the only change needed in <code>get_output</code>
was to call <code>standardize_output</code>:</p>
<pre><code class="lang-julia">out = update_model(model, t) |&gt; standardize_output
</code></pre>
<p>The result of making this change
was a 30% decrease in compilation time
for their simulation!</p>
<h1 id="heading-avoid-loops-over-tuples">Avoid Loops over <code>Tuple</code>s</h1>
<p>The client stored submodels
of a parent model
as a <code>Tuple</code> or <code>NamedTuple</code>.
This makes sense for type-stability
because each submodel was of a unique type,
so storing them in this way
preserved the type information
when accessing the submodels.
In contrast,
storing the submodels as a <code>Vector{Any}</code>
would lose the type information
of the submodels.</p>
<p>However,
type-stability problems arise
when looping over <code>Tuple</code>s
of different types of objects.
The problem is that the compiler needs
to compile code for the body of the loop,
but the body of the loop needs
to be able to handle
all types included in the <code>Tuple</code>.
As a result,
the compiler must resort to dynamic dispatch
in the loop body
(but see the note on union-splitting further below).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1750893863141/FJQxy8eDD.png?auto=format&amp;height=600" alt="President waiting for receptionist to look up his office" /></p>
<p>Here's an example of the issue:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">module</span> TupleLoop

<span class="hljs-keyword">function</span> tupleloop(t::<span class="hljs-built_in">Tuple</span>)

    <span class="hljs-keyword">for</span> val <span class="hljs-keyword">in</span> t
        do_something(val)
    <span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>

do_something(val::<span class="hljs-built_in">Number</span>) = val + <span class="hljs-number">1</span>
do_something(val::<span class="hljs-built_in">String</span>) = val * <span class="hljs-string">"!"</span>
do_something(val::<span class="hljs-built_in">Vector</span>{T}) <span class="hljs-keyword">where</span> {T} = isempty(val) ? zero(T) : val[<span class="hljs-number">1</span>]
do_something(val::<span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>,<span class="hljs-built_in">Int</span>}) = get(val, <span class="hljs-string">"hello"</span>, <span class="hljs-number">0</span>)

<span class="hljs-keyword">end</span>
</code></pre>
<p>Using SnoopCompile.jl reveals dynamic dispatches to <code>do_something</code>:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> SnoopCompileCore

julia&gt; tinf = <span class="hljs-meta">@snoop_inference</span> TupleLoop.tupleloop((<span class="hljs-number">1</span>, <span class="hljs-number">2.0</span>, <span class="hljs-string">"hi"</span>, [<span class="hljs-number">10.0</span>], <span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>,<span class="hljs-built_in">Int</span>}()));

julia&gt; <span class="hljs-keyword">using</span> SnoopCompile

julia&gt; tinf
InferenceTimingNode: <span class="hljs-number">0.019444</span>/<span class="hljs-number">0.020361</span> on Core.Compiler.Timings.ROOT() with <span class="hljs-number">4</span> direct children

julia&gt; itrigs = inference_triggers(tinf); mtrigs = accumulate_by_source(<span class="hljs-built_in">Method</span>, itrigs)
<span class="hljs-number">2</span>-element <span class="hljs-built_in">Vector</span>{SnoopCompile.TaggedTriggers{<span class="hljs-built_in">Method</span>}}:
 eval_user_input(ast, backend::REPL.REPLBackend, mod::<span class="hljs-built_in">Module</span>) @ REPL ~/.julia/juliaup/julia-<span class="hljs-number">1.11</span><span class="hljs-number">.5</span>+<span class="hljs-number">0.</span>x64.linux.gnu/share/julia/stdlib/v1<span class="hljs-number">.11</span>/REPL/src/REPL.jl:<span class="hljs-number">247</span> (<span class="hljs-number">1</span> callees from <span class="hljs-number">1</span> callers)
 tupleloop(t::<span class="hljs-built_in">Tuple</span>) @ Main.TupleLoop /path/to/TupleLoop.jl:<span class="hljs-number">3</span> (<span class="hljs-number">3</span> callees from <span class="hljs-number">1</span> callers)
</code></pre>
<p>Looking at <code>tupleloop</code> with Cthulhu.jl:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> Cthulhu

julia&gt; ascend(mtrigs[<span class="hljs-number">2</span>].itrigs[<span class="hljs-number">1</span>])
Choose a call <span class="hljs-keyword">for</span> analysis (q to quit):
     do_something(::<span class="hljs-built_in">String</span>)
 &gt;     tupleloop(::<span class="hljs-built_in">Tuple</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">Float64</span>, <span class="hljs-built_in">String</span>, <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}, <span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, <span class="hljs-built_in">Int64</span>}}) at /path/to/TupleLoop.jl
tupleloop(t::<span class="hljs-built_in">Tuple</span>) @ Main.TupleLoop /path/to/TupleLoop.jl:<span class="hljs-number">3</span>
 <span class="hljs-number">3</span> <span class="hljs-keyword">function</span> tupleloop(t::<span class="hljs-built_in">Tuple</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">Float64</span>, <span class="hljs-built_in">String</span>, <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}, <span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, <span class="hljs-built_in">Int64</span>}}::<span class="hljs-built_in">Tuple</span>)::Core.Const(<span class="hljs-literal">nothing</span>)
 <span class="hljs-number">5</span>
 <span class="hljs-number">5</span>     <span class="hljs-keyword">for</span> val::<span class="hljs-built_in">Any</span> <span class="hljs-keyword">in</span> t::<span class="hljs-built_in">Tuple</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">Float64</span>, <span class="hljs-built_in">String</span>, <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}, <span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, <span class="hljs-built_in">Int64</span>}}
 <span class="hljs-number">6</span>         do_something(val::<span class="hljs-built_in">Any</span>)
 <span class="hljs-number">7</span>     <span class="hljs-keyword">end</span>
 <span class="hljs-number">9</span>
 <span class="hljs-number">9</span> <span class="hljs-keyword">end</span>
</code></pre>
<p>And we see the problem!
Even though the <code>Tuple</code> is inferred,
the loop variable <code>val</code> is inferred as <code>Any</code>,
which means that calling <code>do_something(val)</code>
must be a dynamic dispatch.</p>
<p>Note that in some cases
Julia can perform <a target="_blank" href="https://julialang.org/blog/2018/08/union-splitting/">union-splitting</a> automatically
to remove the dynamic dispatch
caused by this type-instability.
In this example,
union-splitting occurs when the <code>Tuple</code>
contains <a target="_blank" href="https://docs.julialang.org/en/v1/manual/types/#footnote-1">4 (by default)</a> or fewer unique types.
However,
it's not a general solution.</p>
<p>One way to remove the dynamic dispatch
without relying on union-splitting
is to eliminate the loop:</p>
<pre><code class="lang-julia">do_something(t[<span class="hljs-number">1</span>])
do_something(t[<span class="hljs-number">2</span>])
⋮
</code></pre>
<p>But we can quickly see
that writing this code
is not at all generic;
we have to hard-code
the number of calls to <code>do_something</code>,
which means the code will only work
with <code>Tuple</code>s of a particular length.
Fortunately,
there's a way around this issue.
We can write a <a target="_blank" href="https://docs.julialang.org/en/v1/manual/metaprogramming/#Generated-functions"><code>@generated</code> function</a>
to have the compiler unroll the loop
for us in a generic way:</p>
<pre><code class="lang-julia"><span class="hljs-meta">@generated</span> <span class="hljs-keyword">function</span> tupleloop_generated(t::<span class="hljs-built_in">Tuple</span>)

    body = [:(do_something(t[$i])) <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> fieldnames(t)]
    <span class="hljs-keyword">return</span> <span class="hljs-keyword">quote</span>
        $(body...)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">nothing</span>
    <span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>
</code></pre>
<p>(Note that this code would also work
if we specified <code>t::NamedTuple</code>
in the method signature.)</p>
<p>Due to the way <code>@generated</code> functions work,
SnoopCompile.jl still detects dynamic dispatches,
but note that <code>tupleloop_generated</code>
does not have any dynamic dispatches reported:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> SnoopCompileCore

julia&gt; tinf = <span class="hljs-meta">@snoop_inference</span> TupleLoop.tupleloop_generated((<span class="hljs-number">1</span>, <span class="hljs-number">2.0</span>, <span class="hljs-string">"hi"</span>, [<span class="hljs-number">10.0</span>], <span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>,<span class="hljs-built_in">Int</span>}()));

julia&gt; <span class="hljs-keyword">using</span> SnoopCompile

julia&gt; tinf
InferenceTimingNode: <span class="hljs-number">0.022208</span>/<span class="hljs-number">0.050369</span> on Core.Compiler.Timings.ROOT() with <span class="hljs-number">5</span> direct children

julia&gt; itrigs = inference_triggers(tinf); mtrigs = accumulate_by_source(<span class="hljs-built_in">Method</span>, itrigs)
<span class="hljs-number">3</span>-element <span class="hljs-built_in">Vector</span>{SnoopCompile.TaggedTriggers{<span class="hljs-built_in">Method</span>}}:
 (g::Core.GeneratedFunctionStub)(world::<span class="hljs-built_in">UInt64</span>, source::<span class="hljs-built_in">LineNumberNode</span>, args...) @ Core boot.jl:<span class="hljs-number">705</span> (<span class="hljs-number">1</span> callees from <span class="hljs-number">1</span> callers)
 eval_user_input(ast, backend::REPL.REPLBackend, mod::<span class="hljs-built_in">Module</span>) @ REPL ~/.julia/juliaup/julia-<span class="hljs-number">1.11</span><span class="hljs-number">.5</span>+<span class="hljs-number">0.</span>x64.linux.gnu/share/julia/stdlib/v1<span class="hljs-number">.11</span>/REPL/src/REPL.jl:<span class="hljs-number">247</span> (<span class="hljs-number">1</span> callees from <span class="hljs-number">1</span> callers)
 <span class="hljs-string">var"#s1#1"</span>(::<span class="hljs-built_in">Any</span>, t) @ Main.TupleLoop none:<span class="hljs-number">0</span> (<span class="hljs-number">3</span> callees from <span class="hljs-number">1</span> callers)
</code></pre>
<p>And we can verify with Cthulhu.jl
that there are no more dynamic dispatches in <code>tupleloop_generated</code>:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> Cthulhu

julia&gt; ascend(mtrigs[<span class="hljs-number">2</span>].itrigs[<span class="hljs-number">1</span>])
Choose a call <span class="hljs-keyword">for</span> analysis (q to quit):
 &gt;   tupleloop_generated(::<span class="hljs-built_in">Tuple</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">Float64</span>, <span class="hljs-built_in">String</span>, <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}, <span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, <span class="hljs-built_in">Int64</span>}})
       eval at ./boot.jl:<span class="hljs-number">430</span> =&gt; eval_user_input(::<span class="hljs-built_in">Any</span>, ::REPL.REPLBackend, ::<span class="hljs-built_in">Module</span>) at /cache/build/tester-amdci5-<span class="hljs-number">12</span>/julialang/julia-release-<span class="hljs-number">1</span>-dot-<span class="hljs-number">1</span>
tupleloop_generated(t::<span class="hljs-built_in">Tuple</span>) @ Main.TupleLoop /path/to/TupleLoop.jl:<span class="hljs-number">11</span>
Variables
  <span class="hljs-comment">#self#::Core.Const(Main.TupleLoop.tupleloop_generated)</span>
  t::<span class="hljs-built_in">Tuple</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">Float64</span>, <span class="hljs-built_in">String</span>, <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}, <span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, <span class="hljs-built_in">Int64</span>}}

Body::Core.Const(<span class="hljs-literal">nothing</span>)
    @ /path/to/TupleLoop.jl:<span class="hljs-number">11</span> within <span class="hljs-string">`tupleloop_generated`</span>
   ┌ @ /path/to/TupleLoop.jl:<span class="hljs-number">15</span> within <span class="hljs-string">`macro expansion`</span>
<span class="hljs-number">1</span> ─│ %<span class="hljs-number">1</span>  = Main.TupleLoop.do_something::Core.Const(Main.TupleLoop.do_something)
│  │ %<span class="hljs-number">2</span>  = Base.getindex(t, <span class="hljs-number">1</span>)::<span class="hljs-built_in">Int64</span>
│  │       (%<span class="hljs-number">1</span>)(%<span class="hljs-number">2</span>)
│  │ %<span class="hljs-number">4</span>  = Main.TupleLoop.do_something::Core.Const(Main.TupleLoop.do_something)
│  │ %<span class="hljs-number">5</span>  = Base.getindex(t, <span class="hljs-number">2</span>)::<span class="hljs-built_in">Float64</span>
│  │       (%<span class="hljs-number">4</span>)(%<span class="hljs-number">5</span>)
│  │ %<span class="hljs-number">7</span>  = Main.TupleLoop.do_something::Core.Const(Main.TupleLoop.do_something)
│  │ %<span class="hljs-number">8</span>  = Base.getindex(t, <span class="hljs-number">3</span>)::<span class="hljs-built_in">String</span>
│  │       (%<span class="hljs-number">7</span>)(%<span class="hljs-number">8</span>)
│  │ %<span class="hljs-number">10</span> = Main.TupleLoop.do_something::Core.Const(Main.TupleLoop.do_something)
│  │ %<span class="hljs-number">11</span> = Base.getindex(t, <span class="hljs-number">4</span>)::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}
│  │       (%<span class="hljs-number">10</span>)(%<span class="hljs-number">11</span>)
│  │ %<span class="hljs-number">13</span> = Main.TupleLoop.do_something::Core.Const(Main.TupleLoop.do_something)
│  │ %<span class="hljs-number">14</span> = Base.getindex(t, <span class="hljs-number">5</span>)::<span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, <span class="hljs-built_in">Int64</span>}
│  │       (%<span class="hljs-number">13</span>)(%<span class="hljs-number">14</span>)
│  │ @ /path/to/TupleLoop.jl:<span class="hljs-number">16</span> within <span class="hljs-string">`macro expansion`</span>
└──│       <span class="hljs-keyword">return</span> Main.TupleLoop.<span class="hljs-literal">nothing</span>
   └
</code></pre>
<p>Here we have to examine the so-called "Typed" code
(since the source code was generated via metaprogramming),
but we see that there is no loop in this code.
As a result,
each call to <code>do_something</code>
is a static dispatch
with a concretely inferred input.
Hooray!</p>
<h1 id="heading-avoid-dictionaries-that-map-to-functions">Avoid Dictionaries that Map to Functions</h1>
<p>The client registered functions
for updating their simulation visualization
via a dictionary that mapped from a <code>String</code> key
to the appropriate update function.</p>
<p>Sometimes it can be convenient
to have a dictionary of functions,
for example:</p>
<pre><code class="lang-julia">d = <span class="hljs-built_in">Dict</span>{<span class="hljs-built_in">String</span>, <span class="hljs-built_in">Function</span>}(
    <span class="hljs-string">"sum"</span> =&gt; sum,
    <span class="hljs-string">"norm"</span> =&gt; norm,
    <span class="hljs-comment"># etc.</span>
)
x = [<span class="hljs-number">1.0</span>, <span class="hljs-number">2.0</span>, <span class="hljs-number">3.0</span>]
d[<span class="hljs-string">"sum"</span>](x) <span class="hljs-comment"># Compute the sum of the elements of `x`</span>
d[<span class="hljs-string">"norm"</span>](x) <span class="hljs-comment"># Compute the norm of `x`</span>
</code></pre>
<p>This allows you to write generic code
that can call the appropriate intermediate function
based on a key supplied by the caller.</p>
<p>You could use multiple dispatch to achieve similar results,
but it requires a bit more thought
to organize the code in such a way
that ensures the caller has access to the types to dispatch on.</p>
<p>As another alternative,
you could also have the caller
just pass in the function to call.
But again,
it takes a bit more effort
to organize the code
to make it work.</p>
<p>Unfortunately,
using a dictionary in this way
is type-unstable:
Julia can't figure out what function
will be called
until runtime,
when the precise dictionary key is known.
And since the function is unknown,
the type of the result of the function is also unknown.</p>
<p>One partial solution
is to use type annotations:</p>
<pre><code class="lang-julia">d[func_key](x)::<span class="hljs-built_in">Float64</span>
</code></pre>
<p>Then at least the output of the function
can be used in a type-stable way.
However,
this only works if all the functions in the dictionary
return values of the same type
given the same input.</p>
<p>A slightly less stringent alternative
is to explicitly convert
the result to a common type,
but this requires conversion to be possible.</p>
<p>Our client updated a dictionary
using the output of the registered function,
so the full solution we implemented for our client
was to remove the dictionary
and instead have explicit branches in the code.
That is,
instead of</p>
<pre><code class="lang-julia">updates[key] = d[key](updates[key])
</code></pre>
<p>we had</p>
<pre><code class="lang-julia"><span class="hljs-keyword">if</span> key == <span class="hljs-string">"k1"</span>
    updates[key] = f1(updates[key]::OUTPUT_TYPE_F1)
<span class="hljs-keyword">elseif</span> key == <span class="hljs-string">"k2"</span>
    updates[key] = f2(updates[key]::OUTPUT_TYPE_F2)
<span class="hljs-comment"># Additional branches as needed</span>
<span class="hljs-keyword">end</span>
</code></pre>
<p>Note that we needed the type annotations
<code>OUTPUT_TYPE_F1</code> and <code>OUTPUT_TYPE_F2</code>
because <code>updates</code> had an abstractly typed value type.
The key that makes this solution work
is recognizing that in the first branch
<code>updates[key]</code> is the output of <code>f1</code>
from the previous time step in the simulation
(and similarly for the other branches).
Therefore,
in each branch we know what the type of <code>updates[key]</code> is,
so we can give the compiler that type information.</p>
<p>Also note that the previously mentioned ideas
of using multiple dispatch
or just passing in the functions to use
don't work in this situation
without removing the <code>updates</code> dictionary
(and refactoring the affected code).</p>
<p>Making the above change
completely removed type-instabilities
in that part of the client's code.</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we explored a few problems
related to type-stability
that we helped our client resolve.
We were able to diagnose issues
using SnoopCompile.jl and Cthulhu.jl
and make code improvements
that enabled our client's
most important simulation scenario
to pass tests for the first time.
This was possible
because our solutions enabled the scenario
to run faster than real-time
and reduced compilation time by 30%.</p>
<p>Do you have type-instabilities that plague your Julia code?
<a target="_blank" href="https://glcs.io/software-development/">Contact us</a>, and we can help you out!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://timholy.github.io/SnoopCompile.jl/stable/">SnoopCompile.jl Docs</a><ul>
<li>Documentation for SnoopCompile.jl.</li>
</ul>
</li>
<li><a target="_blank" href="https://github.com/JuliaDebug/Cthulhu.jl">Cthulhu.jl Docs</a><ul>
<li>Documentation (the package's README) for Cthulhu.jl.</li>
</ul>
</li>
<li><a target="_blank" href="https://docs.julialang.org/en/v1/manual/performance-tips/">Julia Performance Tips</a><ul>
<li>Very good tips for improving the performance of Julia code.</li>
</ul>
</li>
<li><a target="_blank" href="https://glcs.io/software-development/">GLCS Software Development</a><ul>
<li>Connect with us for Julia development help.</li>
</ul>
</li>
<li><a target="_blank" href="https://blog.glcs.io/juliacon-2025-preview">Upcoming JuliaCon Talk Announcement</a><ul>
<li>Check out our JuliaCon 2025 talk announcement!</li>
</ul>
</li>
</ul>
<p>The cover image background
of a person at the start of a race
was found at <a target="_blank" href="https://image.freepik.com/free-photo/male-runner-starting-sprint-from-starting-line_23-2148162153.jpg">freepik.com</a>.</p>
<p>The cartoon about looping over <code>Tuple</code>s
was generated with AI.</p>
]]></content:encoded></item><item><title><![CDATA[Type-Stability with SnoopCompile.jl and Cthulhu.jl for High-Performance Julia]]></title><description><![CDATA[This post was written by Steven Whitaker.

The Julia programming language
is a high-level language
that boasts the ability to achieve C-like speeds.
Julia can run fast,
despite being a dynamic language,
because it is compiled
and has smart type-infer...]]></description><link>https://blog.glcs.io/type-stability</link><guid isPermaLink="true">https://blog.glcs.io/type-stability</guid><category><![CDATA[Julia]]></category><category><![CDATA[performance]]></category><category><![CDATA[General Programming]]></category><category><![CDATA[Types]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Wed, 21 May 2025 15:25:27 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1746482085556/ydXlhcdbL.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by Steven Whitaker.</p>
</blockquote>
<p>The <a target="_blank" href="https://julialang.org/">Julia programming language</a>
is a high-level language
that boasts the ability to achieve C-like speeds.
Julia can run fast,
despite being a dynamic language,
because it is compiled
and has smart type-inference.</p>
<p>Type-inference is the process
of the Julia compiler
reasoning about the types of objects,
enabling compilation to create efficient machine code
for the types at hand.</p>
<p>However,
Julia code can be written in a way
that prevents type-inference from succeeding---specifically,
by writing type-unstable functions.
(I'll explain type-instability later on.)
When type-inference fails,
Julia has to compile generic machine code
that can handle any type of input,
sacrificing the C-like performance
and instead running more like an interpreted language
such as Python.</p>
<p>Fortunately,
there are tools that Julia developers can use
to track down what code causes type-inference to fail.
Among the most powerful of these tools
are <a target="_blank" href="https://github.com/timholy/SnoopCompile.jl">SnoopCompile.jl</a> and <a target="_blank" href="https://github.com/JuliaDebug/Cthulhu.jl">Cthulhu.jl</a>.
Using these tools,
developers can fix type-inference failures
and restore the C-like performance
they were hoping to achieve.</p>
<p>In this post,
we will learn about type-stability
and how it impacts performance.
Then we will see how to use
SnoopCompile.jl and Cthulhu.jl
to locate and resolve type-instabilities.</p>
<h1 id="heading-type-stability">Type-Stability</h1>
<p>A function is type-stable
if the type of the function's output can be concretely determined
given the types of the inputs to the function,
without any runtime information.</p>
<p>To illustrate, consider the following function methods:</p>
<pre><code class="lang-julia">f(x::<span class="hljs-built_in">Int</span>) = <span class="hljs-string">"stable"</span>
f(x::<span class="hljs-built_in">Float64</span>) = rand(<span class="hljs-built_in">Bool</span>) ? <span class="hljs-number">1</span> : <span class="hljs-number">2.0</span>
</code></pre>
<p>In this example,
if we call <code>f(x)</code> where <code>x</code> is an <code>Int</code>,
the compiler can figure out
that the output will be a <code>String</code>
without knowing the value of <code>x</code>,
so <code>f(x::Int)</code> is type-stable.
In other words,
it doesn't matter whether <code>x</code> is <code>1</code>, <code>-1</code>, or <code>176859431</code>;
the return value will always be a <code>String</code>
if <code>x</code> is an <code>Int</code>.</p>
<p>On the other hand,
if we call <code>f(x)</code> where <code>x</code> is a <code>Float64</code>,
the compiler doesn't know
whether the output will be an <code>Int</code> or a <code>Float64</code>
because that depends on the result of <code>rand(Bool)</code>,
which is computed at runtime.
Therefore,
<code>f(x::Float64)</code> is type-unstable.</p>
<p>Here's a more subtle example of type-instability:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">function</span> g(x)
    <span class="hljs-keyword">if</span> x &lt; <span class="hljs-number">0</span>
        <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>
    <span class="hljs-keyword">else</span>
        <span class="hljs-keyword">return</span> x
    <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>
</code></pre>
<p>In this example,
<code>g(x)</code> is type-unstable
because the output will either be an <code>Int</code>
or whatever the type of <code>x</code> is,
and it all depends on the value of <code>x</code>,
which isn't known at compile time.
(Note, however,
that <code>g(x)</code> <em>is</em> type-stable
<em>if</em> <code>x</code> is an <code>Int</code>
because then both branches of the <code>if</code> statement
return the same type of value.)</p>
<p>And sometimes a function that might look type-stable
can be type-unstable
depending on the input.
For example:</p>
<pre><code class="lang-julia">h(x::<span class="hljs-built_in">Array</span>) = x[<span class="hljs-number">1</span>] + <span class="hljs-number">1</span>
</code></pre>
<p>In this case,
<code>h([1])</code> is type-stable,
but <code>h(Any[1])</code> is not.
Why?
Because with <code>h([1])</code>,
<code>x</code> is a <code>Vector{Int}</code>,
so the compiler knows
that the type of <code>x[1]</code> will be <code>Int</code>.
On the other hand,
with <code>h(Any[1])</code>,
<code>x</code> is a <code>Vector{Any}</code>,
so the compiler thinks
<code>x[1]</code> could be of any type.</p>
<p>To reiterate:
a function is type-stable
if the compiler can figure out
the concrete type of the return value
given only the types of the inputs,
without any runtime information.</p>
<h1 id="heading-when-compilation-occurs">When Compilation Occurs</h1>
<p>Another aspect of type-inference
that is useful to understand
is when compilation (including type-inference) occurs.</p>
<p>In a static language like C,
an entire program is compiled
before any code runs.
This is possible because the types of all variables
are known in advance,
so machine code specific to those types
can be generated in advance.</p>
<p>In an interpreted language like Python,
no code is ever compiled
because variables are dynamic,
meaning their types aren't really ever known
until variables are actually used
(i.e., during runtime).</p>
<p>Julia programs can lie
pretty much anywhere between
the extremes of C and Python,
and where on that spectrum a program lies
depends on type-stability.</p>
<p>In a just-in-time (JIT) compiled language like Julia,
compilation occurs once types are known.</p>
<ul>
<li>If a Julia program is completely type-stable,
type-inference can figure out the types of all variables
in the program
<em>before running any code</em>.
As a result,
the entire program can be compiled
as if it were written in a static language.
This is what allows Julia to achieve C-like speeds.</li>
<li>If a Julia program is entirely type-unstable,
every function has to be compiled individually.
In this case,
compilation occurs at the moment
the function is called
because that's when the runtime information
of all the input types is finally known.
Furthermore,
the machine code for a type-unstable function
cannot be efficient
because it must be able to
handle a wide range
of potential types.
As a result,
despite being compiled,
the code runs essentially like an interpreted language.</li>
</ul>
<p>Running a Julia program with type-instabilities
is like driving down the street
and hitting all the red lights.
Julia will compile all the code
for which type-inference succeeds
and then start running.
But when the program reaches a function call
that could not be inferred,
that's like a car stopping at a red light;
Julia stops running the code
to compile the function call
now that it knows the runtime types of the inputs.
After the function is compiled,
the program can continue execution,
like how the car can continue driving
once the light turns green.</p>
<h1 id="heading-type-stability-and-performance">Type-Stability and Performance</h1>
<p>As this analogy implies,
and as I've stated before,
type-stability has performance implications.
Type-instabilities can cause various performance degradations, including:</p>
<ul>
<li><em>Dynamic (aka runtime) dispatch.</em>
If the compiler knows the input types to a function,
the generated machine code can include a call
to the specific method determined by those types.
But if the compiler doesn't know those types,
the machine code has to include instructions
to perform dynamic dispatch.
As a result,
rather than jumping directly to the correct method,
Julia has to spend runtime CPU cycles
to look up the correct method to call.</li>
<li><em>Increased memory allocations.</em>
If the compiler doesn't know what type
a variable will have,
it's impossible to put it in a register
or even allocate stack space for it.
As a result,
it has to be heap-allocated
and managed by the garbage collector.</li>
<li><em>Suboptimal compiled code.</em>
Imagine summing the contents of an array in a loop.
If the compiler knows the array contains just <code>Float64</code>s,
it can perform optimizations to compute the sum
as efficiently as possible,
e.g., by using specialized CPU instructions.
Such optimizations cannot occur
if the compiler doesn't know what type of data
it's working with.</li>
</ul>
<p>Here's an example
(inspired by this <a target="_blank" href="https://stackoverflow.com/a/56586862">Stack Overflow answer</a>)
that illustrates the impact
type-stability can have on performance:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Type-unstable because `x` is a non-constant global variable.</span>
x = <span class="hljs-number">0</span>
f() = [i + x <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:<span class="hljs-number">10</span>^<span class="hljs-number">6</span>]

<span class="hljs-comment"># Type-stable because `y` is constant and therefore always an `Int`.</span>
<span class="hljs-keyword">const</span> y = <span class="hljs-number">0</span>
g() = [i + y <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:<span class="hljs-number">10</span>^<span class="hljs-number">6</span>]

<span class="hljs-keyword">using</span> BenchmarkTools
<span class="hljs-meta">@btime</span> f() <span class="hljs-comment">#  16.868 ms (1998983 allocations: 38.13 MiB)</span>
<span class="hljs-meta">@btime</span> g() <span class="hljs-comment"># 190.755 μs (3 allocations: 7.63 MiB)</span>
</code></pre>
<p>Note that the type-unstable version
is two orders of magnitude slower!
Also note, however,
that this is an extreme example
where essentially the entire computation
is type-unstable.
In practice,
some type-instabilities will not impact performance very much.
Type-stability mainly matters in "hot loops",
i.e., in parts of the code
that run very frequently
and contribute to a significant portion
of the program's overall run time.</p>
<h1 id="heading-detecting-type-instabilities-with-snoopcompilejl">Detecting Type-Instabilities with SnoopCompile.jl</h1>
<p>Now the question is,
how do we know if or where our code is type-unstable?
One excellent tool
for discovering where type-instabilities occur in code
is SnoopCompile.jl.
This package provides functionality
for reporting how many times
a Julia program needs to stop to compile code.
(Remember that a perfectly type-stable program
can compile everything in one go,
so every time execution stops for compilation
indicates a type-instability was encountered.)</p>
<p>Let's use an example to illustrate how to use SnoopCompile.jl.
First, the code we want to analyze:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">module</span> Original

<span class="hljs-keyword">struct</span> Alg1 <span class="hljs-keyword">end</span>
<span class="hljs-keyword">struct</span> Alg2 <span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> process(alg::<span class="hljs-built_in">String</span>)

    <span class="hljs-keyword">if</span> alg == <span class="hljs-string">"alg1"</span>
        a = Alg1()
    <span class="hljs-keyword">elseif</span> alg == <span class="hljs-string">"alg2"</span>
        a = Alg2()
    <span class="hljs-keyword">end</span>

    data = get_data(a)

    result = _process(a, data)

    <span class="hljs-keyword">return</span> result

<span class="hljs-keyword">end</span>

get_data(::Alg1) = (<span class="hljs-number">1</span>, <span class="hljs-number">1.0</span>, <span class="hljs-number">0x00</span>, <span class="hljs-number">1.0f0</span>, <span class="hljs-string">"hi"</span>, [<span class="hljs-number">0.0</span>], (<span class="hljs-number">1</span>, <span class="hljs-number">2.0</span>))

<span class="hljs-keyword">function</span> _process(::Alg1, data)

    val = data[<span class="hljs-number">1</span>]
    <span class="hljs-keyword">if</span> val &lt; <span class="hljs-number">0</span>
        val = -val
    <span class="hljs-keyword">end</span>

    result = map(data) <span class="hljs-keyword">do</span> d
        process_item(d, val)
    <span class="hljs-keyword">end</span>

    <span class="hljs-keyword">return</span> result

<span class="hljs-keyword">end</span>

process_item(d::<span class="hljs-built_in">Int</span>, val) = d + val
process_item(d::<span class="hljs-built_in">AbstractFloat</span>, val) = d * val
process_item(d::<span class="hljs-built_in">Unsigned</span>, val) = d - val
process_item(d::<span class="hljs-built_in">String</span>, val) = d * string(val)
process_item(d::<span class="hljs-built_in">Array</span>, val) = d .+ val
process_item(d::<span class="hljs-built_in">Tuple</span>, val) = d .- val

get_data(::Alg2) = rand(<span class="hljs-number">5</span>)
_process(::Alg2, data) = error(<span class="hljs-string">"not implemented"</span>)

<span class="hljs-keyword">end</span>
</code></pre>
<p>We'll use the <code>@snoop_inference</code> macro
to analyze this code.
Note that this macro should be used
in a fresh Julia session
(after loading the code to be analyzed,
but before running anything)
to get the most accurate analysis results.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> SnoopCompileCore

julia&gt; tinf = <span class="hljs-meta">@snoop_inference</span> Original.process(<span class="hljs-string">"alg1"</span>);

julia&gt; <span class="hljs-keyword">using</span> SnoopCompile

julia&gt; tinf
InferenceTimingNode: <span class="hljs-number">0.144601</span>/<span class="hljs-number">0.247183</span> on Core.Compiler.Timings.ROOT() with <span class="hljs-number">8</span> direct children
</code></pre>
<p>You can consult the <a target="_blank" href="https://timholy.github.io/SnoopCompile.jl/stable/">SnoopCompile.jl docs</a>
for more information about what we just did,
but for now,
notice that displaying <code>tinf</code> revealed 8 direct children.
That means compilation occurred 8 times
while running <code>Original.process("alg1")</code>.
If this function were completely type-stable,
<code>@snoop_inference</code> would have reported just 1 direct child,
so we know there are type-instabilities somewhere.</p>
<p>Each of the 8 direct children
is an inference trigger,
i.e., calling the specific method
indicated in the inference trigger
caused compilation to occur.
We can collect the inference triggers:</p>
<pre><code class="lang-julia">julia&gt; itrigs = inference_triggers(tinf)
 Inference triggered to call process(::<span class="hljs-built_in">String</span>) from eval (./boot.jl:<span class="hljs-number">430</span>) inlined into REPL.eval_user_input(::<span class="hljs-built_in">Any</span>, ::REPL.REPLBackend, ::<span class="hljs-built_in">Module</span>) (/cache/build/tester-amdci5-<span class="hljs-number">12</span>/julialang/julia-release-<span class="hljs-number">1</span>-dot-<span class="hljs-number">11</span>/usr/share/julia/stdlib/v1<span class="hljs-number">.11</span>/REPL/src/REPL.jl:<span class="hljs-number">261</span>)
 Inference triggered to call process_item(::<span class="hljs-built_in">Int64</span>, ::<span class="hljs-built_in">Int64</span>) from <span class="hljs-comment">#1 (./REPL[1]:30) with specialization (::var"#1#2")(::Int64)</span>
 Inference triggered to call process_item(::<span class="hljs-built_in">Float64</span>, ::<span class="hljs-built_in">Int64</span>) from <span class="hljs-comment">#1 (./REPL[1]:30) with specialization (::var"#1#2")(::Float64)</span>
 Inference triggered to call process_item(::<span class="hljs-built_in">UInt8</span>, ::<span class="hljs-built_in">Int64</span>) from <span class="hljs-comment">#1 (./REPL[1]:30) with specialization (::var"#1#2")(::UInt8)</span>
 Inference triggered to call process_item(::<span class="hljs-built_in">Float32</span>, ::<span class="hljs-built_in">Int64</span>) from <span class="hljs-comment">#1 (./REPL[1]:30) with specialization (::var"#1#2")(::Float32)</span>
 Inference triggered to call process_item(::<span class="hljs-built_in">String</span>, ::<span class="hljs-built_in">Int64</span>) from <span class="hljs-comment">#1 (./REPL[1]:30) with specialization (::var"#1#2")(::String)</span>
 Inference triggered to call process_item(::<span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}, ::<span class="hljs-built_in">Int64</span>) from <span class="hljs-comment">#1 (./REPL[1]:30) with specialization (::var"#1#2")(::Vector{Float64})</span>
 Inference triggered to call process_item(::<span class="hljs-built_in">Tuple</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">Float64</span>}, ::<span class="hljs-built_in">Int64</span>) from <span class="hljs-comment">#1 (./REPL[1]:30) with specialization (::var"#1#2")(::Tuple{Int64, Float64})</span>
</code></pre>
<p>The first inference trigger
corresponds to compiling the top-level <code>process</code> function
we called
(this is the inference trigger we always expect to see).
But then it looks like Julia had to stop running
to compile several different methods of <code>process_item</code>.</p>
<p>Inference triggers tell us that type-instabilities existed
when calling the given functions,
but what we really want to know is
where these type-instabilities originated.
You'll note that each displayed inference trigger above
also indicates the calling function
by specifying <code>from &lt;calling function&gt;</code>.
(Note that the <code>from #1</code> in the above example
indicates <code>process_item</code> was called
from an anonymous function.)</p>
<p>We can use <code>accumulate_by_source</code>
to get an aggregated view
of what functions made calls via dynamic dispatch:</p>
<pre><code class="lang-julia">julia&gt; mtrigs = accumulate_by_source(<span class="hljs-built_in">Method</span>, itrigs)
<span class="hljs-number">2</span>-element <span class="hljs-built_in">Vector</span>{SnoopCompile.TaggedTriggers{<span class="hljs-built_in">Method</span>}}:
 eval_user_input(ast, backend::REPL.REPLBackend, mod::<span class="hljs-built_in">Module</span>) @ REPL ~/.julia/juliaup/julia-<span class="hljs-number">1.11</span><span class="hljs-number">.5</span>+<span class="hljs-number">0.</span>x64.linux.gnu/share/julia/stdlib/v1<span class="hljs-number">.11</span>/REPL/src/REPL.jl:<span class="hljs-number">247</span> (<span class="hljs-number">1</span> callees from <span class="hljs-number">1</span> callers)
 (::<span class="hljs-string">var"#1#2"</span>)(d) @ Main REPL[<span class="hljs-number">1</span>]:<span class="hljs-number">30</span> (<span class="hljs-number">7</span> callees from <span class="hljs-number">7</span> callers)
</code></pre>
<p>From this,
we can see that the example code
really has only one problematic function:
the anonymous function <code>var"#1#2"</code>.</p>
<h1 id="heading-diving-in-with-cthulhujl">Diving in with Cthulhu.jl</h1>
<p>Now that we have a rough idea
of where the type-instabilities come from,
we can drill down into the code
and pinpoint the precise causes
with Cthulhu.jl.
We can use the <code>ascend</code> function
on an inference trigger
to start investigating:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> Cthulhu

julia&gt; ascend(itrigs[<span class="hljs-number">2</span>]) <span class="hljs-comment"># Skip `itrigs[1]` because that's the top-level compilation that should always occur.</span>
</code></pre>
<p><code>ascend</code> provides a menu that shows <code>process_item</code>
and the anonymous function.
Select the anonymous function and press Enter.
Here's a screenshot of the Cthulhu output:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1746228129119/ENLnH247B.png?auto=format" alt="Cthulhu output" /></p>
<p>Reading the output of Cthulhu.jl
takes some time to get used to
(especially when it can't display source code,
as in this example),
but the main thing to remember
is that red is bad.
See the <a target="_blank" href="https://github.com/JuliaDebug/Cthulhu.jl">Cthulhu.jl README</a> for more information.</p>
<p>In this example,
the source of the type-instability
was fairly easy to pinpoint.
I annotated the screenshot
to indicate from where the type-instability arose,
which is this <code>Core.Box</code> thing.
These are always bad;
they are essentially containers
that can hold values of any type,
hence the type-instability that arises
when accessing the contents.
In this particular case,
<code>Core.getfield(#self#, :val)</code> indicates
<code>val</code> is a variable
that was captured by the anonymous function.</p>
<p>Once we determine what caused the type-instability,
the solution varies on a case-by-case basis.
Some potential solutions may include:</p>
<ul>
<li>Ensure different branches of an <code>if</code> statement
return data of the same type.</li>
<li>Add a type annotation to help out inference.
For example,<pre><code class="lang-julia">x = <span class="hljs-built_in">Any</span>[<span class="hljs-number">1</span>]
y = do_something(x[<span class="hljs-number">1</span>]::<span class="hljs-built_in">Int</span>)
</code></pre>
</li>
<li>Make sure a container type has a concrete element type.
For example, <code>x = Int[]</code>, not <code>x = []</code>.</li>
<li>Avoid loops over heterogeneous <code>Tuple</code>s.</li>
<li>Use <code>let</code> blocks to define closures.
(See <a target="_blank" href="https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-captured">this section</a> of the Julia manual for more details.)</li>
</ul>
<p>We'll use this last solution in our example.
The anonymous function in question
is defined by the <code>do</code> block in <code>_process</code>.
So, let's fix the issue of the captured variable <code>val</code>:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">module</span> Corrected

<span class="hljs-comment"># All other code is the same as in module `Original`.</span>

<span class="hljs-keyword">function</span> _process(::Alg1, data)

    val = data[<span class="hljs-number">1</span>]
    <span class="hljs-keyword">if</span> val &lt; <span class="hljs-number">0</span>
        val = -val
    <span class="hljs-keyword">end</span>

    f = <span class="hljs-keyword">let</span> val = val
        d -&gt; process_item(d, val)
    <span class="hljs-keyword">end</span>

    result = map(f, data)

    <span class="hljs-keyword">return</span> result

<span class="hljs-keyword">end</span>

<span class="hljs-comment"># All other code is the same as in module `Original`.</span>

<span class="hljs-keyword">end</span>
</code></pre>
<p>Now let's see what <code>@snoop_inference</code> says:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> SnoopCompileCore

julia&gt; tinf = <span class="hljs-meta">@snoop_inference</span> Corrected.process(<span class="hljs-string">"alg1"</span>);

julia&gt; <span class="hljs-keyword">using</span> SnoopCompile

julia&gt; tinf
InferenceTimingNode: <span class="hljs-number">0.113669</span>/<span class="hljs-number">0.183888</span> on Core.Compiler.Timings.ROOT() with <span class="hljs-number">1</span> direct children
</code></pre>
<p>There's just one direct child.
Hooray, type-stability!</p>
<p>Let's see how performance compares:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> BenchmarkTools

julia&gt; <span class="hljs-meta">@btime</span> Original.process(<span class="hljs-string">"alg1"</span>);
  <span class="hljs-number">220.506</span> ns (<span class="hljs-number">16</span> allocations: <span class="hljs-number">496</span> bytes)

julia&gt; <span class="hljs-meta">@btime</span> Corrected.process(<span class="hljs-string">"alg1"</span>);
  <span class="hljs-number">51.104</span> ns (<span class="hljs-number">8</span> allocations: <span class="hljs-number">288</span> bytes)
</code></pre>
<p>Awesome, the improved code is ~4 times faster!</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we learned about type-stability
and how type-instabilities affect compilation
and runtime performance.
We also walked through an example
that demonstrated how to use SnoopCompile.jl and Cthulhu.jl
to pinpoint the sources of type-instability in a program.
Even though the example in this post
was a relatively easy fix,
the principles discussed apply to more complicated programs as well.
And, of course,
check out the documentation for SnoopCompile.jl and Cthulhu.jl
for further examples to bolster your understanding.</p>
<p>Do you have type-instabilities that plague your Julia code?
<a target="_blank" href="https://glcs.io/software-development/">Contact us</a>, and we can help you out!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://timholy.github.io/SnoopCompile.jl/stable/">SnoopCompile.jl Docs</a><ul>
<li>Documentation for SnoopCompile.jl.</li>
</ul>
</li>
<li><a target="_blank" href="https://github.com/JuliaDebug/Cthulhu.jl">Cthulhu.jl Docs</a><ul>
<li>Documentation (the package's README) for Cthulhu.jl.</li>
</ul>
</li>
<li><a target="_blank" href="https://docs.julialang.org/en/v1/manual/performance-tips/">Julia Performance Tips</a><ul>
<li>Very good tips for improving the performance of Julia code.</li>
</ul>
</li>
<li><a target="_blank" href="https://glcs.io/software-development/">GLCS Software Development</a><ul>
<li>Connect with us for Julia development help.</li>
</ul>
</li>
<li><a target="_blank" href="https://blog.glcs.io/juliacon-2025-preview">Upcoming JuliaCon Talk Announcement</a><ul>
<li>Check out our JuliaCon 2025 talk announcement!</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Julia and MATLAB can coexist. Let us show you how.]]></title><description><![CDATA[This post was written by Steven Whitaker.
This post is a teaser for my JuliaCon 2025 talk.
If you want to see the talk itself,
check out this other blog post!

Have you ever wished you could start using the Julia programming language
to develop custo...]]></description><link>https://blog.glcs.io/juliacon-2025-preview</link><guid isPermaLink="true">https://blog.glcs.io/juliacon-2025-preview</guid><category><![CDATA[integration]]></category><category><![CDATA[Julia]]></category><category><![CDATA[Matlab]]></category><category><![CDATA[modeling]]></category><category><![CDATA[Open Source]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Mon, 21 Apr 2025 18:15:21 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1745255311774/Pax_MdfL6.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by Steven Whitaker.</p>
<p>This post is a teaser for my JuliaCon 2025 talk.
If you want to see the talk itself,
check out this other <a target="_blank" href="https://blog.glcs.io/juliacon-2025">blog post</a>!</p>
</blockquote>
<p>Have you ever wished you could start using the <a target="_blank" href="https://julialang.org/">Julia programming language</a>
to develop custom models?
Does the idea of replacing
outdated MATLAB<sup>®</sup> code and models
seem overwhelming?</p>
<p>Or maybe you don't plan to replace all MATLAB code,
but wouldn't it be exciting
to integrate Julia code
into existing workflows?</p>
<p>Also, technicalities aside,
how do you convince your colleagues
to make the leap
into the Julia ecosystem?</p>
<p>I'm excited to share
an announcement!
At this year's <a target="_blank" href="https://juliacon.org/2025/">JuliaCon</a>,
I will be speaking about
a small but significant step
you can take to start adding Julia
to your MATLAB codebase.</p>
<p>Great news!
You can transition to Julia smoothly
without completely abandoning MATLAB.
There's a straightforward method
to embrace the best of both worlds,
so you won't need
to rewrite your legacy models from scratch.</p>
<p>I'll give my full <a target="_blank" href="https://pretalx.com/juliacon-2025/talk/9WCTQR/">talk</a> in July,
but if you don't want to wait,
keep reading
for a sneak peek!</p>
<h1 id="heading-background">Background</h1>
<p>The <a target="_blank" href="https://glcs.io">GLCS.io</a> team
has been developing Julia-based solutions since 2015.
Over the past 4 years,
we've had the pleasure of redesigning and enhancing Julia models
for our clients in the finance, science, and engineering sectors.
Its incredible speed and versatility have transformed
how we tackle complex computations together.
However,
we also fully acknowledge the reality:
MATLAB continues to hold a significant place
in countless companies and research labs worldwide.</p>
<p>For decades,
MATLAB has been the benchmark
for data analysis, modeling, and simulation
across scientific and engineering fields.
There are likely hundreds of thousands of MATLAB licenses in use,
with millions of users
supporting an unimaginable number of models and codebases.</p>
<p>Even for a single company,
fully transitioning to Julia
often feels insurmountable.
The vast amount of existing MATLAB code
presents a significant challenge for any team considering adopting Julia.</p>
<p>Yet, unlocking Julia's power is vital for companies
aiming to excel in today's competitive landscape.
The question isn't <strong>if</strong> companies
should adopt Julia---it's <strong>how</strong> to do it.</p>
<p>Companies should blend Julia
with their MATLAB environments,
ensuring minimal disruption and optimal resource use.
This strategic integration
delivers meaningful gains
in accuracy, performance, and scalability
to transform operations and drive success.</p>
<h1 id="heading-juliacon-preview">JuliaCon Preview</h1>
<p>At JuliaCon,
I'm excited to share how you
can seamlessly integrate Julia
into existing MATLAB workflows---a process
that has delivered up to 100x performance improvements
while enhancing code quality and functionality.
Through a real-world model,
I'll highlight design patterns,
benchmark comparisons,
and valuable business case insights
to demonstrate the transformative potential of integrating Julia.</p>
<p>(Spoiler alert:
the performance improvement is <strong>more than 100x</strong>
for the example I will show at JuliaCon.)</p>
<h1 id="heading-what-we-offer">What We Offer</h1>
<p>Unlock high-performance modeling!
Our dedicated team is here
to integrate Julia into your MATLAB workflows.
Experience a strategic, step-by-step process tailored
for seamless Julia-MATLAB integration,
focused on efficiency and delivering measurable results:</p>
<ol>
<li><em>Tailored Assessment</em>:
Pinpoint challenges and opportunities for Julia to address.</li>
<li><em>MATLAB Benchmarking</em>:
Establish a performance baseline to measure progress and impact.</li>
<li><em>Julia Model Development</em>:
Convert MATLAB models to Julia
or assist your team in doing so.</li>
<li><em>Julia Integration</em>:
Combine Julia's capabilities with your existing MATLAB workflows for optimal results.</li>
<li><em>Roadmap Alignment</em>:
Validate performance improvements,
create a strong business case for leadership,
and agree on future support and innovation.</li>
</ol>
<p>Check out our <a target="_blank" href="https://glcs.io/julia-matlab">website</a> for more details.</p>
<h1 id="heading-summary">Summary</h1>
<p>By attending my JuliaCon talk,
you will learn
how to seamlessly integrate Julia
into your existing MATLAB codebase.
And by leveraging our support at GLCS,
you can adopt Julia
without disruption---unlocking faster computations,
improved models,
and better scalability
while retaining the strengths
of your MATLAB codebase.</p>
<p>Are you or someone you know
excited about harnessing the power of Julia and MATLAB together?
Let's connect! Schedule a consultation today
to discover incredible performance gains of 100x or more.</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://blog.glcs.io/juliacon-2025">JuliaCon 2025 Talk</a><ul>
<li>Check out the actual JuliaCon talk!
Find links to the recording, talk submission, and presentation slides.</li>
</ul>
</li>
<li><a target="_blank" href="https://glcs.io/julia-matlab">Julia and MATLAB<sup>®</sup> Integration at GLCS</a><ul>
<li>Our web page detailing our Julia and MATLAB<sup>®</sup> integration service.</li>
</ul>
</li>
</ul>
<p>MATLAB is a registered trademark
of The MathWorks, Inc.</p>
<p>Cover image:
The JuliaCon 2025 logo
was obtained from <a target="_blank" href="https://juliacon.org/2025/">https://juliacon.org/2025/</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Best Practices for Testing Your Julia Packages]]></title><description><![CDATA[This post was written by Steven Whitaker.

The Julia programming language
is a high-level language
that is known, at least in part,
for its excellent package manager
and outstanding composability.
(See another blog post that illustrates this composab...]]></description><link>https://blog.glcs.io/package-testing</link><guid isPermaLink="true">https://blog.glcs.io/package-testing</guid><category><![CDATA[Julia]]></category><category><![CDATA[library]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[package]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Thu, 26 Dec 2024 19:03:02 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1721947378410/FNqqlwn5u.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by Steven Whitaker.</p>
</blockquote>
<p>The <a target="_blank" href="https://julialang.org/">Julia programming language</a>
is a high-level language
that is known, at least in part,
for its excellent package manager
and outstanding composability.
(See <a target="_blank" href="https://blog.glcs.io/multiple-dispatch">another blog post</a> that illustrates this composability.)</p>
<p>Julia makes it super easy
for anybody to create their own package.
Julia's package manager enables easy development and testing of packages.
The ease of package development
encourages developers to split reusable chunks of code
into individual packages,
further enhancing Julia's composability.</p>
<p>In <a target="_blank" href="https://blog.glcs.io/package-creation">our previous post</a>,
we discussed how to create and register your own package.
However,
to encourage people to actually use your package,
it helps to have an assurance
that the package works.
This is why testing is important.
(Plus, you also want to know your package works, right?)</p>
<p>In this post,
we will learn about some of the tools
Julia provides for testing packages.
We will also learn how to use GitHub Actions
to run package tests
against commits and/or pull requests
to check whether code changes break package functionality.</p>
<p>This post assumes you are comfortable navigating the Julia REPL.
If you need a refresher,
check out <a target="_blank" href="https://blog.glcs.io/julia-repl">our post on the Julia REPL</a>.</p>
<h1 id="heading-example-package">Example Package</h1>
<p>We will use a custom package called Averages.jl
to illustrate how to implement testing in Julia.</p>
<p>The <code>Project.toml</code> looks like:</p>
<pre><code class="lang-toml"><span class="hljs-attr">name</span> = <span class="hljs-string">"Averages"</span>
<span class="hljs-attr">uuid</span> = <span class="hljs-string">"1fc6e63b-fe0f-463a-8652-42f2a29b8cc6"</span>
<span class="hljs-attr">version</span> = <span class="hljs-string">"0.1.0"</span>

<span class="hljs-section">[deps]</span>
<span class="hljs-attr">Statistics</span> = <span class="hljs-string">"10745b16-79ce-11e8-11f9-7d13ad32a3b2"</span>

<span class="hljs-section">[extras]</span>
<span class="hljs-attr">Test</span> = <span class="hljs-string">"8dfed614-e22c-5e08-85e1-65c5234f0b40"</span>

<span class="hljs-section">[targets]</span>
<span class="hljs-attr">test</span> = [<span class="hljs-string">"Test"</span>]
</code></pre>
<p>Note that this <code>Project.toml</code> has two more sections besides <code>[deps]</code>:</p>
<ul>
<li><code>[extras]</code> is used to indicate additional packages
that are not direct dependencies of the package.
In this example,
Test is not used in Averages.jl itself;
Test is used only when running tests.</li>
<li><code>[targets]</code> is used to specify what packages are used where.
In this example,
<code>test = ["Test"]</code> indicates that the Test package should be used
when testing Averages.jl.</li>
</ul>
<p>The actual package code in <code>src/Averages.jl</code> looks like:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">module</span> Averages

<span class="hljs-keyword">using</span> Statistics

<span class="hljs-keyword">export</span> compute_average

compute_average(x) = (check_real(x); mean(x))

<span class="hljs-keyword">function</span> compute_average(a, b...)

    check_real(a)

    N = length(a)
    <span class="hljs-keyword">for</span> (i, x) <span class="hljs-keyword">in</span> enumerate(b)
        check_real(x)
        check_length(i + <span class="hljs-number">1</span>, x, N)
    <span class="hljs-keyword">end</span>

    T = float(promote_type(eltype(a), eltype.(b)...))
    average = <span class="hljs-built_in">Vector</span>{T}(undef, N)
    average .= a
    <span class="hljs-keyword">for</span> x <span class="hljs-keyword">in</span> b
        average .+= x
    <span class="hljs-keyword">end</span>
    average ./= length(b) + <span class="hljs-number">1</span>

    <span class="hljs-keyword">return</span> a <span class="hljs-keyword">isa</span> <span class="hljs-built_in">Real</span> ? average[<span class="hljs-number">1</span>] : average

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> check_real(x)

    T = eltype(x)
    T &lt;: <span class="hljs-built_in">Real</span> || throw(<span class="hljs-built_in">ArgumentError</span>(<span class="hljs-string">"only real numbers are supported; unsupported type <span class="hljs-variable">$T</span>"</span>))

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> check_length(i, x, expected)

    N = length(x)
    N == expected || throw(<span class="hljs-built_in">DimensionMismatch</span>(<span class="hljs-string">"the length of input <span class="hljs-variable">$i</span> does not match the length of the first input: <span class="hljs-variable">$N</span> != <span class="hljs-variable">$expected</span>"</span>))

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>
</code></pre>
<h1 id="heading-adding-tests">Adding Tests</h1>
<p>Tests for a package live in <code>test/runtests.jl</code>.
(The file name is important!)
Inside this file there are two main testing utilities that are used:
<code>@testset</code> and <code>@test</code>.
Additionally,
<code>@test_throws</code> can also be useful for testing.
The Test standard library package provides all of these macros.</p>
<ul>
<li><code>@testset</code> is used to organize tests into cohesive blocks.</li>
<li><code>@test</code> is used to actually test package functionality.</li>
<li><code>@test_throws</code> is used to ensure the package throws the errors it should.</li>
</ul>
<p>Here is how <code>test/runtests.jl</code> might look for Averages.jl:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">using</span> Averages
<span class="hljs-keyword">using</span> Test

<span class="hljs-meta">@testset</span> <span class="hljs-string">"Averages.jl"</span> <span class="hljs-keyword">begin</span>

    a = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]
    b = [<span class="hljs-number">4.0</span>, <span class="hljs-number">5.0</span>, <span class="hljs-number">6.0</span>]
    c = (<span class="hljs-built_in">BigInt</span>(<span class="hljs-number">7</span>), <span class="hljs-number">8f0</span>, <span class="hljs-built_in">Int32</span>(<span class="hljs-number">9</span>))
    d = <span class="hljs-number">10</span>
    <span class="hljs-literal">e</span> = <span class="hljs-number">11.0</span>
    bad = [<span class="hljs-string">"hi"</span>, <span class="hljs-string">"hello"</span>, <span class="hljs-string">"hey"</span>]

    <span class="hljs-meta">@testset</span> <span class="hljs-string">"`compute_average(x)`"</span> <span class="hljs-keyword">begin</span>

        <span class="hljs-meta">@test</span> compute_average(a) == <span class="hljs-number">2</span>
        <span class="hljs-meta">@test</span> compute_average(a) <span class="hljs-keyword">isa</span> <span class="hljs-built_in">Float64</span>
        <span class="hljs-meta">@test</span> compute_average(c) == <span class="hljs-number">8</span>
        <span class="hljs-meta">@test</span> compute_average(c) <span class="hljs-keyword">isa</span> <span class="hljs-built_in">BigFloat</span>
        <span class="hljs-meta">@test</span> compute_average(d) == <span class="hljs-number">10</span>

    <span class="hljs-keyword">end</span>

    <span class="hljs-meta">@testset</span> <span class="hljs-string">"`compute_average(a, b...)`"</span> <span class="hljs-keyword">begin</span>

        <span class="hljs-meta">@test</span> compute_average(a, a) == a
        <span class="hljs-meta">@test</span> compute_average(a, b) == [<span class="hljs-number">2.5</span>, <span class="hljs-number">3.5</span>, <span class="hljs-number">4.5</span>]
        <span class="hljs-meta">@test</span> compute_average(a, b, c) == b
        <span class="hljs-meta">@test</span> compute_average(a, b, c) <span class="hljs-keyword">isa</span> <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Float64</span>}
        <span class="hljs-meta">@test</span> compute_average(b, b, b) == b
        <span class="hljs-meta">@test</span> compute_average(d, <span class="hljs-literal">e</span>) == <span class="hljs-number">10.5</span>

    <span class="hljs-keyword">end</span>

    <span class="hljs-meta">@testset</span> <span class="hljs-string">"Error Handling"</span> <span class="hljs-keyword">begin</span>

        <span class="hljs-meta">@test_throws</span> <span class="hljs-built_in">ArgumentError</span> compute_average(<span class="hljs-literal">im</span>)
        <span class="hljs-meta">@test_throws</span> <span class="hljs-built_in">ArgumentError</span> compute_average(a, bad)
        <span class="hljs-meta">@test_throws</span> <span class="hljs-built_in">ArgumentError</span> compute_average(bad, c)
        <span class="hljs-meta">@test_throws</span> <span class="hljs-built_in">DimensionMismatch</span> compute_average(a, b[<span class="hljs-number">1</span>:<span class="hljs-number">2</span>])
        <span class="hljs-meta">@test_throws</span> <span class="hljs-built_in">DimensionMismatch</span> compute_average(a[<span class="hljs-number">1</span>:<span class="hljs-number">2</span>], b)

    <span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>
</code></pre>
<p>Now let's look more closely at the macros used:</p>
<ul>
<li><code>@testset</code> can be given a label
to help organize the reporting Julia does
at the end of testing.
Besides that,
<code>@testset</code> wraps around a set of tests
(including other <code>@testset</code>s).</li>
<li><code>@test</code> is given an expression
that evaluates to a boolean.
If the boolean is <code>true</code>, the test passes;
otherwise it fails.</li>
<li><code>@test_throws</code> takes two inputs:
an error type and then an expression.
The test passes if the expression
throws an error of the given type.</li>
</ul>
<h2 id="heading-testing-against-other-packages">Testing Against Other Packages</h2>
<p>In some cases,
you might want to ensure your package
is compatible with a type defined in another package.
For our example,
let's test against <a target="_blank" href="https://github.com/JuliaArrays/StaticArrays.jl">StaticArrays.jl</a>.
Our package does not depend on StaticArrays.jl,
so we need to add it as a test-only dependency
by editing the <code>[extras]</code> and <code>[targets]</code> sections
in the <code>Project.toml</code>:</p>
<pre><code class="lang-toml"><span class="hljs-section">[extras]</span>
<span class="hljs-attr">StaticArrays</span> = <span class="hljs-string">"90137ffa-7385-5640-81b9-e52037218182"</span>
<span class="hljs-attr">Test</span> = <span class="hljs-string">"8dfed614-e22c-5e08-85e1-65c5234f0b40"</span>

<span class="hljs-section">[targets]</span>
<span class="hljs-attr">test</span> = [<span class="hljs-string">"StaticArrays"</span>, <span class="hljs-string">"Test"</span>]
</code></pre>
<p>(Note that I grabbed the UUID for StaticArrays.jl
from its <a target="_blank" href="https://github.com/JuliaArrays/StaticArrays.jl/blob/master/Project.toml"><code>Project.toml</code> on GitHub</a>.)</p>
<p>Then we can add some tests
to make sure <code>compute_average</code> is generic enough
to work with <code>StaticArray</code>s:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">using</span> Averages
<span class="hljs-keyword">using</span> Test
<span class="hljs-keyword">using</span> StaticArrays

<span class="hljs-meta">@testset</span> <span class="hljs-string">"Averages.jl"</span> <span class="hljs-keyword">begin</span>

    ⋮

    <span class="hljs-meta">@testset</span> <span class="hljs-string">"StaticArrays.jl"</span> <span class="hljs-keyword">begin</span>

        s = SA[<span class="hljs-number">12</span>, <span class="hljs-number">13</span>, <span class="hljs-number">14</span>]

        <span class="hljs-meta">@test</span> compute_average(s) == <span class="hljs-number">13</span>
        <span class="hljs-meta">@test</span> compute_average(s, s) == [<span class="hljs-number">12</span>, <span class="hljs-number">13</span>, <span class="hljs-number">14</span>]
        <span class="hljs-meta">@test</span> compute_average(a, b, s) == [<span class="hljs-number">17</span>/<span class="hljs-number">3</span>, <span class="hljs-number">20</span>/<span class="hljs-number">3</span>, <span class="hljs-number">23</span>/<span class="hljs-number">3</span>]
        <span class="hljs-meta">@test</span> compute_average(s, a, c) == [<span class="hljs-number">20</span>/<span class="hljs-number">3</span>, <span class="hljs-number">23</span>/<span class="hljs-number">3</span>, <span class="hljs-number">26</span>/<span class="hljs-number">3</span>]

    <span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>
</code></pre>
<h1 id="heading-running-tests-locally">Running Tests Locally</h1>
<p>Now Averages.jl is ready for testing.
To run package tests on your own computer,
start Julia, activate the package environment,
and then run <code>test</code> from the package prompt:</p>
<pre><code class="lang-julia">(<span class="hljs-meta">@v1</span>.X) pkg&gt; activate /path/to/Averages

(Averages) pkg&gt; test
</code></pre>
<p>The first thing <code>test</code> does
is set up a temporary package environment for testing
that includes the packages defined in the <code>test</code> target
in the <code>Project.toml</code>.
Then it runs the tests and displays the result:</p>
<pre><code class="lang-plaintext">     Testing Running tests...
Test Summary: | Pass  Total  Time
Averages.jl   |   20     20  0.7s
     Testing Averages tests passed
</code></pre>
<p>If a test fails,
the result looks like this:</p>
<pre><code class="lang-plaintext">     Testing Running tests...
`compute_average(a, b...)`: Test Failed at /path/to/Averages/test/runtests.jl:27
  Expression: compute_average(a, b) == [2.0, 3.5, 4.5]
   Evaluated: [2.5, 3.5, 4.5] == [2.0, 3.5, 4.5]

Stacktrace:
 [1] macro expansion
   @ /path/to/julia-1.X.Y/share/julia/stdlib/v1.X/Test/src/Test.jl:672 [inlined]
 [2] macro expansion
   @ /path/to/Averages/test/runtests.jl:27 [inlined]
 [3] macro expansion
   @ /path/to/julia-1.X.Y/share/julia/stdlib/v1.X/Test/src/Test.jl:1577 [inlined]
 [4] macro expansion
   @ /path/to/Averages/test/runtests.jl:26 [inlined]
 [5] macro expansion
   @ /path/to/julia-1.X.Y/share/julia/stdlib/v1.X/Test/src/Test.jl:1577 [inlined]
 [6] top-level scope
   @ /path/to/Averages/test/runtests.jl:7
Test Summary:                | Pass  Fail  Total  Time
Averages.jl                  |   19     1     20  0.9s
  `compute_average(x)`       |    5            5  0.1s
  `compute_average(a, b...)` |    5     1      6  0.6s
  Error Handling             |    5            5  0.0s
  StaticArrays.jl            |    4            4  0.2s
ERROR: LoadError: Some tests did not pass: 19 passed, 1 failed, 0 errored, 0 broken.
in expression starting at /path/to/Averages/test/runtests.jl:5
ERROR: Package Averages errored during testing
</code></pre>
<p>Some things to note:</p>
<ul>
<li>When all tests in a test set pass,
the test summary does not report the individual results
of nested test sets.
When a test fails,
results of nested test sets are reported individually
to report more precisely where the failure occurred.</li>
<li>When a test fails,
the file and line number of the failing test are reported,
along with the expression that failed.
This information is displayed
for all failures that occur.</li>
<li>The test summary reports how many tests passed and how many failed
in each test set,
in addition to how long each test set took.</li>
<li>Tests in a test set continue to run after a test fails.
To have a test set stop on failure,
use the <code>failfast</code> option:<pre><code class="lang-julia"><span class="hljs-meta">@testset</span> failfast = <span class="hljs-literal">true</span> <span class="hljs-string">"Averages.jl"</span> <span class="hljs-keyword">begin</span>
</code></pre>
(This option is available only in Julia 1.9 and later.)</li>
</ul>
<p>Now, when developing Averages.jl,
we can run the tests locally
to ensure we don't break any functionality!</p>
<h1 id="heading-running-tests-with-github-actions">Running Tests with GitHub Actions</h1>
<p>Besides running tests locally,
one can use GitHub Actions to run tests
on one of GitHub's servers.
One advantage
is that it enables automated testing
on various machines/operating systems
and across various Julia versions.
Automating tests in this way is an essential part of continuous integration (CI)
(so much so that the phrase "running CI"
is equivalent to "running tests via GitHub Actions",
even though CI technically involves more than just testing).</p>
<p>To enable testing via GitHub Actions,
we just need to add an appropriate <code>.yml</code> file
in the <code>.github/workflows</code> directory of our package.
As mentioned in <a target="_blank" href="https://blog.glcs.io/package-creation">our previous post</a>,
<a target="_blank" href="https://github.com/JuliaCI/PkgTemplates.jl">PkgTemplates.jl</a> can automatically generate
the necessary <code>.yml</code> file.
This is the default CI workflow generated by PkgTemplates.jl:</p>
<pre><code class="lang-yml"><span class="hljs-attr">name:</span> <span class="hljs-string">CI</span>
<span class="hljs-attr">on:</span>
  <span class="hljs-attr">push:</span>
    <span class="hljs-attr">branches:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-string">main</span>
    <span class="hljs-attr">tags:</span> [<span class="hljs-string">'*'</span>]
  <span class="hljs-attr">pull_request:</span>
  <span class="hljs-attr">workflow_dispatch:</span>
<span class="hljs-attr">concurrency:</span>
  <span class="hljs-comment"># Skip intermediate builds: always.</span>
  <span class="hljs-comment"># Cancel intermediate builds: only if it is a pull request build.</span>
  <span class="hljs-attr">group:</span> <span class="hljs-string">${{</span> <span class="hljs-string">github.workflow</span> <span class="hljs-string">}}-${{</span> <span class="hljs-string">github.ref</span> <span class="hljs-string">}}</span>
  <span class="hljs-attr">cancel-in-progress:</span> <span class="hljs-string">${{</span> <span class="hljs-string">startsWith(github.ref,</span> <span class="hljs-string">'refs/pull/'</span><span class="hljs-string">)</span> <span class="hljs-string">}}</span>
<span class="hljs-attr">jobs:</span>
  <span class="hljs-attr">test:</span>
    <span class="hljs-attr">name:</span> <span class="hljs-string">Julia</span> <span class="hljs-string">${{</span> <span class="hljs-string">matrix.version</span> <span class="hljs-string">}}</span> <span class="hljs-bullet">-</span> <span class="hljs-string">${{</span> <span class="hljs-string">matrix.os</span> <span class="hljs-string">}}</span> <span class="hljs-bullet">-</span> <span class="hljs-string">${{</span> <span class="hljs-string">matrix.arch</span> <span class="hljs-string">}}</span> <span class="hljs-bullet">-</span> <span class="hljs-string">${{</span> <span class="hljs-string">github.event_name</span> <span class="hljs-string">}}</span>
    <span class="hljs-attr">runs-on:</span> <span class="hljs-string">${{</span> <span class="hljs-string">matrix.os</span> <span class="hljs-string">}}</span>
    <span class="hljs-attr">timeout-minutes:</span> <span class="hljs-number">60</span>
    <span class="hljs-attr">permissions:</span> <span class="hljs-comment"># needed to allow julia-actions/cache to proactively delete old caches that it has created</span>
      <span class="hljs-attr">actions:</span> <span class="hljs-string">write</span>
      <span class="hljs-attr">contents:</span> <span class="hljs-string">read</span>
    <span class="hljs-attr">strategy:</span>
      <span class="hljs-attr">fail-fast:</span> <span class="hljs-literal">false</span>
      <span class="hljs-attr">matrix:</span>
        <span class="hljs-attr">version:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">'1.10'</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">'1.6'</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">'pre'</span>
        <span class="hljs-attr">os:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">ubuntu-latest</span>
        <span class="hljs-attr">arch:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-string">x64</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">uses:</span> <span class="hljs-string">actions/checkout@v4</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">uses:</span> <span class="hljs-string">julia-actions/setup-julia@v2</span>
        <span class="hljs-attr">with:</span>
          <span class="hljs-attr">version:</span> <span class="hljs-string">${{</span> <span class="hljs-string">matrix.version</span> <span class="hljs-string">}}</span>
          <span class="hljs-attr">arch:</span> <span class="hljs-string">${{</span> <span class="hljs-string">matrix.arch</span> <span class="hljs-string">}}</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">uses:</span> <span class="hljs-string">julia-actions/cache@v2</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">uses:</span> <span class="hljs-string">julia-actions/julia-buildpkg@v1</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">uses:</span> <span class="hljs-string">julia-actions/julia-runtest@v1</span>
</code></pre>
<p>For most users,
the most relevant fields to customize
are <code>version</code> and <code>os</code>
(under <code>jobs: test: strategy: matrix</code>).
Under <code>os</code>,
specify the operating systems to run tests on
(e.g., <code>ubuntu-latest</code>, <code>windows-latest</code>, <code>macOS-latest</code>).
Under <code>version</code>,
specify the versions of Julia to use when testing:</p>
<ul>
<li><code>'1.X'</code> means run on Julia 1.X.Y,
where Y is the largest patch
of Julia 1.X that has been released.
For example,
<code>'1.9'</code> means run on Julia 1.9.4.</li>
<li><code>'1'</code> means run on the latest stable version of Julia.</li>
<li><code>'pre'</code> means run on the latest pre-release version of Julia.</li>
<li><code>'lts'</code> means run on Julia's long-term support (LTS) version.</li>
</ul>
<p>Usually,
it makes sense just to test <code>'1'</code> and <code>'pre'</code>
to ensure compatibility with the current
and upcoming Julia versions.</p>
<p>One can also fine-tune the <code>version</code> and <code>os</code> fields,
as well as other fields,
when generating a package
with PkgTemplates.jl.
For example,
to generate the <code>.yml</code> file
to run tests only on Windows
with Julia 1.8 and the latest pre-release version of Julia:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">using</span> PkgTemplates
gha = GitHubActions(; linux = <span class="hljs-literal">false</span>, windows = <span class="hljs-literal">true</span>, extra_versions = [<span class="hljs-string">"1.8"</span>, <span class="hljs-string">"pre"</span>])
t = Template(; dir = <span class="hljs-string">"."</span>, plugins = [gha])
t(<span class="hljs-string">"MyPackage"</span>)
</code></pre>
<p>Note that the <code>.yml</code> file generated
will also include testing on Julia 1.6.
The <code>Template</code> constructor has a keyword argument <code>julia</code>
that sets the minimum version of Julia
you want your package to support,
and this version is included in testing.
As of this writing,
by default the minimum version is Julia 1.6.</p>
<p>See the PkgTemplates.jl docs
about <a target="_blank" href="https://juliaci.github.io/PkgTemplates.jl/stable/user/#PkgTemplates.Template"><code>Template</code></a> and <a target="_blank" href="https://juliaci.github.io/PkgTemplates.jl/stable/user/#PkgTemplates.GitHubActions"><code>GitHubActions</code></a>
for more details
on customizing the <code>.yml</code> file.
See also the <a target="_blank" href="https://docs.github.com/en/actions">GitHub Actions docs</a>,
and in particular the <a target="_blank" href="https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions">workflow syntax docs</a>,
for more details on what makes up the <code>.yml</code> file.
(Be warned, these docs are quite lengthy
and probably aren't practically useful
for most people to get a CI workflow up and running.
For a more approachable overview of the <code>.yml</code> file,
consider looking at this <a target="_blank" href="https://docs.github.com/en/actions/use-cases-and-examples/building-and-testing/building-and-testing-python">tutorial for building and testing Python</a>.)</p>
<p>Once we push <code>.github/workflows/CI.yml</code> to GitHub,
whenever branch <code>main</code> is pushed to,
or a pull request (PR) is opened or pushed to,
our package's tests will run.
This is the essence of CI:
continuously making sure changes we make to our code
integrate well with the code base
(i.e., don't break anything).
By running tests against PRs,
we can be sure changes made
don't break existing functionality.</p>
<p>One neat thing about GitHub Actions
is that GitHub provides a status badge/icon
that you can display in your package's README.
This badge lets people know</p>
<ol>
<li>that your package is regularly tested, and</li>
<li>whether the current state of your package passes those tests.</li>
</ol>
<p>In other words,
this badge is a good way
to boost confidence that your package is suitable for use.
You can add this badge to your package's README
by adding something like the following markdown:</p>
<pre><code class="lang-markdown">[<span class="hljs-string">![CI</span>](<span class="hljs-link">https://github.com/username/Averages.jl/actions/workflows/CI.yml/badge.svg</span>)](<span class="hljs-link">https://github.com/username/Averages.jl/actions/workflows/CI.yml</span>)
</code></pre>
<p>And it will display as follows:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721944941015/pqZBVp7JI.svg?auto=format" alt="GitHub CI badge" /></p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we learned how to add tests
to our own Julia package.
We also learned how to enable CI with GitHub Actions
to run our tests against code changes
to ensure our package remains in working order.</p>
<p>How difficult was it for you to set up CI for the first time?
Do you have any tips for beginners?
Let us know in the comments below!</p>
<p>Now that you have testing down,
learn how to conditionally extend package functionality
in another
<a target="_blank" href="https://blog.glcs.io/package-extensions">post about package extensions</a>!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://docs.julialang.org/en/v1/stdlib/Test/#Basic-Unit-Tests">Julia Testing Docs</a><ul>
<li>Official Julia documentation on testing.</li>
</ul>
</li>
<li><a target="_blank" href="https://juliaci.github.io/PkgTemplates.jl/stable/user/">PkgTemplates.jl Docs</a><ul>
<li>Documentation for PkgTemplates.jl,
including potential customizations to the generated CI workflow.</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[How to Create a Julia Package from Scratch]]></title><description><![CDATA[This post was written by Steven Whitaker.

The Julia programming language
is a high-level language
that is known, at least in part,
for its excellent package manager
and outstanding composability.
(See another blog post that illustrates this composab...]]></description><link>https://blog.glcs.io/package-creation</link><guid isPermaLink="true">https://blog.glcs.io/package-creation</guid><category><![CDATA[Julia]]></category><category><![CDATA[library]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[package]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Great Lakes Consulting]]></dc:creator><pubDate>Mon, 25 Nov 2024 17:19:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1721837971281/TmI6oW0bU.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by Steven Whitaker.</p>
</blockquote>
<p>The <a target="_blank" href="https://julialang.org/">Julia programming language</a>
is a high-level language
that is known, at least in part,
for its excellent package manager
and outstanding composability.
(See <a target="_blank" href="https://blog.glcs.io/multiple-dispatch">another blog post</a> that illustrates this composability.)</p>
<p>Julia makes it super easy
for anybody to create their own package.
Julia's package manager enables easy development and testing of packages.
The ease of package development
encourages developers to split reusable chunks of code
into individual packages,
further enhancing Julia's composability.</p>
<p>In this post,
we will learn what comprises a Julia package.
We will also discuss tools
that automate the creation of packages.
Finally,
we will talk about the basics of package development
and walk through how to publish (register) a package
for others to use.</p>
<p>This post assumes you are comfortable navigating the Julia REPL.
If you need a refresher,
check out <a target="_blank" href="https://blog.glcs.io/julia-repl">our post on the Julia REPL</a>.</p>
<h1 id="heading-components-of-a-package">Components of a Package</h1>
<p>Packages are easy enough to use:
just install them with <code>add PkgName</code> in the package prompt
and then run <code>using PkgName</code> in the julia prompt.
But what actually goes into a package?</p>
<p>Packages must follow a specific directory structure
and include certain information
to be recognized as a package by Julia.</p>
<p>Suppose we are creating a package called PracticePackage.jl.
First, we create a directory called <code>PracticePackage</code>.
This directory is the package root.
Within the root directory we need a file called <code>Project.toml</code>
and another directory called <code>src</code>.</p>
<p>The <code>Project.toml</code> requires the following information:</p>
<pre><code class="lang-toml"><span class="hljs-attr">name</span> = <span class="hljs-string">"PracticePackage"</span>
<span class="hljs-attr">uuid</span> = <span class="hljs-string">"11111111-2222-3333-aaaa-bbbbbbbbbbbb"</span>
<span class="hljs-attr">authors</span> = [<span class="hljs-string">"Your Name &lt;youremail@email.com&gt;"</span>]
<span class="hljs-attr">version</span> = <span class="hljs-string">"0.1.0"</span>
</code></pre>
<ul>
<li><code>uuid</code> stands for universally unique identifier,
and can be generated in Julia with
<code>using UUIDs; uuid4()</code>.
The purpose of a UUID is to allow different packages of the same name to coexist.</li>
<li><code>version</code> should be set to whatever version is appropriate for your package,
typically <code>"0.1.0"</code> or <code>"1.0.0"</code> for an initial release.
The versioning of Julia packages follows <a target="_blank" href="https://pkgdocs.julialang.org/v1/toml-files/#The-version-field">SemVer</a>.</li>
<li>The <code>Project.toml</code> will also include information
about package dependencies,
but more on that later.</li>
</ul>
<p>The <code>src</code> directory requires one Julia file
named <code>PracticePackage.jl</code>
that defines a module named <code>PracticePackage</code>:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">module</span> PracticePackage

<span class="hljs-comment"># Package code goes here.</span>

<span class="hljs-keyword">end</span>
</code></pre>
<p>So, the directory structure of the package
looks like the following:</p>
<pre><code class="lang-plaintext">PracticePackage
├── Project.toml
└── src
    └── PracticePackage.jl
</code></pre>
<p>And that's all there is to a package!
(Well, at least minimally.)</p>
<h2 id="heading-some-technicalities">Some Technicalities</h2>
<p>Feel free to skip this section,
but if you are curious about some technicalities
for what comprises a valid package,
read on.</p>
<ul>
<li>The <code>Project.toml</code> only needs the <code>name</code> and <code>uuid</code> fields
for Julia to recognize the package.
Without the <code>version</code> field,
Julia treats the version as <code>v0.0.0</code>.<ul>
<li>However, the <code>version</code> and <code>authors</code> fields are needed
to register the package.</li>
</ul>
</li>
<li>The name of the package root directory doesn't matter,
meaning it doesn't have to match the package name.
However, the <code>name</code> field in <code>Project.toml</code>
does have to match the name of the module
defined in <code>src/PracticePackage.jl</code>,
and the file name of <code>src/PracticePackage.jl</code> also has to match.<ul>
<li>For example,
we could change the name of the package
by setting <code>name = "Oops"</code> in <code>Project.toml</code>,
renaming <code>src/PracticePackage.jl</code> to <code>src/Oops.jl</code>,
and defining <code>module Oops</code> in that file.
We would not have to rename the package root directory
from <code>PracticePackage</code> to <code>Oops</code>
(though that would be a good idea to avoid confusion).</li>
</ul>
</li>
</ul>
<h1 id="heading-automatically-generating-packages">Automatically Generating Packages</h1>
<p>The basic structure of a package is pretty simple,
so there ought to be a way to automate it, right?
(I mean, who wants to manually generate a UUID?)
Good news: package creation can be automated!</p>
<h2 id="heading-package-generate-command">Package <code>generate</code> Command</h2>
<p>Julia comes with a <code>generate</code> package command built-in.
First, change directories
to where the package root directory should live,
then run <code>generate</code> in the Julia package prompt:</p>
<pre><code class="lang-julia">pkg&gt; generate PracticePackage
</code></pre>
<p>This command creates the package root directory <code>PracticePackage</code>
and the <code>Project.toml</code> and <code>src/PracticePackage.jl</code> files.
Some notes:</p>
<ul>
<li>The <code>Project.toml</code> is pre-filled with the correct fields and values,
including an automatically generated UUID.
When I ran <code>generate</code> on my computer,
it also pre-filled the <code>authors</code> field
with my name and email from my <code>~/.gitconfig</code> file.</li>
<li><code>src/PracticePackage.jl</code> is pre-filled
with a definition for the module <code>PracticePackage</code>.
It also defines a function <code>greet</code> in the module,
but typically you will replace that with your own code.</li>
</ul>
<h2 id="heading-pkgtemplatesjl">PkgTemplates.jl</h2>
<p>The <code>generate</code> command works fine,
but it's barebones.
For example,
if you are planning on hosting your package on GitHub,
you might want to include a GitHub Action
for continuous integration (CI),
so it would be nice
to automate the creation of the appropriate <code>.yml</code> file.
This is where <a target="_blank" href="https://github.com/JuliaCI/PkgTemplates.jl">PkgTemplates.jl</a> comes in.</p>
<p>PkgTemplates.jl is a normal Julia package,
so install it as usual and run <code>using PkgTemplates</code>.
Then we can create our PracticePackage.jl:</p>
<pre><code class="lang-julia">t = Template(; dir = <span class="hljs-string">"."</span>)
t(<span class="hljs-string">"PracticePackage"</span>)
</code></pre>
<p>Running this code creates the package
with the following directory structure:</p>
<pre><code class="lang-plaintext">PracticePackage
├── .git
│   ⋮
├── .github
│   ├── dependabot.yml
│   └── workflows
│       ├── CI.yml
│       ├── CompatHelper.yml
│       └── TagBot.yml
├── .gitignore
├── LICENSE
├── Manifest.toml
├── Project.toml
├── README.md
├── src
│   └── PracticePackage.jl
└── test
    └── runtests.jl
</code></pre>
<p>As you can see,
PkgTemplates.jl automatically generates a lot of files
that aid in following package development best practices,
like adding CI and tests.</p>
<p>Note that many options
can be supplied to <code>Template</code>
to customize what files are generated.
See the <a target="_blank" href="https://juliaci.github.io/PkgTemplates.jl/stable/user/">PkgTemplates.jl docs</a> for all the options.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721837925565/X64TzFFIS.png?auto=format" alt="Checklist of settings" /></p>
<h1 id="heading-basic-package-development">Basic Package Development</h1>
<p>Once your package is set up,
the next step is to actually add code.
Add the functions, types, constants, etc.
that your package needs
directly in the <code>PracticePackage</code> module in <code>src/PracticePackage.jl</code>,
or add additional files in the <code>src</code> directory
and <code>include</code> them in the module.
(See a <a target="_blank" href="https://blog.glcs.io/modules-variable-scope">previous blog post</a> for more information about modules,
though note that using modules directly works slightly differently
than using packages.)</p>
<p>To add dependencies for your package to use,
you will need to activate your project's package environment
and then add packages.
For example,
if you want your package to use the DataFrames.jl package,
start Julia and navigate to your package root directory.
Then, activate the package environment and add the package:</p>
<pre><code class="lang-julia">(<span class="hljs-meta">@v1</span>.X) pkg&gt; activate .

(PracticePackage) pkg&gt; add DataFrames
</code></pre>
<p>After this,
you will be able to include <code>using DataFrames</code>
in your package code
to enable the functionality provided by DataFrames.jl.</p>
<p>Adding packages after activating the package environment
edits the package's <code>Project.toml</code> file.
It adds a <code>[deps]</code> section
that lists the added packages and their UUIDs.
In the example above,
adding DataFrames.jl
adds the following lines to the <code>Project.toml</code> file:</p>
<pre><code class="lang-toml"><span class="hljs-section">[deps]</span>
<span class="hljs-attr">DataFrames</span> = <span class="hljs-string">"a93c6f00-e57d-5684-b7b6-d8193f3e46c0"</span>
</code></pre>
<p>(And <code>(PracticePackage) pkg&gt; rm DataFrames</code> would remove the <code>DataFrames = ...</code> line,
so it is best not to edit the <code>[deps]</code> section manually.)</p>
<p>Finally,
to try out your package,
activate your package environment (as above)
and then load your package as usual:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> PracticePackage <span class="hljs-comment"># No need to `add PracticePackage` first.</span>
</code></pre>
<p>Note that by default Julia will have to be restarted
to reload any changes you make to your package code.
If you want to avoid restarting Julia
whenever you make changes,
check out <a target="_blank" href="https://github.com/timholy/Revise.jl">Revise.jl</a>.</p>
<h1 id="heading-publishingregistering-a-package">Publishing/Registering a Package</h1>
<p>Once your package is in working order,
it is natural to want to publish the package
for others to use.</p>
<p>A package can be published
by registering it in a package registry,
which basically is a map that tells the Julia package manager
where to find a package
so it can be downloaded.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721837941732/4HcCoymOd.png?auto=format" alt="Treasure map" /></p>
<p>The <a target="_blank" href="https://github.com/JuliaRegistries/General">General registry</a> is the largest registry
as well as the default registry used by Julia;
most, if not all, of the most popular open-source packages
(DataFrames.jl, Plots.jl, StaticArrays.jl, ModelingToolkit.jl, etc.)
exist in General.
Once a package is registered in General,
it can be installed with <code>pkg&gt; add PracticePackage</code>.</p>
<p>(Note that if registering a package is not desired for some reason,
a package can be added via URL, e.g.,
<code>pkg&gt; add https://github.com/username/PracticePackage.jl</code>,
assuming the package is in a public git repository.
However,
the package manager has limited ability
to manage packages added in this way;
in particular,
managing package versions must be done manually.)</p>
<p>The most common way
to register a package in General
is to use <a target="_blank" href="https://github.com/JuliaRegistries/Registrator.jl">Registrator.jl</a> as a GitHub App.
See the README for detailed instructions,
but the process basically boils down to:</p>
<ol>
<li>Write/test package code.</li>
<li>Update the <code>version</code> field in the <code>Project.toml</code>
(e.g., to <code>"0.1.0"</code> or <code>"1.0.0"</code> for the first registered version).</li>
<li>Add a comment with <code>@JuliaRegistrator register</code>
to the latest commit that should be included
in the registered version of the package.</li>
</ol>
<p>Note that there are additional steps for preparing a package for publishing
that we did not discuss in this post
(such as specifying compatible versions
of Julia and package dependencies).
Refer to the <a target="_blank" href="https://github.com/JuliaRegistries/General?tab=readme-ov-file#registering-a-package-in-general">General registry's documentation</a> and links therein for details.</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we discussed creating Julia packages.
We learned what comprises a package,
how to automate package creation,
and how to register a package in Julia's General registry.</p>
<p>What package development tips do you have?
Let us know in the comments below!</p>
<p>Now that you know how to create a Julia package,
check out our
<a target="_blank" href="https://blog.glcs.io/package-testing">next post to learn about package testing</a>!
Or, if you're curious
about how to conditionally extend package functionality,
check out another
<a target="_blank" href="https://blog.glcs.io/package-extensions">post about package extensions</a>!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://pkgdocs.julialang.org/v1/creating-packages/">Official Package Creation Docs</a><ul>
<li>Official Julia documentation on creating packages.</li>
</ul>
</li>
<li><a target="_blank" href="https://juliaci.github.io/PkgTemplates.jl/stable/user/">PkgTemplates.jl Docs</a><ul>
<li>Documentation for PkgTemplates.jl.</li>
</ul>
</li>
<li><a target="_blank" href="https://pkgdocs.julialang.org/v1/toml-files/"><code>Project.toml</code> Docs</a><ul>
<li>Documentation for the <code>Project.toml</code> file.</li>
</ul>
</li>
<li><a target="_blank" href="https://github.com/JuliaRegistries/General">Julia General Registry</a><ul>
<li>GitHub repo (including README) for Julia's General registry.</li>
</ul>
</li>
</ul>
<p>Cover image background provided by <a target="_blank" href="https://www.proflowers.com">www.proflowers.com</a> at
<a target="_blank" href="mailto:https://www.flickr.com/photos/127365614@N08/16011252136">https://www.flickr.com/photos/127365614@N08/16011252136</a>.</p>
<p>Treasure map image source: <a target="_blank" href="https://openclipart.org/detail/299283/x-marks-the-spot">https://openclipart.org/detail/299283/x-marks-the-spot</a></p>
]]></content:encoded></item><item><title><![CDATA[Julia 1.11: Top Features and Important Updates]]></title><description><![CDATA[This post was written by Steven Whitaker.
Note that Julia 1.11
is no longer the most recent Julia version.
To learn more about the most recent version,
check out our series of posts about Julia releases.

A new version of the Julia programming langua...]]></description><link>https://blog.glcs.io/julia-1-11</link><guid isPermaLink="true">https://blog.glcs.io/julia-1-11</guid><category><![CDATA[Julia]]></category><dc:creator><![CDATA[Steven Whitaker]]></dc:creator><pubDate>Thu, 10 Oct 2024 15:46:26 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1724862314414/LU-DgaMUH.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>This post was written by <strong>Steven Whitaker</strong>.</p>
<p>Note that Julia 1.11
is no longer the most recent Julia version.
To learn more about the most recent version,
check out our <a target="_blank" href="https://blog.glcs.io/series/julia-releases">series of posts about Julia releases</a>.</p>
</blockquote>
<p>A new version of the <a target="_blank" href="https://julialang.org">Julia programming language</a>
was just released!
Version 1.11 is now the latest stable version of Julia.</p>
<p>This release is a minor release,
meaning it includes language enhancements
and bug fixes
but should also be fully compatible
with code written in previous Julia versions
(from version 1.0 and onward).</p>
<p>In this post,
we will check out some of the features and improvements
introduced in this newest Julia version.
Read the full post,
or click on the links below
to jump to the features that interest you.</p>
<ul>
<li><a class="post-section-overview" href="#heading-improved-clarity-of-public-api-with-public-keyword">Improved Clarity of Public API with <code>public</code> Keyword</a></li>
<li><a class="post-section-overview" href="#heading-standardized-entry-point-for-scripts">Standardized Entry Point for Scripts</a></li>
<li><a class="post-section-overview" href="#heading-improved-string-styling">Improved String Styling</a></li>
<li><a class="post-section-overview" href="#heading-new-functions-for-testing-existence-of-documentation">New Functions for Testing Existence of Documentation</a></li>
<li><a class="post-section-overview" href="#heading-more-complete-timed-macro">More Complete <code>@timed</code> Macro</a></li>
<li><a class="post-section-overview" href="#heading-new-convenience-function-logrange">New Convenience Function <code>logrange</code></a></li>
</ul>
<p>If you are new to Julia
(or just need a refresher),
feel free to check out our <a target="_blank" href="https://blog.glcs.io/series/julia-basics-programmers">Julia tutorial series</a>,
beginning with <a target="_blank" href="https://blog.glcs.io/install-julia-and-vscode">how to install Julia and VS Code</a>.</p>
<h1 id="heading-improved-clarity-of-public-api-with-public-keyword">Improved Clarity of Public API with <code>public</code> Keyword</h1>
<p>An important aspect of semantic versioning
(aka SemVer, which Julia and Julia packages follow)
is the public API users have access to.
In essence,
SemVer states that minor package updates
should not break compatibility
with existing code that uses the public API.
However,
other parts of the code are free to change
in a minor update.
To illustrate:</p>
<ul>
<li>If I have <code>sum([1, 2, 3])</code> in my code
that I wrote in Julia 1.10,
it will continue to return <code>6</code> in Julia 1.11
because <code>sum</code> is part of Julia's public API.
But it could break in Julia 2.0.
(Hopefully not, though!)</li>
<li>If I have <code>SomePackage._internal_function(0)</code> in my code
that I wrote with SomePackage v1.2.0,
it might error when SomePackage upgrades to v1.2.1
(because, for example, <code>_internal_function</code> got deleted).
Such a change would be allowed
because <code>_internal_function</code> is not part of the public API.</li>
</ul>
<p>So, the question is,
how does a user know
what the public API of a package is?
Historically,
there have been some conventions followed:</p>
<ul>
<li>Names that are exported
(e.g., <code>export DataFrame</code>)
are part of the public API.</li>
<li>Unexported names that are prefixed with an underscore
(e.g., <code>SomePackage._internal_function</code>)
are not part of the public API.</li>
</ul>
<p>But what about a function like <code>ForwardDiff.gradient</code>?
That function is the reason why 99%
of users load the ForwardDiff package,
but it's not exported!
The good news is that it's still part of the public API
because, well, ForwardDiff's maintainers say so.
Or maybe it's because the documentation says so.
Or maybe it's because enough people use it?
Sometimes it's not entirely clear.</p>
<p>But now in Julia 1.11,
the code can indicate
the public API!
This is thanks to a new <code>public</code> keyword.
Now,
all symbols that are documented with <code>public</code>
are part of the public API.
(Note that this is in addition to exported symbols,
i.e., <code>func</code> would be considered public API
with either <code>export func</code> or <code>public func</code>.)</p>
<p>Usage of <code>public</code> is the same as <code>export</code>.
For example:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">module</span> MyPackage

public add1

<span class="hljs-string">"""
Docstring for a public function.
"""</span>
add1(x) = x + <span class="hljs-number">1</span>

<span class="hljs-string">"""
Docstring for a private function.
"""</span>
private_add1(x) = x + <span class="hljs-number">1</span>

<span class="hljs-keyword">end</span>
</code></pre>
<p>With <code>using MyPackage</code>,
no symbols are made available (except <code>MyPackage</code>),
e.g., <code>add1</code> can only be called via <code>MyPackage.add1</code>,
because nothing is exported with <code>export</code>.
And both <code>MyPackage.add1(1)</code> and <code>MyPackage.private_add1(1)</code> work,
even though <code>add1</code> is public and <code>private_add1</code> is private.
So,
the <code>public</code> keyword doesn't change how MyPackage works or is used.</p>
<p>However,
the <code>public</code> keyword does change some behaviors.
The most notable difference is
when displaying documentation in the REPL's help mode:</p>
<pre><code class="lang-julia">help?&gt; MyPackage.add1
  Docstring <span class="hljs-keyword">for</span> a public <span class="hljs-keyword">function</span>.

help?&gt; MyPackage.private_add1
  │ Warning
  │
  │  The following bindings may be internal; they may change or be removed <span class="hljs-keyword">in</span> future versions:
  │
  │    •  MyPackage.private_add1

  Docstring <span class="hljs-keyword">for</span> a private <span class="hljs-keyword">function</span>.
</code></pre>
<p>See the warning with <code>private_add1</code>?
Such warnings,
in addition to the more straightforward documentation of the public API,
may help reduce usage of package internals
(particularly accidental usage),
which in turn may help improve
the stability of Julia packages.</p>
<p>To summarize,
even though the <code>public</code> keyword
doesn't change how a package works or is used,
it does provide a mechanism for clearly stating the public API
and providing warnings when viewing the documentation
for internal functions.
This, in turn,
may improve the stability of Julia packages
as they adhere more closely
to public APIs.</p>
<p>Read more about <code>public</code> in the <a target="_blank" href="https://docs.julialang.org/en/v1/manual/modules/#Export-lists">Julia manual</a>.</p>
<h1 id="heading-standardized-entry-point-for-scripts">Standardized Entry Point for Scripts</h1>
<p>There is now a standardized entry point
when running scripts from the command line.</p>
<p>Julia 1.11 introduces the <code>@main</code> macro.
This macro, when placed in a script,
and when the script is run via the command line,
tells Julia to run a function called <code>main</code>
after running the code in the script.
<code>main</code> will be passed a <code>Vector</code> of <code>String</code>s
containing the command line arguments passed to the script.</p>
<p>To use <code>@main</code>,
just include it in your script
on its own line
after defining your <code>main</code> function.
(If <code>@main</code> occurs before defining <code>main</code>,
for example at the top of the script,
an error will be thrown,
so the ordering matters.)</p>
<p>Of course,
for this to work
there has to be a function <code>main</code>
with a method that takes a <code>Vector{String}</code> as the only input.</p>
<p>Let's look at an example
to illustrate how this works.
Say we have the following in <code>test.jl</code>:</p>
<pre><code class="lang-julia">print_val(x) = print(x)

<span class="hljs-keyword">function</span> main(args)
    print_val(args)
<span class="hljs-keyword">end</span>
<span class="hljs-meta">@main</span>
</code></pre>
<p>If we run this file in the REPL
with <code>include("test.jl")</code>,
the functions <code>print_val</code> and <code>main</code>
will be defined,
but <code>main</code> will not get called.
This is the same behavior
as when <code>@main</code> is not present.</p>
<p>On the other hand,
if we run this file via the command line
with <code>julia test.jl</code>,
the functions <code>print_val</code> and <code>main</code>
will be defined
and then <code>main</code> will be called
with the command line arguments
as the input.
To illustrate:</p>
<ul>
<li><code>julia test.jl</code> will call <code>main(String[])</code>
(because no command line arguments were passed).</li>
<li><code>julia test.jl 1 hello</code> will call <code>main(["1", "hello"])</code>.</li>
</ul>
<p>As a result of <code>@main</code>,
a Julia file can have different behavior
depending on whether it is run as a script or not.</p>
<p>If you're familiar with Python,
<code>@main</code> might remind you of
<code>if __name__ == "__main__"</code>.
However,
there is one significant difference:</p>
<ul>
<li>In Python,
if <code>script1.py</code> imports <code>script2.py</code>
and <code>script2.py</code> has the "if main" check,
running <code>script1.py</code> as a script
will not run <code>script2.py</code>'s "if main" code.</li>
<li>In Julia,
if <code>script1.jl</code> includes <code>script2.jl</code>
and <code>script2.jl</code> uses <code>@main</code>,
running <code>script1.jl</code> as a script
<em>will</em> run <code>script2.jl</code>'s <code>main</code> function.
(Technicality:
Unless <code>script1.jl</code> defines an appropriate <code>main</code> method,
in which case <code>script1.jl</code>'s <code>main</code> would be called,
even if <code>script1.jl</code> did not include <code>@main</code>.)</li>
</ul>
<p>This isn't to say Julia's <code>@main</code> is bad or wrong;
it's just important to know that it works differently
than Python.
And it's still cool to have a standardized entry point
for Julia scripts now!</p>
<p>Read more about <code>@main</code> in the <a target="_blank" href="https://docs.julialang.org/en/v1/manual/command-line-interface/#The-Main.main-entry-point">Julia manual</a>.</p>
<h1 id="heading-improved-string-styling">Improved String Styling</h1>
<p>Julia 1.11 introduces a new <a target="_blank" href="https://github.com/JuliaLang/StyledStrings.jl">StyledStrings.jl</a>
standard library package.
This package provides a convenient way
to add styling to strings.
StyledStrings makes printing styled strings
much easier than calling <code>printstyled</code>,
particularly when different parts of the string
have different styles.</p>
<p>The easiest way to create a styled string
is with <code>styled"..."</code>.
For example:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">using</span> StyledStrings
styled_string = <span class="hljs-string">styled"{italic:This} is a {bold,bright_cyan:styled string}!"</span>
</code></pre>
<p>Then, when printing the styled string,
it will display according to the provided annotations.</p>
<p>Also, because the style information is stored with the string,
it can easily be preserved across string manipulations
such as string concatenation or grabbing a substring.</p>
<p>Check out the <a target="_blank" href="https://docs.julialang.org/en/v1/stdlib/StyledStrings/#stdlib-styledstring-literals">documentation</a>
for more information
about the variety of different annotations
StyledStrings supports.</p>
<p>And here's some more information
from the <a target="_blank" href="https://www.youtube.com/watch?v=sXhRWCV38iQ">State of Julia talk</a> at JuliaCon 2024:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1724867404200/3jkrwUs7m.png?auto=format" alt="Slide about StyledStrings" /></p>
<h1 id="heading-new-functions-for-testing-existence-of-documentation">New Functions for Testing Existence of Documentation</h1>
<p>Julia 1.11 makes it easy
to determine programmatically whether a function has a docstring.
This can be useful for, e.g., CI checks
to ensure a package is well documented.</p>
<p>There are two functions for this purpose.
The first is <code>Docs.hasdoc</code>,
which is used to query a particular function.
<code>hasdoc</code> takes two inputs:
the module to look in
and the name (as a <code>Symbol</code>) of the function.
For example:</p>
<pre><code class="lang-julia">julia&gt; Docs.hasdoc(Base, :sum)
<span class="hljs-literal">true</span>
</code></pre>
<p>The other function provided
is <code>Docs.undocumented_names</code>,
which returns a list of a module's public names
that have no docstrings.
(Note that public names include symbols exported via <code>export</code>
as well as symbols declared as public via <code>public</code>.)
For example:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">module</span> Example

       <span class="hljs-keyword">export</span> f1, f4
       public f2, f5

       <span class="hljs-string">"Exported, documented"</span>
       f1() = <span class="hljs-number">1</span>

       <span class="hljs-string">"Public, documented"</span>
       f2() = <span class="hljs-number">2</span>

       <span class="hljs-string">"Internal, documented"</span>
       f3() = <span class="hljs-number">3</span>

       <span class="hljs-comment"># Exported, undocumented</span>
       f4() = <span class="hljs-number">4</span>

       <span class="hljs-comment"># Public, undocumented</span>
       f5() = <span class="hljs-number">5</span>

       <span class="hljs-comment"># Internal, undocumented</span>
       f6() = <span class="hljs-number">6</span>

       <span class="hljs-keyword">end</span>
Main.Example

<span class="hljs-comment"># Note that `f6` is not returned because it is neither exported nor public.</span>
julia&gt; Docs.undocumented_names(Example)
<span class="hljs-number">3</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Symbol</span>}:
 :Example
 :f4
 :f5
</code></pre>
<p>It will be interesting to see what tooling arises
to take advantage of these functions.</p>
<h1 id="heading-more-complete-timed-macro">More Complete <code>timed</code> Macro</h1>
<p>The <code>@timed</code> macro
provides more complete timing information
in Julia 1.11.</p>
<p>Previously,
<code>@timed</code> gave run time and allocation/garbage collection information,
but nothing about compilation time.
Now, compilation time is included.</p>
<p>But why care about <code>@timed</code>
when <code>@time</code> already gave all that info?
Because <code>@time</code> is hard-coded to print to <code>stdout</code>,
meaning there's no way to capture the information,
e.g., for logging purposes.</p>
<p>I actually had a project where I wanted to redirect the output of <code>@time</code>
to a log file.
I couldn't just use <code>redirect_stdio</code>
because that would also redirect the output
of the code being timed.
I ended up using <code>@timed</code> along with <code>Base.time_print</code>
to create the log statements,
but I was disappointed <code>@timed</code> didn't give me compilation time information.
Well, now it does!</p>
<h1 id="heading-new-convenience-function-logrange">New Convenience Function <code>logrange</code></h1>
<p>Pop quiz:
Which of the following
is the correct way
to create a logarithmically spaced range of numbers?</p>
<ol>
<li><code>log.(range(exp(a), exp(b), N))</code></li>
<li><code>exp.(range(log(a), log(b), N))</code></li>
</ol>
<p>I have occasionally needed
to use logarithmically spaced ranges of numbers,
not so frequently that I memorized which expression to use,
but frequently enough that I developed a real distaste
for the mental gymnastics I had to go through
every time just to remember
where to put the <code>exp</code>s and <code>log</code>s.
Maybe I should have just taken some time
to memorize the answer...</p>
<p>But now it doesn't matter!
The correct answer is neither 1 nor 2,
but <code>logrange(a, b, N)</code>!
Here's an example usage:</p>
<pre><code class="lang-julia">julia&gt; logrange(<span class="hljs-number">1</span>, <span class="hljs-number">10</span>, <span class="hljs-number">5</span>)
<span class="hljs-number">5</span>-element Base.LogRange{<span class="hljs-built_in">Float64</span>, Base.TwicePrecision{<span class="hljs-built_in">Float64</span>}}:
 <span class="hljs-number">1.0</span>, <span class="hljs-number">1.77828</span>, <span class="hljs-number">3.16228</span>, <span class="hljs-number">5.62341</span>, <span class="hljs-number">10.0</span>
</code></pre>
<p>I know it's a fairly minor change,
but the addition of <code>logrange</code> in Julia 1.11
is probably the change I'm most excited about.
There was much rejoicing when I saw the news!</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we learned about
some of the new features
and improvements
introduced in Julia 1.11.
Curious readers can
check out the <a target="_blank" href="https://github.com/JuliaLang/julia/blob/v1.11.0/NEWS.md">release notes</a>
for the full list of changes.</p>
<p>Note also that with the new release,
Julia 1.10 will now become the LTS (long-term support) version,
replacing Julia 1.6.
As a result,
Julia 1.10 will receive maintenance updates
instead of Julia 1.6
(which has now reached end of support)
until the next LTS version is announced.
If you want to learn more about what changes Julia 1.10 brought,
check out <a target="_blank" href="https://blog.glcs.io/julia-1-10">our post</a>!</p>
<p>What are you most excited about
in Julia 1.11?
Let us know in the comments below!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://github.com/JuliaLang/julia/blob/v1.11.0/NEWS.md">Julia v1.11 Release Notes</a><ul>
<li>Full list of changes made in Julia 1.11.</li>
</ul>
</li>
<li><a target="_blank" href="https://blog.glcs.io/series/julia-basics-programmers">Julia Basics for Programmers</a><ul>
<li>Series of blog posts covering Julia basics.</li>
</ul>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Maximizing Julia Development with VSCode Extension]]></title><description><![CDATA[In this article, we will review some of the features that the Julia extension offers! If you don't have Julia, VS Code, or the Julia extension already installed, look at this article to help you get set up!
Running a Julia File
Now that we have the e...]]></description><link>https://blog.glcs.io/julia-vs-code-extension</link><guid isPermaLink="true">https://blog.glcs.io/julia-vs-code-extension</guid><category><![CDATA[Julia]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Justyn Nissly]]></dc:creator><pubDate>Wed, 14 Aug 2024 18:49:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1723596840231/U-782bIxP.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this article, we will review some of the features that the Julia extension offers! If you don't have Julia, VS Code, or the Julia extension already installed, look at <a target="_blank" href="https://blog.glcs.io/install-julia-and-vscode">this article</a> to help you get set up!</p>
<h1 id="heading-running-a-julia-file">Running a Julia File</h1>
<p>Now that we have the extension installed and configured, we can start using the extension.
One of the first things we will examine is the most basic feature - running code!
To run code, we do the following:</p>
<ol>
<li>Click the drop down in the top right of the code window</li>
<li>Click "Julia: Execute Code in REPL"</li>
<li>Enjoy the results!</li>
</ol>
<p>When you hit <code>Ctrl+Enter</code>, the line your cursor is currently on will run, and the cursor will advance to the next line.
This is one way to step through the code line by line.</p>
<p>You are also able to use <code>Ctrl+F5</code> (<code>Option+F5</code> for Macs) to run the entire file. You can learn more about running Julia code from the <a target="_blank" href="https://www.julia-vscode.org/docs/dev/userguide/runningcode/">official documentation</a>.</p>
<h1 id="heading-code-navigation">Code Navigation</h1>
<p>The information in this section applies to any language and is not exclusive to the Julia extension. The features below are helpful to know and can increase your productivity as you code.</p>
<p>Within VS Code, if you use <code>Ctrl+P</code>, a prompt will open at the top of your editor. In that prompt, you can start typing the name of a file you want to jump to in your project. Once you click the option (or press enter), VS Code will jump directly to that file. You can also use <code>Ctrl+G</code> to jump to a specific line number in the file if you know which line you are looking for. Being able to jump back and forth between files without having to search the file tree will greatly enhance your workflow. Imagine all the time you will save by not searching through file trees!</p>
<p>Using <code>Ctrl+Shift+O</code> (be sure to use the letter O, not the number 0) will allow you to navigate through individual symbols within your program. After using the shortcut above, type <code>:</code> and all your symbols will be grouped by type. You are then able to navigate between them all.</p>
<h1 id="heading-editing-code">Editing Code</h1>
<p>Code navigation is a great skill to develop, especially when given some wonderful legacy code to unravel...we've all been there. Being proficient in navigating your code does you no good unless you are also proficient at editing the code.</p>
<p>One way to get going quickly is to use the "rename symbol" shortcut. You can either right click on a symbol and press "rename" or hit <code>F2</code>. When you rename the symbol, it will be renamed everywhere else in the file that it exists. Pretty neat, huh?</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723596410858/cBzhUQ8yL.gif?auto=format" alt="Change_Symbol" /></p>
<h1 id="heading-the-plot-viewer">The Plot Viewer</h1>
<p>Up until this point in the article we have laid the ground work for working with your code in VS Code. Next, we will look into some of the Julia specific features that the Julia extension offers, starting with the plot viewer.</p>
<p>The plot viewer is a really handy tool that lets you...well...view plots. We can look at an example to see how it works.</p>
<p>First, we will install the plots package if it hasn't been installed already.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> Pkg
julia&gt; Pkg.add(<span class="hljs-string">"Plots"</span>)
<span class="hljs-comment"># OR</span>
<span class="hljs-comment"># We can type the "]" key and use the package interface</span>
(<span class="hljs-meta">@v1</span><span class="hljs-number">.10</span>) pkg&gt;
(<span class="hljs-meta">@v1</span><span class="hljs-number">.10</span>) pkg&gt; add Plots
</code></pre>
<p>After we do that, we can create a plot to visualize.</p>
<pre><code class="lang-julia"><span class="hljs-keyword">using</span> Plots

<span class="hljs-keyword">function</span> make_plot()
    <span class="hljs-comment"># Create a plot</span>
    p = plot()

    θ = range(<span class="hljs-number">0</span>, <span class="hljs-number">2</span><span class="hljs-literal">π</span>, length=<span class="hljs-number">100</span>)
    x = cos.(θ) * <span class="hljs-number">5</span>
    y = sin.(θ) * <span class="hljs-number">5</span>
    plot!(p, x, y, linecolor=:black, linewidth=<span class="hljs-number">2</span>, legend=:topright)

    x_left = <span class="hljs-number">1</span> .+ <span class="hljs-number">0.5</span> * cos.(θ)
    y_left = <span class="hljs-number">2</span> .+ <span class="hljs-number">0.5</span> * sin.(θ)
    plot!(p, x_left, y_left, linecolor=:black, linewidth=<span class="hljs-number">2</span>, fillalpha=<span class="hljs-number">0.2</span>)

    x_right = <span class="hljs-number">3</span> .+ <span class="hljs-number">0.5</span> * cos.(θ)
    y_right = <span class="hljs-number">2</span> .+ <span class="hljs-number">0.5</span> * sin.(θ)
    plot!(p, x_right, y_right, linecolor=:black, linewidth=<span class="hljs-number">2</span>, fillalpha=<span class="hljs-number">0.2</span>)

    θ_arc = range(<span class="hljs-number">0</span>, <span class="hljs-literal">π</span>, length=<span class="hljs-number">10</span>)
    x_arc = <span class="hljs-number">2</span> .+ cos.(θ_arc) * <span class="hljs-number">2</span>
    y_arc = -<span class="hljs-number">1</span> .+ sin.(θ_arc) * <span class="hljs-number">1</span>
    plot!(p, x_arc, y_arc, linecolor=:black, linewidth=<span class="hljs-number">2</span>)

    <span class="hljs-comment"># Adjust plot limits and display the final plot</span>
    xlims!(-<span class="hljs-number">6</span>, <span class="hljs-number">6</span>)
    ylims!(-<span class="hljs-number">6</span>, <span class="hljs-number">6</span>)
    display(p)
<span class="hljs-keyword">end</span>

<span class="hljs-comment"># Execute the function to plot</span>
make_plot()
</code></pre>
<p>Next we run this by using the keyboard shortcut we learned earlier (<code>Ctrl+Enter</code>), and we can see the result below!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723585530940/xQsqraxjR.gif?auto=format" alt="Make_Plot" /></p>
<p>Pretty cool! Now we can see our charts generated in real time right inside our editor!</p>
<h1 id="heading-the-table-viewer">The Table Viewer</h1>
<p>I don't know about you, but I never liked having to try and dump the contents of an array, matrix,
or other data structures to the console and try and parse through the data. Lucky for us we don't have to do that.
The Julia extension allows us to view any <code>Tables.jl</code> compatible table in the special table viewer.
There are two ways to do this.</p>
<p>The first way is by clicking the "View in VS Code" button next to your table in the "Workspace" section.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723596439171/3Zi4igNAk.gif?auto=format" alt="Table_View_Button" /></p>
<p>The second way to do this is by running the <code>vscodedisplay(name_of_table)</code> directly in the REPL.</p>
<p>It's pretty cool if you ask me.
Having the ability to view the data is a nice feature, but to make it even better, you can sort the data in the UI by clicking the column headers. You can also copy data by using <code>Ctrl + C</code> just like you would in Excel!</p>
<p>A word of caution: All the table data you see is cached so any changes you make will not be reflected in the table viewer. To fix that just re-display the table in the editor and you will see your changes.</p>
<h1 id="heading-debugging">Debugging</h1>
<p>The last topic we will cover is the debugging tools.</p>
<p>There are many things you can do in the VS Code debugger that are not language-specific. We aren't going to cover all of those features here, so if you want a more in-depth look, check out the <a target="_blank" href="https://code.visualstudio.com/docs/editor/debugging">official documentation</a>.</p>
<p>Since the only truly bug free code is no code, we will start by writing some code that we can test and try to catch bugs in.</p>
<pre><code class="lang-julia"><span class="hljs-keyword">function</span> do_math(a, b)
    c = a + b
    d = a * b
    c + d
<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> print_things()
    println(<span class="hljs-string">"First print"</span>)
    println(<span class="hljs-string">"Second print"</span>)

    var = do_math(<span class="hljs-number">5</span>,<span class="hljs-number">8</span>)

    println(<span class="hljs-string">"Third print and math: "</span>, var)

    var2 = do_math(<span class="hljs-string">"Bug"</span>,<span class="hljs-number">3</span>)

    println(<span class="hljs-string">"Fourth print and bug: "</span>, var2)
<span class="hljs-keyword">end</span>

print_things()
</code></pre>
<p>We have code to run, so we can run the debugger and see what we get. First, we must switch to the "Run and Debug" tab. We do this by either clicking on the tab (the one with the bug and play button) or by hitting <code>Ctrl+Shift+D</code>.</p>
<p>Once we are there, we will be greeted with a screen like this:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723597211102/6TVn2vftR.png?auto=format" alt="Debug_Screen" /></p>
<p>From here we can observe the compiled Julia code, our breakpoints, and several other things as the program runs. We will want to run our code through the debugger, and to do that, we can either click the big "Run and Debug" button or hit <code>F5</code>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723597182460/jnkTxo0w7.png?auto=format" alt="Debug_Error_Message" /></p>
<p>We step through the code a bit, and see some of what the debugger will show us.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1723597200023/NG6mHC1sN.png?auto=format" alt="Debugger_Annotated" /></p>
<p>1: The variables passed into the function</p>
<p>2: The variables local to the function</p>
<p>3: The indicator of which line we are currently on</p>
<p>4: A breakpoint indicator</p>
<p>We can set our breakpoints by double clicking next to the line numbers. After setting our breakpoints, we will be able to step through the code. As we step through the code line by line, the variables that get created are populated in the top section and their values are shown. Another neat feature the debugger gives that is not highlighted above but should be noted is the "CALL STACK" section. As the name suggests, this will show us the entire call stack as you step through the code. All of these are most likely things we have seen before in other debuggers, but they are useful nonetheless.</p>
<h1 id="heading-keyboard-shortcuts">Keyboard Shortcuts</h1>
<p>To wrap up, let's look at a list of keyboard shortcuts for effective Julia extension usage.
Note that a command like <code>Alt+J</code> <code>Alt+C</code> means press <code>Alt+J</code> followed by <code>Alt+C</code>.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Command</td><td>Shortcut</td></tr>
</thead>
<tbody>
<tr>
<td><code>Execute Code in REPL and Move</code></td><td><code>Shift+Enter</code></td></tr>
<tr>
<td><code>Execute Code in REPL</code></td><td><code>Ctrl+Enter</code></td></tr>
<tr>
<td><code>Execute Code Cell in REPL</code></td><td><code>Alt+Enter</code></td></tr>
<tr>
<td><code>Execute Code Cell in REPL and Move</code></td><td><code>Alt+Shift+Enter</code></td></tr>
<tr>
<td><code>Interrupt Execution</code></td><td><code>Ctrl+C</code></td></tr>
<tr>
<td><code>Clear Current Inline Result</code></td><td><code>Escape</code></td></tr>
<tr>
<td><code>Clear Inline Results In Editor</code></td><td><code>Alt+J</code> <code>Alt+C</code></td></tr>
<tr>
<td><code>Select Current Module</code></td><td><code>Alt+J</code> <code>Alt+M</code></td></tr>
<tr>
<td><code>New Julia File</code></td><td><code>Alt+J</code> <code>Alt+N</code></td></tr>
<tr>
<td><code>Start REPL</code></td><td><code>Alt+J</code> <code>Alt+O</code></td></tr>
<tr>
<td><code>Stop REPL</code></td><td><code>Alt+J</code> <code>Alt+K</code></td></tr>
<tr>
<td><code>Restart REPL</code></td><td><code>Alt+J</code> <code>Alt+R</code></td></tr>
<tr>
<td><code>Change Current Environment</code></td><td><code>Alt+J</code> <code>Alt+E</code></td></tr>
<tr>
<td><code>Show Documentation</code></td><td><code>Alt+J</code> <code>Alt+D</code></td></tr>
<tr>
<td><code>Show Plot</code></td><td><code>Alt+J</code> <code>Alt+P</code></td></tr>
<tr>
<td><code>REPLVariables.focus</code></td><td><code>Alt+J</code> <code>Alt+W</code></td></tr>
<tr>
<td><code>Interrupt Execution</code></td><td><code>Ctrl+Shift+C</code></td></tr>
<tr>
<td><code>Browse Back Documentation</code></td><td><code>Left</code></td></tr>
<tr>
<td><code>Browse Forward Documentation</code></td><td><code>Right</code></td></tr>
<tr>
<td><code>Show Previous Plot</code></td><td><code>Left</code></td></tr>
<tr>
<td><code>Show Next Plot</code></td><td><code>Right</code></td></tr>
<tr>
<td><code>Show First Plot</code></td><td><code>Home</code></td></tr>
<tr>
<td><code>Show Last Plot</code></td><td><code>End</code></td></tr>
<tr>
<td><code>Delete plot</code></td><td><code>Delete</code></td></tr>
<tr>
<td><code>Delete All Plots</code></td><td><code>Shift+Delete</code></td></tr>
</tbody>
</table>
</div><h1 id="heading-summary">Summary</h1>
<p>We reviewed some of the basics of the Julia VS Code extension. We looked at running a Julia file, the basics of code navigation and editing, the usefulness of the plot and table viewers, and some basic debugging features. This was only an overview, and there is much more that the extension has to offer! If you would like to do a deeper dive into the Julia VS Code extension, visit the <a target="_blank" href="https://www.julia-vscode.org/docs/dev/">official documentation</a> or their <a target="_blank" href="https://github.com/julia-vscode/julia-vscode">GitHub</a>.</p>
<p>If you would like to learn more about Julia so you can fully take advantage of the extension's features, check out our <a target="_blank" href="https://blog.glcs.io/series/julia-basics-programmers">Julia Basics</a> series!</p>
]]></content:encoded></item><item><title><![CDATA[Enhancing Healthcare Revenue Forecasting with DataFrames.jl and MemPool.jl: A Case Study]]></title><description><![CDATA[Introduction
The Great Lakes Consulting team collaborated with a major healthcare client to develop a Revenue Forecasting application using the Julia framework. This application allows in-memory processing of large healthcare claims datasets, enablin...]]></description><link>https://blog.glcs.io/healthcare-dataframes-mempool-case-study</link><guid isPermaLink="true">https://blog.glcs.io/healthcare-dataframes-mempool-case-study</guid><category><![CDATA[MemPool.jl]]></category><category><![CDATA[DataFrames]]></category><category><![CDATA[Julia]]></category><category><![CDATA[data analytics]]></category><dc:creator><![CDATA[Jeff Dixon]]></dc:creator><pubDate>Fri, 02 Aug 2024 10:39:23 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/dBI_My696Rk/upload/9159dfe057aef38af3064cb707f3f25f.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-introduction">Introduction</h2>
<p>The Great Lakes Consulting team collaborated with a major healthcare client to develop a Revenue Forecasting application using the Julia framework. This application allows in-memory processing of large healthcare claims datasets, enabling real-time scenario modeling and quick analysis.</p>
<p>Recently, our team upgraded the application, replacing JuliaDB.jl with DataFrames.jl. During this project, JuliaDB.jl <code>NDSparse</code> objects were replaced with a custom <code>DCTable</code> object. This new custom object wraps a <code>DataFrame</code> and manages disk caching with <a target="_blank" href="https://github.com/JuliaData/MemPool.jl">MemPool.jl</a>. High-level design decisions and lessons learned are detailed in this blog post.</p>
<p>As a bonus, our Healthcare client is looking to hire new developers to continue development and support of the Net Revenue Forecasting application and its Julia framework.  There are two (2) positions posted on their Jobs Portal. Please consider applying if you are interested.</p>
<p><a target="_blank" href="https://jobs.trinity-health.org/job/00503530/App-Programmer-Specialist-REMOTE">Julia Developer Specialist (REMOTE)</a></p>
<p><a target="_blank" href="https://jobs.trinity-health.org/job/00499053/Senior-Applications-Programmer-Analyst-REMOTE">Senior Applications Programmer / Analyst (REMOTE)</a></p>
<h1 id="heading-background"><strong>Background</strong></h1>
<p>In 2018, the GLCS team developed a Net Revenue and Mid-Month Forecasting application (NRF) for one of the largest not-for-profit healthcare systems in the U.S. This application is part of a suite of revenue tools that provide A/R Valuation, Revenue Forecasting, Analytics, and Reporting. It helps the health system create accurate revenue forecasts for over 200 hospitals, continuing care facilities, and urgent care locations nationwide.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1721074173775/a4705a43-5150-4e2a-86a8-cbf907bbeb1f.png" alt class="image--center mx-auto" /></p>
<p>The most recent generation of the NRF application leverages a services-based architecture powered by Julia. The platform facilitates in-memory processing for very large multi-dimensional datasets (8 GB – 12 GB) to enable real-time scenario modeling for users. Core data manipulation was originally performed with <a target="_blank" href="https://github.com/JuliaData/JuliaDB.jl">JuliaDB.jl</a>.</p>
<p>When the NRF application was developed in 2018, JuliaDB.jl was the best solution for:</p>
<ul>
<li><p>Loading multi-dimensional datasets quickly.</p>
</li>
<li><p>Indexing data and performing FILTER, AGGREGATE, SORT and JOIN operations.</p>
</li>
<li><p>Saving results and loading them back efficiently.</p>
</li>
<li><p>Leveraging Julia's built-in parallelism to fully utilize available resources.</p>
</li>
</ul>
<p>Now for some bad news. The last official release for this package was August 3, 2020 (v0.13.1).  JuliaDB.jl is effectively abandoned and not receiving support or maintenance updates. The development recommended <a target="_blank" href="https://github.com/JuliaParallel/DTables.jl">DTables.jl</a> and <a target="_blank" href="https://github.com/JuliaData/DataFrames.jl">DataFrames.jl</a> as the preferred alternatives for JuliaDB.jl.</p>
<p>Earlier this year, the GLCS team set out to upgrade the NRF application and replace JuliaDB.jl components within the framework. The first design leveraged DTables.jl, an abstraction layer on top of <a target="_blank" href="https://github.com/JuliaParallel/Dagger.jl">Dagger.jl</a> that allows for manipulation of table-like structures in a distributed environment.</p>
<h1 id="heading-issues-encountered"><strong>Issues Encountered</strong></h1>
<p>The initial builds showed some promise. In general, query and processing performance improved. However, when deployed to the test environment (which mimicked the production server configuration) with real-time user interactions and very large datasets, individual processes occasionally failed or hung without explanation. Due to the constraints of the test environment, a C-library function call within Dagger.jl, a dependency of DTables.jl, failed when scheduling parallel tasks. Additionally, the new framework had excessive memory (RAM) demands, which exhausted available resources and resulted in poor performance.</p>
<p>The initial versions were less stable for single-threaded, multi-process applications like NRF. The GLCS team reported several issues to the maintainers of DTables.jl and Dagger.jl to enhance their stability for applications like NRF. The resulting bug fixes and improvements increased effectiveness and efficiency. Local testing ran smoothly, but the application would still occasionally hang or crash when deployed to the official test environment.</p>
<h1 id="heading-root-cause-analysis"><strong>Root Cause Analysis</strong></h1>
<p>DTables.jl does not allow for modifying data in-place. As a result, data often had to be copied to another structure, modified, and then stored again using DTables.jl. This frequent copying of very large datasets led to high memory usage and slow performance.</p>
<p>Despite working closely with the maintainers of DTables.jl and Dagger.jl, NRF still faced occasional issues in the official test environment that did not occur in personal test setups. This suggests an incompatibility between Dagger.jl and the operating system used in the official test environment (RHEL 7.9).</p>
<h1 id="heading-solution"><strong>Solution</strong></h1>
<p>Because the only feature NRF needed from DTables.jl was disk caching (i.e., being able to swap data seamlessly from memory to disk, and vice versa), the team dropped DTables.jl and instead used MemPool.jl. The team replaced JuliaDB.jl <code>NDSparse</code> objects with custom <code>DCTable</code> objects (DC meaning disk-cacheable). The new custom object wraps a DataFrames.jl <code>DataFrame</code> object and manages disk caching with MemPool.jl. In code, a <code>DCTable</code> is used in the same way one would use a <code>DataFrame</code>. The difference is that whenever a <code>DCTable</code> is used, first the underlying <code>DataFrame</code> must be grabbed from MemPool.jl, fetching the <code>DataFrame</code> from disk if it has been cached. Then the <code>DataFrame</code> is used as normal. The result of the operation (if it is a <code>DataFrame</code>) is then wrapped in a new <code>DCTable</code> to be managed by MemPool.jl.</p>
<pre><code class="lang-julia"><span class="hljs-string">"""
DCTable --&gt; Disk-cacheable table (`DataFrame`).
Disk caching is managed via MemPool.jl, and a `DCTable` stores a `DRef` 
returned by calling `poolset` on the `DataFrame` to store.
The `DataFrame` can be retrieved by calling `fetch` on a `DCTable` 
(which calls `poolget` on the stored `DRef`).
"""</span>
<span class="hljs-keyword">struct</span> DCTable
    ref::DRef
    DCTable(df::DataFrame) = new(poolset(df; size = nbytes(df)))
<span class="hljs-keyword">end</span>
</code></pre>
<p>The <code>DCTable</code> objects resolved the issues experienced with DTables.jl. First, because operating on a <code>DCTable</code> works on the underlying <code>DataFrame</code>, all the benefits of using <code>DataFrame</code>s apply, including being able to modify data in-place. As a result, using <code>DCTable</code>s requires much less memory. And second, relying on MemPool.jl drops the complex task-scheduling logic of Dagger.jl that seemed to be incompatible with the official NRF test environment.</p>
<p>With <code>DCTable</code> objects on hand, the team was able to replace all JuliaDB.jl function calls with equivalent operations on <code>DCTable</code>/<code>DataFrame</code> objects. The team ensured feature parity between new and old back-ends. Tests were developed to ensure the results obtained via the new back-end were consistent with the old back-end, and benchmarks were created to compare the performance of the old and new back-ends. Next, the team deployed the new codebase to the official test environment. Finally, after extensive internal testing and user-acceptance testing, the new codebase was officially deployed to the production environment.</p>
<h1 id="heading-follow-up-opportunities"><strong>Follow-up Opportunities</strong></h1>
<p>Are you a Julia programmer looking for a new opportunity? Our client is seeking additional help to continue developing and supporting the Net Revenue Forecasting application and its Julia framework. They have posted two positions on their Jobs Portal. Links to those positions are below. Please consider applying if you are interested.</p>
<p><a target="_blank" href="https://jobs.trinity-health.org/job/00503530/App-Programmer-Specialist-REMOTE">Julia Developer Specialist (REMOTE)</a></p>
<p><a target="_blank" href="https://jobs.trinity-health.org/job/00499053/Senior-Applications-Programmer-Analyst-REMOTE">Senior Applications Programmer / Analyst (REMOTE)</a></p>
]]></content:encoded></item><item><title><![CDATA[Getting Started with DataFrames.jl: A Beginner's Guide]]></title><description><![CDATA[When doing any sort of development one will often find themselves in need of working with data in a
tabular format. This is especially true for those of us in data science, or data analysis, fields.
In the Julia programming language one of the more p...]]></description><link>https://blog.glcs.io/julia-dataframes</link><guid isPermaLink="true">https://blog.glcs.io/julia-dataframes</guid><category><![CDATA[Julia]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Joel Nelson]]></dc:creator><pubDate>Wed, 08 May 2024 17:33:08 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1714499723041/n2NxMfsUG.gif?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When doing any sort of development one will often find themselves in need of working with data in a
tabular format. This is especially true for those of us in data science, or data analysis, fields.
In the <a target="_blank" href="https://julialang.org">Julia programming language</a> one of the more popular libraries for this type of data
wrangling is <a target="_blank" href="https://dataframes.juliadata.org/stable/">DataFrames.jl</a>. In this blog post we'll explore the beginnings of working with this
package.</p>
<h1 id="heading-introduction">Introduction</h1>
<p>The great thing about a package like <code>Dataframes.jl</code> is that it bridges the gap between traditional
programming and SQL (Structured Query Language). Databases are great tools for easily gaining insights
into your data by joining, filtering, aggregating, sorting, etc... <code>Dataframes.jl</code> brings those goodies right
into your hands by simply adding the package into your julia session. So, lets get started!</p>
<h1 id="heading-getting-started">Getting Started</h1>
<p>Adding the package is a few simple steps.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> Pkg
julia&gt; Pkg.add(<span class="hljs-string">"DataFrames"</span>)
julia&gt; <span class="hljs-keyword">using</span> DataFrames
</code></pre>
<p>The constructor for a <code>DataFrame</code> provides flexibility to create from arrays, tuples, constants, or files. The 
documentation covers all these, but for this post we'll just explore one of the more common ways.</p>
<pre><code class="lang-julia">julia&gt; df = DataFrame(a = <span class="hljs-number">1</span>:<span class="hljs-number">4</span>, b = rand(<span class="hljs-number">4</span>), c = <span class="hljs-string">"My first DataFrame"</span>)
<span class="hljs-number">4</span>×<span class="hljs-number">3</span> DataFrame
 Row │ a      b         c                  
     │ <span class="hljs-built_in">Int64</span>  <span class="hljs-built_in">Float64</span>   <span class="hljs-built_in">String</span>             
─────┼─────────────────────────────────────
   <span class="hljs-number">1</span> │     <span class="hljs-number">1</span>  <span class="hljs-number">0.141874</span>  My first DataFrame
   <span class="hljs-number">2</span> │     <span class="hljs-number">2</span>  <span class="hljs-number">0.432084</span>  My first DataFrame
   <span class="hljs-number">3</span> │     <span class="hljs-number">3</span>  <span class="hljs-number">0.47098</span>   My first DataFrame
   <span class="hljs-number">4</span> │     <span class="hljs-number">4</span>  <span class="hljs-number">0.414639</span>  My first DataFrame
</code></pre>
<p>You'll notice in the code above we use a mix of datatypes including range, array, and scalar. The underlying vectors must be of the same size
and the scalar gets broadcasted, or repeated, for each row. Also, pay attention that the types of each column are inferred
based on the arrays passed into the constructor. </p>
<p>Now, to access a column of a <code>DataFrame</code> there are also a few different possibilities. Here are a few examples of accessing the
first column "a".</p>
<pre><code class="lang-julia">julia&gt; df.a
<span class="hljs-number">4</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">1</span>
 <span class="hljs-number">2</span>
 <span class="hljs-number">3</span>
 <span class="hljs-number">4</span>

julia&gt; df.<span class="hljs-string">"a"</span>
<span class="hljs-number">4</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">1</span>
 <span class="hljs-number">2</span>
 <span class="hljs-number">3</span>
 <span class="hljs-number">4</span>

julia&gt; df[!, <span class="hljs-string">"a"</span>]
<span class="hljs-number">4</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">1</span>
 <span class="hljs-number">2</span>
 <span class="hljs-number">3</span>
 <span class="hljs-number">4</span>

julia&gt; df[!, :a]
<span class="hljs-number">4</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">1</span>
 <span class="hljs-number">2</span>
 <span class="hljs-number">3</span>
 <span class="hljs-number">4</span>

julia&gt; df[:, :a]
<span class="hljs-number">4</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">1</span>
 <span class="hljs-number">2</span>
 <span class="hljs-number">3</span>
 <span class="hljs-number">4</span>
</code></pre>
<p>In these examples columns can be access directly with literals such as <code>df.a</code>, or more 
dynamically using brackets (since variables could be substituted.) You may also find
yourself wondering the difference between <code>!</code> and <code>:</code>, which is an important distinction!</p>
<p>The <code>!</code> returns the underlying vector and <code>:</code> returns a copy. This can be showcased in an
example where we will attempt to change the description of the second value in column <code>c</code>
to "I love Julia!"</p>
<pre><code class="lang-julia">julia&gt; df[:, :c][<span class="hljs-number">2</span>] = <span class="hljs-string">"I love Julia!"</span>
<span class="hljs-number">3</span>

julia&gt; df
<span class="hljs-number">4</span>×<span class="hljs-number">3</span> DataFrame
 Row │ a      b         c                  
     │ <span class="hljs-built_in">Int64</span>  <span class="hljs-built_in">Float64</span>   <span class="hljs-built_in">String</span>             
─────┼─────────────────────────────────────
   <span class="hljs-number">1</span> │     <span class="hljs-number">1</span>  <span class="hljs-number">0.394165</span>  My first DataFrame
   <span class="hljs-number">2</span> │     <span class="hljs-number">2</span>  <span class="hljs-number">0.809883</span>  My first DataFrame
   <span class="hljs-number">3</span> │     <span class="hljs-number">3</span>  <span class="hljs-number">0.124035</span>  My first DataFrame
   <span class="hljs-number">4</span> │     <span class="hljs-number">4</span>  <span class="hljs-number">0.886781</span>  My first DataFrame

julia&gt; df[!, :c][<span class="hljs-number">2</span>] = <span class="hljs-string">"I love Julia!"</span>
<span class="hljs-number">3</span>

julia&gt; df
<span class="hljs-number">4</span>×<span class="hljs-number">3</span> DataFrame
 Row │ a      b         c                  
     │ <span class="hljs-built_in">Int64</span>  <span class="hljs-built_in">Float64</span>   <span class="hljs-built_in">String</span>             
─────┼─────────────────────────────────────
   <span class="hljs-number">1</span> │     <span class="hljs-number">1</span>  <span class="hljs-number">0.394165</span>  My first DataFrame
   <span class="hljs-number">2</span> │     <span class="hljs-number">3</span>  <span class="hljs-number">0.809883</span>  <span class="hljs-literal">I</span> love Julia!
   <span class="hljs-number">3</span> │     <span class="hljs-number">3</span>  <span class="hljs-number">0.124035</span>  My first DataFrame
   <span class="hljs-number">4</span> │     <span class="hljs-number">4</span>  <span class="hljs-number">0.886781</span>  My first DataFrame
</code></pre>
<p>Notice how the change will only persist to <code>df</code> when we access the column with <code>!</code>.</p>
<p>There is often a tradeoff between returning copies versus the actual underlying vectors. 
Returning a copy is generally considered safer since if the copy is later mutated the underlying
DataFrame remains unchanged. However, with very large DataFrames copying every column access will
result in an increase in memory. It is best to weigh those considerations and figure out what
approach will work best for a given program.</p>
<h1 id="heading-data-wrangling">Data Wrangling</h1>
<h2 id="heading-import-export">Import / Export</h2>
<p>Another great feature of the Julia programming language is that many different packages will interact well
when used together. For instance, <code>DataFrames.jl</code> and <code>CSV.jl</code> can be used to very easily import and export
data.</p>
<p>First, we can save the <code>DataFrame</code> from above to CSV.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> CSV

julia&gt; path = joinpath(homedir(), <span class="hljs-string">"my_df.csv"</span>)

julia&gt; CSV.write(path, df)
</code></pre>
<p>And, reading in the <code>DataFrame</code> from file is just as easy!</p>
<pre><code class="lang-julia">julia&gt; CSV.read(path, DataFrame)
<span class="hljs-number">4</span>×<span class="hljs-number">3</span> DataFrame
 Row │ a      b         c                  
     │ <span class="hljs-built_in">Int64</span>  <span class="hljs-built_in">Float64</span>   String31           
─────┼─────────────────────────────────────
   <span class="hljs-number">1</span> │     <span class="hljs-number">1</span>  <span class="hljs-number">0.601361</span>  My first DataFrame
   <span class="hljs-number">2</span> │     <span class="hljs-number">2</span>  <span class="hljs-number">0.178065</span>  My first DataFrame
   <span class="hljs-number">3</span> │     <span class="hljs-number">3</span>  <span class="hljs-number">0.729591</span>  My first DataFrame
   <span class="hljs-number">4</span> │     <span class="hljs-number">4</span>  <span class="hljs-number">0.280314</span>  My first DataFrame
</code></pre>
<p>There are many keyword arguments to explore when handling csv files and the documentation is best for
covering all of these <a target="_blank" href="https://csv.juliadata.org/stable/">CSV.jl</a>.</p>
<p><code>DataFrames.jl</code> also supports writing and reading to multiple files types such as Arrow, JSON, Parquet, and others.</p>
<h2 id="heading-joins">Joins</h2>
<p>A join is a way to merge data from two DataFrames into a single DataFrame. There are several types
and they generally mimic the same types that a database would support.</p>
<ul>
<li><code>innerjoin</code></li>
<li><code>leftjoin</code></li>
<li><code>rightjoin</code></li>
<li><code>outerjoin</code></li>
<li><code>semijoin</code></li>
<li><code>antijoin</code></li>
<li><code>crossjoin</code></li>
</ul>
<p>Definitions of each can be found in either the documentation, or docstrings, but lets take a look at a few
examples. Say we have the following <code>DataFrame</code> sets containing information from a school.</p>
<pre><code class="lang-julia">julia&gt; student_df = DataFrame(student_id = <span class="hljs-number">1</span>:<span class="hljs-number">10</span>, student_name = [<span class="hljs-string">"Joe"</span>, <span class="hljs-string">"Sally"</span>, <span class="hljs-string">"Jim"</span>, <span class="hljs-string">"Sandy"</span>, <span class="hljs-string">"Beth"</span>, <span class="hljs-string">"Alex"</span>, <span class="hljs-string">"Tom"</span>, <span class="hljs-string">"Liz"</span>, <span class="hljs-string">"Bill"</span>, <span class="hljs-string">"Carl"</span>], teacher_id = repeat([<span class="hljs-number">1</span>,<span class="hljs-number">2</span>],<span class="hljs-number">5</span>))
<span class="hljs-number">10</span>×<span class="hljs-number">3</span> DataFrame
 Row │ student_id  student_name  teacher_id 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>      
─────┼──────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">1</span>  Joe                    <span class="hljs-number">1</span>
   <span class="hljs-number">2</span> │          <span class="hljs-number">2</span>  Sally                  <span class="hljs-number">2</span>
   <span class="hljs-number">3</span> │          <span class="hljs-number">3</span>  Jim                    <span class="hljs-number">1</span>
   <span class="hljs-number">4</span> │          <span class="hljs-number">4</span>  Sandy                  <span class="hljs-number">2</span>
   <span class="hljs-number">5</span> │          <span class="hljs-number">5</span>  Beth                   <span class="hljs-number">1</span>
   <span class="hljs-number">6</span> │          <span class="hljs-number">6</span>  Alex                   <span class="hljs-number">2</span>
   <span class="hljs-number">7</span> │          <span class="hljs-number">7</span>  Tom                    <span class="hljs-number">1</span>
   <span class="hljs-number">8</span> │          <span class="hljs-number">8</span>  Liz                    <span class="hljs-number">2</span>
   <span class="hljs-number">9</span> │          <span class="hljs-number">9</span>  Bill                   <span class="hljs-number">1</span>
  <span class="hljs-number">10</span> │         <span class="hljs-number">10</span>  Carl                   <span class="hljs-number">2</span>

julia&gt; teacher_df = DataFrame(teacher_id = <span class="hljs-number">1</span>:<span class="hljs-number">2</span>, teacher_name = [<span class="hljs-string">"Mr. Jackson"</span>, <span class="hljs-string">"Ms. Smith"</span>])
<span class="hljs-number">2</span>×<span class="hljs-number">2</span> DataFrame
 Row │ teacher_id  teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>       
─────┼──────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">1</span>  Mr. Jackson
   <span class="hljs-number">2</span> │          <span class="hljs-number">2</span>  Ms. Smith

julia&gt; grade_df = DataFrame(exam_id = <span class="hljs-number">1</span>, student_id = vcat(<span class="hljs-number">1</span>:<span class="hljs-number">3</span>, <span class="hljs-number">5</span>:<span class="hljs-number">10</span>), grade = [<span class="hljs-number">0.95</span>, <span class="hljs-number">0.93</span>, <span class="hljs-number">0.81</span>, <span class="hljs-number">0.85</span>, <span class="hljs-number">0.73</span>, <span class="hljs-number">0.88</span>, <span class="hljs-number">0.77</span>, <span class="hljs-number">0.75</span>, <span class="hljs-number">0.93</span>])
<span class="hljs-number">9</span>×<span class="hljs-number">3</span> DataFrame
 Row │ exam_id  student_id  grade   
     │ <span class="hljs-built_in">Int64</span>    <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Float64</span> 
─────┼──────────────────────────────
   <span class="hljs-number">1</span> │       <span class="hljs-number">1</span>           <span class="hljs-number">1</span>     <span class="hljs-number">0.95</span>
   <span class="hljs-number">2</span> │       <span class="hljs-number">1</span>           <span class="hljs-number">2</span>     <span class="hljs-number">0.93</span>
   <span class="hljs-number">3</span> │       <span class="hljs-number">1</span>           <span class="hljs-number">3</span>     <span class="hljs-number">0.81</span>
   <span class="hljs-number">4</span> │       <span class="hljs-number">1</span>           <span class="hljs-number">5</span>     <span class="hljs-number">0.85</span>
   <span class="hljs-number">5</span> │       <span class="hljs-number">1</span>           <span class="hljs-number">6</span>     <span class="hljs-number">0.73</span>
   <span class="hljs-number">6</span> │       <span class="hljs-number">1</span>           <span class="hljs-number">7</span>     <span class="hljs-number">0.88</span>
   <span class="hljs-number">7</span> │       <span class="hljs-number">1</span>           <span class="hljs-number">8</span>     <span class="hljs-number">0.77</span>
   <span class="hljs-number">8</span> │       <span class="hljs-number">1</span>           <span class="hljs-number">9</span>     <span class="hljs-number">0.75</span>
   <span class="hljs-number">9</span> │       <span class="hljs-number">1</span>          <span class="hljs-number">10</span>     <span class="hljs-number">0.93</span>
</code></pre>
<p>If we look at the <code>grade_df</code> we can see there are 9 results, but in the <code>student_df</code> we have 10 students.
So, someone must have missed the exam! Let's find out who that way we can alert the teacher to schedule
a makeup. </p>
<p>Let's do a <code>leftjoin</code>, which means every row will persist from the first <code>DataFrame</code> regardless if there is
a match to the second <code>DataFrame</code>. The <code>leftjoin</code> function also takes an <code>on</code> keyword argument
to signify what column needs to be used to find matches.</p>
<pre><code class="lang-julia">julia&gt; student_grade_df = leftjoin(student_df, grade_df, on=:student_id)
<span class="hljs-number">10</span>×<span class="hljs-number">5</span> DataFrame
 Row │ student_id  student_name  teacher_id  exam_id  grade      
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>?   <span class="hljs-built_in">Float64</span>?   
─────┼───────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">1</span>  Joe                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.95</span>
   <span class="hljs-number">2</span> │          <span class="hljs-number">2</span>  Sally                  <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>
   <span class="hljs-number">3</span> │          <span class="hljs-number">3</span>  Jim                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.81</span>
   <span class="hljs-number">4</span> │          <span class="hljs-number">5</span>  Beth                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.85</span>
   <span class="hljs-number">5</span> │          <span class="hljs-number">6</span>  Alex                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.73</span>
   <span class="hljs-number">6</span> │          <span class="hljs-number">7</span>  Tom                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.88</span>
   <span class="hljs-number">7</span> │          <span class="hljs-number">8</span>  Liz                    <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.77</span>
   <span class="hljs-number">8</span> │          <span class="hljs-number">9</span>  Bill                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.75</span>
   <span class="hljs-number">9</span> │         <span class="hljs-number">10</span>  Carl                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>
  <span class="hljs-number">10</span> │          <span class="hljs-number">4</span>  Sandy                  <span class="hljs-number">2</span>  missing  missing
</code></pre>
<p>We notice Sandy has a <code>missing</code> value for both the <code>exam_id</code> and <code>grade</code> fields. <code>missing</code> is a 
special datatype in Julia that is similar to a <code>null</code> value in databases. This would signify to
us that there was no match in the <code>grade_df</code> meaning Sandy missed the exam. We can add one more 
join to get the respective teacher's name.</p>
<pre><code class="lang-julia">julia&gt; result_df = innerjoin(student_grade_df, teacher_df, on=:teacher_id)
<span class="hljs-number">10</span>×<span class="hljs-number">6</span> DataFrame
 Row │ student_id  student_name  teacher_id  exam_id  grade       teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>?   <span class="hljs-built_in">Float64</span>?    <span class="hljs-built_in">String</span>       
─────┼─────────────────────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">1</span>  Joe                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.95</span>  Mr. Jackson
   <span class="hljs-number">2</span> │          <span class="hljs-number">2</span>  Sally                  <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">3</span> │          <span class="hljs-number">3</span>  Jim                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.81</span>  Mr. Jackson
   <span class="hljs-number">4</span> │          <span class="hljs-number">5</span>  Beth                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.85</span>  Mr. Jackson
   <span class="hljs-number">5</span> │          <span class="hljs-number">6</span>  Alex                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.73</span>  Ms. Smith
   <span class="hljs-number">6</span> │          <span class="hljs-number">7</span>  Tom                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.88</span>  Mr. Jackson
   <span class="hljs-number">7</span> │          <span class="hljs-number">8</span>  Liz                    <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.77</span>  Ms. Smith
   <span class="hljs-number">8</span> │          <span class="hljs-number">9</span>  Bill                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.75</span>  Mr. Jackson
   <span class="hljs-number">9</span> │         <span class="hljs-number">10</span>  Carl                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
  <span class="hljs-number">10</span> │          <span class="hljs-number">4</span>  Sandy                  <span class="hljs-number">2</span>  missing  missing     Ms. Smith
</code></pre>
<p>We used an <code>innerjoin</code> this time since we know that every student would have a teacher assigned.
Now, we can let Ms. Smith know that she needs to reach out to Sandy to re-schedule her exam.</p>
<h2 id="heading-sorting">Sorting</h2>
<p>Another helpful function for analysis is <code>sort</code>. Let's sort our <code>result_df</code> by the <code>grade</code> column.</p>
<pre><code class="lang-julia">julia&gt; sort(result_df, [:grade])
<span class="hljs-number">10</span>×<span class="hljs-number">6</span> DataFrame
 Row │ student_id  student_name  teacher_id  exam_id  grade       teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>?   <span class="hljs-built_in">Float64</span>?    <span class="hljs-built_in">String</span>       
─────┼─────────────────────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">6</span>  Alex                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.73</span>  Ms. Smith
   <span class="hljs-number">2</span> │          <span class="hljs-number">9</span>  Bill                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.75</span>  Mr. Jackson
   <span class="hljs-number">3</span> │          <span class="hljs-number">8</span>  Liz                    <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.77</span>  Ms. Smith
   <span class="hljs-number">4</span> │          <span class="hljs-number">3</span>  Jim                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.81</span>  Mr. Jackson
   <span class="hljs-number">5</span> │          <span class="hljs-number">5</span>  Beth                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.85</span>  Mr. Jackson
   <span class="hljs-number">6</span> │          <span class="hljs-number">7</span>  Tom                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.88</span>  Mr. Jackson
   <span class="hljs-number">7</span> │          <span class="hljs-number">2</span>  Sally                  <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">8</span> │         <span class="hljs-number">10</span>  Carl                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">9</span> │          <span class="hljs-number">1</span>  Joe                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.95</span>  Mr. Jackson
  <span class="hljs-number">10</span> │          <span class="hljs-number">4</span>  Sandy                  <span class="hljs-number">2</span>  missing  missing     Ms. Smith
</code></pre>
<p>The function takes the <code>DataFrame</code> and an array of columns to sort on. Our result of <code>sort</code> is putting the lowest
grade first, but if we wanted it descending we can pass a <code>rev</code> keyword argument.</p>
<pre><code class="lang-julia">julia&gt; sort(result_df, [:grade], rev=<span class="hljs-literal">true</span>)
<span class="hljs-number">10</span>×<span class="hljs-number">6</span> DataFrame
 Row │ student_id  student_name  teacher_id  exam_id  grade       teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>?   <span class="hljs-built_in">Float64</span>?    <span class="hljs-built_in">String</span>       
─────┼─────────────────────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">4</span>  Sandy                  <span class="hljs-number">2</span>  missing  missing     Ms. Smith
   <span class="hljs-number">2</span> │          <span class="hljs-number">1</span>  Joe                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.95</span>  Mr. Jackson
   <span class="hljs-number">3</span> │          <span class="hljs-number">2</span>  Sally                  <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">4</span> │         <span class="hljs-number">10</span>  Carl                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">5</span> │          <span class="hljs-number">7</span>  Tom                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.88</span>  Mr. Jackson
   <span class="hljs-number">6</span> │          <span class="hljs-number">5</span>  Beth                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.85</span>  Mr. Jackson
   <span class="hljs-number">7</span> │          <span class="hljs-number">3</span>  Jim                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.81</span>  Mr. Jackson
   <span class="hljs-number">8</span> │          <span class="hljs-number">8</span>  Liz                    <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.77</span>  Ms. Smith
   <span class="hljs-number">9</span> │          <span class="hljs-number">9</span>  Bill                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.75</span>  Mr. Jackson
  <span class="hljs-number">10</span> │          <span class="hljs-number">6</span>  Alex                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.73</span>  Ms. Smith
</code></pre>
<p>In both these cases a copy of the <code>DataFrame</code> is returned and the <code>result_df</code> is left unchanged. But, if we wanted to
sort in-place we can also use the <code>sort!</code> function that will update the passed <code>DataFrame</code>.</p>
<pre><code class="lang-julia">julia&gt; sort!(result_df, [:grade], rev=<span class="hljs-literal">true</span>)
<span class="hljs-number">10</span>×<span class="hljs-number">6</span> DataFrame
 Row │ student_id  student_name  teacher_id  exam_id  grade       teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>?   <span class="hljs-built_in">Float64</span>?    <span class="hljs-built_in">String</span>       
─────┼─────────────────────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">4</span>  Sandy                  <span class="hljs-number">2</span>  missing  missing     Ms. Smith
   <span class="hljs-number">2</span> │          <span class="hljs-number">1</span>  Joe                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.95</span>  Mr. Jackson
   <span class="hljs-number">3</span> │          <span class="hljs-number">2</span>  Sally                  <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">4</span> │         <span class="hljs-number">10</span>  Carl                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">5</span> │          <span class="hljs-number">7</span>  Tom                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.88</span>  Mr. Jackson
   <span class="hljs-number">6</span> │          <span class="hljs-number">5</span>  Beth                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.85</span>  Mr. Jackson
   <span class="hljs-number">7</span> │          <span class="hljs-number">3</span>  Jim                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.81</span>  Mr. Jackson
   <span class="hljs-number">8</span> │          <span class="hljs-number">8</span>  Liz                    <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.77</span>  Ms. Smith
   <span class="hljs-number">9</span> │          <span class="hljs-number">9</span>  Bill                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.75</span>  Mr. Jackson
  <span class="hljs-number">10</span> │          <span class="hljs-number">6</span>  Alex                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.73</span>  Ms. Smith
</code></pre>
<h2 id="heading-split-apply-combine">Split-apply-combine</h2>
<p>Now that we have some basics down, it's time to dive into aggregating results. In <code>DataFrames.jl</code> this is
referred to as a split-apply-combine strategy. It is a bit of a mouthful, but let's walk through what
exactly this is referring to.</p>
<p>Split is simply breaking the <code>DataFrame</code> into groups using the <code>groupby</code> function. In our example lets split
our <code>DataFrame</code> by the <code>teacher_name</code> column.</p>
<pre><code class="lang-julia">julia&gt; grouped_df = groupby(result_df, :teacher_name)
GroupedDataFrame with <span class="hljs-number">2</span> groups based on key: teacher_name
First Group (<span class="hljs-number">5</span> rows): teacher_name = <span class="hljs-string">"Mr. Jackson"</span>
 Row │ student_id  student_name  teacher_id  exam_id  grade     teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>?   <span class="hljs-built_in">Float64</span>?  <span class="hljs-built_in">String</span>       
─────┼───────────────────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">1</span>  Joe                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>      <span class="hljs-number">0.95</span>  Mr. Jackson
   <span class="hljs-number">2</span> │          <span class="hljs-number">3</span>  Jim                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>      <span class="hljs-number">0.81</span>  Mr. Jackson
   <span class="hljs-number">3</span> │          <span class="hljs-number">5</span>  Beth                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>      <span class="hljs-number">0.85</span>  Mr. Jackson
   <span class="hljs-number">4</span> │          <span class="hljs-number">7</span>  Tom                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>      <span class="hljs-number">0.88</span>  Mr. Jackson
   <span class="hljs-number">5</span> │          <span class="hljs-number">9</span>  Bill                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>      <span class="hljs-number">0.75</span>  Mr. Jackson
⋮
Last Group (<span class="hljs-number">5</span> rows): teacher_name = <span class="hljs-string">"Ms. Smith"</span>
 Row │ student_id  student_name  teacher_id  exam_id  grade       teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>?   <span class="hljs-built_in">Float64</span>?    <span class="hljs-built_in">String</span>       
─────┼─────────────────────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">2</span>  Sally                  <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">2</span> │          <span class="hljs-number">6</span>  Alex                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.73</span>  Ms. Smith
   <span class="hljs-number">3</span> │          <span class="hljs-number">8</span>  Liz                    <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.77</span>  Ms. Smith
   <span class="hljs-number">4</span> │         <span class="hljs-number">10</span>  Carl                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>        <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">5</span> │          <span class="hljs-number">4</span>  Sandy                  <span class="hljs-number">2</span>  missing  missing     Ms. Smith
</code></pre>
<p>The result of calling <code>groupby</code> is of type <code>GroupedDataFrame</code>, which is basically a wrapper
around one, or many, groups of a <code>DataFrame</code>. In our example we have two teachers and so the result
<code>GroupedDataFrame</code> has two groups.</p>
<p>Now, lets try to get an average exam grade for our two teachers. This will introduce the <code>combine</code>
function that takes a <code>GroupedDataFrame</code> and any number of aggregation functions. Let's also add
the <code>Statistics.jl</code> package, so we can take advantage of the <code>mean</code> function.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> Statistics

julia&gt; combine(grouped_df, :grade =&gt; mean)
<span class="hljs-number">2</span>×<span class="hljs-number">2</span> DataFrame
 Row │ teacher_name  grade_mean  
     │ <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Float64</span>?    
─────┼───────────────────────────
   <span class="hljs-number">1</span> │ Mr. Jackson         <span class="hljs-number">0.848</span>
   <span class="hljs-number">2</span> │ Ms. Smith     missing
</code></pre>
<p>The result is a <code>DataFrame</code> where the first column(s) will match our <code>GroupedDataFrame</code> key(s) and the 
subsequent column(s) will match the function(s) we pass for aggregation. However, Ms. Smith has a <code>grade_mean</code> 
of <code>missing</code>!?</p>
<p>In our earlier discussion we found that Sandy missed the exam, so her grade was set to <code>missing</code>. A <code>missing</code> value
behaves differently than normal numbers, which is problematic in our aggregation function. Take a look at a very simple
example.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-number">1</span> + missing
missing
</code></pre>
<p>We notice that adding a value of 1 to <code>missing</code> equals <code>missing</code>. This is a necessary evil and you may be wondering
why don't we just treat it as 0? Let's see what happens to our results if we replace <code>missing</code> with 0.</p>
<pre><code class="lang-julia">julia&gt; combine(grouped_df, :grade =&gt; (x -&gt; mean(coalesce.(x, <span class="hljs-number">0</span>))))
<span class="hljs-number">2</span>×<span class="hljs-number">2</span> DataFrame
 Row │ teacher_name  grade_function 
     │ <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Float64</span>        
─────┼──────────────────────────────
   <span class="hljs-number">1</span> │ Mr. Jackson            <span class="hljs-number">0.848</span>
   <span class="hljs-number">2</span> │ Ms. Smith              <span class="hljs-number">0.672</span>
</code></pre>
<p>In the above example, instead of just passing <code>mean</code> as the function we create an <a target="_blank" href="https://docs.julialang.org/en/v1/manual/functions/#man-anonymous-functions">anonymous function</a>. This allows us to get a little
more clever with adding a <code>coalesce</code> to replace the <code>missing</code> values with 0. We see from the results that Ms. Smith has a much lower
scoring average than Mr. Jackson. But, if we think about it the results are getting incorrectly skewed. We know Sandy didn't actually
score a 0, but rather didn't take the test at all. Treating her result as a 0 is skewing the average much lower than it should be.</p>
<p>In some cases replacing with a 0 would make sense, but not in this scenario. Here are a few better options:</p>
<p>We could just drop the rows that contain <code>missing</code> values prior to aggregation. <code>DataFrames.jl</code> provides a <code>dropmissing</code> function
specifically for this.</p>
<pre><code class="lang-julia">julia&gt; result_no_missing_df = dropmissing(result_df)
<span class="hljs-number">9</span>×<span class="hljs-number">6</span> DataFrame
 Row │ student_id  student_name  teacher_id  exam_id  grade    teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>    <span class="hljs-built_in">Float64</span>  <span class="hljs-built_in">String</span>       
─────┼──────────────────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">1</span>  Joe                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.95</span>  Mr. Jackson
   <span class="hljs-number">2</span> │          <span class="hljs-number">2</span>  Sally                  <span class="hljs-number">2</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">3</span> │          <span class="hljs-number">3</span>  Jim                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.81</span>  Mr. Jackson
   <span class="hljs-number">4</span> │          <span class="hljs-number">5</span>  Beth                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.85</span>  Mr. Jackson
   <span class="hljs-number">5</span> │          <span class="hljs-number">6</span>  Alex                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.73</span>  Ms. Smith
   <span class="hljs-number">6</span> │          <span class="hljs-number">7</span>  Tom                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.88</span>  Mr. Jackson
   <span class="hljs-number">7</span> │          <span class="hljs-number">8</span>  Liz                    <span class="hljs-number">2</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.77</span>  Ms. Smith
   <span class="hljs-number">8</span> │          <span class="hljs-number">9</span>  Bill                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.75</span>  Mr. Jackson
   <span class="hljs-number">9</span> │         <span class="hljs-number">10</span>  Carl                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.93</span>  Ms. Smith

julia&gt; grouped_no_missing_df = groupby(result_no_missing_df, :teacher_name)
GroupedDataFrame with <span class="hljs-number">2</span> groups based on key: teacher_name
First Group (<span class="hljs-number">5</span> rows): teacher_name = <span class="hljs-string">"Mr. Jackson"</span>
 Row │ student_id  student_name  teacher_id  exam_id  grade    teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>    <span class="hljs-built_in">Float64</span>  <span class="hljs-built_in">String</span>       
─────┼──────────────────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">1</span>  Joe                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.95</span>  Mr. Jackson
   <span class="hljs-number">2</span> │          <span class="hljs-number">3</span>  Jim                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.81</span>  Mr. Jackson
   <span class="hljs-number">3</span> │          <span class="hljs-number">5</span>  Beth                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.85</span>  Mr. Jackson
   <span class="hljs-number">4</span> │          <span class="hljs-number">7</span>  Tom                    <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.88</span>  Mr. Jackson
   <span class="hljs-number">5</span> │          <span class="hljs-number">9</span>  Bill                   <span class="hljs-number">1</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.75</span>  Mr. Jackson
⋮
Last Group (<span class="hljs-number">4</span> rows): teacher_name = <span class="hljs-string">"Ms. Smith"</span>
 Row │ student_id  student_name  teacher_id  exam_id  grade    teacher_name 
     │ <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Int64</span>       <span class="hljs-built_in">Int64</span>    <span class="hljs-built_in">Float64</span>  <span class="hljs-built_in">String</span>       
─────┼──────────────────────────────────────────────────────────────────────
   <span class="hljs-number">1</span> │          <span class="hljs-number">2</span>  Sally                  <span class="hljs-number">2</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.93</span>  Ms. Smith
   <span class="hljs-number">2</span> │          <span class="hljs-number">6</span>  Alex                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.73</span>  Ms. Smith
   <span class="hljs-number">3</span> │          <span class="hljs-number">8</span>  Liz                    <span class="hljs-number">2</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.77</span>  Ms. Smith
   <span class="hljs-number">4</span> │         <span class="hljs-number">10</span>  Carl                   <span class="hljs-number">2</span>        <span class="hljs-number">1</span>     <span class="hljs-number">0.93</span>  Ms. Smith

julia&gt; combine(grouped_no_missing_df, :grade =&gt; mean)
<span class="hljs-number">2</span>×<span class="hljs-number">2</span> DataFrame
 Row │ teacher_name  grade_mean 
     │ <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Float64</span>    
─────┼──────────────────────────
   <span class="hljs-number">1</span> │ Mr. Jackson        <span class="hljs-number">0.848</span>
   <span class="hljs-number">2</span> │ Ms. Smith          <span class="hljs-number">0.84</span>
</code></pre>
<p>We now see that the two teachers average test scores are very similar. This approach would work well
if we never again needed the rows containing <code>missing</code> values. </p>
<p>But, if we wanted to keep those rows around and rather just exclude them from certain calculations. 
We can make use of another function, <code>skipmissing</code>, which will simply skip over the <code>missing</code> values.</p>
<pre><code class="lang-julia">julia&gt; combine(grouped_df, :grade =&gt; (x -&gt; mean(skipmissing(x))))
<span class="hljs-number">2</span>×<span class="hljs-number">2</span> DataFrame
 Row │ teacher_name  grade_function 
     │ <span class="hljs-built_in">String</span>        <span class="hljs-built_in">Float64</span>        
─────┼──────────────────────────────
   <span class="hljs-number">1</span> │ Mr. Jackson            <span class="hljs-number">0.848</span>
   <span class="hljs-number">2</span> │ Ms. Smith              <span class="hljs-number">0.84</span>
</code></pre>
<p>One last thing to note on <code>missing</code> values is that it is easy to identify if one, or more, of your <code>DataFrame</code>
columns contains <code>missing</code> values. We talked earlier that <code>DataFrames.jl</code> infers the types for each
column and displays them in the output. You'll notice in <code>result_df</code> that the column <code>teacher_id</code> is of datatype
<code>Int64</code> and <code>exam_id</code> is of <code>Int64?</code>. Here the <code>?</code> denotes that <code>missing</code> values were found, so be careful!</p>
<h1 id="heading-conclusion">Conclusion</h1>
<p>We've touched on some of the topics that makes <code>DataFrames.jl</code> such a great general purpose package. It is a helpful
tool to quickly interact for data exploration, or to be used in production code to manipulate tabular data. I hope
you've enjoyed today's reading and be sure to check out the rest of our blog posts on <a target="_blank" href="https://blog.glcs.io/">blog.glcs.io</a>!</p>
]]></content:encoded></item><item><title><![CDATA[Mastering Efficient Array Operations with StaticArrays.jl in Julia]]></title><description><![CDATA[The Julia programming language
is known for being a high-level language
that can still compete with C
in terms of performance.
As such,
Julia already has performant data structures built-in,
such as arrays.
But what if arrays could be even faster?
Th...]]></description><link>https://blog.glcs.io/staticarrays</link><guid isPermaLink="true">https://blog.glcs.io/staticarrays</guid><category><![CDATA[Julia]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Steven Whitaker]]></dc:creator><pubDate>Mon, 18 Mar 2024 15:03:32 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1709056050577/aa6UfGC7B.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The <a target="_blank" href="https://julialang.org/">Julia programming language</a>
is known for being a high-level language
that can still compete with C
in terms of performance.
As such,
Julia already has performant data structures built-in,
such as arrays.
But what if arrays could be even faster?
That's where the <a target="_blank" href="https://github.com/JuliaArrays/StaticArrays.jl">StaticArrays.jl</a> package comes in.</p>
<p>StaticArrays.jl provides drop-in replacements for <code>Array</code>,
the standard Julia array type.
These <code>StaticArray</code>s work just like <code>Array</code>s,
but they provide one additional piece of information
in the type:
the size of the array.
Consequently,
you can't insert or remove elements of a <code>StaticArray</code>;
they are statically sized arrays
(hence the name).
However,
this restriction allows more information
to be given to Julia's compiler,
which in turn results in more efficient machine code
(for example, via loop unrolling and SIMD operations).
The resulting speed-up can often be 10x or more!</p>
<p>In this post,
we will learn how to use StaticArrays.jl
and compare the performance of <code>StaticArray</code>s
to that of regular <code>Array</code>s
for several different operations.</p>
<p>Note that the code examples in this post
assume StaticArrays.jl has been installed and loaded:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Press ] to enter the package prompt.</span>
pkg&gt; add StaticArrays

<span class="hljs-comment"># Press Backspace to return to the Julia prompt.</span>
julia&gt; <span class="hljs-keyword">using</span> StaticArrays
</code></pre>
<p>(Check out <a target="_blank" href="https://blog.glcs.io/julia-repl">our post on the Julia REPL</a>
for more details about the package prompt
and navigating the REPL.)</p>
<h1 id="heading-how-to-use-staticarraysjl">How to Use StaticArrays.jl</h1>
<p>When working with StaticArrays.jl,
typically one will use the <code>SVector</code> type
or the <code>SMatrix</code> type.
(There is also the <code>SArray</code> type for N-dimensional arrays,
but we will focus on 1D and 2D arrays in this post.)
<code>SVector</code>s and <code>SMatrix</code>es have both static size
and static data,
meaning the data contained in such objects
cannot be modified.
For statically sized arrays
whose contents can be modified,
StaticArrays.jl provides <code>MVector</code> and <code>MMatrix</code> (and <code>MArray</code>).
We will stick with <code>SVector</code>s and <code>SMatrix</code>es in this post
unless we specifically need mutability.</p>
<h2 id="heading-constructors">Constructors</h2>
<p>There are three ways to construct <code>StaticArray</code>s.</p>
<ol>
<li><p>Convenience constructor <code>SA</code>:</p>
<pre><code class="lang-julia">julia&gt; SA[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]
<span class="hljs-number">3</span>-element SVector{<span class="hljs-number">3</span>, <span class="hljs-built_in">Int64</span>} with indices SOneTo(<span class="hljs-number">3</span>):
 <span class="hljs-number">1</span>
 <span class="hljs-number">2</span>
 <span class="hljs-number">3</span>

julia&gt; SA[<span class="hljs-number">1</span> <span class="hljs-number">2</span>; <span class="hljs-number">3</span> <span class="hljs-number">4</span>]
<span class="hljs-number">2</span>×<span class="hljs-number">2</span> SMatrix{<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-built_in">Int64</span>, <span class="hljs-number">4</span>} with indices SOneTo(<span class="hljs-number">2</span>)×SOneTo(<span class="hljs-number">2</span>):
 <span class="hljs-number">1</span>  <span class="hljs-number">2</span>
 <span class="hljs-number">3</span>  <span class="hljs-number">4</span>
</code></pre>
</li>
<li><p>Normal constructor functions:</p>
<pre><code>julia&gt; SVector(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>)
<span class="hljs-number">2</span>-element SVector{<span class="hljs-number">2</span>, Int64} <span class="hljs-keyword">with</span> indices SOneTo(<span class="hljs-number">2</span>):
 <span class="hljs-number">1</span>
 <span class="hljs-number">2</span>

julia&gt; SMatrix{<span class="hljs-number">2</span>,<span class="hljs-number">3</span>}(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>)
<span class="hljs-number">2</span>×<span class="hljs-number">3</span> SMatrix{<span class="hljs-number">2</span>, <span class="hljs-number">3</span>, Int64, <span class="hljs-number">6</span>} <span class="hljs-keyword">with</span> indices SOneTo(<span class="hljs-number">2</span>)×SOneTo(<span class="hljs-number">3</span>):
 <span class="hljs-number">1</span>  <span class="hljs-number">3</span>  <span class="hljs-number">5</span>
 <span class="hljs-number">2</span>  <span class="hljs-number">4</span>  <span class="hljs-number">6</span>
</code></pre></li>
<li><p>Macros:</p>
<pre><code>julia&gt; @SVector [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]
<span class="hljs-number">3</span>-element SVector{<span class="hljs-number">3</span>, Int64} <span class="hljs-keyword">with</span> indices SOneTo(<span class="hljs-number">3</span>):
 <span class="hljs-number">1</span>
 <span class="hljs-number">2</span>
 <span class="hljs-number">3</span>

julia&gt; @SMatrix [<span class="hljs-number">1</span> <span class="hljs-number">2</span>; <span class="hljs-number">3</span> <span class="hljs-number">4</span>]
<span class="hljs-number">2</span>×<span class="hljs-number">2</span> SMatrix{<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, Int64, <span class="hljs-number">4</span>} <span class="hljs-keyword">with</span> indices SOneTo(<span class="hljs-number">2</span>)×SOneTo(<span class="hljs-number">2</span>):
 <span class="hljs-number">1</span>  <span class="hljs-number">2</span>
 <span class="hljs-number">3</span>  <span class="hljs-number">4</span>
</code></pre><p>Note that using macros
also enables a convenient way
to create <code>StaticArray</code>s from common array-creation functions
(eliminating the need to create an <code>Array</code> first
just to convert it immediately to a <code>StaticArray</code>):</p>
<pre><code class="lang-julia"><span class="hljs-meta">@SVector</span> [<span class="hljs-number">10</span> * i <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:<span class="hljs-number">10</span>]
<span class="hljs-meta">@SVector</span> zeros(<span class="hljs-number">5</span>)
<span class="hljs-meta">@SVector</span> rand(<span class="hljs-number">7</span>)
<span class="hljs-meta">@SMatrix</span> [(i, j) <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:<span class="hljs-number">2</span>, j = <span class="hljs-number">1</span>:<span class="hljs-number">3</span>]
<span class="hljs-meta">@SMatrix</span> zeros(<span class="hljs-number">2</span>, <span class="hljs-number">2</span>)
<span class="hljs-meta">@SMatrix</span> randn(<span class="hljs-number">6</span>, <span class="hljs-number">6</span>)
</code></pre>
</li>
</ol>
<h2 id="heading-conversion-tofrom-array">Conversion to/from <code>Array</code></h2>
<p>It may occasionally be necessary
to convert to or from <code>Array</code>s.
To convert from an <code>Array</code> to a <code>StaticArray</code>,
use the appropriate constructor function.
However, because <code>Array</code>s do not have size information in the type,
we ourselves must provide the size to the constructor:</p>
<pre><code class="lang-julia">SVector{<span class="hljs-number">3</span>}([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>])
SMatrix{<span class="hljs-number">4</span>,<span class="hljs-number">4</span>}(zeros(<span class="hljs-number">4</span>, <span class="hljs-number">4</span>))
</code></pre>
<p>To convert back to an <code>Array</code>, use the <code>collect</code> function:</p>
<pre><code class="lang-julia">julia&gt; collect(SVector(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>))
<span class="hljs-number">2</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">1</span>
 <span class="hljs-number">2</span>
</code></pre>
<h1 id="heading-comparing-staticarrays-to-arrays">Comparing <code>StaticArray</code>s to <code>Array</code>s</h1>
<p>Once a <code>StaticArray</code> is created,
it can be operated on in the same way
as an <code>Array</code>.
To illustrate,
we will run a simple benchmark,
both to compare the run-time speeds
of the two types of arrays
and to show that the same code can work
with either type of array.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1709056851531/HW_luh3Nv.png?auto=format" alt="Stopwatch" /></p>
<p>Here's the benchmark code,
inspired by <a target="_blank" href="https://github.com/JuliaArrays/StaticArrays.jl/blob/07c12450d1b3481dda4b503564ae4a5cb4e27ce4/perf/README_benchmarks.jl">StaticArrays.jl's benchmark</a>:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">using</span> BenchmarkTools, StaticArrays, LinearAlgebra, Printf

add!(C, A, B) = C .= A .+ B

<span class="hljs-keyword">function</span> run_benchmarks(N)

    A = rand(N, N); A = A' * A
    B = rand(N, N)
    C = <span class="hljs-built_in">Matrix</span>{eltype(A)}(undef, N, N)
    D = rand(N)
    SA = SMatrix{N,N}(A)
    SB = SMatrix{N,N}(B)
    MA = MMatrix{N,N}(A)
    MB = MMatrix{N,N}(B)
    MC = MMatrix{N,N}(C)
    SD = SVector{N}(D)

    speedup = [
        <span class="hljs-meta">@belapsed</span>($A + $B) / <span class="hljs-meta">@belapsed</span>($SA + $SB),
        <span class="hljs-meta">@belapsed</span>(add!($C, $A, $B)) / <span class="hljs-meta">@belapsed</span>(add!($MC, $MA, $MB)),
        <span class="hljs-meta">@belapsed</span>($A * $B) / <span class="hljs-meta">@belapsed</span>($SA * $SB),
        <span class="hljs-meta">@belapsed</span>(mul!($C, $A, $B)) / <span class="hljs-meta">@belapsed</span>(mul!($MC, $MA, $MB)),
        <span class="hljs-meta">@belapsed</span>(norm($D)) / <span class="hljs-meta">@belapsed</span>(norm($SD)),
        <span class="hljs-meta">@belapsed</span>(det($A)) / <span class="hljs-meta">@belapsed</span>(det($SA)),
        <span class="hljs-meta">@belapsed</span>(inv($A)) / <span class="hljs-meta">@belapsed</span>(inv($SA)),
        <span class="hljs-meta">@belapsed</span>($A \ $D) / <span class="hljs-meta">@belapsed</span>($SA \ $SD),
        <span class="hljs-meta">@belapsed</span>(eigen($A)) / <span class="hljs-meta">@belapsed</span>(eigen($SA)),
        <span class="hljs-meta">@belapsed</span>(map(abs, $A)) / <span class="hljs-meta">@belapsed</span>(map(abs, $SA)),
        <span class="hljs-meta">@belapsed</span>(sum($D)) / <span class="hljs-meta">@belapsed</span>(sum($SD)),
        <span class="hljs-meta">@belapsed</span>(sort($D)) / <span class="hljs-meta">@belapsed</span>(sort($SD)),
    ]

    <span class="hljs-keyword">return</span> speedup

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> main()

    benchmarks = [
        <span class="hljs-string">"Addition"</span>,
        <span class="hljs-string">"Addition (in-place)"</span>,
        <span class="hljs-string">"Multiplication"</span>,
        <span class="hljs-string">"Multiplication (in-place)"</span>,
        <span class="hljs-string">"L2 Norm"</span>,
        <span class="hljs-string">"Determinant"</span>,
        <span class="hljs-string">"Inverse"</span>,
        <span class="hljs-string">"Linear Solve (A \\ b)"</span>,
        <span class="hljs-string">"Symmetric Eigendecomposition"</span>,
        <span class="hljs-string">"`map`"</span>,
        <span class="hljs-string">"Sum of Elements"</span>,
        <span class="hljs-string">"Sorting"</span>,
    ]
    N = [<span class="hljs-number">3</span>, <span class="hljs-number">5</span>, <span class="hljs-number">10</span>, <span class="hljs-number">30</span>]
    speedups = map(run_benchmarks, N)
    fmt_header = Printf.Format(<span class="hljs-string">"%-<span class="hljs-subst">$(maximum(length.(benchmarks)</span>))s"</span> * <span class="hljs-string">" | %7s"</span>^length(N))
    header = Printf.format(fmt_header, <span class="hljs-string">"Benchmark"</span>, string.(<span class="hljs-string">"N = "</span>, N)...)
    println(header)
    println(<span class="hljs-string">"="</span>^length(header))
    fmt = Printf.Format(<span class="hljs-string">"%-<span class="hljs-subst">$(maximum(length.(benchmarks)</span>))s"</span> * <span class="hljs-string">" | %7.1f"</span>^length(N))
    <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:length(benchmarks)
        println(Printf.format(fmt, benchmarks[i], getindex.(speedups, i)...))
    <span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>

main()
</code></pre>
<p>Notice that all the functions called
when creating the array <code>speedup</code>
in <code>run_benchmarks</code>
are the same whether using <code>Array</code>s or <code>StaticArray</code>s,
illustrating that <code>StaticArray</code>s
are drop-in replacements for standard <code>Array</code>s.</p>
<p>Running the above code
prints the following results on my laptop
(the numbers indicate the speedup
of <code>StaticArray</code>s over normal <code>Array</code>s;
e.g., a value of 17.7 means
using <code>StaticArray</code>s was 17.7 times faster
than using <code>Array</code>s):</p>
<pre><code>Benchmark                    |   N = <span class="hljs-number">3</span> |   N = <span class="hljs-number">5</span> |  N = <span class="hljs-number">10</span> |  N = <span class="hljs-number">30</span>
====================================================================
Addition                     |    <span class="hljs-number">17.7</span> |    <span class="hljs-number">14.5</span> |     <span class="hljs-number">7.9</span> |     <span class="hljs-number">2.0</span>
Addition (<span class="hljs-keyword">in</span>-place)          |     <span class="hljs-number">1.6</span> |     <span class="hljs-number">1.3</span> |     <span class="hljs-number">1.4</span> |     <span class="hljs-number">0.7</span>
Multiplication               |     <span class="hljs-number">8.2</span> |     <span class="hljs-number">7.0</span> |     <span class="hljs-number">4.2</span> |     <span class="hljs-number">2.6</span>
Multiplication (<span class="hljs-keyword">in</span>-place)    |     <span class="hljs-number">1.9</span> |     <span class="hljs-number">5.9</span> |     <span class="hljs-number">3.0</span> |     <span class="hljs-number">1.0</span>
L2 Norm                      |     <span class="hljs-number">4.2</span> |     <span class="hljs-number">4.0</span> |     <span class="hljs-number">5.4</span> |     <span class="hljs-number">9.7</span>
Determinant                  |    <span class="hljs-number">66.6</span> |     <span class="hljs-number">2.5</span> |     <span class="hljs-number">1.3</span> |     <span class="hljs-number">0.9</span>
Inverse                      |    <span class="hljs-number">54.8</span> |     <span class="hljs-number">5.9</span> |     <span class="hljs-number">1.8</span> |     <span class="hljs-number">0.9</span>
Linear Solve (A \ b)         |    <span class="hljs-number">65.5</span> |     <span class="hljs-number">3.7</span> |     <span class="hljs-number">1.8</span> |     <span class="hljs-number">0.9</span>
Symmetric Eigendecomposition |     <span class="hljs-number">3.7</span> |     <span class="hljs-number">1.0</span> |     <span class="hljs-number">1.0</span> |     <span class="hljs-number">1.0</span>
<span class="hljs-string">`map`</span>                        |    <span class="hljs-number">10.6</span> |     <span class="hljs-number">8.2</span> |     <span class="hljs-number">4.9</span> |     <span class="hljs-number">2.1</span>
Sum <span class="hljs-keyword">of</span> Elements              |     <span class="hljs-number">1.5</span> |     <span class="hljs-number">1.1</span> |     <span class="hljs-number">1.7</span> |     <span class="hljs-number">2.1</span>
Sorting                      |     <span class="hljs-number">7.1</span> |     <span class="hljs-number">2.9</span> |     <span class="hljs-number">1.5</span> |     <span class="hljs-number">1.1</span>
</code></pre><p>There are two main conclusions from this table.
First,
using <code>StaticArray</code>s instead of <code>Array</code>s
can result in some nice speed-ups!
Second,
the gains from using <code>StaticArray</code>s tend to diminish
as the sizes of the arrays increase.
So,
you can't expect StaticArrays.jl
to always magically make your code faster,
but if your arrays are small enough
(the recommendation being fewer than about 100 elements)
then you can expect to see some good speed-ups.</p>
<p>Of course,
the above code timed just individual operations;
how much faster a particular application would be
is a different matter.</p>
<p>For example,
consider a physical simulation
where many 3D vectors
are manipulated over several time steps.
Since 3D vectors are static in size
(i.e., are 1D arrays with exactly three elements),
such a situation is a prime example
of where StaticArrays.jl is useful.
To illustrate,
here is an example
(taken from the field of magnetic resonance imaging)
of a physical simulation
using <code>Array</code>s vs using <code>StaticArrays</code>:</p>
<pre><code class="lang-julia"><span class="hljs-keyword">using</span> BenchmarkTools, StaticArrays, LinearAlgebra

<span class="hljs-keyword">function</span> sim_arrays(N)

    M = <span class="hljs-built_in">Matrix</span>{<span class="hljs-built_in">Float64</span>}(undef, <span class="hljs-number">3</span>, N)
    M[<span class="hljs-number">1</span>,:] .= <span class="hljs-number">0.0</span>
    M[<span class="hljs-number">2</span>,:] .= <span class="hljs-number">0.0</span>
    M[<span class="hljs-number">3</span>,:] .= <span class="hljs-number">1.0</span>
    M2 = similar(M)

    (sinα, cosα) = sincosd(<span class="hljs-number">30</span>)
    R = [<span class="hljs-number">1</span> <span class="hljs-number">0</span> <span class="hljs-number">0</span>; <span class="hljs-number">0</span> cosα sinα; <span class="hljs-number">0</span> -sinα cosα]
    E1 = exp(-<span class="hljs-number">0.01</span>)
    E2 = exp(-<span class="hljs-number">0.1</span>)
    (sinθ, cosθ) = sincosd(<span class="hljs-number">1</span>)
    F = [E2 * cosθ E2 * sinθ <span class="hljs-number">0</span>; -E2 * sinθ E2 * cosθ <span class="hljs-number">0</span>; <span class="hljs-number">0</span> <span class="hljs-number">0</span> E1]
    FR = F * R
    C = [<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span> - E1]
    <span class="hljs-comment"># Run for 100 time steps (each loop iteration does 2 time steps).</span>
    <span class="hljs-keyword">for</span> t = <span class="hljs-number">1</span>:<span class="hljs-number">50</span>
        mul!(M2, FR, M)
        M2 .+= C
        mul!(M, FR, M2)
        M .+= C
    <span class="hljs-keyword">end</span>
    total = sum(M; dims = <span class="hljs-number">2</span>)

    <span class="hljs-keyword">return</span> complex(total[<span class="hljs-number">1</span>], total[<span class="hljs-number">2</span>])

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> sim_staticarrays(N)

    M = fill(SVector(<span class="hljs-number">0.0</span>, <span class="hljs-number">0.0</span>, <span class="hljs-number">1.0</span>), N)

    (sinα, cosα) = sincosd(<span class="hljs-number">30</span>)
    R = <span class="hljs-meta">@SMatrix</span> [<span class="hljs-number">1</span> <span class="hljs-number">0</span> <span class="hljs-number">0</span>; <span class="hljs-number">0</span> cosα sinα; <span class="hljs-number">0</span> -sinα cosα]
    E1 = exp(-<span class="hljs-number">0.01</span>)
    E2 = exp(-<span class="hljs-number">0.1</span>)
    (sinθ, cosθ) = sincosd(<span class="hljs-number">1</span>)
    F = <span class="hljs-meta">@SMatrix</span> [E2 * cosθ E2 * sinθ <span class="hljs-number">0</span>; -E2 * sinθ E2 * cosθ <span class="hljs-number">0</span>; <span class="hljs-number">0</span> <span class="hljs-number">0</span> E1]
    FR = F * R
    C = <span class="hljs-meta">@SVector</span> [<span class="hljs-number">0</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span> - E1]
    <span class="hljs-comment"># Run for 100 time steps (each loop iteration does 1 time step).</span>
    <span class="hljs-keyword">for</span> t = <span class="hljs-number">1</span>:<span class="hljs-number">100</span>
        <span class="hljs-comment"># Apply simulation dynamics to each 3D vector.</span>
        <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:length(M)
            M[i] = FR * M[i] + C
        <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">end</span>
    total = sum(M)

    <span class="hljs-keyword">return</span> complex(total[<span class="hljs-number">1</span>], total[<span class="hljs-number">2</span>])

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> main(N)

    r1 = <span class="hljs-meta">@btime</span> sim_arrays($N)
    r2 = <span class="hljs-meta">@btime</span> sim_staticarrays($N)
    <span class="hljs-meta">@assert</span> r1 ≈ r2 <span class="hljs-comment"># Make sure the results are the same.</span>

<span class="hljs-keyword">end</span>
</code></pre>
<p>The speed-ups on my laptop
for different values of <code>N</code>
were as follows:</p>
<ul>
<li><code>N = 10</code>: 14.6x faster</li>
<li><code>N = 100</code>: 7.1x faster</li>
<li><code>N = 1000</code>: 5.2x faster</li>
</ul>
<p>(Here, <code>N</code> is the number of 3D vectors in the simulation,
not the size of the <code>StaticArray</code>s.)</p>
<p>Note also that I wrote <code>sim_arrays</code>
to be as performant as possible
by doing in-place operations
(like <code>mul!</code>),
which has the unfortunate side effect
of making the code a bit harder to read.
Therefore,
<code>sim_staticarrays</code> is both faster <em>and</em> easier to read!</p>
<p>As another example
of how StaticArrays.jl
can speed up a more involved application,
see the <a target="_blank" href="https://docs.sciml.ai/DiffEqDocs/stable/tutorials/faster_ode_example/#Example-Accelerating-a-Non-Stiff-Equation:-The-Lorenz-Equation">DifferentialEquations.jl docs</a>.</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we discussed StaticArrays.jl.
We saw that <code>StaticArray</code>s are drop-in replacements
for regular Julia <code>Array</code>s.
We also saw that using <code>StaticArray</code>s
can result in some nice speed-ups
over using <code>Array</code>s,
at least when the sizes of the arrays
are not too big.</p>
<p>Are array operations a bottleneck in your code?
Try out StaticArrays.jl
and then comment below how it helps!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://juliaarrays.github.io/StaticArrays.jl/stable/">StaticArrays.jl Docs</a><ul>
<li>Documentation for StaticArrays.jl.</li>
</ul>
</li>
</ul>
<p>Cover image background from
<a target="_blank" href="https://openverse.org/image/875bf026-11ef-47a8-a63c-ee1f1877c156?q=circuit%20board%20array">https://openverse.org/image/875bf026-11ef-47a8-a63c-ee1f1877c156?q=circuit%20board%20array</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Efficient Julia Optimization through an MRI Case Study]]></title><description><![CDATA[Welcome to our first
Julia for Devs
blog post!
This will be a continuously running series of posts
where our team will discuss
how we have used Julia
to solve real-life problems.
So, let's get started!

In this Julia for Devs post,
we will discuss us...]]></description><link>https://blog.glcs.io/optimization-mri</link><guid isPermaLink="true">https://blog.glcs.io/optimization-mri</guid><category><![CDATA[Julia]]></category><category><![CDATA[MRI ]]></category><category><![CDATA[optimization]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Steven Whitaker]]></dc:creator><pubDate>Mon, 19 Feb 2024 15:03:46 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1708032289596/Tlem-6QXR.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote>
<p>Welcome to our first
<em>Julia for Devs</em>
blog post!
This will be a continuously running series of posts
where our team will discuss
how we have used Julia
to solve real-life problems.
So, let's get started!</p>
</blockquote>
<p>In this <em>Julia for Devs</em> post,
we will discuss using Julia
to optimize scan parameters
in magnetic resonance imaging (MRI).</p>
<p>If you are interested
just in seeing example code
showing how to use Julia
to optimize your own cost function,
feel free to <a class="post-section-overview" href="#heading-optimization">skip ahead</a>.</p>
<p>While we will discuss MRI specifically,
the concepts
(and even some of the code)
we will see
will be applicable
in many situations where optimization is useful,
particularly when there is a model
for an observable signal.
The model could be, e.g.,
a mathematical equation
or a trained neural network,
and the observable signal
(aka the output of the model)
could be, e.g.,
the image intensity of a medical imaging scan
or the energy needed to heat a building.</p>
<h1 id="heading-problem-setup">Problem Setup</h1>
<p>In this post,
the signal model we will use
is a mathematical equation
that gives the image intensity
of a balanced steady-state free precession (bSSFP) MRI scan.
This model has two sets of inputs:
those that are user-defined
and those that are not.
In MRI,
user-defined inputs are scan parameters,
such as how long to run the scan for
or how much power to use to generate MRI signal.
The other input parameters are called tissue parameters
because they are properties that are intrinsic
to the imaged tissue (e.g., muscle, brain, etc.).</p>
<p>Tissue parameters often are assumed
to take on a pre-defined value
for each specific tissue type.
However,
because they are tissue-specific
and can vary with health and age,
sometimes tissue parameters are considered to be unknown values
that need to be estimated from data.</p>
<p>The problem we will discuss in this post
is how to estimate tissue parameters
from a set of bSSFP scans.
Then we will discuss
how to optimize the scan parameters
of the set of bSSFP scans
to improve the tissue parameter estimates.</p>
<p>To estimate tissue parameters,
we need the following:</p>
<ol>
<li>a signal model, and</li>
<li>an estimation algorithm.</li>
</ol>
<h2 id="heading-signal-model">Signal Model</h2>
<p>Here's the signal model:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Reference: https://www.mriquestions.com/true-fispfiesta.html</span>
<span class="hljs-keyword">function</span> bssfp(TR, flip, T1, T2)

    E1 = exp(-TR / T1)
    E2 = exp(-TR / T2)
    (sinα, cosα) = sincosd(flip)
    num = sinα * (<span class="hljs-number">1</span> - E1)
    den = <span class="hljs-number">1</span> - cosα * (E1 - E2) - E1 * E2
    signal = num / den

    <span class="hljs-keyword">return</span> signal

<span class="hljs-keyword">end</span>
</code></pre>
<p>This function returns the bSSFP signal
as a function of two scan parameters
(<code>TR</code>, a value in units of seconds,
and <code>flip</code>, a value in units of degrees)
and two tissue parameters
(<code>T1</code> and <code>T2</code>, each a value in units of seconds).</p>
<h2 id="heading-estimation-algorithm">Estimation Algorithm</h2>
<p>There are various estimation algorithms out there,
but, for simplicity in this post,
we will stick with a grid search.
We will compute the bSSFP signal
over many pairs of <code>T1</code> and <code>T2</code> values.
We will compare these computed signals
to the actual observed signal,
and the pair of <code>T1</code> and <code>T2</code> values
that results in the closest match
will be chosen as the estimates of the tissue parameters.
Here's the algorithm in code:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># `signal` is the observed signal.</span>
<span class="hljs-comment"># `TR` and `flip` are the scan parameters that were used,</span>
<span class="hljs-comment"># and we want to estimate `T1` and `T2`.</span>
<span class="hljs-comment"># `signal`, `TR` and `flip` should be vectors of the same length,</span>
<span class="hljs-comment"># representing running multiple scans</span>
<span class="hljs-comment"># and recording the observed signal for each pair of `TR` and `flip` values.</span>
<span class="hljs-keyword">function</span> gridsearch(signal, TR, flip)

    <span class="hljs-comment"># Specify the grid of values to search over.</span>
    <span class="hljs-comment"># `T1_grid` and `T2_grid` could optionally be inputs to this function.</span>
    T1_grid = exp10.(range(log10(<span class="hljs-number">0.5</span>), log10(<span class="hljs-number">3</span>), <span class="hljs-number">40</span>))
    T2_grid = exp10.(range(log10(<span class="hljs-number">0.005</span>), log10(<span class="hljs-number">0.7</span>), <span class="hljs-number">40</span>))

    T1_est = T1_grid[<span class="hljs-number">1</span>]
    T2_est = T2_grid[<span class="hljs-number">1</span>]
    best = <span class="hljs-literal">Inf</span>
    <span class="hljs-comment"># Pre-allocate memory for some computations to speed up the following loop.</span>
    residual = similar(signal)
    <span class="hljs-comment"># Iterate over the Cartesian product of `T1_grid` and `T2_grid`.</span>
    <span class="hljs-keyword">for</span> (T1, T2) <span class="hljs-keyword">in</span> Iterators.product(T1_grid, T2_grid)
        <span class="hljs-comment"># Physical constraint: T1 is greater than T2, and both are positive.</span>
        T1 &gt; T2 &gt; <span class="hljs-number">0</span> || <span class="hljs-keyword">continue</span>
        <span class="hljs-comment"># For the given T1 and T2 pairs, compute the (noiseless) bSSFP signal</span>
        <span class="hljs-comment"># for each bSSFP scan and subtract them from the given signals.</span>
        <span class="hljs-comment"># Tip: In Julia, one can apply a scalar function elementwise on vector</span>
        <span class="hljs-comment">#      inputs using the "dot" notation (see the `.` after `bssfp` below).</span>
        residual .= signal .- bssfp.(TR, flip, T1, T2)
        <span class="hljs-comment"># Compute the norm squared of the above difference.</span>
        err = residual' * residual
        <span class="hljs-comment"># If the candidate T1 and T2 pair fit the given signals better,</span>
        <span class="hljs-comment"># keep them as the current estimate of T1 and T2.</span>
        <span class="hljs-keyword">if</span> err &lt; best
            best = err
            T1_est = T1
            T2_est = T2
        <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">end</span>

    isinf(best) &amp;&amp; error(<span class="hljs-string">"no valid grid points; consider changing `T1_grid` and/or `T2_grid`"</span>)

    <span class="hljs-keyword">return</span> (T1_est, T2_est)

<span class="hljs-keyword">end</span>
</code></pre>
<h2 id="heading-cost-function">Cost Function</h2>
<p>Now,
to optimize scan parameters
we need a function to optimize,
also called a cost function.
In this case,
we want to minimize the error
between what our estimator tells us
and the true tissue parameter values.
But because we need to optimize scan parameters
<em>before</em> running any scans,
we will simulate bSSFP MRI scans
using the given sets of scan parameters
and average the estimator error
over several sets of tissue parameters.
Here's the code for the cost function:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Load the Statistics standard library package to get access to the `mean` function.</span>
<span class="hljs-keyword">using</span> Statistics

<span class="hljs-comment"># `TR` and `flip` are vectors of the same length</span>
<span class="hljs-comment"># specifying the pairs of scan parameters to use.</span>
<span class="hljs-keyword">function</span> estimator_error(TR, flip)

    <span class="hljs-comment"># Specify the grid of values to average over and the noise level to simulate.</span>
    <span class="hljs-comment"># For a given pair of `T1_true` and `T2_true` values,</span>
    <span class="hljs-comment"># bSSFP signals will be simulated, and then the estimator</span>
    <span class="hljs-comment"># will attempt to estimate what `T1_true` and `T2_true` values were used</span>
    <span class="hljs-comment"># based on the simulated bSSFP signals and the given scan parameters.</span>
    T1_true = [<span class="hljs-number">0.8</span>, <span class="hljs-number">1.0</span>, <span class="hljs-number">1.5</span>]
    T2_true = [<span class="hljs-number">0.05</span>, <span class="hljs-number">0.08</span>, <span class="hljs-number">0.1</span>]
    noise_level = <span class="hljs-number">0.01</span>

    T1_avg = mean(T1_true)
    T2_avg = mean(T2_true)
    <span class="hljs-comment"># Pre-allocate memory for some computations to speed up the following loop.</span>
    signal = float.(TR)
    <span class="hljs-comment"># The following computes the mean estimator error over all pairs</span>
    <span class="hljs-comment"># of true T1 and T2 values.</span>
    <span class="hljs-keyword">return</span> mean(Iterators.product(T1_true, T2_true)) <span class="hljs-keyword">do</span> (T1, T2)
        <span class="hljs-comment"># Ignore pairs for which `T2 &gt; T1`.</span>
        T2 &gt; T1 &amp;&amp; <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>
        <span class="hljs-comment"># Simulate noisy signal using the true T1 and T2 values.</span>
        signal .= bssfp.(TR, flip, T1, T2) .+ noise_level .* randn.()
        <span class="hljs-comment"># Estimate T1 and T2 from the noisy signal.</span>
        (T1_est, T2_est) = gridsearch(signal, TR, flip)
        <span class="hljs-comment"># Compare estimates to truth.</span>
        T1_err = (T1_est - T1)^<span class="hljs-number">2</span>
        T2_err = (T2_est - T2)^<span class="hljs-number">2</span>
        <span class="hljs-comment"># Combine the T1 and T2 errors into one single error metric</span>
        <span class="hljs-comment"># by averaging the respective errors</span>
        <span class="hljs-comment"># after normalizing by the mean true value of each parameter.</span>
        err = (T1_err / T1_avg + T2_err / T2_avg) / <span class="hljs-number">2</span>
    <span class="hljs-keyword">end</span>

<span class="hljs-keyword">end</span>
</code></pre>
<h1 id="heading-optimization">Optimization</h1>
<p>Now we are finally ready to optimize!
We will use <a target="_blank" href="https://github.com/SciML/Optimization.jl">Optimization.jl</a>.
Many optimizers are available,
but because we have a non-convex optimization problem,
we will use the adaptive particle swarm global optimization algorithm.
Here's a function that does the optimization:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Load needed packages.</span>
<span class="hljs-comment"># Optimization.jl provides the optimization infrastructure,</span>
<span class="hljs-comment"># while OptimizationOptimJL.jl wraps the Optim.jl package</span>
<span class="hljs-comment"># that provides the optimization algorithm we will use.</span>
<span class="hljs-keyword">using</span> Optimization, OptimizationOptimJL

<span class="hljs-comment"># Load the Random standard library package to enable setting the random seed.</span>
<span class="hljs-keyword">using</span> Random

<span class="hljs-comment"># `TR_init` and `flip_init` are vectors of the same length</span>
<span class="hljs-comment"># that provide a starting point for the optimization algorithm.</span>
<span class="hljs-comment"># The length of these vectors also determines the number of bSSFP scans to simulate.</span>
<span class="hljs-keyword">function</span> scan_optimization(TR_init, flip_init)

    <span class="hljs-comment"># Ensure randomly generated noise is the same for each evaluation.</span>
    Random.seed!(<span class="hljs-number">0</span>)

    N_scans = length(TR_init)
    length(flip_init) == N_scans || error(<span class="hljs-string">"`TR_init` and `flip_init` have different lengths"</span>)

    <span class="hljs-comment"># The following converts the cost function we created (`estimator_error`)</span>
    <span class="hljs-comment"># into a form that Optimization.jl can use.</span>
    <span class="hljs-comment"># Specifically, Optimization.jl needs a cost function that takes two inputs</span>
    <span class="hljs-comment"># (the first of which contains the parameters to optimize)</span>
    <span class="hljs-comment"># and returns a real number.</span>
    <span class="hljs-comment"># The input `x` is a concatenation of `TR` and `flip`,</span>
    <span class="hljs-comment"># i.e., `TR == x[1:N_scans]` and `flip == x[N_scans+1:end]`.</span>
    <span class="hljs-comment"># The input `p` is unused, but is needed by Optimization.jl.</span>
    cost_fun = (x, p) -&gt; estimator_error(x[<span class="hljs-number">1</span>:N_scans], x[N_scans+<span class="hljs-number">1</span>:<span class="hljs-keyword">end</span>])

    <span class="hljs-comment"># Specify constraints.</span>
    <span class="hljs-comment"># The lower and upper bounds are chosen to ensure reasonable bSSFP scans.</span>
    (min_TR, max_TR) = (<span class="hljs-number">0.001</span>, <span class="hljs-number">0.02</span>)
    (min_flip, max_flip) = (<span class="hljs-number">1</span>, <span class="hljs-number">90</span>)
    constraints = (;
        lb = [fill(min_TR, N_scans); fill(min_flip, N_scans)],
        ub = [fill(max_TR, N_scans); fill(max_flip, N_scans)],
    )

    <span class="hljs-comment"># Set up and solve the problem.</span>
    f = OptimizationFunction(cost_fun)
    prob = OptimizationProblem(f, [TR_init; flip_init]; constraints...)
    sol = solve(prob, ParticleSwarm(lower = prob.lb, upper = prob.ub, n_particles = <span class="hljs-number">3</span>))

    <span class="hljs-comment"># Extract the optimized TRs and flip angles, remembering that `sol.u == [TR; flip]`.</span>
    TR_opt = sol.u[<span class="hljs-number">1</span>:N_scans]
    flip_opt = sol.u[N_scans+<span class="hljs-number">1</span>:<span class="hljs-keyword">end</span>]

    <span class="hljs-keyword">return</span> (TR_opt, flip_opt)

<span class="hljs-keyword">end</span>
</code></pre>
<h1 id="heading-results">Results</h1>
<p>Let's see how scan parameter optimization
can improve tissue parameter estimates.
We will use the following function
to take scan parameters,
simulate bSSFP scans,
and then estimate the tissue parameters
from those scans.
We will compare the results of this function
with manually chosen scan parameters
to those with optimized scan parameters.</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Load the LinearAlgebra standard library package for access to `norm`.</span>
<span class="hljs-keyword">using</span> LinearAlgebra

<span class="hljs-comment"># Load Plots.jl for displaying the results.</span>
<span class="hljs-keyword">using</span> Plots

<span class="hljs-comment"># Helper function for plotting.</span>
<span class="hljs-keyword">function</span> plot_true_vs_est(T_true, T_est, title, rmse, clim)

    <span class="hljs-keyword">return</span> heatmap([T_true T_est];
        title,
        clim,
        xlabel = <span class="hljs-string">"RMSE = <span class="hljs-subst">$(round(rmse; digits = <span class="hljs-number">4</span>)</span>) s"</span>,
        xticks = [],
        yticks = [],
        showaxis = <span class="hljs-literal">false</span>,
        aspect_ratio = <span class="hljs-number">1</span>,
    )

<span class="hljs-keyword">end</span>

<span class="hljs-keyword">function</span> run(TR, flip)

    <span class="hljs-comment"># Create a synthetic object to scan.</span>
    <span class="hljs-comment"># The background of the object will be indicated with values of `0.0`</span>
    <span class="hljs-comment"># for the tissue parameters.</span>
    nx = ny = <span class="hljs-number">128</span>
    object = map(Iterators.product(range(-<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, nx), range(-<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, ny))) <span class="hljs-keyword">do</span> (x, y)
        r = hypot(x, y)
        <span class="hljs-keyword">if</span> r &lt; <span class="hljs-number">0.5</span>
            <span class="hljs-keyword">return</span> (<span class="hljs-number">0.8</span>, <span class="hljs-number">0.08</span>)
        <span class="hljs-keyword">elseif</span> r &lt; <span class="hljs-number">0.8</span>
            <span class="hljs-keyword">return</span> (<span class="hljs-number">1.3</span>, <span class="hljs-number">0.09</span>)
        <span class="hljs-keyword">else</span>
            <span class="hljs-keyword">return</span> (<span class="hljs-number">0.0</span>, <span class="hljs-number">0.0</span>)
        <span class="hljs-keyword">end</span>
    <span class="hljs-keyword">end</span>

    <span class="hljs-comment"># Simulate the bSSFP scans.</span>
    T1_true = first.(object)
    T2_true = last.(object)
    noise_level = <span class="hljs-number">0.001</span>
    signal = map(T1_true, T2_true) <span class="hljs-keyword">do</span> T1, T2
        <span class="hljs-comment"># Ignore the background of the object.</span>
        (T1, T2) == (<span class="hljs-number">0.0</span>, <span class="hljs-number">0.0</span>) &amp;&amp; <span class="hljs-keyword">return</span> <span class="hljs-number">0.0</span>
        bssfp.(TR, flip, T1, T2) .+ noise_level .* randn.()
    <span class="hljs-keyword">end</span>

    <span class="hljs-comment"># Estimate the tissue parameters.</span>
    T1_est = zeros(nx, ny)
    T2_est = zeros(nx, ny)
    <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> eachindex(signal, T1_est, T2_est)
        <span class="hljs-comment"># Don't try to estimate tissue parameters for the background.</span>
        signal[i] == <span class="hljs-number">0.0</span> &amp;&amp; <span class="hljs-keyword">continue</span>
        (T1_est[i], T2_est[i]) = gridsearch(signal[i], TR, flip)
    <span class="hljs-keyword">end</span>

    <span class="hljs-comment"># Compute the root mean squared error.</span>
    m = T1_true .!= <span class="hljs-number">0.0</span>
    T1_rmse = sqrt(norm(T1_true[m] - T1_est[m]) / count(m))
    T2_rmse = sqrt(norm(T2_true[m] - T2_est[m]) / count(m))

    <span class="hljs-comment"># Plot the results.</span>
    p_T1 = plot_true_vs_est(T1_true, T1_est, <span class="hljs-string">"True vs Estimated T1"</span>, T1_rmse, (<span class="hljs-number">0</span>, <span class="hljs-number">2.5</span>))
    p_T2 = plot_true_vs_est(T2_true, T2_est, <span class="hljs-string">"True vs Estimated T2"</span>, T2_rmse, (<span class="hljs-number">0</span>, <span class="hljs-number">0.25</span>))

    <span class="hljs-keyword">return</span> (p_T1, p_T2)

<span class="hljs-keyword">end</span>
</code></pre>
<p>First, let's see the results with no optimization:</p>
<pre><code class="lang-julia">TR_init = [<span class="hljs-number">0.005</span>, <span class="hljs-number">0.005</span>, <span class="hljs-number">0.005</span>]
flip_init = [<span class="hljs-number">30</span>, <span class="hljs-number">60</span>, <span class="hljs-number">80</span>]
(p_T1_init, p_T2_init) = run(TR_init, flip_init)
</code></pre>
<pre><code class="lang-julia">display(p_T1_init)
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1706651408110/k-_Gc3yle.png?auto=format" alt="T1 estimates using initial scan parameters" /></p>
<pre><code class="lang-julia">display(p_T2_init)
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1706651424879/rYnGUWq8J.png?auto=format" alt="T2 estimates using initial scan parameters" /></p>
<p>Now let's optimize the scan parameters
and then see how the tissue parameter estimates look:</p>
<pre><code class="lang-julia">(TR_opt, flip_opt) = scan_optimization(TR_init, flip_init)
(p_T1_opt, p_T2_opt) = run(TR_opt, flip_opt)
</code></pre>
<pre><code class="lang-julia">display(p_T1_opt)
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1706651435538/8gMdXZ8px.png?auto=format" alt="T1 estimates using optimized scan parameters" /></p>
<pre><code class="lang-julia">display(p_T2_opt)
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1706651445926/GfihhniP2.png?auto=format" alt="T2 estimates using optimized scan parameters" /></p>
<p>We can see that the optimized scans
result in much better tissue parameter estimates!</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we saw how Julia can be used
to optimize MRI scan parameters
to improve tissue parameter estimates.
Even though we discussed MRI specifically,
the concepts presented here
easily extend to other applications
where signal models are known
and optimization is required.</p>
<p>Note that most of the code for this post
was taken from a <a target="_blank" href="https://mri-scan-optim.dash-examples.juliahub.app/">Dash app</a>
I helped create for <a target="_blank" href="https://juliahub.com/company/resources/dash/">JuliaHub</a>.
Feel free to check it out
if you want to see this code in action!</p>
<p>Are you facing slow optimization or simulation runs in MATLAB<sup>®</sup>?
Want to leverage Julia for enhanced performance
but don't know where to start?
Check out our <a target="_blank" href="https://glcs.io/julia-matlab/">webpage</a>
and watch our <a target="_blank" href="https://www.youtube.com/watch?v=vvnfyVMwu_Y">JuliaCon talk</a>
for solutions tailored just for you!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://mri-scan-optim.dash-examples.juliahub.app/">Optimizing MRI Scans Dash App</a><ul>
<li>Web app showcasing the capabilities discussed in this post.</li>
</ul>
</li>
<li><a target="_blank" href="https://juliahub.com/ui/Notebooks/deep-datta2/Dash_Examples/mri_scan_parameter_optimization.jl">Optimizing MRI Scans Pluto Notebook</a><ul>
<li>Pluto notebook accompanying the Dash app mentioned above.
(Unfortunately, the data used in the notebook didn't get set up properly,
so most of the code can't run correctly.)</li>
</ul>
</li>
<li>Links to other blog posts discussing how to do MRI in Julia:<ul>
<li><a target="_blank" href="https://blog.glcs.io/simulating-mri-physics-with-the-bloch-equations">Simulating MRI Physics with the Bloch Equations</a></li>
<li><a target="_blank" href="https://blog.glcs.io/mastering-mri-bloch-simulations-with-blochsimjl-in-julia">Mastering MRI Bloch Simulations with BlochSim.jl in Julia</a></li>
<li><a target="_blank" href="https://blog.glcs.io/mri-scan-simulation-made-easy-with-blochsimjl">MRI Scan Simulation Made Easy with BlochSim.jl</a></li>
</ul>
</li>
<li><a target="_blank" href="https://docs.sciml.ai/Optimization/stable/">Optimization.jl Docs</a><ul>
<li>Official documentation for Optimization.jl.</li>
</ul>
</li>
<li>Julia-MATLAB integration:<ul>
<li><a target="_blank" href="https://glcs.io/julia-matlab/">Service webpage</a></li>
<li><a target="_blank" href="https://www.youtube.com/watch?v=vvnfyVMwu_Y">JuliaCon 2025 talk recording</a></li>
<li><a target="_blank" href="https://blog.glcs.io/juliacon-2025">JuliaCon 2025 talk transcript</a></li>
<li><a target="_blank" href="https://blog.glcs.io/juliacon-2025-preview">JuliaCon 2025 preview blog post</a></li>
</ul>
</li>
</ul>
<p>MATLAB is a registered trademark of The MathWorks, Inc.</p>
]]></content:encoded></item><item><title><![CDATA[Delving into Advanced Types within the Julia Type System]]></title><description><![CDATA[In a previous post we covered the building blocks of the Julia type system and discussed just how powerful it can be.
What if I were to tell you that Julia's type system has even more to offer?
Well, welcome back to class, my friends! Today, we will ...]]></description><link>https://blog.glcs.io/julia-types-advanced</link><guid isPermaLink="true">https://blog.glcs.io/julia-types-advanced</guid><category><![CDATA[Julia]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Justyn Nissly]]></dc:creator><pubDate>Mon, 22 Jan 2024 15:03:58 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1705531784771/e5q0vD-v7.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In a <a target="_blank" href="https://blog.glcs.io/julia-type-system">previous post</a> we covered the building blocks of the Julia type system and discussed just how powerful it can be.</p>
<p>What if I were to tell you that Julia's type system has even more to offer?
Well, welcome back to class, my friends! Today, we will see what else the Julia type system offers us. Let's dive right in and look at our first topic: Type Unions.</p>
<h1 id="heading-type-unions">Type Unions</h1>
<p>Type unions are exactly what you think they are. They are a union of two or more types that create a new abstract type.
With this new abstract type that you create with the <code>Union</code> keyword, you can assign values to a variable with that type that matches any of the types specified in the Union.
Let's look at an example:</p>
<pre><code class="lang-julia">    julia&gt; IntOrString = <span class="hljs-built_in">Union</span>{<span class="hljs-built_in">Int</span>,<span class="hljs-built_in">AbstractString</span>}
    <span class="hljs-built_in">Union</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">AbstractString</span>}


    julia&gt; <span class="hljs-keyword">function</span> testUnionTypes(aType::IntOrString)
            print(<span class="hljs-string">`The type of aType is <span class="hljs-subst">$(typeof(aType)</span>)`</span>)
        <span class="hljs-keyword">end</span>
    testUnionTypes (generic <span class="hljs-keyword">function</span> with <span class="hljs-number">1</span> method)


    julia&gt; testUnionTypes(<span class="hljs-number">1</span>)
    <span class="hljs-string">`The type of aType is Int64`</span>


    julia&gt; testUnionTypes(<span class="hljs-string">"Testing Union Types"</span>)
    <span class="hljs-string">`The type of aType is String`</span>


    julia&gt; testUnionTypes(<span class="hljs-number">3.14</span>)
    ERROR: <span class="hljs-built_in">MethodError</span>: no method matching testUnionTypes(::<span class="hljs-built_in">Float64</span>)


    Closest candidates are:
    testUnionTypes(::<span class="hljs-built_in">Union</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">AbstractString</span>})
    @ Main REPL[<span class="hljs-number">8</span>]:<span class="hljs-number">1</span>


    Stacktrace:
    [<span class="hljs-number">1</span>] top-level scope
    @ REPL[<span class="hljs-number">11</span>]:<span class="hljs-number">1</span>
</code></pre>
<p>You'll notice that when we test our new <code>Union</code> type, it rejects the float value we gave it, but it accepts both the integer and the string! Just like the <code>Any</code> type we discussed in a <a target="_blank" href="https://blog.glcs.io/julia-type-system">previous article</a> on types, this allows us to develop generic code that works with several types. Julia itself actually uses this concept to allow for nullable types. Julia uses <code>Union{T, Nothing}</code> where <code>T</code> has any type you want. For example, <code>Union{AbstractString, Nothing}</code> would allow you to assign a string or a null (written with the symbol <code>nothing</code> in Julia).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1705525557500/rb-zAg3UM.png?auto=format" alt="A generic triangle" /></p>
<h1 id="heading-parametric-types">Parametric Types</h1>
<p>Parametric types are a very interesting part of Julia, and yes, they are exactly what they sound like. They are types that can take parameters. Let's look at an example of the syntax and then discuss its use.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">struct</span> Triangle{T}
           a::T
           b::T
       <span class="hljs-keyword">end</span>


julia&gt; myFloatTriangle = Triangle{<span class="hljs-built_in">AbstractFloat</span>}(<span class="hljs-number">1.0</span>,<span class="hljs-number">2.0</span>)
Triangle{<span class="hljs-built_in">AbstractFloat</span>}(<span class="hljs-number">1.0</span>, <span class="hljs-number">2.0</span>)


julia&gt; myFloatTriangle.a
<span class="hljs-number">1.0</span>


julia&gt; myFloatTriangle.b
<span class="hljs-number">2.0</span>
</code></pre>
<p>Now that we have created a parametric type, let's see how we can use it.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">function</span> calculate_hypotenuse(myShape::Triangle)
           println(<span class="hljs-string">"The hypotenuse is: <span class="hljs-subst">$(sqrt(myShape.a^<span class="hljs-number">2</span> + myShape.b^<span class="hljs-number">2</span>)</span>)"</span>)
       <span class="hljs-keyword">end</span>


julia&gt; calculate_hypotenuse(myFloatTriangle)
The hypotenuse is: <span class="hljs-number">3.605551275463989</span>


julia&gt; myIntTriangle = Triangle(<span class="hljs-number">1</span>,<span class="hljs-number">2</span>)
Triangle{<span class="hljs-built_in">Int64</span>}(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>)


julia&gt; calculate_hypotenuse(myIntTriangle)
The hypotenuse is: <span class="hljs-number">3.605551275463989</span>
</code></pre>
<p>In our example above, we defined our "Triangle" variable in two ways.
In one, we specified the type; in the other, we simply gave the values for the parameters.
Julia was able to figure out the rest. This is all possible because <code>T</code> can be any type we give it.
We don't have to specify the type upfront, so we can make our code much more flexible and reusable. In fact, not specifying the type (as we did with <code>myIntTriangle</code>) is the preferred way to do it in Julia.</p>
<hr />
<h1 id="heading-tuple-types">Tuple Types</h1>
<p>You can think of Tuples more like boxes that hold items.
After you put the items into the box, the box is "sealed," and those items are set in that order with those values...<strong><em>FOREVER</em></strong>...
Okay, maybe that is a bit dramatic, but I think you get the point.
Tuple types are immutable containers with any number or combination of types.
Let's look at the syntax of Tuples:</p>
<pre><code class="lang-julia">julia&gt; typeof((<span class="hljs-number">42</span>,<span class="hljs-string">"Don't panic"</span>,<span class="hljs-number">10.9</span>))
<span class="hljs-built_in">Tuple</span>{<span class="hljs-built_in">Int64</span>, <span class="hljs-built_in">String</span>, <span class="hljs-built_in">Float64</span>}
</code></pre>
<p>One primary use for this is returning multiple types from a single function. Because Tuples are immutable, they ensure that the data stays structured in the order you tell it and remains constant. You can see several more examples of how Tuples are used in the <a target="_blank" href="https://docs.julialang.org/en/v1/manual/types/#Tuple-Types">Julia documentation</a> or if you want a great explanation of how Tuples work, check out <a target="_blank" href="https://www.youtube.com/watch?v=J-9bGo-nNyE">this video</a> by DoggoDotJl. DoggoDotJl does a fantastic job explaining several different parts of the Julia language, and I highly recommend you check him out at <a target="_blank" href="https://www.youtube.com/@doggodotjl">DoggoDotJl</a> on Youtube. <em>#notsponsored</em></p>
<h2 id="heading-named-tuples">Named Tuples</h2>
<p>Named Tuples are just Tuples...but with names. I bet you didn't see that coming.
Well, okay... Named Tuples are a little more than just Tuples with names.
Let's look at the syntax and then talk a little more about it:</p>
<pre><code class="lang-julia">julia&gt; tupleWithNames = (
           language = <span class="hljs-string">"Julia"</span>,
           isTheBest = <span class="hljs-literal">true</span>,
           <span class="hljs-keyword">type</span> = <span class="hljs-string">"NamedTuple"</span>,
       )
(language = <span class="hljs-string">"Julia"</span>, isTheBest = <span class="hljs-literal">true</span>, <span class="hljs-keyword">type</span> = <span class="hljs-string">"NamedTuple"</span>)


julia&gt; typeof(tupleWithNames)
<span class="hljs-meta">@NamedTuple</span>{language::<span class="hljs-built_in">String</span>, isTheBest::<span class="hljs-built_in">Bool</span>, <span class="hljs-keyword">type</span>::<span class="hljs-built_in">String</span>}
</code></pre>
<p>A NamedTuple functions like a JSON object in that you have key-value pairs. They are a fantastic data structure to use when you want to organize your code but don't need the flexibility of an array. When you want to access the elements of a NamedTuple you can do it in two different ways.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">function</span> testingTuples(ourTuple)
            println(<span class="hljs-string">"We can use <span class="hljs-subst">$(ourTuple.<span class="hljs-keyword">type</span>)</span> to check if <span class="hljs-subst">$(ourTuple.language)</span> is better than Python"</span>)
            println(<span class="hljs-string">"Is <span class="hljs-subst">$(ourTuple.language)</span> the best? <span class="hljs-subst">$(ourTuple[<span class="hljs-number">2</span>] ? <span class="hljs-string">"Yes it is! .jl &gt; .py"</span> : <span class="hljs-string">"No, use Python noob"</span>)</span>"</span>)<span class="hljs-keyword">end</span>
testingTuples (generic <span class="hljs-keyword">function</span> with <span class="hljs-number">1</span> method)


julia&gt; testingTuples(tupleWithNames)
We can use NamedTuple to check <span class="hljs-keyword">if</span> Julia is better than Python
Is Julia the best? Yes it is! .jl &gt; .py
</code></pre>
<p>You can see in the above example that Julia is indeed better than Python! Okay, maybe this example proves nothing about Julia's superiority and may need to be the topic of another post. But this example does show how you can access elements inside NamedTuples. You can access elements in tuples using the tuple name, the <code>.</code> operator, and then the name of the element you want to access. You can also access it by indexing. Indexing is also how to access data in Tuples that weren't special enough for you to give names to. Now, with access to Tuples and NamedTuples, you can organize data in the same way you would with JSON objects.</p>
<blockquote>
<p><strong>NOTE:</strong> Just like Tuples, NamedTuples are immutable. Once the values are set, they cannot be changed later.</p>
</blockquote>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1705525321104/RAuLJWIbS.jpg?auto=format" alt="Picture of a cool lizard" /></p>
<h1 id="heading-unionall-types">UnionAll Types</h1>
<p>Now I know what you are probably thinking: What is up with the lizard?
That can be explained with two simple points:</p>
<ul>
<li>Lizards are cool.</li>
<li>I like to compare UnionAll types to chameleons.</li>
</ul>
<p>Now, let me elaborate on that a bit. Just like chameleons change color, UnionAll types can change form and adapt to any data type. Chameleons don't change their color to yellow and suddenly become bananas. They are still chameleons. Similarly, you may have a UnionAll type that is a float in one instance and an integer the next, but it still has the same structure. Enough with the lizard talk; let's look at a more practical example to make sense of this reptilian elucidation.</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">abstract type</span> Shape{T} <span class="hljs-keyword">end</span>


julia&gt; <span class="hljs-keyword">struct</span> Circle{T} &lt;: Shape{T}
           radius::T
       <span class="hljs-keyword">end</span>


julia&gt; <span class="hljs-keyword">struct</span> Square{T} &lt;: Shape{T}
           side::T
       <span class="hljs-keyword">end</span>


julia&gt; circle_float = Circle(<span class="hljs-number">2.5</span>)
Circle{<span class="hljs-built_in">Float64</span>}(<span class="hljs-number">2.5</span>)



julia&gt; square_int = Square(<span class="hljs-number">4</span>)
Square{<span class="hljs-built_in">Int64</span>}(<span class="hljs-number">4</span>)



julia&gt; square_int <span class="hljs-keyword">isa</span> Shape
<span class="hljs-literal">true</span>


julia&gt; square_int <span class="hljs-keyword">isa</span> Square
<span class="hljs-literal">true</span>


julia&gt; square_int <span class="hljs-keyword">isa</span> Circle
<span class="hljs-literal">false</span>
</code></pre>
<p>You can see that we created an abstract type called Shape, and then we created two other types called Square and Circle. By doing this, we have created a universal way to handle shapes without tying ourselves to any specific numeric type. We aren't limited to just integers or just floats. UnionAll types, similar to other topics we covered, allow you to build very flexible code that is type-stable, generic, and optimal for machine code generation.</p>
<h1 id="heading-type-aliases">Type Aliases</h1>
<p>The last thing we will cover is type aliases.
Julia allows type aliases to give a new name to an existing type.
The idea here is the same as many other concepts we discussed: generic and reusable code.
Take <code>Int</code>, for example. When you specify a number as an <code>Int</code>, Julia will check if it needs to be an <code>Int32</code> or <code>Int64</code>,
depending on your system. The <code>Int</code> alias is a convenience operator, removing the abstraction of 32-bit or 64-bit values. You specify a variable as <code>Int</code>, and Julia takes care of the rest.
We should probably note that Julia does not have <code>Float</code> as an alias for specific size floating point numbers (such as <code>Float64</code>).
<code>Int</code> reflects the exact size of the pointer native to the machine you are running your code on, while
floating points, on the other hand, are specified by the IEEE-754 standard. So, just like in real school, you have some homework to do later...or not. I won't judge.</p>
<h1 id="heading-summary">Summary</h1>
<p>To recap, we have covered Type Unions, Parametric Types, Tuple Types, UnionAll Types, and Type Aliases.
Hopefully, you have a better understanding of the Julia type system and the power it wields.
If you want to learn more about Julia or some other interesting topics,
be sure to check out our other blog posts on <a target="_blank" href="https://blog.glcs.io/">blog.glcs.io</a>!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://docs.julialang.org/en/v1/manual/types">Types Documentation</a><ul>
<li>Official Julia documentation on all types</li>
</ul>
</li>
<li><a target="_blank" href="https://blog.glcs.io/julia-type-system">Understanding the Julia Type System</a><ul>
<li>Our previous post on the Julia type system</li>
</ul>
</li>
</ul>
Background for article cover image by <a href="https://www.freepik.com/free-photo/halloween-party_9164228.htm#page=2&amp;query=math%20chalkboard&amp;position=38&amp;from_view=search&amp;track=ais&amp;uuid=5e4f51c6-8947-4535-83fb-b6c75136fd3c">Freepik</a>


]]></content:encoded></item><item><title><![CDATA[Julia's Parallel Processing]]></title><description><![CDATA[Julia is a relatively new,
free, and open-source programming language.
It has a syntax
similar to that of other popular programming languages
such as MATLAB® and Python,
but it boasts being able to achieve C-like speeds.
While serial Julia code can b...]]></description><link>https://blog.glcs.io/parallel-processing</link><guid isPermaLink="true">https://blog.glcs.io/parallel-processing</guid><category><![CDATA[Julia]]></category><category><![CDATA[General Programming]]></category><dc:creator><![CDATA[Steven Whitaker]]></dc:creator><pubDate>Mon, 15 Jan 2024 15:03:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1703185018201/AwXX-mjkf.png?auto=format" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><a target="_blank" href="https://julialang.org">Julia</a> is a relatively new,
free, and open-source programming language.
It has a syntax
similar to that of other popular programming languages
such as MATLAB<sup>®</sup> and Python,
but it boasts being able to achieve C-like speeds.</p>
<p>While serial Julia code can be fast,
sometimes even more speed is desired.
In many cases,
writing parallel code
can further reduce run time.
Parallel code takes advantage
of the multiple CPU cores
included in modern computers,
allowing multiple computations
to run at the same time,
or in parallel.</p>
<p>Julia provides two methods
for writing parallel CPU code:
multi-threading and distributed computing.
This post will cover
the basics of
how to use these two methods
of parallel processing.</p>
<p>This post assumes you already have Julia installed.
If you haven't yet,
check out our earlier
<a target="_blank" href="https://blog.glcs.io/install-julia-and-vscode">post on how to install Julia</a>.</p>
<h1 id="heading-multi-threading">Multi-Threading</h1>
<p>First, let's learn about multi-threading.</p>
<p>To enable multi-threading,
you must start Julia in one of two ways:</p>
<ol>
<li>Set the environment variable <code>JULIA_NUM_THREADS</code>
to the number of threads Julia should use,
and then start Julia.
For example, <code>JULIA_NUM_THREADS=4</code>.</li>
<li>Run Julia with the <code>--threads</code> (or <code>-t</code>) command line argument.
For example, <code>julia --threads 4</code> or <code>julia -t 4</code>.</li>
</ol>
<p>After starting Julia
(either with or without specifying the number of threads),
the <code>Threads</code> module will be loaded.
We can check the number of threads Julia has available:</p>
<pre><code class="lang-julia">julia&gt; Threads.nthreads()
<span class="hljs-number">4</span>
</code></pre>
<p>The simplest way
to start writing parallel code
is just to use the <code>Threads.@threads</code> macro.
Inserting this macro before a <code>for</code> loop
will cause the iterations of the loop
to be split across the available threads,
which will then operate in parallel.
For example:</p>
<pre><code class="lang-julia">Threads.<span class="hljs-meta">@threads</span> <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:<span class="hljs-number">10</span>
    func(i)
<span class="hljs-keyword">end</span>
</code></pre>
<p>Without <code>Threads.@threads</code>,
first <code>func(1)</code> will run,
then <code>func(2)</code>, and so on.
With the macro,
and assuming we started Julia with four threads,
first <code>func(1)</code>, <code>func(4)</code>, <code>func(7)</code>, and <code>func(9)</code>
will run in parallel.
Then,
when a thread's iteration finishes,
it will start another iteration
(assuming the loop is not done yet),
regardless of whether the other threads
have finished their iterations yet.
Therefore,
this loop will theoretically finish 10 iterations
in the time it takes a single thread to do 3.</p>
<p>Note that <code>Threads.@threads</code> is blocking,
meaning code after the threaded <code>for</code> loop
will not run until the loop has finished.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1704743130296/OXidmRWz_.png?auto=format" alt="Image of threaded for loop" /></p>
<p>Julia also provides another macro for multi-threading:
<code>Threads.@spawn</code>.
This macro is more flexible than <code>Threads.@threads</code>
because it can be used to run any code
on a thread,
not just <code>for</code> loops.
But let's illustrate how to use <code>Threads.@spawn</code>
by implementing the behavior of <code>Threads.@threads</code>:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Function for splitting up `x` as evenly as possible</span>
<span class="hljs-comment"># across `np` partitions.</span>
<span class="hljs-keyword">function</span> partition(x, np)
    (len, rem) = divrem(length(x), np)
    Base.Generator(<span class="hljs-number">1</span>:np) <span class="hljs-keyword">do</span> p
        i1 = firstindex(x) + (p - <span class="hljs-number">1</span>) * len
        i2 = i1 + len - <span class="hljs-number">1</span>
        <span class="hljs-keyword">if</span> p &lt;= rem
            i1 += p - <span class="hljs-number">1</span>
            i2 += p
        <span class="hljs-keyword">else</span>
            i1 += rem
            i2 += rem
        <span class="hljs-keyword">end</span>
        chunk = x[i1:i2]
    <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>
N = <span class="hljs-number">10</span>
chunks = partition(<span class="hljs-number">1</span>:<span class="hljs-number">10</span>, Threads.nthreads())
tasks = map(chunks) <span class="hljs-keyword">do</span> chunk
    Threads.<span class="hljs-meta">@spawn</span> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> chunk
        func(i)
    <span class="hljs-keyword">end</span>
<span class="hljs-keyword">end</span>
wait.(tasks)
</code></pre>
<p>Let's walk through this code,
assuming <code>Threads.nthreads() == 4</code>:</p>
<ul>
<li>First, we split the 10 iterations
evenly across the 4 threads
using <code>partition</code>.
So, <code>chunks</code> ends up being
<code>[1:3, 4:6, 7:8, 9:10]</code>.
(We could have hard-coded the partitioning,
but now you have a nice <code>partition</code> function
that can work with more complicated partitionings!)</li>
<li>Then, for each chunk,
we create a <code>Task</code> via <code>Threads.@spawn</code>
that will call <code>func</code>
on each element of the chunk.
This <code>Task</code> will be scheduled
to run on an available thread.
<code>tasks</code> contains a reference
to each of these spawned <code>Task</code>s.</li>
<li>Finally, we wait for the <code>Task</code>s to finish
with the <code>wait</code> function.</li>
</ul>
<p>To reemphasize, note that <code>Threads.@spawn</code> creates a <code>Task</code>;
it does not wait for the task to run.
As such, it is non-blocking,
and program execution continues
as soon as the <code>Task</code> is returned.
The code wrapped in the task
will also run, but in parallel, on a separate thread.
This behavior is illustrated below:</p>
<pre><code class="lang-julia">julia&gt; Threads.<span class="hljs-meta">@spawn</span> (sleep(<span class="hljs-number">2</span>); println(<span class="hljs-string">"Spawned task finished"</span>))
<span class="hljs-built_in">Task</span> (runnable) @<span class="hljs-number">0x00007fdd4b10dc30</span>

julia&gt; <span class="hljs-number">1</span> + <span class="hljs-number">1</span> <span class="hljs-comment"># This code executes without waiting for the above task to finish</span>
<span class="hljs-number">2</span>

julia&gt; Spawned task finished <span class="hljs-comment"># Prints 2 seconds after spawning the above task</span>
julia&gt;
</code></pre>
<p>Spawned tasks can also return data.
While <code>wait</code> just waits for a task to finish,
<code>fetch</code> waits for a task
and then obtains the result:</p>
<pre><code class="lang-julia">julia&gt; task = Threads.<span class="hljs-meta">@spawn</span> (sleep(<span class="hljs-number">2</span>); <span class="hljs-number">1</span> + <span class="hljs-number">1</span>)
<span class="hljs-built_in">Task</span> (runnable) @<span class="hljs-number">0x00007fdd4a5e28b0</span>

julia&gt; fetch(task)
<span class="hljs-number">2</span>
</code></pre>
<h2 id="heading-thread-safety">Thread Safety</h2>
<p>When using multi-threading,
memory is shared across threads.
If a thread writes to a memory location
that is written to or read from another thread,
that will lead to a race condition
with unpredictable results.
To illustrate:</p>
<pre><code class="lang-julia">julia&gt; s = <span class="hljs-number">0</span>;

julia&gt; Threads.<span class="hljs-meta">@threads</span> <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:<span class="hljs-number">1000000</span>
           <span class="hljs-keyword">global</span> s += i
       <span class="hljs-keyword">end</span>

julia&gt; s
<span class="hljs-number">19566554653</span> <span class="hljs-comment"># Should be 500000500000</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1704746329461/BN_ShsuHs.png?auto=format" alt="Race condition" /></p>
<p>There are two methods we can use
to avoid the race condition.
The first involves using a lock:</p>
<pre><code class="lang-julia">julia&gt; s = <span class="hljs-number">0</span>; l = <span class="hljs-built_in">ReentrantLock</span>();

julia&gt; Threads.<span class="hljs-meta">@threads</span> <span class="hljs-keyword">for</span> i = <span class="hljs-number">1</span>:<span class="hljs-number">1000000</span>
           lock(l) <span class="hljs-keyword">do</span>
               <span class="hljs-keyword">global</span> s += i
           <span class="hljs-keyword">end</span>
       <span class="hljs-keyword">end</span>

julia&gt; s
<span class="hljs-number">500000500000</span>
</code></pre>
<p>In this case,
the addition can only occur
on a given thread
once that thread holds the lock.
If a thread does not hold the lock,
it must wait for whatever thread controls it
to release the lock
before it can run the code
within the <code>lock</code> block.</p>
<p>Using a lock in this example
is suboptimal, however,
as it eliminates all parallelism
because only one thread can hold the lock
at any given moment.
(In other examples, however,
using a lock works great,
particularly when only a small portion
of the code depends on the lock.)</p>
<p>The other way to eliminate the race condition
is to use task-local buffers:</p>
<pre><code class="lang-julia">julia&gt; s = <span class="hljs-number">0</span>; chunks = partition(<span class="hljs-number">1</span>:<span class="hljs-number">1000000</span>, Threads.nthreads());

julia&gt; tasks = map(chunks) <span class="hljs-keyword">do</span> chunk
           Threads.<span class="hljs-meta">@spawn</span> <span class="hljs-keyword">begin</span>
               x = <span class="hljs-number">0</span>
               <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> chunk
                   x += i
               <span class="hljs-keyword">end</span>
               x
           <span class="hljs-keyword">end</span>
       <span class="hljs-keyword">end</span>;

julia&gt; thread_sums = fetch.(tasks);

julia&gt; <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> thread_sums
           s += i
       <span class="hljs-keyword">end</span>

julia&gt; s
<span class="hljs-number">500000500000</span>
</code></pre>
<p>In this example,
each spawned task has its own <code>x</code>
that stores the sum
of the values just in the task's chunk of data.
In particular,
none of the tasks modify <code>s</code>.
Then, once each task has computed its sum,
the intermediate values are summed
and stored in <code>s</code>
in a single-threaded manner.</p>
<p>Using task-local buffers
works better for this example
than using a lock
because most of the parallelism is preserved.</p>
<p>(Note that it used to be advised
to manage task-local buffers
using the <code>threadid</code> function.
However, doing so does not guarantee
each task uses its own buffer.
Therefore, the method demonstrated in the above example
<a target="_blank" href="https://julialang.org/blog/2023/07/PSA-dont-use-threadid/">is now advised</a>.)</p>
<h2 id="heading-packages-for-quickly-utilizing-multi-threading">Packages for Quickly Utilizing Multi-Threading</h2>
<p>In addition to writing your own multi-threaded code,
there exist packages that utilize multi-threading.
Two such examples are <a target="_blank" href="https://github.com/tkf/ThreadsX.jl">ThreadsX.jl</a> and <a target="_blank" href="https://github.com/baggepinnen/ThreadTools.jl">ThreadTools.jl</a>.</p>
<p>ThreadsX.jl provides multi-threaded implementations
of several common functions
such as <code>sum</code> and <code>sort</code>,
while ThreadTools.jl provides <code>tmap</code>,
a multi-threaded version of <code>map</code>.</p>
<p>These packages can be great
for quickly boosting performance
without having to figure out multi-threading
on your own.</p>
<h1 id="heading-distributed-computing">Distributed Computing</h1>
<p>Besides multi-threading,
Julia also provides for distributed computing,
or splitting work across multiple Julia processes.</p>
<p>There are two ways to start multiple Julia processes:</p>
<ol>
<li>Load the Distributed standard library package
with <code>using Distributed</code>
and then use <code>addprocs</code>.
For example, <code>addprocs(2)</code>
to add two additional Julia processes
(for a total of three).</li>
<li>Run Julia with the <code>-p</code> command line argument.
For example, <code>julia -p 2</code>
to start Julia with three total Julia processes.
(Note that running Julia with <code>-p</code>
will implicitly load <code>Distributed</code>.)</li>
</ol>
<p>Added processes are known as worker processes,
while the original process is the main process.
Each process has an id:
the main process has id <code>1</code>,
and worker processes have id <code>2</code>, <code>3</code>, etc.</p>
<p>By default,
code runs on the main process.
To run code on a worker,
we need to explicitly give code to that worker.
We can do so with <code>remotecall_fetch</code>,
which takes as inputs
a function to run,
the process id to run the function on,
and the input arguments and keyword arguments
the function needs.
Here are some examples:</p>
<pre><code class="lang-julia"><span class="hljs-comment"># Create a zero-argument anonymous function to run on worker 2.</span>
julia&gt; remotecall_fetch(<span class="hljs-number">2</span>) <span class="hljs-keyword">do</span>
           println(<span class="hljs-string">"Done"</span>)
       <span class="hljs-keyword">end</span>
      From worker <span class="hljs-number">2</span>:    Done

<span class="hljs-comment"># Create a two-argument anonymous function to run on worker 2.</span>
julia&gt; remotecall_fetch((a, b) -&gt; a + b, <span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>)
<span class="hljs-number">3</span>

<span class="hljs-comment"># Run `sum([1 3; 2 4]; dims = 1)` on worker 3.</span>
julia&gt; remotecall_fetch(sum, <span class="hljs-number">3</span>, [<span class="hljs-number">1</span> <span class="hljs-number">3</span>; <span class="hljs-number">2</span> <span class="hljs-number">4</span>]; dims = <span class="hljs-number">1</span>)
<span class="hljs-number">1</span>x2 <span class="hljs-built_in">Matrix</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">3</span>  <span class="hljs-number">7</span>
</code></pre>
<p>If you don't need to wait for the result immediately,
use <code>remotecall</code> instead of <code>remotecall_fetch</code>.
This will create a <code>Future</code>
that you can later <code>wait</code> on or <code>fetch</code>
(similarly to a <code>Task</code> spawned with <code>Threads.@spawn</code>).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1704747080118/LwFuPX-wG.jpg?auto=format&amp;height=600" alt="Super computer" /></p>
<h2 id="heading-separate-memory-spaces">Separate Memory Spaces</h2>
<p>One significant difference
between multi-threading and distributed processing
is that memory is shared in multi-threading,
while each distributed process
has its own separate memory space.
This has several important implications:</p>
<ul>
<li><p>To use a package on a given worker,
it must be loaded on that worker,
not just on the main process.
To illustrate:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-keyword">using</span> LinearAlgebra

julia&gt; <span class="hljs-literal">I</span>
<span class="hljs-built_in">UniformScaling</span>{<span class="hljs-built_in">Bool</span>}
<span class="hljs-literal">true</span>*<span class="hljs-literal">I</span>

julia&gt; remotecall_fetch(() -&gt; <span class="hljs-literal">I</span>, <span class="hljs-number">2</span>)
ERROR: On worker <span class="hljs-number">2</span>:
<span class="hljs-built_in">UndefVarError</span>: <span class="hljs-string">`I`</span> not defined
</code></pre>
<p>To avoid the error,
we could use <code>@everywhere using LinearAlgebra</code>
to load <code>LinearAlgebra</code> on all processes.</p>
</li>
<li><p>Similarly to the previous point,
functions defined on one process
are not available on other processes.
Prepend a function definition with <code>@everywhere</code>
to allow using the function on all processes:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-meta">@everywhere</span> <span class="hljs-keyword">function</span> myadd(a, b)
           a + b
       <span class="hljs-keyword">end</span>;

julia&gt; myadd(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>)
<span class="hljs-number">3</span>

<span class="hljs-comment"># This would error without `@everywhere` above.</span>
julia&gt; remotecall_fetch(myadd, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>)
<span class="hljs-number">7</span>
</code></pre>
</li>
<li><p>Global variables are not shared,
even if defined everywhere with <code>@everywhere</code>:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-meta">@everywhere</span> x = [<span class="hljs-number">0</span>];

julia&gt; remotecall_fetch(<span class="hljs-number">2</span>) <span class="hljs-keyword">do</span>
           x[<span class="hljs-number">1</span>] = <span class="hljs-number">2</span>
       <span class="hljs-keyword">end</span>;

<span class="hljs-comment"># `x` was modified on worker 2.</span>
julia&gt; remotecall_fetch(() -&gt; x, <span class="hljs-number">2</span>)
<span class="hljs-number">1</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">2</span>

<span class="hljs-comment"># `x` was not modified on worker 3.</span>
julia&gt; remotecall_fetch(() -&gt; x, <span class="hljs-number">3</span>)
<span class="hljs-number">1</span>-element <span class="hljs-built_in">Vector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">0</span>
</code></pre>
<p>If needed,
an array of data can be shared
across processes
by using a <code>SharedArray</code>,
provided by the SharedArrays standard library package:</p>
<pre><code class="lang-julia">julia&gt; <span class="hljs-meta">@everywhere</span> <span class="hljs-keyword">using</span> SharedArrays

<span class="hljs-comment"># We don't need `@everywhere` when defining a `SharedArray`.</span>
julia&gt; x = <span class="hljs-built_in">SharedArray</span>{<span class="hljs-built_in">Int</span>,<span class="hljs-number">1</span>}(<span class="hljs-number">1</span>)
<span class="hljs-number">1</span>-element <span class="hljs-built_in">SharedVector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">0</span>

julia&gt; remotecall_fetch(<span class="hljs-number">2</span>) <span class="hljs-keyword">do</span>
           x[<span class="hljs-number">1</span>] = <span class="hljs-number">2</span>
       <span class="hljs-keyword">end</span>;

julia&gt; remotecall_fetch(() -&gt; x, <span class="hljs-number">2</span>)
<span class="hljs-number">1</span>-element <span class="hljs-built_in">SharedVector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">2</span>

julia&gt; remotecall_fetch(() -&gt; x, <span class="hljs-number">3</span>)
<span class="hljs-number">1</span>-element <span class="hljs-built_in">SharedVector</span>{<span class="hljs-built_in">Int64</span>}:
 <span class="hljs-number">2</span>
</code></pre>
</li>
</ul>
<p>Now, a note about command line arguments.
When adding worker processes with  <code>-p</code>,
those processes are spawned
with the same command line arguments
as the main Julia process.
With <code>addprocs</code>, however,
each of those added processes
are started with no command line arguments.
Below is an example of where this behavior
might cause some confusion:</p>
<pre><code>$ JULIA_NUM_THREADS=<span class="hljs-number">4</span> julia --banner=no -t <span class="hljs-number">1</span>
julia&gt; Threads.nthreads()
<span class="hljs-number">1</span>

julia&gt; using Distributed

julia&gt; addprocs(<span class="hljs-number">1</span>);

julia&gt; remotecall_fetch(Threads.nthreads, <span class="hljs-number">2</span>)
<span class="hljs-number">4</span>
</code></pre><p>In this situation, we have the environment variable <code>JULIA_NUM_THREADS</code>
(for example, because normally we run Julia with four threads).
But in this particular case
we want to run Julia with just one thread,
so we set <code>-t 1</code>.
Then we add a process,
but it turns out that process
has four threads, not one!
This is because the environment variable was set,
but no command line arguments were given
to the added process.
To use just one thread
for the added process,
we would need to use the <code>exeflags</code> keyword argument
to <code>addprocs</code>:</p>
<pre><code class="lang-julia">addprocs(<span class="hljs-number">1</span>; exeflags = [<span class="hljs-string">"-t 1"</span>])
</code></pre>
<p>As a final note, if needed,
processes can be removed
with <code>rmprocs</code>,
which removes the processes
associated with the provided worker ids.</p>
<h1 id="heading-summary">Summary</h1>
<p>In this post,
we have provided an introduction
to parallel processing in Julia.
We discussed the basics
of both multi-threading and distributed computing,
how to use them in Julia,
and some things to watch out for.</p>
<p>As a parting piece of advice,
when choosing whether to use multi-threading or distributed processing,
choose multi-threading
unless you have a specific need
for multiple processes with distinct memory spaces.
Multi-threading has lower overhead
and generally is easier to use.</p>
<p>How do you use parallel processing in your code?
Let us know in the comments below!</p>
<h1 id="heading-additional-links">Additional Links</h1>
<ul>
<li><a target="_blank" href="https://docs.julialang.org/en/v1/manual/multi-threading/">Multi-Threading</a><ul>
<li>Official Julia documentation on multi-threading.</li>
</ul>
</li>
<li><a target="_blank" href="https://docs.julialang.org/en/v1/manual/distributed-computing/">Multi-processing and Distributed Computing</a><ul>
<li>Official Julia documentation on distributed computing.</li>
</ul>
</li>
</ul>
<p>MATLAB is a registered trademark
of The MathWorks, Inc.</p>
]]></content:encoded></item></channel></rss>