Callisto.jl

Introduction

Callisto.jl is a simulator for bittide, a system architecture for synchronous distributed computing. A detailed description of the bittide mechanism is available in the papers:

S. Lall, C. Cascaval, M. Izzard and T. Spalink. Modeling and Control of bittide Synchronization. Proceedings of the American Control Conference, 2022. arXiv:2109.14111
S. Lall, C. Cascaval, M. Izzard and T. Spalink. Resistance Distance and Control Performance for bittide Synchronization. Proceedings of the European Control Conference, 2022. arXiv:2111.05296

The model and algorithm used by Callisto follows closely that described in the first of these papers.

Installation

Start Julia, and at the REPL prompt install Callisto as follows.

julia> using Pkg
julia> Pkg.add(url="https://github.com/bittide/Callisto.jl")

Quickstart

Now you can run a simple simulation:

julia> using Callisto
julia> c = CalOpts()
julia> x = callisto(c)

You can view the output:

julia> xc = parse_callisto_logx(c, x...)
julia> using PlotKit
julia> pzip(a) = Point.(zip(a.x, a.y))
julia> save(drawplot(pzip.(xc.freq)), "frequency.png")

This will save a plot of the frequency against time in the file frequency.png in the Julia working directory.

Common options

julia> c = CalOpts(; kwargs...)

CalOpts takes a number of optional keyword arguments. These include

topology: a tuple, whose first entry is the string name, and remaining entries are numerical parameters. For example,
- ("triangle")
- ("mesh", 3, 4)
- ("line", 6)
kp: proportional gain
ki: integral gain
latency: either a number, in which case it is the latency of every link, or a list, whose ith entry is the latency of the ith link
tmax: simulation end time
poll_period: the delay, in localticks, between successive control updates
control_delay: the delay, in localticks, at a node between making a measurement and the frequency update
errors: a list of the uncontrolled frequency of the oscillators
base_freq: used to scale the integrator in the controller. See the example below.

Example

The following is a more complete example. In particular, here we are setting the uncontrolled frequencies of the system to be approximately 200MHz, with a maximum error of 100ppm.

We are also running the system in units of nanoseconds, so 200MHz corresponds to a frequency of 0.2. The sampling period of the controller is given by poll_period, but this is in units of localticks. Because sampling is driven by the local clock, it can only be set in localticks. The variable typical_freq contains the approximate frequency of the oscillators. Here we scale poll_period by the typical frequency so that the sampling rate is about 1e5 nanoseconds.

The base_freq option to CalOpts is used by the controller. Because we are using a PI controller, setting the base_freq is important. In the PI controller, the integral is approximated by a running sum scaled by 1/base_freq. We set base_freq equal to typical_freq for this approximation.

module Run

using Callisto
using PlotKit
using Random

plotpath(x) = joinpath(ENV["HOME"], "plots/", x)
pzip(a) = Point.(a)
tzip(a::Series) = pzip(tuples(a))
tzip(a::Array)  = pzip.(tuples.(a))
plot(d, f; kw...) = save(drawplot(tzip(d); kw...), plotpath(f), 4)

function main()
    tmax = 2e8
    num_nodes = 6
    typical_freq = 0.2 # GHz
    max_ppm = 100
    Random.seed!(1)
    freqs = typical_freq * (1 .+  rand(-max_ppm:max_ppm, num_nodes) /  1e6)
    errors =  [Error(a) for a in freqs]
    c = CalOpts(; topology=("mesh", 3, 2),
                ki = 1e-15,
                kp = 2e-8,
                latency = 200,     
                control_delay = 10,
                poll_period = 1e5 * typical_freq,
                base_freq = typical_freq,
                tmax,
                errors)
    x = callisto(c)
    make_plots(c, x)
end


function make_plots(c, x)
    xc = parse_callisto_logx(c, x...)
    
    plot(xc.freq, "ex_frequency.png")
    plot(xc.freq, "ex_frequency_detail.png", xmin=0, xmax = 1e7; xidealnumlabels = 6)
    plot(xc.mocc, "ex_measured_occupancy.png")

    fc = focused_callisto_info(c, xc, 180e6, 180e6 + 50)
    plot(fc.occ, "ex_occupancy_detail.png"; xidealnumlabels = 6)
end

end

Output

Calling callisto returns a Log object, and theta, which is a list of PiecewiseLinear objects, one for each node. There are two convenience functions for parsing this data. The first is parse_callisto_logx, used as follows.

julia> c = CalOpts()
julia> simlog, theta = callisto(c)
julia> xc = parse_callisto_logx(c, simlog, theta)

The return value is a named tuple, containing fields:

simlog: the Log object
theta: the phase of each node
freq: the frequency of each node
meas: the measurement at each node
afreq: the adjusted frequency at each node
xi: the first controller state at each node
mocc: the measured buffer occupancy at each edge

Each of these is a list of quantities, which are either PiecewiseLinear, PiecewiseConstant or Samples objects. One important point is that mocc only contains samples of the occupancy at the sample times, that is, every poll_period localticks. In fact, the buffer occupancy changes in between these sample instants. To see the buffer occupancy at all times, one can use focused_callisto_info, as follows

julia> tmin = 180e6
julia> tmax = tmin + 50
julia> fc = focused_callisto_info(c, xc, tmin, tmax)

The return value here is a named tuple, with two fields:

occ: the occupancy of each elastic buffer
adj_theta: the adjusted theta values

The field occ is PiecewiseConstant, containing the exact buffer occupancy of the elastic buffer, including every frame arrival and departure. As a result, it can be very large if tmax - tmin is large.

Plotting

The PiecewiseLinear, PiecewiseConstant or Samples objects each have a method tuples, which converts the object into a list of tuples (x,y) suitable for passing to your favorite plotting package, for example:

plottable_data = tuples.(xc.freq)

where we have used broadcast . notation to apply the function tuples to each entry of xc.freq.