Systolic Gauss-Jordan
This folder contains the current RTL for the systolic Gauss-Jordan path in
rtl/systolic_gauss_jordan/.
Role
This subsystem implements the available GF(2) elimination kernel introduced on
the project homepage. It streams rows of the augmented system [A | B] through
a lifted trapezoidal mesh whose processing elements perform pivot detection,
row updates, and reduction using only local state and nearest-neighbor
communication.
Available surface
The reference top solves A x = B by streaming rows of [A | B] into the
mesh. The only architectural output exposed by the reference top is the bottom
boundary trace data_bottom_o.
Why systolic
The design uses a systolic organization because the target computation is dominated by regular GF(2) row operations with predictable communication patterns. This makes the array suitable for FPGA-oriented studies in which throughput, locality, and control simplicity matter as much as arithmetic count.
RTL overview
Row r begins at global column r. The first active cell on that row is a
pe_diag instance, and every active cell to its right is a pe_col instance.
Data enters from the top edge and moves downward within a global column.
Opcodes enter each row at the diagonal cell and move rightward across the active region.
The available path is intentionally small:
input.svreads A/B rows, applies the stagger schedule, and feeds the meshcontroller.svowns the run window and reduce pulsepe_diag.sv,pe_col.sv, andtrapeziod_mesh.svimplement the arraymem.sv,delay_line.sv, andgj_pkg.svprovide the supporting primitives
Resource scaling
Because the architecture is regular and parameterized, its resource cost can be studied as a function of matrix dimensions, mesh shape, and scheduling strategy. This makes the module useful both as an implementation vehicle and as a platform for exploring hardware-performance tradeoffs in decoder design.
Module map
Source file |
Documentation page |
Role in the subsystem |
|---|---|---|
|
Canonical home for opcode definitions and shared data types |
|
|
Small reusable timing primitive used by feeders and wrappers |
|
|
Simple synchronous storage block for experiments and examples |
|
|
Collapsed reference feeder from A/B RAMs to the staggered mesh ingress |
|
|
Minimal reference top that exposes only the bottom architectural trace |
|
|
Left-edge pivot cell that generates or forwards row control |
|
|
Interior/right processing cell that applies the diagonal opcode |
|
|
Structural mesh tying the lifted array together |
How to test this now
The matching cocotb suite is documented in Test: Systolic Gauss-Jordan. The fastest useful commands are:
make -C test/systolic_gauss_jordan TEST=pe_diag
make -C test/systolic_gauss_jordan TEST=pe_col
make -C test/systolic_gauss_jordan TEST=trapeziod_mesh
make -C test/systolic_gauss_jordan TEST=trapeziod_full_trace_reduce