HAL — AXI-Lite MMIO Layer¶
The uca_hal_* function set is the AXI-Lite MMIO layer that sits
between the public C API (uca_*, see :doc:api) and the NPU hardware.
No code above this layer accesses physical addresses or register offsets
directly.
The implementation lives in
codes/v002/sw/driver/uCA_v1_hal.c / uCA_v1_hal.h.
HAL Position¶
The driver stack is organized into two layers.
Layer |
Symbol prefix |
Role |
|---|---|---|
Public API |
|
Compute and memory primitives. Assembles 64-bit VLIW instructions
and passes them through the HAL. See :doc: |
HAL |
|
AXI-Lite register reads and writes, 64-bit instruction latching, status polling. Depends directly on KV260 bare-metal pointer MMIO. |
The HAL stores all state in a single file-scope singleton,
g_mmio_base (volatile uint32_t *). No context pointer is used;
a single process is expected to communicate with one NPU instance.
The KV260 bare-metal HAL source files (sw/driver/uCA_v1_hal.{c,h})
remain in the board integration repository,
pccxai/pccx-FPGA-NPU-LLM-kv260,
because they encode KV260 board-side bare-metal pointer MMIO. The
authoritative reading is the source files in that repo; this page
no longer embeds the listing inline.
Register Map¶
The MMIO base address is UCA_MMIO_BASE_ADDR = 0xA0000000. This value
must match the AXI-Lite slave address assigned in the Vivado block
design.
(see the linked KV260 board integration repo above for the source listing)
Name |
Offset |
Access |
Description |
|---|---|---|---|
|
|
Write |
Lower 32 bits of the 64-bit VLIW instruction. Written first. |
|
|
Write |
Upper 32 bits of the 64-bit VLIW instruction. Writing this register triggers the NPU instruction latch. |
|
|
Read |
NPU status register (read-only). Contains |
A 64-bit instruction is written as a pair, LO first, HI second. The HI write triggers the controller’s instruction latch.
(see the linked KV260 board integration repo above for the source listing)
CMD_IN / STAT_OUT Mechanics¶
uca_hal_issue_instr submits a 64-bit instruction to the NPU’s CMD_IN
path by writing the register pair. The call returns immediately; the
NPU controller executes the instruction independently inside its
pipeline.
Status register UCA_REG_STATUS bit fields:
(see the linked KV260 board integration repo above for the source listing)
UCA_STAT_BUSY(bit 0) — NPU is executing an instruction. Do not issue a new instruction while this bit is set.UCA_STAT_DONE(bit 1) — Last operation completed successfully.
Polling is performed by uca_hal_wait_idle. Because no hardware timer
driver is yet available on the bare-metal KV260, the current
implementation uses a busy-wait loop with an iteration count estimated
at the 400 MHz core rate.
(see the linked KV260 board integration repo above for the source listing)
When timeout_us decrements to zero, -1 is returned. The NPU state is
not forced-reset on timeout; the caller is responsible for error
recovery.
uca_init Flow¶
uca_hal_init performs three operations in sequence.
Sets
g_mmio_baseto(volatile uint32_t *)UCA_MMIO_BASE_ADDR. Physical addresses are directly accessible in the KV260 bare-metal environment.Calls
uca_hal_read32(UCA_REG_STATUS)to read the status register.If the return value is
0xFFFFFFFF, the AXI bus is not responding; returns-1. Otherwise returns0.
(see the linked KV260 board integration repo above for the source listing)
uca_hal_deinit sets g_mmio_base to NULL. Any subsequent
uca_hal_write32 or uca_hal_read32 call will dereference a null
pointer; the caller must ensure no HAL calls follow uca_hal_deinit.
See also
Public API primitives: :doc:
apiAXI-Lite command path architecture: :doc:
../Architecture/top_levelISA instruction encoding: :doc:
../ISA/encoding
Last verified against
Commit 8c09e5e @ pccxai/pccx-FPGA-NPU-LLM-kv260 (2026-04-29)