libmodal-flow

libmodal-flow is the visual frontend used by open-vins-server on VOXL for feature-point detection and sparse optical-flow tracking. It ingests grayscale camera frames (commonly from tracking_*_misp_norm_ion pipes), builds image pyramids on GPU, detects keypoints, and tracks points frame-to-frame.

Public Header APIs
Field Reference (Used in Detect + Track Examples)
Example Workflow: Detect + Track from ION pipes
Notes

Public Header APIs

The current user-facing headers are in:

library/include/modal_flow/Types.hpp
library/include/modal_flow/Manager.hpp
library/include/modal_flow/Detector.hpp
library/include/modal_flow/Tracker.hpp

In the current examples, users instantiate the OpenCL backend directly via:

library/include/modal_flow/ocl/OclDevice.hpp
library/include/modal_flow/ocl/ManagerCL.hpp
library/include/modal_flow/ocl/DetectorCL.hpp
library/include/modal_flow/ocl/TrackerCL.hpp

Core types

Types.hpp defines Camera, ImageView, Frame, BufferId, Keypoint, and related enums (PixelFormat, ExternalType, Backend).

Manager operations

Manager.hpp exposes camera and buffer lifecycle plus batched detection/tracking:

add_camera() / remove_camera()
acquire_pyramid_buf() / release_pyramid()
upload_frame_to_buf()
detect_many() and track_many()

Detect and track IO

Detector.hpp defines:

DetectOptions
DetectInput
DetectResult

Tracker.hpp defines:

TrackInput
TrackResult

Field Reference (Used in Detect + Track Examples)

This section maps the commonly initialized structs to the field values used by the test apps.

`modal_flow::Camera`

Field	Meaning	Example value(s)
`id`	Logical camera identifier used throughout manager/detect/track calls	`0` (front), `1` (down)
`width`	Frame width in pixels	`1280`
`height`	Frame height in pixels	`800`
`format`	Camera pixel format	`modal_flow::PixelFormat::R8`

add_camera(cam, num_bufs) also sets how many internal pyramid buffers are reserved per camera. The examples use num_bufs = 3 (detect) and num_bufs = 6 (track).

`modal_flow::ImageDesc`, `modal_flow::ImageView`, and `modal_flow::Frame`

These wrap each incoming ION frame before upload.

Type	Field	Meaning	Example value
`ImageDesc`	`width`	Image width	`data->img_meta.width`
`ImageDesc`	`height`	Image height	`data->img_meta.height`
`ImageDesc`	`format`	Pixel format	`modal_flow::PixelFormat::R8`
`ImageDesc`	`strideBytes`	Row stride in bytes	`data->img_meta.stride`
`ImageView`	`data`	CPU pointer input (unused when external handle is used)	`nullptr`
`ImageView`	`handle_type`	External buffer handle type	`modal_flow::ExternalType::ClMem`
`ImageView`	`external_handle`	Opaque `cl_mem` handle value	cast of `cl_mem_out[ch]`
`Frame`	`cam`	Camera id associated with this frame	`(modal_flow::CameraId)ch`
`Frame`	`t`	Timestamp field (ns)	`0` in these tests
`Frame`	`img`	Image payload wrapper	`iv`

`modal_flow::DetectOptions`

Field	Meaning	Value used in `modal-flow-test-detect`
`border_x`	Left/right image border ignored by detector	`3`
`border_y`	Top/bottom image border ignored by detector	`3`
`threshold`	FAST-like detection threshold	`35.f`
`use_grid_detect`	Enable grid-partitioned detection	`true`
`horizontal_grid_cells`	Number of grid columns	`5`
`vertical_grid_cells`	Number of grid rows	`5`
`grid_cells_to_search`	Explicit grid cells to search	all `(row, col)` pairs from `(0,0)` to `(4,4)`
`use_nms`	Enable non-max suppression	`true`

`modal_flow::DetectInput`

Field	Meaning	Value set in example
`cam_id`	Camera id to run detect on	`i`
`img_buf`	Claimed pyramid buffer id for that camera	`g_img_buf[i]`
`opts`	Detect options for this camera	shared `opts` instance

`modal_flow::DetectResult`

detect_many() returns:

std::vector<modal_flow::DetectResult> (one result per input camera)
each DetectResult contains keypoints (std::vector<modal_flow::Keypoint>)

Minimal usage pattern:

std::vector<modal_flow::DetectResult> results = mgr->detect_many(in);
const auto& cam0 = results[0];

for (const auto& kp : cam0.keypoints) {
    float x = kp.x;
    float y = kp.y;
    float score = kp.score;
    // use or filter keypoints here
}

Example shape of one camera result:

// Pseudodata for illustration
modal_flow::DetectResult cam0_result;
cam0_result.keypoints.push_back(modal_flow::Keypoint{102.3f, 87.1f, 34.0f});
cam0_result.keypoints.push_back(modal_flow::Keypoint{245.8f, 121.6f, 41.0f});
cam0_result.keypoints.push_back(modal_flow::Keypoint{512.0f, 300.4f, 29.5f});

`modal_flow::TrackInput`

Field	Meaning	Value set in `modal-flow-test-track`
`prev_cam_id`	Camera id for previous frame	`i`
`next_cam_id`	Camera id for current frame	`i`
`prev_img_buf`	Buffer id containing prior frame pyramid	`prev_img_buf[i]`
`next_img_buf`	Buffer id containing current frame pyramid	`next_img_buf[i]`
`prev_points`	Input keypoints to propagate	`seed_pts` / current tracked points

`modal_flow::TrackOptions`

TrackOptions exists in the API and provides LK tuning parameters. The current track example uses library defaults (it does not override them in TrackInput).

Field	Meaning	Default value in header
`win`	LK tracking window size	`21`
`max_iters`	Max LK iterations per level	`30`
`epsilon`	Convergence tolerance	`0.01f`

`modal_flow::TrackResult`

Field	Meaning	How example uses it
`next_points`	Tracked keypoints for next frame	updates `g_current_points`
`status`	Per-point validity flag (`1` success, `0` lost)	success keeps trajectory; lost resets to origin
`error`	Per-point tracking error metric	returned but not thresholded in this test

Point layout (`modal_flow::Keypoint`) and indexing

The point type used by detect and track is:

Field	Type	Meaning
`x`	`float`	X pixel coordinate
`y`	`float`	Y pixel coordinate
`score`	`float`	Detector/tracker score or confidence-like value

Practical construction example:

std::vector<modal_flow::Keypoint> seed_pts;
seed_pts.push_back(modal_flow::Keypoint{100.f, 100.f, 0.f});
seed_pts.push_back(modal_flow::Keypoint{320.f, 240.f, 0.f});
seed_pts.push_back(modal_flow::Keypoint{640.f, 400.f, 0.f});

Detect output shape:

DetectResult.keypoints is std::vector<modal_flow::Keypoint>.
Each entry is one detected feature point with x/y/score.

Track input/output shape:

TrackInput.prev_points is the input point vector for a frame-to-frame track step.
TrackResult.next_points, TrackResult.status, and TrackResult.error are aligned by index with prev_points.

Index alignment example:

for (size_t i = 0; i < res.status.size(); ++i) {
    const auto& prev = seed_pts[i];          // input point i
    const auto& next = res.next_points[i];   // tracked output for point i
    uint8_t ok = res.status[i];              // 1=tracked, 0=lost
    float err = res.error[i];                // tracking error for point i
}

`ion_buf_release_msg_t` (manual ION release)

When CLIENT_FLAG_MANUAL_ION_BUF_RELEASE is set, the detect test releases each consumed ION buffer using:

client_id = 0
buffer_id = data->buffer_id
generation = data->generation

and sends that struct via pipe_client_send_control_cmd_bytes(...).

Example Workflow: Detect + Track from ION pipes

The code below is adapted directly from:

1) Create the flow manager, detector/tracker, and register cameras

auto& dev = modal_flow::ocl::OclDevice::Instance();
mgr = new modal_flow::ocl::ManagerCL(dev);

auto det = std::make_unique<modal_flow::ocl::DetectorCL>(dev, /*nms_radius*/ 3);
mgr->set_detector(std::move(det));

auto trk = std::make_unique<modal_flow::ocl::TrackerCL>(dev);
mgr->set_tracker(std::move(trk));

modal_flow::Camera cam_front{.id = 0, .width = 1280, .height = 800,
                             .format = modal_flow::PixelFormat::R8};
int num_bufs = 3;
mgr->add_camera(cam_front, num_bufs);

for (int ch = 0; ch < kNumCh; ++ch) {
    pipe_client_set_ion_buf_helper_cb(ch, _ion_buf_cb, nullptr);
    pipe_client_set_disconnect_cb(ch, _disconnect_cb, nullptr);
    pipe_client_open(
        ch,
        pipes[ch].c_str(),
        CLIENT_NAME,
        CLIENT_FLAG_EN_ION_BUF_HELPER | CLIENT_FLAG_MANUAL_ION_BUF_RELEASE,
        0);
}

Use CLIENT_FLAG_MANUAL_ION_BUF_RELEASE when your app will explicitly release buffers back to the producer.

3) In the ION callback: claim pyramid buffers and upload frames

modal_flow::ImageDesc desc{data->img_meta.width, data->img_meta.height,
                      modal_flow::PixelFormat::R8, data->img_meta.stride};
modal_flow::ImageView iv{desc,
                         nullptr,
                         modal_flow::ExternalType::ClMem,
                         static_cast<uint64_t>(reinterpret_cast<std::uintptr_t>(cl_mem_out[ch]))};

modal_flow::Frame f{(modal_flow::CameraId)ch, 0, iv};

if (g_img_buf[ch]) {
    mgr->release_pyramid((modal_flow::CameraId)ch, g_img_buf[ch]);
    g_img_buf[ch] = 0;
}

g_img_buf[ch] = mgr->acquire_pyramid_buf((modal_flow::CameraId)ch);
mgr->upload_frame_to_buf(f, g_img_buf[ch]);

This is the claim/use/release cycle for internal pyramid buffers.

4) Run feature detection

std::vector<modal_flow::DetectInput> in(kNumCh);
modal_flow::DetectOptions opts;
opts.border_x = 3;
opts.border_y = 3;
opts.threshold = 35.f;
opts.use_grid_detect = true;
opts.horizontal_grid_cells = 5;
opts.vertical_grid_cells = 5;
opts.use_nms = true;

for (int i = 0; i < kNumCh; i++) {
    in[i].cam_id = i;
    in[i].img_buf = g_img_buf[i];
    in[i].opts = opts;
}

auto res = mgr->detect_many(in);

5) Run tracking between previous and current buffers

std::vector<modal_flow::TrackInput> inputs(kNumCh);
for (int i = 0; i < kNumCh; ++i) {
    inputs[i].prev_cam_id  = i;
    inputs[i].next_cam_id  = i;
    inputs[i].prev_img_buf = prev_img_buf[i];
    inputs[i].next_img_buf = next_img_buf[i];
    inputs[i].prev_points  = seed_pts;
}

std::vector<modal_flow::TrackResult> results = mgr->track_many(inputs);

6) Release pyramid buffers (`release_pyramid`) after detect/track use

The API method is release_pyramid(CameraId, BufferId) (there is no release_pyramid_buf symbol in the header).

Recommended lifecycle:

claim: BufferId id = mgr->acquire_pyramid_buf(cam_id);
use: mgr->upload_frame_to_buf(frame, id); and run detect/track using that id
release: mgr->release_pyramid(cam_id, id); when the buffer is no longer needed

Common patterns used in the examples:

detect loop: release previous frame’s g_img_buf[ch] before acquiring the next one
track loop: when next_img_buf[ch] becomes prev_img_buf[ch], release the old previous buffer first
shutdown: release any remaining non-zero BufferId values before exit

Minimal cleanup pattern:

for (int ch = 0; ch < kNumCh; ++ch) {
    if (g_img_buf[ch]) {
        mgr->release_pyramid((modal_flow::CameraId)ch, g_img_buf[ch]);
        g_img_buf[ch] = 0;
    }
}

7) Release ION buffers when images are no longer needed

If manual release is enabled, return each consumed buffer to the producer:

ion_buf_release_msg_t msg;
msg.client_id  = 0;
msg.buffer_id  = data->buffer_id;
msg.generation = data->generation;

int ret = pipe_client_send_control_cmd_bytes(ch, &msg, sizeof(ion_buf_release_msg_t));

Without this release step, camera producers can stall due to unreleased buffers.

Notes

modal-flow-test-detect demonstrates a detection-oriented pipeline and explicit ION buffer release.
modal-flow-test-track demonstrates temporal tracking with prev_img_buf/next_img_buf and TrackResult.status handling.
For clean shutdown, close all subscriptions with pipe_client_close_all().

libmodal-flow

Table of contents

Public Header APIs

Core types

Manager operations

Detect and track IO

Field Reference (Used in Detect + Track Examples)

modal_flow::Camera

modal_flow::ImageDesc, modal_flow::ImageView, and modal_flow::Frame

modal_flow::DetectOptions

modal_flow::DetectInput

modal_flow::DetectResult

modal_flow::TrackInput

modal_flow::TrackOptions

modal_flow::TrackResult

Point layout (modal_flow::Keypoint) and indexing

ion_buf_release_msg_t (manual ION release)