libmodal-flow
libmodal-flow is the visual frontend used by open-vins-server on VOXL for feature-point detection and sparse optical-flow tracking. It ingests grayscale camera frames (commonly from tracking_*_misp_norm_ion pipes), builds image pyramids on GPU, detects keypoints, and tracks points frame-to-frame.
Table of contents
- Public Header APIs
- Field Reference (Used in Detect + Track Examples)
modal_flow::Cameramodal_flow::ImageDesc,modal_flow::ImageView, andmodal_flow::Framemodal_flow::DetectOptionsmodal_flow::DetectInputmodal_flow::DetectResultmodal_flow::TrackInputmodal_flow::TrackOptionsmodal_flow::TrackResult- Point layout (
modal_flow::Keypoint) and indexing ion_buf_release_msg_t(manual ION release)
- Example Workflow: Detect + Track from ION pipes
- 1) Create the flow manager, detector/tracker, and register cameras
- 2) Subscribe to camera ION pipes
- 3) In the ION callback: claim pyramid buffers and upload frames
- 4) Run feature detection
- 5) Run tracking between previous and current buffers
- 6) Release pyramid buffers (
release_pyramid) after detect/track use - 7) Release ION buffers when images are no longer needed
- Notes
Public Header APIs
The current user-facing headers are in:
library/include/modal_flow/Types.hpplibrary/include/modal_flow/Manager.hpplibrary/include/modal_flow/Detector.hpplibrary/include/modal_flow/Tracker.hpp
In the current examples, users instantiate the OpenCL backend directly via:
library/include/modal_flow/ocl/OclDevice.hpplibrary/include/modal_flow/ocl/ManagerCL.hpplibrary/include/modal_flow/ocl/DetectorCL.hpplibrary/include/modal_flow/ocl/TrackerCL.hpp
Core types
Types.hpp defines Camera, ImageView, Frame, BufferId, Keypoint, and related enums (PixelFormat, ExternalType, Backend).
Manager operations
Manager.hpp exposes camera and buffer lifecycle plus batched detection/tracking:
add_camera()/remove_camera()acquire_pyramid_buf()/release_pyramid()upload_frame_to_buf()detect_many()andtrack_many()
Detect and track IO
Detector.hpp defines:
DetectOptionsDetectInputDetectResult
Tracker.hpp defines:
TrackInputTrackResult
Field Reference (Used in Detect + Track Examples)
This section maps the commonly initialized structs to the field values used by the test apps.
modal_flow::Camera
| Field | Meaning | Example value(s) |
|---|---|---|
id | Logical camera identifier used throughout manager/detect/track calls | 0 (front), 1 (down) |
width | Frame width in pixels | 1280 |
height | Frame height in pixels | 800 |
format | Camera pixel format | modal_flow::PixelFormat::R8 |
add_camera(cam, num_bufs) also sets how many internal pyramid buffers are reserved per camera. The examples use num_bufs = 3 (detect) and num_bufs = 6 (track).
modal_flow::ImageDesc, modal_flow::ImageView, and modal_flow::Frame
These wrap each incoming ION frame before upload.
| Type | Field | Meaning | Example value |
|---|---|---|---|
ImageDesc | width | Image width | data->img_meta.width |
ImageDesc | height | Image height | data->img_meta.height |
ImageDesc | format | Pixel format | modal_flow::PixelFormat::R8 |
ImageDesc | strideBytes | Row stride in bytes | data->img_meta.stride |
ImageView | data | CPU pointer input (unused when external handle is used) | nullptr |
ImageView | handle_type | External buffer handle type | modal_flow::ExternalType::ClMem |
ImageView | external_handle | Opaque cl_mem handle value | cast of cl_mem_out[ch] |
Frame | cam | Camera id associated with this frame | (modal_flow::CameraId)ch |
Frame | t | Timestamp field (ns) | 0 in these tests |
Frame | img | Image payload wrapper | iv |
modal_flow::DetectOptions
| Field | Meaning | Value used in modal-flow-test-detect |
|---|---|---|
border_x | Left/right image border ignored by detector | 3 |
border_y | Top/bottom image border ignored by detector | 3 |
threshold | FAST-like detection threshold | 35.f |
use_grid_detect | Enable grid-partitioned detection | true |
horizontal_grid_cells | Number of grid columns | 5 |
vertical_grid_cells | Number of grid rows | 5 |
grid_cells_to_search | Explicit grid cells to search | all (row, col) pairs from (0,0) to (4,4) |
use_nms | Enable non-max suppression | true |
modal_flow::DetectInput
| Field | Meaning | Value set in example |
|---|---|---|
cam_id | Camera id to run detect on | i |
img_buf | Claimed pyramid buffer id for that camera | g_img_buf[i] |
opts | Detect options for this camera | shared opts instance |
modal_flow::DetectResult
detect_many() returns:
std::vector<modal_flow::DetectResult>(one result per input camera)- each
DetectResultcontainskeypoints(std::vector<modal_flow::Keypoint>)
Minimal usage pattern:
std::vector<modal_flow::DetectResult> results = mgr->detect_many(in);
const auto& cam0 = results[0];
for (const auto& kp : cam0.keypoints) {
float x = kp.x;
float y = kp.y;
float score = kp.score;
// use or filter keypoints here
}
Example shape of one camera result:
// Pseudodata for illustration
modal_flow::DetectResult cam0_result;
cam0_result.keypoints.push_back(modal_flow::Keypoint{102.3f, 87.1f, 34.0f});
cam0_result.keypoints.push_back(modal_flow::Keypoint{245.8f, 121.6f, 41.0f});
cam0_result.keypoints.push_back(modal_flow::Keypoint{512.0f, 300.4f, 29.5f});
modal_flow::TrackInput
| Field | Meaning | Value set in modal-flow-test-track |
|---|---|---|
prev_cam_id | Camera id for previous frame | i |
next_cam_id | Camera id for current frame | i |
prev_img_buf | Buffer id containing prior frame pyramid | prev_img_buf[i] |
next_img_buf | Buffer id containing current frame pyramid | next_img_buf[i] |
prev_points | Input keypoints to propagate | seed_pts / current tracked points |
modal_flow::TrackOptions
TrackOptions exists in the API and provides LK tuning parameters. The current track example uses library defaults (it does not override them in TrackInput).
| Field | Meaning | Default value in header |
|---|---|---|
win | LK tracking window size | 21 |
max_iters | Max LK iterations per level | 30 |
epsilon | Convergence tolerance | 0.01f |
modal_flow::TrackResult
| Field | Meaning | How example uses it |
|---|---|---|
next_points | Tracked keypoints for next frame | updates g_current_points |
status | Per-point validity flag (1 success, 0 lost) | success keeps trajectory; lost resets to origin |
error | Per-point tracking error metric | returned but not thresholded in this test |
Point layout (modal_flow::Keypoint) and indexing
The point type used by detect and track is:
| Field | Type | Meaning |
|---|---|---|
x | float | X pixel coordinate |
y | float | Y pixel coordinate |
score | float | Detector/tracker score or confidence-like value |
Practical construction example:
std::vector<modal_flow::Keypoint> seed_pts;
seed_pts.push_back(modal_flow::Keypoint{100.f, 100.f, 0.f});
seed_pts.push_back(modal_flow::Keypoint{320.f, 240.f, 0.f});
seed_pts.push_back(modal_flow::Keypoint{640.f, 400.f, 0.f});
Detect output shape:
DetectResult.keypointsisstd::vector<modal_flow::Keypoint>.- Each entry is one detected feature point with
x/y/score.
Track input/output shape:
TrackInput.prev_pointsis the input point vector for a frame-to-frame track step.TrackResult.next_points,TrackResult.status, andTrackResult.errorare aligned by index withprev_points.
Index alignment example:
for (size_t i = 0; i < res.status.size(); ++i) {
const auto& prev = seed_pts[i]; // input point i
const auto& next = res.next_points[i]; // tracked output for point i
uint8_t ok = res.status[i]; // 1=tracked, 0=lost
float err = res.error[i]; // tracking error for point i
}
ion_buf_release_msg_t (manual ION release)
When CLIENT_FLAG_MANUAL_ION_BUF_RELEASE is set, the detect test releases each consumed ION buffer using:
client_id = 0buffer_id = data->buffer_idgeneration = data->generation
and sends that struct via pipe_client_send_control_cmd_bytes(...).
Example Workflow: Detect + Track from ION pipes
The code below is adapted directly from:
1) Create the flow manager, detector/tracker, and register cameras
auto& dev = modal_flow::ocl::OclDevice::Instance();
mgr = new modal_flow::ocl::ManagerCL(dev);
auto det = std::make_unique<modal_flow::ocl::DetectorCL>(dev, /*nms_radius*/ 3);
mgr->set_detector(std::move(det));
auto trk = std::make_unique<modal_flow::ocl::TrackerCL>(dev);
mgr->set_tracker(std::move(trk));
modal_flow::Camera cam_front{.id = 0, .width = 1280, .height = 800,
.format = modal_flow::PixelFormat::R8};
int num_bufs = 3;
mgr->add_camera(cam_front, num_bufs);
2) Subscribe to camera ION pipes
for (int ch = 0; ch < kNumCh; ++ch) {
pipe_client_set_ion_buf_helper_cb(ch, _ion_buf_cb, nullptr);
pipe_client_set_disconnect_cb(ch, _disconnect_cb, nullptr);
pipe_client_open(
ch,
pipes[ch].c_str(),
CLIENT_NAME,
CLIENT_FLAG_EN_ION_BUF_HELPER | CLIENT_FLAG_MANUAL_ION_BUF_RELEASE,
0);
}
Use CLIENT_FLAG_MANUAL_ION_BUF_RELEASE when your app will explicitly release buffers back to the producer.
3) In the ION callback: claim pyramid buffers and upload frames
modal_flow::ImageDesc desc{data->img_meta.width, data->img_meta.height,
modal_flow::PixelFormat::R8, data->img_meta.stride};
modal_flow::ImageView iv{desc,
nullptr,
modal_flow::ExternalType::ClMem,
static_cast<uint64_t>(reinterpret_cast<std::uintptr_t>(cl_mem_out[ch]))};
modal_flow::Frame f{(modal_flow::CameraId)ch, 0, iv};
if (g_img_buf[ch]) {
mgr->release_pyramid((modal_flow::CameraId)ch, g_img_buf[ch]);
g_img_buf[ch] = 0;
}
g_img_buf[ch] = mgr->acquire_pyramid_buf((modal_flow::CameraId)ch);
mgr->upload_frame_to_buf(f, g_img_buf[ch]);
This is the claim/use/release cycle for internal pyramid buffers.
4) Run feature detection
std::vector<modal_flow::DetectInput> in(kNumCh);
modal_flow::DetectOptions opts;
opts.border_x = 3;
opts.border_y = 3;
opts.threshold = 35.f;
opts.use_grid_detect = true;
opts.horizontal_grid_cells = 5;
opts.vertical_grid_cells = 5;
opts.use_nms = true;
for (int i = 0; i < kNumCh; i++) {
in[i].cam_id = i;
in[i].img_buf = g_img_buf[i];
in[i].opts = opts;
}
auto res = mgr->detect_many(in);
5) Run tracking between previous and current buffers
std::vector<modal_flow::TrackInput> inputs(kNumCh);
for (int i = 0; i < kNumCh; ++i) {
inputs[i].prev_cam_id = i;
inputs[i].next_cam_id = i;
inputs[i].prev_img_buf = prev_img_buf[i];
inputs[i].next_img_buf = next_img_buf[i];
inputs[i].prev_points = seed_pts;
}
std::vector<modal_flow::TrackResult> results = mgr->track_many(inputs);
6) Release pyramid buffers (release_pyramid) after detect/track use
The API method is release_pyramid(CameraId, BufferId) (there is no release_pyramid_buf symbol in the header).
Recommended lifecycle:
- claim:
BufferId id = mgr->acquire_pyramid_buf(cam_id); - use:
mgr->upload_frame_to_buf(frame, id);and run detect/track using thatid - release:
mgr->release_pyramid(cam_id, id);when the buffer is no longer needed
Common patterns used in the examples:
- detect loop: release previous frame’s
g_img_buf[ch]before acquiring the next one - track loop: when
next_img_buf[ch]becomesprev_img_buf[ch], release the old previous buffer first - shutdown: release any remaining non-zero
BufferIdvalues before exit
Minimal cleanup pattern:
for (int ch = 0; ch < kNumCh; ++ch) {
if (g_img_buf[ch]) {
mgr->release_pyramid((modal_flow::CameraId)ch, g_img_buf[ch]);
g_img_buf[ch] = 0;
}
}
7) Release ION buffers when images are no longer needed
If manual release is enabled, return each consumed buffer to the producer:
ion_buf_release_msg_t msg;
msg.client_id = 0;
msg.buffer_id = data->buffer_id;
msg.generation = data->generation;
int ret = pipe_client_send_control_cmd_bytes(ch, &msg, sizeof(ion_buf_release_msg_t));
Without this release step, camera producers can stall due to unreleased buffers.
Notes
modal-flow-test-detectdemonstrates a detection-oriented pipeline and explicit ION buffer release.modal-flow-test-trackdemonstrates temporal tracking withprev_img_buf/next_img_bufandTrackResult.statushandling.- For clean shutdown, close all subscriptions with
pipe_client_close_all().