Link Search Menu Expand Document

VOXL Camera Server

Project badge Project badge Project badge


VOXL Camera Server publishes MIPI Camera images and video to MPA pipes to allow multiple clients to utilize each camera. It is one of many MPA services that publish image data. See the previous page to read more about the MPA camera interface.

Example services that are clients for voxl-camera-server include voxl-qvio-server, voxl-portal, ROS, and voxl-streamer.

VOXL Camera Server uses Google’s HAL3 API to access the Qualcomm ISP pipeline and the OMX API for real-time video compression.

Hardware Configuration

MIPI camera interfaces and drivers are very complicated, requiring careful schematic and PCB design to connect. Driver bringup requires deep knowledge of the Qualcomm camera stack and kernel. These are not USB webcams and cannot be plugged in willy-nilly to whatever port you fancy.

ModalAI provides a set of supported camera configurations to cover a range of use cases for the VOXL 1, VOXL 2, and VOXL 2 Mini.

QRB5165 based platforms (VOXL 2 and VOXL 2 mini) are more flexible but it would impossible for us to support the thousands of possible camera configurations. Therefore we suggest starting with a supported camera configuration, seeing how the qrb5165-configure-cameras tool sets up the drivers and voxl-camera-server config file, then modifying to suit your use case from there.

Only connect and disconnect cameras while VOXL is powered off!!

The following sensors are supported:

“ov7251”, “ov9782”, “imx214”, “imx412”, “imx678”, “pmd-tof”

Software configuration

voxl-configure-cameras is a top-level helper script used to set up camera sensor drivers and the/etc/modalai/voxl-camera-server.conf config file. It will write default settings for the selected camera configuration as a starting point for further customization if necessary. If you want to change some aspects of the camera server behavior such as disabling a camera or changing video encoding settings, you can do so by modifying that file and restarting camera server with systemctl restart voxl-camera-server.

HAL3 Streams

HAL3 can provide multiple simultaneous image streams with different formats and resolutions. On APQ8096 (VOXL1) the upper limit is 4 simultaneous streams. On QRB5165 (VOXL2) you can only have 3 simultaneous streams enabled.

voxl-camera-server’s config file is set up to configure 4 different streams with specific intended use cases. The streams are not activated until a client is connected.

previewLow latency uncompressed images computer vision algorithmsNV12 or RAW8
small_encodedCompressed video for wireless stream (See voxl-streamer)H264 or H265
large_encodedCompressed video for saving to diskH264 or H265
snapshotSave full-quality ISP processed JPG to disk or to pipeJPG

The primary stream for computer video sensors such as the OV7251 Tracking camera and stereo pairs is the preview stream. For example, voxl-qvio-server takes a preview stream in RAW8 format as its input for the tracking sensor.

voxl-camera-server V1.4.5 and above has the ability to use hardware acceleration to compress H264 and H265 video. This is provided by the small_encoded and large_encoded streams. voxl-streamer is configured to stream the hires_small_encoded stream by default. Note that if you want to change the resolution of the RTSP stream provided by voxl-streamer you need to set it in voxl-camera-server.conf since it’s voxl-camera-server that’s encoding the video!

Finally the snapshot stream is unique in that it’s not a steady stream at constant framerate but instead takes a snapshot when it receives the “snapshot” command through any of that camera’s control pipes. This triggers the Qualcomm ISP to do the same pipeline as clicking the shutter button in your smartphone’s camera app!

Camera Server Config File

All of these parameters are valid for every camera or stereo pair. However many of them will be hidden in the config file to reduce clutter where they are not applicable for a particular camera or configuration. For example, the preview stream params are hidden on IMX214 hires sensors since out of the box only the two video streams and snapshot streams are enabled. But, you could opt to disable one of the default streams and enable the preview stream.

The Camera Server config file is in json format and contains an array of camera, each of which may contain the following values:

General Settings

type: “ov7251”, “ov9782”, “imx214”, “imx412”, “imx678”, or “pmd-tof” name: name used as prefix for pipes published by this camera
enabled: true by default. Set to false to disable the camera.
camera_id: This is the the id of the camera as enumerated by HAL3
camera_id2: This is the the id of the secondary camera when running a stereo pair. Omit or Leave as -1 for monocular.
independent_exposure: true or false to enable independent auto exposure for a stereo pair. Default: false
fps: framerate
ae_mode: Auto exposure mode “off”, “isp”, or “lme_msv”
standby_enabled: enable decimated framerate when CPU reports standby mode, only for TOF (default false)
decimator: TOF framerate decimator when in standby mode

Preview Stream Settings

en_preview: Enable the preview stream (true or false)
preview_width: Width of the preview stream image (default 640)
preview_height: Height of the preview stream image (default 640)
pre_format: Format, raw8 for greyscale sensors, nv12 for color sensors

Small Video Stream Settings

en_small_video: true or false to enable the small video stream
small_video_width: default 1024
small_video_height: default 768
small_venc_mode: “h264” or “h265”
small_venc_br_ctrl: “cqp” or constant quantization or “cbr” for constant bitrate
small_venc_Qfixed: Quantization to use for cqp mode
small_venc_Qmin: Minimum quantization to allow in cbr mode
small_venc_Qmax: Max quantization to allow in cbr mode
small_venc_nPframes: number of P frames to use between I frames (default 9)
small_venc_mbps: target bitrate for cbr mode (default 2)

Large Video Stream Settings

en_large_video: true or false to enable the large video stream
large_video_width: default dependent on sensor, typically full resolution
large_video_height: default dependent on sensor, typically full resolution
large_venc_mode: “h264” or “h265”
large_venc_br_ctrl: “cqp” or constant quantization or “cbr” for constant bitrate
large_venc_Qfixed: Quantization to use for cqp mode
large_venc_Qmin: Minimum quantization to allow in cbr mode
large_venc_Qmax: Max quantization to allow in cbr mode
large_venc_nPframes: number of P frames to use between I frames (default 29)
large_venc_mbps: target bitrate for cbr mode (default 30)

Snapshot Stream Settings

en_snapshot: true or false
snapshot_width:default dependent on sensor, typically full resolution
snapshot_height: default dependent on sensor, typically full resolution

libmodal exposure settings

When using libmodal exposure (lme_msv) instead of the ISP’s autoexposure you have more control, this is exposed through the following parameters.

ae_desired_msv: the desired mean sample value, a.k.a. the average value of pixels that the auto exposure algorithm should try to achieve in frame
ae_k_p_ns: the desired p_ns for the exposure algorithm
ae_k_i_ns: the desired k_i for the exposure algorithm
ae_max_i: the desired max_i for the exposure algorithm
ae_filter_alpha: a low-pass filter constant that filters the calculated MSV to slow down responses - the filter used is an IIR filter
ae_ignore_fraction: maximum percentage of saturated (255) pixels that will be used in calculation of MSV. If there are more saturated_pixels / total_pixels, then additional saturated pixels are not used to calculate MSV. This helps prevent image getting too dark if there are large blobs of very bright light
ae_slope: ratio that specifies how much gain vs exposure should be changed when trying to achieve desired MSV. Both gain and exposure linearly affect the pixel brightness, but gain and exposure have different effects on the image quality - mostly in the sense that gain affects granularity and that exposure affects motion blur.
ae_exposure_period: controls the duration where the cells of the camera sensor are exposed to light
ae_gain_period: the gain period associated to the auto exposure (think of this as an amplification factor of the pixels)

Constant Quantization vs Constant Bitrate Mode

Constant bitrate video compression works but there is a known issue with the hardware accelerated video compressor where a scale factor is applied to the desired bitrate that is dependent on resolution. This scale factor is roughly 13x for 1024x768 video and roughly 30x for 4K video. If you wish to use constant bitrate mode you will need to set a much smaller target bitrate in the config file than desired. Then use voxl-inspect-cam to measure the output bitrate during motion and while still until you find the config file bitrate value that achieves the desired result.

The default encoding scheme is “cqp” or Constant Quantization Parameter which is more useful in most scenarios as it allows the bitrate to drop automatically when there is little motion in the video. Simply tune the small_venc_Qfixed and large_venc_Qfixed quantization parameters to achieve the desired bitrate and quality while in motion. This will save bandwidth and disk space when the drone is still.

Config File Example

Whenever you run voxl-configure-cameras a new default camera server config file is created for that particular camera setup. Here is an example default config file for camera config #6 used on Starling.

    "version":  0.1,
    "cameras":  [{
            "type": "pmd-tof",
            "name": "tof",
            "enabled":  true,
            "camera_id":    0,
            "fps":  5,
            "en_preview":   true,
            "preview_width":    224,
            "preview_height":   1557,
            "pre_format":   "tof",
            "ae_mode":  "off",
            "standby_enabled":  false,
            "decimator":    5
        }, {
            "type": "imx214",
            "name": "hires",
            "enabled":  true,
            "camera_id":    1,
            "fps":  30,
            "en_preview":   false,
            "preview_width":    640,
            "preview_height":   480,
            "pre_format":   "nv21",
            "en_small_video":   true,
            "small_video_width":    1024,
            "small_video_height":   768,
            "small_venc_mode":  "h265",
            "small_venc_br_ctrl":   "cqp",
            "small_venc_Qfixed":    30,
            "small_venc_Qmin":  15,
            "small_venc_Qmax":  40,
            "small_venc_nPframes":  9,
            "small_venc_mbps":  2,
            "en_large_video":   true,
            "large_video_width":    4096,
            "large_video_height":   2160,
            "large_venc_mode":  "h265",
            "large_venc_br_ctrl":   "cqp",
            "large_venc_Qfixed":    38,
            "large_venc_Qmin":  15,
            "large_venc_Qmax":  50,
            "large_venc_nPframes":  29,
            "large_venc_mbps":  30,
            "en_snapshot":  true,
            "en_snapshot_width":    4160,
            "en_snapshot_height":   3120,
            "ae_mode":  "isp"
        }, {
            "type": "ov7251",
            "name": "tracking",
            "enabled":  true,
            "camera_id":    2,
            "fps":  30,
            "en_preview":   true,
            "preview_width":    640,
            "preview_height":   480,
            "pre_format":   "raw8",
            "ae_mode":  "lme_msv",
            "ae_desired_msv":   60,
            "ae_filter_alpha":  0.600000023841858,
            "ae_ignore_fraction":   0.20000000298023224,
            "ae_slope": 0.05000000074505806,
            "ae_exposure_period":   1,
            "ae_gain_period":   1

Camera sensor options and capabilities

For the available camera configurations as well as the accepted FPS/Dimensions associated to each camera sensor, please follow the link here.

Supported Resolutions

From voxl-camera-server 1.7.0 onwards, all resolutions a sensor supports can be compressed real-time using OMX.

Geo-tag Snapshots with GPS

Note: this feature is only available on VOXL 2

When GPS is being published via the voxl-mavlink-server, voxl-camera-server will geo-tag each snapshot captured with the latest GPS coordinates. The GPS data (latitude, longitude, altitude) will be saved in the JPEG Snapshot EXIF metadata. In order for this to occur, the user must have a valid GPS connected and actively publishing data from a MAVLink compatible flight controller, such as PX4 or ArduPillot.

You can verify GPS is being published with the voxl-inspect-gps tool.

In order to view the EXIF metadata of the photo, the user must pull the photo from the VOXL 2 (that lives in /data/snapshots) and then can look at the properties of the image via a double click or by uploading the image to a website that allows for metadata from the EXIF to be viewed.


From a command line, run voxl-camera-server -d0 to view the output of the camera-server initialiation process.

Source Code

The source code for VOXL Camera Server can be found here