VOXL Camera Server

VOXL Camera Server

Overview

VOXL Camera Server publishes MIPI Camera images and video to MPA pipes to allow multiple clients to utilize each camera. It is one of many MPA services that publish image data. See the previous page to read more about the MPA camera interface.

Example services that are clients for voxl-camera-server include voxl-qvio-server, voxl-portal, ROS, and voxl-streamer.

VOXL Camera Server uses Google’s HAL3 API to access the Qualcomm ISP pipeline and the OMX API for real-time video compression.

Hardware Configuration

MIPI camera interfaces and drivers are very complicated, requiring careful schematic and PCB design to connect. Driver bringup requires deep knowledge of the Qualcomm camera stack and kernel. These are not USB webcams and cannot be plugged in willy-nilly to whatever port you fancy.

ModalAI provides a set of supported camera configurations to cover a range of use cases for the VOXL 1, VOXL 2, and VOXL 2 Mini.

QRB5165 based platforms (VOXL 2 and VOXL 2 mini) are more flexible but it would impossible for us to support the thousands of possible camera configurations. Therefore we suggest starting with a supported camera configuration, seeing how the qrb5165-configure-cameras tool sets up the drivers and voxl-camera-server config file, then modifying to suit your use case from there.

Only connect and disconnect cameras while VOXL is powered off!!

The following sensors are supported code reference:

ov7251
ov9782
ov64b
ar0144
ar0144-12bit
imx214
imx412
imx664
imx678
pmd-tof
pmd-tof-liow2
boson

Software configuration

voxl-configure-cameras is a top-level helper script used to set up camera sensor drivers and the/etc/modalai/voxl-camera-server.conf config file. It will write default settings for the selected camera configuration as a starting point for further customization if necessary. If you want to change some aspects of the camera server behavior such as disabling a camera or changing video encoding settings, you can do so by modifying that file and restarting camera server with systemctl restart voxl-camera-server.

HAL3 Streams

HAL3 can provide multiple simultaneous image streams with different formats and resolutions. On APQ8096 (VOXL1) the upper limit is 4 simultaneous streams. On QRB5165 (VOXL2) you can only have 3 simultaneous streams enabled.

voxl-camera-server’s config file is set up to configure 4 different streams with specific intended use cases. The streams are not activated until a client is connected.

Mode	Purpose	Format
preview	Low latency uncompressed images computer vision algorithms	NV12 or RAW8
small_encoded	Compressed video for wireless stream (See voxl-streamer)	H264 or H265
large_encoded	Compressed video for saving to disk	H264 or H265
snapshot	Save full-quality ISP processed JPG to disk or to pipe	JPG

The primary stream for computer video sensors such as the OV7251 Tracking camera and stereo pairs is the preview stream. For example, voxl-qvio-server takes a preview stream in RAW8 format as its input for the tracking sensor.

voxl-camera-server V1.4.5 and above has the ability to use hardware acceleration to compress H264 and H265 video. This is provided by the small_encoded and large_encoded streams. voxl-streamer is configured to stream the hires_small_encoded stream by default. Note that if you want to change the resolution of the RTSP stream provided by voxl-streamer you need to set it in voxl-camera-server.conf since it’s voxl-camera-server that’s encoding the video!

Finally the snapshot stream is unique in that it’s not a steady stream at constant framerate but instead takes a snapshot when it receives the “snapshot” command through any of that camera’s control pipes. This triggers the Qualcomm ISP to do the same pipeline as clicking the shutter button in your smartphone’s camera app!

Camera Server Config File

All of these parameters are valid for every camera or stereo pair. However many of them will be hidden in the config file to reduce clutter where they are not applicable for a particular camera or configuration. For example, the preview stream params are hidden on IMX214 hires sensors since out of the box only the two video streams and snapshot streams are enabled. But, you could opt to disable one of the default streams and enable the preview stream.

The Camera Server config file is in json format and contains an array of camera, each of which may contain the following values:

General Settings

type: “ov7251”, “ov9782”, “imx214”, “imx412”, “imx678”, or “pmd-tof” name: name used as prefix for pipes published by this camera
enabled: true by default. Set to false to disable the camera.
camera_id: This is the the id of the camera as enumerated by HAL3
camera_id2: This is the the id of the secondary camera when running a stereo pair. Omit or Leave as -1 for monocular.
independent_exposure: true or false to enable independent auto exposure for a stereo pair. Default: false
fps: framerate
ae_mode: Auto exposure mode “off”, “isp”, or “lme_msv”
standby_enabled: enable decimated framerate when CPU reports standby mode, only for TOF (default false)
decimator: TOF framerate decimator when in standby mode

Preview Stream Settings

en_preview: Enable the preview stream (true or false)
preview_width: Width of the preview stream image (default 640)
preview_height: Height of the preview stream image (default 640)
pre_format: Format, raw8 for greyscale sensors, nv12 for color sensors

Small Video Stream Settings

en_small_video: true or false to enable the small video stream
small_video_width: default 1024
small_video_height: default 768
small_venc_mode: “h264” or “h265”
small_venc_br_ctrl: “cqp” or constant quantization or “cbr” for constant bitrate
small_venc_Qfixed: Quantization to use for cqp mode
small_venc_Qmin: Minimum quantization to allow in cbr mode
small_venc_Qmax: Max quantization to allow in cbr mode
small_venc_nPframes: number of P frames to use between I frames (default 9)
small_venc_mbps: target bitrate for cbr mode (default 2)

Large Video Stream Settings

en_large_video: true or false to enable the large video stream
large_video_width: default dependent on sensor, typically full resolution
large_video_height: default dependent on sensor, typically full resolution
large_venc_mode: “h264” or “h265”
large_venc_br_ctrl: “cqp” or constant quantization or “cbr” for constant bitrate
large_venc_Qfixed: Quantization to use for cqp mode
large_venc_Qmin: Minimum quantization to allow in cbr mode
large_venc_Qmax: Max quantization to allow in cbr mode
large_venc_nPframes: number of P frames to use between I frames (default 29)
large_venc_mbps: target bitrate for cbr mode (default 30)

Snapshot Stream Settings

en_snapshot: true or false
snapshot_width:default dependent on sensor, typically full resolution
snapshot_height: default dependent on sensor, typically full resolution

libmodal exposure settings

When using libmodal exposure (lme_msv) instead of the ISP’s autoexposure you have more control, this is exposed through the following parameters.

ae_desired_msv: the desired mean sample value, a.k.a. the average value of pixels that the auto exposure algorithm should try to achieve in frame
ae_k_p_ns: the desired p_ns for the exposure algorithm
ae_k_i_ns: the desired k_i for the exposure algorithm
ae_max_i: the desired max_i for the exposure algorithm
ae_filter_alpha: a low-pass filter constant that filters the calculated MSV to slow down responses - the filter used is an IIR filter
ae_ignore_fraction: maximum percentage of saturated (255) pixels that will be used in calculation of MSV. If there are more saturated_pixels / total_pixels, then additional saturated pixels are not used to calculate MSV. This helps prevent image getting too dark if there are large blobs of very bright light
ae_slope: ratio that specifies how much gain vs exposure should be changed when trying to achieve desired MSV. Both gain and exposure linearly affect the pixel brightness, but gain and exposure have different effects on the image quality - mostly in the sense that gain affects granularity and that exposure affects motion blur.
ae_exposure_period: controls the duration where the cells of the camera sensor are exposed to light
ae_gain_period: the gain period associated to the auto exposure (think of this as an amplification factor of the pixels)

Constant Quantization vs Constant Bitrate Mode

❗As of SDK 1.3.5, CBR mode for h264 and h265 is working properly with some (bitrate) limitations

SDK 1.3.5+

CBR (Constant Bit Rate)

*_venc_mbps parameters is used to specify desired video bitrate in Mbit/s
CBR has the following limitations on VOXL2 :
Encoder Mode FPS Min Bitrate (mbps) Max Bitrate (mbps)
H264 30 1.5 5.0
H264 60 1.5 10.0
H265 30 3.0 10.0
H265 60 3.0 20.0
- if desired mbps is lower than the above min bitrate limit, camera server will override the value and set it to min
  - if camera server does not limit the min value, encoder will use MAXIMUM possible Mbps
- if desired mbps is higher than the limit, encoder will automatically use the maximum value from above
- _venc_Qmin value should be set to a low value (default 15), allowing encoder to adjust QP to achieved the desired bitrate
- _venc_Qmax should be set to a high value (in the 40-50 range, maximum 51), allowing encoder to adjust QP to achieved the desired bitrate
- note that this limitation is not a Hardware limit, it is due to some mis-configuration of the encoder, which still needs to be addressed
- lower bitrates for h264 / h264 are supported by voxl-streamer application, which can encode raw frames to video and serve rtsp stream

Encoder Mode	FPS	Min Bitrate (mbps)	Max Bitrate (mbps)
H264	30	1.5	5.0
H264	60	1.5	10.0
H265	30	3.0	10.0
H265	60	3.0	20.0

CBR + CQP

This is a hybrid mode that uses Constatnt Quantization when there is little motion and Constant Bitrate to limit the maximum bitrate
In order to use this mode, select cbr as the bitrate control, however also adjust the Qmin parameter, as described below
_venc_Qmin value can be used to lower the bit rate when video is mostly still (value needs to be tuned for use case)
As the amount of motion in the image increases, the encoder’s bit rate controller will increase QP (reduce quality), if needed, (up to Qmax) in order to keep the bitrate capped at the desired bitrate
- Qmin (lowest quantization bound) specifies the lowest quantization that encoder can use, meaning the highest image quality
- if the encoder would need to use a lower Qmin than is configured (in order to achieve target bitrate), the bitrate controller would be limited by the Qmin and resulting bitrate would be smaller than desired.
  - Sometimes this can be desired in order to avoid wasting bandwidth in CBR mode
  - When there is little motion, the actual bit rate may be smaller than specified in mbps parameter
  - However, when there is sufficient motion, the bitrate will increase and will be capped by the desired mbps value
  - ❗This results in a MAX bitrate instead of CBR mode, which sometimes is desired (low bit rate when there is no motion, capped bitrate during motion)
- _venc_Qmax should be set to a high value (in the 40-50 range, maximum 51)
  - this value specifies the largest Quantization Parameter used in CBR mode when needed to increase quantization in order to keep bitrate at desired value (this typically happens during lots of motion in the image)
  - if Qmax is not large enough, the bit rate controller may not be allowed to increase the quantization parameter sufficiently enough to achieve (to lower) the desired bitrate
the following parameters are ignored: _venc_Qfixed

CQP (Constant Quantization Parameter)

small_venc_Qfixed / large_venc_Qfixed parameter is used to set the constant Quantization Parameter
constant QP does not mean constant bit rate. QP is used to quantize the pixel values but motion estimation will drive bandwidth usage
actual bitrate will also depend on video resolution and FPS (more pixels = more bandwidth)
the following parameters are ignored: *_venc_Qmin, *_venc_Qmax, *_venc_mbps

Pre SDK 1.3.5

Constant bitrate video compression works but there is a known issue with the hardware accelerated video compressor where a scale factor is applied to the desired bitrate that is dependent on resolution. This scale factor is roughly 13x for 1024x768 video and roughly 30x for 4K video. If you wish to use constant bitrate mode you will need to set a much smaller target bitrate in the config file than desired. Then use voxl-inspect-cam to measure the output bitrate during motion and while still until you find the config file bitrate value that achieves the desired result.

The default encoding scheme is “cqp” or Constant Quantization Parameter which is more useful in most scenarios as it allows the bitrate to drop automatically when there is little motion in the video. Simply tune the small_venc_Qfixed and large_venc_Qfixed quantization parameters to achieve the desired bitrate and quality while in motion. This will save bandwidth and disk space when the drone is still.

Config File Example

Whenever you run voxl-configure-cameras a new default camera server config file is created for that particular camera setup. Here is an example default config file for camera config #6 used on Starling.

{
    "version":  0.1,
    "cameras":  [{
            "type": "pmd-tof",
            "name": "tof",
            "enabled":  true,
            "camera_id":    0,
            "fps":  5,
            "en_preview":   true,
            "preview_width":    224,
            "preview_height":   1557,
            "pre_format":   "tof",
            "ae_mode":  "off",
            "standby_enabled":  false,
            "decimator":    5
        }, {
            "type": "imx214",
            "name": "hires",
            "enabled":  true,
            "camera_id":    1,
            "fps":  30,
            "en_preview":   false,
            "preview_width":    640,
            "preview_height":   480,
            "pre_format":   "nv21",
            "en_small_video":   true,
            "small_video_width":    1024,
            "small_video_height":   768,
            "small_venc_mode":  "h265",
            "small_venc_br_ctrl":   "cqp",
            "small_venc_Qfixed":    30,
            "small_venc_Qmin":  15,
            "small_venc_Qmax":  40,
            "small_venc_nPframes":  9,
            "small_venc_mbps":  2,
            "en_large_video":   true,
            "large_video_width":    4096,
            "large_video_height":   2160,
            "large_venc_mode":  "h265",
            "large_venc_br_ctrl":   "cqp",
            "large_venc_Qfixed":    38,
            "large_venc_Qmin":  15,
            "large_venc_Qmax":  50,
            "large_venc_nPframes":  29,
            "large_venc_mbps":  30,
            "en_snapshot":  true,
            "en_snapshot_width":    4160,
            "en_snapshot_height":   3120,
            "ae_mode":  "isp"
        }, {
            "type": "ov7251",
            "name": "tracking",
            "enabled":  true,
            "camera_id":    2,
            "fps":  30,
            "en_preview":   true,
            "preview_width":    640,
            "preview_height":   480,
            "pre_format":   "raw8",
            "ae_mode":  "lme_msv",
            "ae_desired_msv":   60,
            "ae_filter_alpha":  0.600000023841858,
            "ae_ignore_fraction":   0.20000000298023224,
            "ae_slope": 0.05000000074505806,
            "ae_exposure_period":   1,
            "ae_gain_period":   1
        }]
}

Camera sensor options and capabilities

For the available camera configurations as well as the accepted FPS/Dimensions associated to each camera sensor, please follow the link here.

Supported Resolutions

From voxl-camera-server 1.7.0 onwards, all resolutions a sensor supports can be compressed real-time using OMX.

Geo-tag Snapshots with GPS

Note: this feature is only available on VOXL 2

When GPS is being published via the voxl-mavlink-server, voxl-camera-server will geo-tag each snapshot captured with the latest GPS coordinates. The GPS data (latitude, longitude, altitude) will be saved in the JPEG Snapshot EXIF metadata. In order for this to occur, the user must have a valid GPS connected and actively publishing data from a MAVLink compatible flight controller, such as PX4 or ArduPillot.

You can verify GPS is being published with the voxl-inspect-gps tool.

In order to view the EXIF metadata of the photo, the user must pull the photo from the VOXL 2 (that lives in /data/snapshots) and then can look at the properties of the image via a double click or by uploading the image to a website that allows for metadata from the EXIF to be viewed.

Troubleshooting

From a command line, run voxl-camera-server -d0 to view the output of the camera-server initialiation process.

Debugging Exposure Settings

Send the following command to set the fixed exposure and gain:

voxl-send-command set_exp_gain

for example: voxl-send-command tracking set_exp_gain 20.0 100

The name-of-output-stream should match the pipe/stream name that you are using. Since there can be multiple streams for the same camera, you can use any stream to set this param.

Source Code

The source code for VOXL Camera Server can be found here