VOXL Vision PX4 Apriltag Relocalization

One inherent shortcoming of navigating with VIO is that VIO algorithms are generally unaware of their absolution location unless they employ a mapping and relocalization function. Most of them only know their position relative to where the algorithm started.

One reliable and configurable method for relocalization is using fiducial markers. We incorporate this functionality into voxl-vision-px4 specifically using Apriltags. In this demonstration video you can see the drone start up crooked so that it thinks forward is offset and yawed to the right. It’s told to fly forward and back in a straight line before landing at the 0,0,0 point. On its return to the landing pad it sees the Apriltag and immediately corrects the error, allowing it to land squarely on its landing pad.

Table of contents

  1. Video Walkthrough
  2. Configuration File
  3. Frames of Reference
  4. Debugging
  5. Fixed Frame Pipe for user-provided localization
  6. Filter Configuration
  7. Correcting Setpoints with the Fixed Frame Offset
  8. Next

Video Walkthrough

Configuration File

To use this feature, you must enable VIO and either the “en_apriltag_fixed_frame” feature or the “en_fixed_frame_pipe” feature which will be discussed later in this page.

When using the “en_apriltag_fixed_frame” feature you need to specify the position and rotation of each Apriltag relative to fixed frame. You can specify from 1 up to 64 different apriltags. Each must have a unique ID and position. Here is a sample config file with 2 tags specified.

    "en_vio": true,
    "en_apriltag_fixed_frame": true,
    "fixed_apriltags": [{
            "tag_id": 0,
            "R_tag_to_fixed": [
                [0, -1, 0],
                [1, 0, 0],
                [0, 0, 1]
            "T_tag_wrt_fixed": [0, 0, 0]
            "tag_id": 1,
            "R_tag_to_fixed": [
                [0, -1, 0],
                [1, 0, 0],
                [0, 0, 1]
            "T_tag_wrt_fixed": [0, 5, 0]
    "en_transform_mavlink_pos_setpoints_from_fixed_frame": true,
    "en_fixed_frame_pipe": false

You will also need to specify the size of the tag in /etc/snav/app_params.snav_dft_vio_apriltag_app.xml which is the width/height of the black square in meters.

    <param name="tag_size" value="0.267"/>

Note that apriltags have a coordinate frame similar to cameras such that X points out the right of the tag, Y points towards the bottom edge of the paper, and Z points into the paper. The flight demonstration video at the beginning of this page is using the default config file with the same rotation matrix as above. This rotation matrix corresponds to a tag sitting on the floor with the top of the picture pointing forward along the X axis.

We recommend you try to remain consistent with your Apriltag orientations to avoid confusion and error. Positions are easier to visualize and debug.

Frames of Reference

The geometry module inside voxl-vision-px4 keeps track of the following frames of reference.


Calculation of the translation and rotation between fixed and local frame is done using the relation between the following frames. voxl-vision-px4 keeps a small buffer of odometry data relating the IMU to VIO frames as the drone moves. Using the accurate camera frame timestamps, voxl-vision-px4 then goes back in time and interpolates to best estimate the position and orientation of the drone when the Apriltag was detected.



Before first flight, we highly suggest running voxl-vision-px4 from the command line with each of the following options one at a time. This will let you inspect the behavior of the relocalization algorithm and Apriltag detection. A small typo when configuring the Apriltag positions and orientations in the config file can be disastrous.

-g, --debug_fixed_frame     print debug info regarding the calculation of fixed frame
                              relative to local frame as set by apriltags.
-l, --debug_tag_local       print location and rotation of each tag in local frame
                              note that apriltag frame of reference is not the same
                              as local frame so interpreting roll/pitch/yaw requires
                              some thought
-m, --debug_tag_cam         print location and rotation of each tag detection wrt cam
-o, --debug_odometry        print the odometry of body in local frame each time
                              a VIO packet is sent to PX4
-p, --debug_odometry_fixed  print the odometry of body in fixed frame. Note this
                              is not the data being sent to px4. This is to help
                              ensure fixed apriltags are set correctly in config file

Fixed Frame Pipe for user-provided localization

The user may wish to implement their own method of localization instead of using the integrated Apriltag feature. For example: multi-fiducial-marker detection, or landing pad detection. In this case, the user may disable the Apriltag localization feature and enable the fixed frame pipe feature.

"en_apriltag_fixed_frame":  false,
"en_fixed_frame_pipe": true,

This will open up a named pipe at the following location to which the user can write data in the form of a pose_4dof_t struct relating the location of the body to fixed frame. This path and struct are defined in /usr/include/voxl-vision-px4.h

#define FIXED_FRAME_PIPE_PATH "/dev/fixed_frame_pipe"

typedef struct pose_4dof_t{
    int64_t timestamp_ns; // timestamp in apps-proc clock-monotonic
    double p[3];          // meters
    double yaw;           // radians, between +- PI
} pose_4dof_t;

An example of how to write to this pipe can be found here: https://gitlab.com/voxl-public/voxl-vision-px4/-/blob/master/misc/fixed_frame_pipe_example.c

Passing data in through the fixed_frame_pipe uses the same basic method as Apriltag detection and the low pass filter described below is also used. Accurate timestamping is critical when passing data in this way since the geometry module will look back in time and use interpolation to find the VIO odometry data corresponding to the provided timestamp.

Filter Configuration

As with all sensors, there is inherent noise and uncertainty with Apriltag detections. For this reason we incorporate a moving average filter to smooth out the estimate of the rotation and translation between local and fixed frame. This filter is configured such that if only one detection is made, the filter will output exactly that value to prevent the rise time that would be incurred if the filter had to wait to be completely filled before converging.

The filter length is user-configurable. It can be increased if the use-case involves the drone hovering still over a target in order to better estimate its position. If the user wishes to ensure the drone very quickly localizes given only a few camera frames looking at a tag then the filter length can be reduced.

It is possible to set the filter length to 1 which will result in the local-to-fixed frame relation being completely determined by the most recent tag detection.

"fixed_frame_filter_len": 5

Correcting Setpoints with the Fixed Frame Offset

So far, we have only discussed how the offset between local and fixed frame is computed. This information is never actually sent to PX4. PX4 always receives odometry data as the position and rotation of the body with respect to local frame. If PX4 was provided with the Body with respect to Fixed frame then the sudden jump that occurs upon relocalization would cause PX4’s EKF2 to misbehave and likely fall out of position-hold mode.

Instead, the user can provide setpoints in fixed frame and voxl-vision-px4 will automatically translate those setpoints into local frame so PX4 will fly where you want it to go relative to apriltags.

"en_transform_mavlink_pos_setpoints_from_fixed_frame": true,

When enabled, any SET_POSITION_TARGET_LOCAL_NED mavlink messages received through UDP will have its position, velocity, acceleration, and yaw values updated with the fixed frame offset before being sent to PX4 over UART. This is very useful for MAVROS and MAVSDK processes running locally or remotely.

This feature is completely optional if you wish to have more control yourself.


Next page: Collision Prevention