Link Search Menu Expand Document

Apriltag Detection

voxl-tag-detector is a background systemd service that subscribes to up to 3 camera image feeds and outputs AprilTag detections via pipe at /run/mpa/tag_detection. Currently only 36h11 tags are supported.

For information on how this can be used, see the AprilTag relocalization page.

geometry_overview


Table of contents

  1. Tag Location Types
  2. Output Format
  3. voxl-configure-tag-detector
  4. Tag Detector Configuration File
    1. Undistort Ratio
    2. Fast Mode
    3. multithreading
    4. Skipping Frames
  5. Tag Location Configuration File
  6. Tag Coordinate Frame
  7. Tag Detection Debug Overlay
  8. Troubleshooting
  9. Source

Tag Location Types

A tag can be flagged by the user as fixed, static, or dynamic. These types are used in the tag_locations.conf config file and are specified in the output data struct too.

  • “fixed”: The tag is at a known location in space as described by the T_tag_wrt_fixed vector and R_tag_to_fixed rotation matrix. These fixed tags are used by voxl-vision-hub for AprilTag relocalization.

  • “static”: A static tag can be trusted to be static (not moving) but does not have a known location. For example, a landing pad.

  • “dynamic”: A dynamic tag can be expected to be in motion such as an object to be tracked or followed dynamically.

If a tag is detected that has not been listed by the user as a known tag in /etc/modalai/tag_locations.conf then it will be listed as unknown.

Output Format

The data format is standardized in libmodal_pipe in modal_pipe_interfaces.h as follows:

#define TAG_LOCATION_TYPE_STRINGS {"unknown", "fixed","static","dynamic"}
#define N_TAG_LOCATION_TYPES    4
#define TAG_LOCATION_UNKNOWN    0
#define TAG_LOCATION_FIXED      1
#define TAG_LOCATION_STATIC     2
#define TAG_LOCATION_DYNAMIC    3
// max name length of a tag name
#define TAG_NAME_LEN 64

/**
 * describes an apriltag, aruco, or similar detection. Provides the tag's position and rotation
 * relative to the camera that detected it.
 *
 * This packet is 252 bytes long.
 */
typedef struct tag_detection_t{
    uint32_t magic_number;                  ///< Unique 32-bit number used to signal the beginning of a VIO packet while parsing a data stream.
    int32_t  id;                            ///< id number of the tag
    float    size_m;                        ///< size of the tag in meters
    int64_t  timestamp_ns;                  ///< timestamp at the middle of the frame exposure in monotonic time
    char     name[TAG_NAME_LEN];            ///< optional name of the tag
    int      loc_type;                      ///< location type
    float    T_tag_wrt_cam[3];              ///< location of the tag with respect to camera frame in meters.
    float    R_tag_to_cam[3][3];            ///< rotation matrix from tag frame to camera frame
    float    T_tag_wrt_fixed[3];            ///< only set if location type is LOCATION_FIXED
    float    R_tag_to_fixed[3][3];          ///< only set if location type is LOCATION_FIXED
    char     cam[MODAL_PIPE_MAX_DIR_LEN];   ///< camera pipe where the detection was made
    int      reserved;                      ///< reserved field
} __attribute__((packed)) tag_detection_t;

For an example of how to read this data, see voxl-inspect-tags.

voxl-configure-tag-detector

Like all MPA services, voxl-tag-detector has a basic configure script to enable/disable the service as well as to reset the config file back to default for factory resets.

yocto:/$ voxl-configure-tag-detector -h

Start wizard with prompts:
voxl-configure-tag-detector

Shortcut configuration arguments for scripted setup.
factory_enable will reset the config file to factory defaults
before enabling the service.

voxl-configure-tag-detector disable
voxl-configure-tag-detector factory_disable
voxl-configure-tag-detector factory_enable
voxl-configure-tag-detector enable

show this help message:
voxl-configure-tag-detector help
yocto:/$ voxl-configure-tag-detector
Starting Wizard

Do you want to reset the config file to factory defaults?
1) yes
2) no
#? 1
wiping old config file
Creating new config file: /etc/modalai/voxl-tag-detector.conf
Created new empty json file: /etc/modalai/tag_locations.conf
The JSON tag location data was modified during parsing, saving the changes to disk

do you want to enable or disable voxl-tag-detector
1) enable
2) disable
#? 1
enabling  voxl-tag-detector systemd service
Created symlink from /etc/systemd/system/multi-user.target.wants/voxl-tag-detector.service to /etc/systemd/system/voxl-tag-detector.service.
starting  voxl-tag-detector systemd service
Done configuring voxl-tag-detector

Tag Detector Configuration File

Up to 3 parallel detection threads can be configured independently. They all output detection data on the same pipe, but have separate debug overlay streams which can be configured here. The default configuration file has recommended settings for the tracking and stereo cameras, with only the tracking camera enabled.

yocto:/$ cat /etc/modalai/voxl-tag-detector.conf
/**
 * This file contains configuration parameters for voxl-tag-detector.
 * You can specify up to 3 cameras to do detection on simultaneously.
 * For the stereo camera pair, only the left camera is used.
 */
{
    "detector_0":     {
        "enable":           true,
        "input_pipe":       "tracking",
        "en_fast_mode":     true,
        "n_threads":        1,
        "en_undistortion":  true,
        "undistort_scale":  0.600000023841858,
        "overlay_name":     "tracking_tag_overlay",
        "lens_cal_file":    "/data/modalai/opencv_tracking_intrinsics.yml",
        "skip_n_frames":    5
    },
    "detector_1":    {
        "enable":           false,
        "input_pipe":       "stereo",
        "en_fast_mode":     true,
        "n_threads":        1,
        "en_undistortion":  true,
        "undistort_scale":  0.899999976158142,
        "overlay_name":     "stereo_tag_overlay",
        "lens_cal_file":    "/data/modalai/opencv_stereo_intrinsics.yml",
        "skip_n_frames":    5
    },
    "detector_2":    {
        "enable":           false,
        "input_pipe":       "extra",
        "en_fast_mode":     true,
        "n_threads":        1,
        "en_undistortion":  false,
        "undistort_scale":  1,
        "overlay_name":     "extra_tag_overlay",
        "lens_cal_file":    "/data/modalai/opencv_extra_intrinsics.yml",
        "skip_n_frames":    5
    }
}

Undistort Ratio

This is mostly relavent to undistorting the fisheye tracking camera images. When undistorting an image, you can choose to either shrink the edges of the image in towards or the center to keep all image data but leave black areas, or stretch the corners out which discards image data at the corners. We find that somewhere in the middle (0.6) is ideal for the tracking camera. For the less distorted stereo cameras we elect to preserve more of the image and set the scale to 0.9.

The scale number is a multiplier on the focal length of the lens, thus a value of 1.0 will preserve the same focal length and result in no black areas on the image, but parts of the image will be lost during the undistortion process.

Fast Mode

Fast mode is enabled by default and scales the image down by a factor of two to perform the quadrangle detection which is one of the most computationally intensive steps. This does reduce detection rates slightly for small or distant tags and can be turned off if detection range is a high priority.

multithreading

For small VGA images, there is little to no benefit to enabling more than one core for processing images. It is generally faster to keep the image in one core’s cache and let one CPU work on it without interruptions or context switches. There are generally other things to keep the other CPU cores occupied.

Skipping Frames

voxl-tag-detector will always drop backed-up frames if it cannot keep up with the input frame rate. However, it is rarely needed to run tag detection on every frame so you can configure voxl-tag-detector to automatically skip frames to reduce CPU load. The default is to skip 5 frames, resulting in every 6th frame being processed. This results in a reasonable 5hz detection rate for a 30fps camera stream.

Tag Location Configuration File

The location, size, type, and ID of each tag you expect to see needs to be listed in the tag_locations.conf configuration file. This data is forwarded through to the output data pipe so subscribers have access to this information without needing to read it manually.

The reference PDFs for the 36h11 family can be found here. Print them to scale for the localization to work properly. The size_m parameter must align with the actual size of the physically printed tag for the localization to be accurate.

yocto:/$ cat /etc/modalai/tag_locations.conf
/**
 * Apriltag Location Configuration File
 * This file is used by voxl-tag-detector
 *
 * A tag can be flagged by the user as fixed, static, or dynamic.
 *
 *  - fixed: The tag is at a known location in space as described by the
 *    T_tag_wrt_fixed vector and R_tag_to_fixed rotation matrix. These fixed
 *    tags are used by voxl-vision-hub for apriltag relocalization.
 *
 *  - static: A static tag can be trusted to be static (not moving) but does not
 *    have a known location. For example, a landing pad.
 *
 *  - dynamic: A dynamic tag can be expected to be in motion such as an object
 *    to be tracked or followed dynamically.
 *
 *
 * If the tag is fixed then you must specify its location and rotation relative
 * to the fixed reference frame. Otherwise the translation and rotation
 * parameters can be ignored.
 *
 * The name parameter is currently only used for info prints but can be helpful
 * for the user to keep track of what tag is which.
 *
 * Each tag must have its size listed. Tags not listed will be assumed to have
 * size equal to the default_size.
 *
 * Currently only 36h11 apriltags are supported
 */
{
    "locations":    [{
            "id":          0,
            "name":        "default_name",
            "loc_type":    "unknown",
            "size_m":      0.40000000596046448,
            "T_tag_wrt_fixed":    [0, 0, 0],
            "R_tag_to_fixed":    [[0, -1, 0], [1, 0, 0], [0, 0, 1]]
        }],
    "default_size_m":      0.40000000596046448
}

Tag Coordinate Frame

Apriltags have a coordinate frame similar to cameras such that X points out the right of the tag, Y points towards the bottom edge of the paper, and Z points into the paper. The flight demonstration video at the beginning of this page is using the default config file. This rotation matrix corresponds to a tag sitting on the floor with the top of the picture pointing forward along the X axis.

We recommend you try to remain consistent with your Apriltag orientations to avoid confusion and error. Positions are easier to visualize and debug.

Tag Detection Debug Overlay

Each detector can output a debug overlay image stream which is another normal camera pipe. As such, it can be viewed with voxl-portal, converted to ROS with voxl_mpa_to_ros](/voxl-mpa-to-ros/), and logged to disk with voxl-logger.

The overlay highlights the 4 detected corners of the tag and writes the ID number of the tag in the middle.

See the image at the top of this page for an example.

Troubleshooting

Accurate tag detection requires calibrated cameras. The voxl-tag-detector service won’t even start if image undistortion is enabled and the specified camera calibration file cannot be found.

We recommend having a white border around your AprilTags at least twice as wide as the individual squares in the black portion of the tag. This greatly improves detection reliability at longer distances by improving the contrast around the tag, making the black square easier to distinguish. This is particularly true when using fast mode.

If the distances seem incorrect, make sure you have specified the correct tag size in the tag_locations.conf file.

Outdoors, shiny Apriltags are particularly susceptible to glare. Try to use a matte surface whenever possible.

Source

Source code is available on Gitlab