Software Platform

Ever wondered how a robot “decides” to turn left or right? Like most autonomous systems, Auri’s software stack is a distributed network of processes and algorithms. The real world is a complex environment, but reacting autonomously to it can be broken down into a few sub-problems.

  1. Interpreting the world. We need to know where we are, and what’s around us.
  2. Decision-making. What will we try to do next? We want to output commands as simple as “turn right” but also as subtle as “search for the buoy”.
  3. Act. Translate our commands into real actions.

It’s this pipeline of information from interpreting to acting in the real world that makes robots work. This is the goal of ARVP’s software system, and is the basis for its design. It is composed of several distinct modules:

  • Control System
  • Computer Vision
  • Mission Planning
  • Simulation

These components are all connected through the Robot Operating System (ROS), an open source communications library. ROS was chosen as a software framework because it supports a  highly distributed system, which lets Auri maximize the use of its two onboard computers. In addition, ROS nodes empower the software team to create modular code and consistent I/O endpoints, which have been invaluable to the development process. By leveraging each component separately, different techniques can be evaluated and iterated on quickly. For example, swapping computer vision algorithms at runtime to determine the best performing one given current conditions.

Control System

This year, FAT Auri’s control system was completely redesigned. In previous years, the team had used a PID control system which presented numerous challenges. Most notably,

  • There was significant amount of time being dedicated to calibrating the controllers every time the hull of the robot was modified; and
  • The control system could only effectively control one axis of movement at a time.

Upon researching what types of control systems had historically performed wellat RoboSub, ARVP decided to develop a Linear Quadratic Regulator (LQR) control system. The main focus of an LQR control system is to achieve certain criteria for a given system as efficiently as possible using an optimal control law. The optimal control law is a set of differential equations that minimizes a cost function which typically depends on the robot state and control variables.

Computer Vision

Auri’s vision algorithms are designed with color invariance in mind. This is because underwater image processing is affected by light attenuation and scattering, which results in poor contrast and non-uniform colors. Instead, Auri’s new vision algorithms use the shapes of the competition objects (eg, Buoy, path, gate), which are more reliable indicators. Auri relies on two techniques to isolate objects without explicitly declaring tight color ranges:

  • A 2D histogram to isolate colors with high contrast to an overall image, and
  • Image segmentation using superpixels.

These techniques are used to extract contours from images, which are then used for object identification through a shape matching process.

2D Histogram Technique

To isolate contours of interest in an image, we first perform background subtraction on our images using a 2D hue and value histogram. From this histogram, we extract a list of disconnected clusters, representing the different color regions in the image. By removing the largest clusters, we can create a binary mask on pixels in the image that most stand out. Contours are then from the binary mask using OpenCV.

This technique works under the assumption that the received image’s background will be roughly one color, and that the foreground will have some minimum amount of contrast. It’s primary advantage is its simplicity and speed.

Image Segmentation with Superpixels

Whereas the histogram technique looks at an image as a whole, image segmentation was used to account for color locality. The basic idea of this technique is to divide an into a grid of roughly equal sized chunks called “superpixels” which each consist of a single average color. This is accomplished using a third party library called gSLICr. Once an image has been segmented, good contours can be obtained by thresholding.

Shape Matching

Shape matching is the concept of classifying contours by matching them to pre-generated contours of a particular object. This technique is generic, and allows new detection models to be trained efficiently. For each target object, we render its 3D model at incremental orientations using OpenSceneGraph. Using our image segmentation strategy, we then extract the contours and store their translation and scale invariant format in a database. The process is left rotation variant in order to reduce match collisions and allow for 3D pose estimation. For speed purposes, CUDA is used to GPU accelerate comparisons between perceived contours.

Mission Planning

High-level control of robot operations is handled by a single hierarchical state-machine. The idea of it came at the beginning of preparation when the software team decided to aim for more tasks this year. To ensure an extensible structure and robust error handling, SMACH a open source Python library is used. The top-level plan is first conceived and drawn out as a flow chart including the flow of the main process, data keys, as well as concurrent side processes. Based on this plan, every state is coded as an individual python class, each with a function to run on activation as well as a list of exit paths and the in/out flow of data keys. In addition to creating custom states, states for ROS action and service clients are created directly to more easily interface with other ROS nodes. When this cannot be done, custom interfaces are made to manage the sending and receiving of messages on ROS topics, allowing the Mission Planner to control other nodes in the system. With all the interfaces and states coded, a complicated web of transitions between states is defined, as well as the flow of data between states. In our case, a separate state machine was built for every task to be achieved, such as the buoy task where the robot has to touch three different buoys in a certain order. This state machine contains several states such as detect, track, and repositioning, and the transitions between them are based on the success and failure of each state. For instance, when the tracker loses the buoy it makes the appropriate transition to re-detect it. Once a state machine for a certain task is ready, it is added itself as a state to the top-level state machine. It is this hierarchical capability of SMACH that allows handling both the completion of certain tasks, as well as the flow of all tasks in the competition. Overall, the goal of the ARVP Mission Planner is a combination of properly interfacing with all other components, and applying a plan to make every component work together seamlessly.



Developed late last year, ARVP’s simulator continues to play a critical role in testing the software system end-to-end. Built off the UWSim project for marine research, testing in the simulator significantly sped up the development process for many projects, including the control system, various detection algorithms and the mission planning stack.


This year, the simulator dynamics were significantly improved in parallel to the development of the LQR dynamics model. With the addition of realistic drag and thrust coefficients, the robot’s behaviour is much closer to reality than in previous versions.


Three hydrophones (a microphone which detects sound waves under water) will be used to capture the signal. The hydrophones will be arranged in a L configuration, with the middle hydrophone acting as a reference for the other two. This signal will be amplified and DC-biased using a simple op-amp circuit. A dual channel ADC will be used to sample one pair of hydrophone data (reference + one other hydrophone). A software defined radio approach is used for processing. The signal will be filtered, normalized, and then the phase shift between the two signals will be used to determine the angle to the pinger in one axis. The other pair will then be sampled, and used to localize the pinger in the other axis. Combining this information will allow us to get a heading to the pinger.