The software team is responsible for the design and implementation of the software that allows the robot to operate autonomously. The major software components include: Mission Planning, Control Systems, Computer Vision and Sonar Navigation.
Every year, RoboSub releases a document that describes all of the tasks that will be present at competition. Many of these tasks carry over from previous years but sometimes they add new twists. When this document is released, the software team holds a meeting and discusses what tasks are the easiest, which are challenging but doable, and which tasks are unrealistic. We then create a plan for which tasks should we focus development on and which tasks should be designated as stretch goals. Some tasks include, going through a gate, dropping an object onto a platform, shooting a torpedo through a target, or locating a pinger and surfacing above it.
One of the oldest tasks at competition is the buoy challenge. For this task, Arctos must find a buoy in the pool and bump into it. This sounds simple but that does not mean it is easy. To illustrate why this task is challenging let us consider an example.
Let us say we wished to start from scratch and design a new robot that solely focused on tackling this one challenge. Let’s call this robot Teddy.
The first aspect we must consider is how does Teddy move? To move underwater, we use thrusters to apply a force to the robot in a specific direction. Let us say, the mechanical team designs Teddy such that it has a thruster configuration such that Teddy can theoretically be moved or turned in any direction by the thrusters. But how do we control them?
Competition requires all robots to be positively buoyant. This means that if a robot were to lose power it should float to the surface. This is a challenge. We can’t let Teddy float on the surface of the water. If we did, Teddy would be a boat and not an underwater robot.
To solve this problem, we must use our thrusters to exert a downward force on Teddy that counteracts buoyancy. This force must be present at all times so our thrusters must always be active.
Now that Teddy can stay still, we can now move on to maneuvering Teddy around the pool. Thankfully, we can apply the same solution as before. Simply put, we can use the thrusters to apply a net force and net moment in any direction. This means we can completely control Teddy’s orientation and movement.
The second aspect we must consider is how does Teddy observe its environment? Without sensor data Teddy has no way of knowing where it is, if it is moving, or where anything else is. Let us say the mechanical and electrical teams install a sensor suite on Teddy. Unfortunately, Teddy is underwater and that seriously limits the amount of sensor data that can be recorded. For example, we cannot use GPS. So far, the best we can hope to measure is: depth underwater with a depth sensor, acceleration and orientation with an Inertial Measurement Unit, and velocity with a Doppler Velocity Log. How Teddy uses this sensor data can become very complicated so let us assume that we have a perfect control and localization system such that Teddy always knows where it is and where it is moving.
The third aspect we must consider is how does Teddy detect objects, such as a buoy? To do this we must implement a computer vision system. In other words we must use a camera and analyze the images the camera records to find a buoy. We know what the buoys look like so we may be able to train Teddy to detect buoys it sees.
Now we have addressed these three aspects, we may now plan a mission for Teddy to bump into a buoy. The steps for this mission are as follows:
In the perception faze of the pipeline, the robot takes in information about its surroundings to answer the question “Where am I?” There are a number of areas the robot takes information from.
A ZED stereo camera is installed on the robot. With it the robot can record camera footage and analyze the images. We use a deep-learning based approach to analyzing images for props. Specifically, we use YOLO v3 for rapid object detection. We also use YOLACT for more detailed image segmentation.
In the left image you can see YOLO v3 drawing bounding boxes around areas of importance for a torpedo target. On the right you can see a gif of YOLACT actually highlighting the entire detected object.
By using a stereo camera the robot can extract additional information from its environment. A stereo camera is a device with two cameras that are offset from each other by a known distance. The two cameras record independently from each other. By comparing the images produced, distances can be triangulated from an object to the camera. Below is an image of what this looks like in the simulator. The darker a pixel, is the closer it is to the camera.
The simplest sensor on the robot is the depth sensor. It measures water pressure and from that calculates the robot’s depth underwater. Despite being so simple, it is the only sensor that can directly measure position along the z-axis.
The next sensor is an Inertial Measurement Unit (IMU). IMUs are highly sophisticated pieces of technology. ARVP uses a LORD MicroStrain 3DM-GX5-25 AHRS. This model contains an accelerometer, gyroscope, and magnetometer. With it the robot can directly measure: Acceleration, Magnetic Field, Angular Rate, Pressure, Up Vector, North Vector, and Ambient Pressure. This IMU not only makes measurements but it also calculates Attitude and a Gravity Vector. The most important measurements the robot records are Acceleration, Angular Rate, and Attitude. With these measurements the robot can tell what direction it is facing and what direction it is accelerating.
The last and most expensive sensor the robot uses is a Doppler Velocity Log (DVL). ARVP uses a Nortek DVL1000. A DVL is a specially designed acoustic sensor for subsea applications that cannot use a GPS. It uses the Doppler Effect by producing 3 acoustic signals and then measuring their return frequencies to calculate the velocity of the sensor. When ARVP obtained a DVL, it completely changed the software architecture. It allowed the team to estimate position with some degree of accuracy
At competition, one of the tasks includes an acoustic pinger. The robot is supposed to navigate on top of this pinger and then surface. This is by far the most difficult task to complete. Detecting and calculating the direction of a ping has taken years of development to get to a reasonable level. To detect a ping we use three hydrophones that are held in an ‘L’ formation by a custom frame. Each hydrophone measures a ping at a slightly different time. By measuring the time difference between hydrophone signals, the direction of the pinger can be calculated. This is, at best, very noisy so pinger estimates need to be thoroughly evaluated. For this, the robot uses a Particle Filter. A test run can be seen in the adjacent gif.
One of the benefits of using a DVL and having good localization is the ability to create a map of the robot and its surroundings. By creating a map, the robot can articulate its own position and estimate its position with respect to the props in the pool. The mapper takes in information from robot position estimates and prop position estimates and filters it to reduce error. In the adjacent image an example of the mapper can be seen. On the left is the robot, and to the right are a gate, a buoy, and a torpedo target. Around each prop is an ellipsoid which visualizes the confidence the mapper has in that props position. In other words, the mapper believes there is a 95% chance that the prop is in that ellipsoid. Over time, the ellipsoids shrink as the mapper’s confidence increases.
Once the robot has taken in sensor information and a position estimate is formed. The robot must then answer the question “What should I do?” This is where the mission planner comes in.
The mission planner monitors the state of the robot and of the props and sets goals to drive the robot. It does not directly control thrusters but rather it sets targets for the control system to reach. For example, the mission planner may see that the robot is at position x=1, y=1, and z=5 (this notation is cumbersome so instead we will refer to positions in this form, (x,y,z) where z increases as the robot goes deeper underwater.) but the prop for the next task is at (5,3,5). The mission planner might then set a goal somewhere near the prop like (4,3,5). Once a goal is set, the mission planner is done. It does not determine the path the robot will take to a goal; that is handled by the robot’s control systems.
Setting goals is not the only function of the mission planner, it also handles time management. At competition, the robot has half an hour to complete as many tasks as possible. That may sound like a fair amount of time but in actuality it is very limited. Moving from task to task can take ~4 minutes and tasks themselves can take an additional ~2 minutes. If the robot intends to do 5 tasks, then the mission should take ~26 minutes in total. So with this in mind, the robot cannot be allowed to stall at a single task for too long. To account for the time limitation, each goal set by the mission planner has a predetermined timeout. If the timeout is reached and a goal has not been met then the mission planner will move on to the next goal.
One of the biggest challenges with autonomous robotics is that once you start the robot on a mission you completely lose control. As per the rules of competition, we are not allowed, in any way, to communicate with the robot once it starts. That is why it is important to have a robust mission planner. If there is some sort of failure, the mission planner is there to cut losses and continue on with the mission.
The controllers on the robot are the processes that convert positional targets provided by the mission planner into thrust efforts over time. They answer the question, “How do I reach my goal?”
To illustrate what controls do let us consider an everyday example. Imagine you are driving a car on a highway and you wish to maintain your speed with traffic around you. If your car moves too quickly you could hit the car in front of you. If your car moves too slowly, you could be hit by a car behind you. So, let us say you try and keep two car-lengths between you and the car ahead of you. But, how can you do this?
The only way you can control your position relative to the other car, let us refer to this as relative position, is by manipulating your car’s accelerator. The accelerator does not directly control your relative position but you, as an intelligent human being, understand that by accelerating or decelerating you can change the speed of your car. By changing your speed you know you can increase or decrease your relative position. So, in this way, you have an intuitive understanding of the car and how the accelerator affects your relative distance.
Let us say you are 3 car-lengths away from the car ahead of you. Since you want to maintain a distance of 2 car-lengths you should accelerate. But, once you reach two car lengths you realize you are going to overshoot your goal by 0.5 car-lengths. So you decelerate but then undershoot but by 0.25 car-lengths. By continuing to accelerate and decelerate you eventually reach your goal of 2 car-lengths.
This example may be extremely simple but let us analyze it in terms of a control system: you are the controller; the target was a relative distance of 2 car-lengths; the state was your acceleration, speed, and relative distance; the system was the two cars moving at varying speeds; the dynamics of the system followed simple Newtonian Physics; and the control input was the accelerator. You, as the controller, were able to convert a target into a series of control inputs over time.
Controlling an underwater robot is far more difficult than controlling a car. On Arctos, there are 8 thrusters: 4 pointed along the z axes, 2 pointed along the x-axis, and 2 along the y-axis. They must be controlled in such a way that the robot never becomes unstable. An example of a fragile control system would be one which was assumed to never rolled or pitched so that the thrusters along the z-axis always pointed up and down. Now if this robot were to tilt even slightly, that would result in a complete failure. As can be seen in the diagram, a small deviation cannot be corrected so the position of the robot degenerates into failure.
So the robot’s control system must be able to recalculate thrust efforts dynamically such that there is always a net downward force to counteract buoyancy for any possible orientation. In other words, the control system must be able to control all axes of movement simultaneously.
The next challenge for the control system is the weight distribution of the robot. At competition components are moved around constantly. This has a pronounced effect on how well the robot is able to control itself. For this reason, it is important that the control system is flexible enough to handle changes to the robot very quickly.
At ARVP, we use a Linear Quadratic Regulator (LQR) control system.
In mathematical terms this process is described with this equation:The main focus of an LQR control system is to achieve and hold a target configuration for a given linear system using an optimal control law. The optimal control law is a set of differential equations that minimizes a cost function which typically depends on the state and control variables. An LQR control system generates the control law using four matrices. These matrices are the A, B, Q, and R matrices which model the physical dynamics, control dynamics, state cost, and control cost, respectively.
Juan Rojas & Nathan Liebrecht
Put simply, the purpose of an LQR control system is to reach and maintain a target state using a dynamic model of the system (A), a model of the control dynamics (B), and two cost matrices (Q and R). In the equation, x is the robot’s current state and ẋ is the robot’s state in the next instant. The A matrix describes how the robot behaves in an underwater environment. It includes buoyancy, frictional forces, gravity, and the plethora of other physical constraints that affect the robot in an underwater system. This matrix represents how the control believes the robot’s position will change given its current state (x). The B matrix describes how the control system believes it’s control input (u) affects the robot’s state. u can be changed at any time by the controller but every other variable must be calculated or measured.
To relate this mathematical representation with the previous car example, we may draw the following comparisons:
The big question now becomes how do we determine u ? The A matrix can be determined through analysis and experiments, the B matrix can be determined by analysing the robot’s thruster configuration, and x comes from sensor data and filtering. In the car example, you determine u by intuition (a little more, a little less) but the robot cannot rely on human input in the pool.
This is where LQR comes in. Put very simply, LQR is a complicated mathematical equation that calculates a u with a minimal ‘cost’ for a particular time instant. To determine cost we define two new matrices Q and R.
The Q matrix describes the cost for error in the state of the system (error in state refers to the difference between the current state and the target state). In other words, it is a weighting factor that prioritizes certain components of the state. If there is a large error but a low cost, LQR will not produce very much control input to reduce the error. If there is a very high cost then LQR will exert much more effort to reduce the error.
The R matrix is very similar but it assigns cost to the control inputs. If there is a low cost then LQR will exert much effort with that control input. If there is a high cost, then LQR will emphasize using other control inputs to reduce error. For example, let us say the robot is very slow when it moves left and right. We can assign a high cost to moving left and right so that LQR does not rely on lateral thrust to reach a target.
With these two cost matrices, LQR attempts to find the best u that costs the least. In doing so, it finds the optimal control input for that situation.
We have had tremendous success with using an LQR control system. Some of the benefits include:
To demonstrate how powerful our LQR control system can be, see this gif of AURI completing a barrel roll (aileron roll). This may seem simple but the control system must maintain three targets. A forward velocity, a rotational velocity, and it must maintain its depth.
Apart from being mathematically intensive, LQR has one major flaw. It only calculates a control input for a given time. There is no control over the path the robot takes when reaching its target.
To solve this problem we use a ‘motion_planner’ which, given a target position, generates a path to that position and then feeds a series of velocity targets to the control system over time.
This process can be described in the following flow chart:
Our most essential development tool is our simulation system, Gazebo. Designing an underwater robot requires huge amounts of testing. At best, we can only test our robot in the pool once a week for 2-3 hours. This cripples development so ARVP has created a simulator with realistic underwater physics.
By having a simulator, every developer can run tests in the comfort of their own dev machine. They can test their vision changes, mapping changes, control changes, localization changes, they can also test missions they have created. Of course, a simulator will never replace an in-person physical test but it allows the team to scale development with the number of developers.
Here we can see a screenshot of Arctos in our simulator. The environment is a representation of what the competition pool looks like.
ARVP’s code is mostly written in C++, C, and Python. Developers generally work in an Ubuntu 18.04 environment. The team makes good use of the Robot Operating System (ROS) Framework. ROS allows us to modularize our code with nodes in a publisher subscriber model. Functionality is divided into processes called nodes. For example, the mission planner has its own node, every sensor has its own node, and additional functionality can be added as a node. Information is distributed in a publisher/subscriber model. A node may publish messages (data) to a topic and any node that is subscribed to that topic will receive those messages. This allows nodes to distribute information on an as-needed basis which reduces overhead.