Technically this was an academic project for a class called CS401, but the concept and implementation are entirely my own. The lectures focused solely on presenting the problem, and the students were free to pursue any solution they wished.
The problem to solve here is the coordination of movements of a swarm of simple robots. The idea is to survey a large land area by deploying a quantity of mobile sensors with modest individual capabilities. The initial positions of the sensors are expected to be very loosely controlled, e.g. as if they were deployed en masse from the air into the general target area. The robots would then need to communicate and coordinate their movements in situ in order to survey the predetermined target area. Redundant coverage may also be desired, for example surveying each point of the target area with at least n independent sensors.
The approach I chose to pursue is a kind of distributed evolutionary algorithm. Each sensor optimizes its own movement plan through random mutations which are evaluated in the context of what is known about nearby sensors’ plans. This technique hinges on the assumption that any two sensors which are close enough to be able to survey a common area are close enough to communicate.
The simulation I developed was able to produce swarm movement plans in just a few seconds, suggesting that the algorithm is viable for deployment on low power processors such as those a simple robotic platform would include.
The following three images illustrate a simple scenario with two sensors, in random initial state, partially optimized, and fully optimized. The pink region is the designated target area, and the green regions are the coverage of the sensors as they (plan to) move.
The simulation code is available for download. It may include several incomplete features.
The remainder of this page is taken from a draft paper and goes into much more detail.
Sensor and Coverage Model
Sensors are modeled as mobile robots which have continuous movement over a limited distance and are able to sense effectively within a fixed distance of themselves. Sensor positions and movements are expressed with arbitrary precision, as real 2-dimensional vectors.
Coverage is estimated by a dynamically-sized grid of points, with each point equidistant from its six nearest neighbors. In effect this divides land area by a regular hexagonal tiling. Each hexagonal cell is considered covered by a sensor if its centroid (a grid point) falls within the sensor’s effective radius. The density of this grid is chosen in a way that balances accuracy of estimation with computational complexity. An appropriate spacing of grid points is approximately one-fifth of the expected sensing radius.
Each sensor’s movement plan is expressed as a sequence of vectors, with each movement step equal in length to the sensor’s effective radius. This allows the sensor’s contiguous band of coverage to be accurately estimated as a series of circles centered on the points connected by the vectors of the movement plan, as if each sensor were executing one movement step, stopping to collect data, and then continuing with the next step.
The optimization algorithm runs independently on each sensor, optimizing the sensor’s contribution to overall coverage. Sensors communicate their plans to each other, but each sensor optimizes only its own plan. Sensors maintain a record of movement plans received from other sensors, and are able to store and forward this information to each other. Thus, a group of sensors forming a connected network will have access to up-to-date information about each sensor in this network. Further, a sensor that leaves this network and then makes contact with another independent network of sensors will be able to update every sensor in the new network with information about the previous group’s last-known plans. Assuming that sensor communication range is greater than sensing range, this should provide sufficiently correct swarm movement information for each sensor, and each sensor should be able to properly optimize its own behavior for contribution to the swarm’s overall coverage. In other words, if two sensors are unable to communicate, then these sensors must be sufficiently distant that their plans are irrelevant to each other.
Target regions of arbitrary size and shape may be defined by flagging grid points as targets. Coverage of target areas is prioritized over coverage of non-target areas. All sensors are to be given this target information a priori.
The algorithm could be classified as a (1+1) Evolutionary Strategy. This means that a single solution is kept in memory with only a transitory second copy. A solution consists of a sequence of vectors which determine how the sensor will move. Iteratively, the algorithm makes a random change (mutation) to the movement plan and then reverts that change if it degrades overall coverage. This arrangement resembles a type of evolutionary algorithm having only one individual in its population and generating one offspring per generation.
When a sensor mutates its movement plan, one of several types of change may be applied:
- With some small probability, the sensor will set its entire movement plan as a straight line in a random direction. This helps sensors deployed outside of target areas to locate targets within their reach.
- With some probability (e.g. 20%), two randomly chosen vectors in the movement plan will be swapped. This causes a random segment of the overall path to be shifted while the rest of the path remains unchanged.
- With high probability (e.g. 80%), a randomly chosen vector in the movement plan is rotated by a random angle taken from a normal distribution. Experiments suggest that a variance of about (2 * pi / 5) radians is optimal. Further, with some probability (e.g. 20%), this same rotation is also applied to all subsequent vectors in the plan, causing the entire path beyond the randomly chosen vector to be rotated.
Having applied a mutation to the solution in memory, the algorithm must decide whether the change has improved coverage. Comparison of the two solutions relies on several pieces of information about the results of executing the tentative plan as compared to the current plan. The comparison is based on two values: Target k-Coverage (TC) and Non-Target k-Coverage (NTC).
Target k-Coverage reveals how many target cells received how many visits from unique sensors (true k-sweep coverage), and similarly Non-Target k-Coverage for non-target cells. Both are histogram-type breakdowns of the properties named. They are arrays of cell tallies indexed by coverage. For example, if TC = 23, then 23 target cells are visited by 5 unique sensors, meaning 23 cells have 5-sweep coverage.
Several metrics are calculated from these values, and then each is evaluated for improvement, in a particular order of priority. If a metric shows no change positive or negative, it is ignored and the next metric is checked, and so on.
- First priority is TCOVER. This is a tally of target cells having at least 1-sweep coverage. If this metric changes, immediately judge in favor of the higher value. This metric assures that no target areas are abandoned after they are “discovered” by a sensor.
- Second priority is TSCORE. This is a weighted sum of the values TC. Over all valid k, TSCORE = sum( (1.0 – 0.05 * ( k – 1 ) ) * k * TC[k] ) If this metric changes, immediately judge in favor of the higher value.
- Third priority is NTSCORE. This is calculated and compared exactly as TSCORE, except that it uses non-target data NTC.
- Finally, if no metrics show any change, the new solution is kept anyway. This helps to promote exploration of the solution space.
The current approach could be described as a 1-phase sweep: sensors optimize their entire movement plan from their initial positions and then execute that plan in its entirety. Coverage performance and fault tolerance could benefit greatly from re-optimization of movement plans mid-sweep. For example, all sensors could determine complete, optimal movement plans from their initial positions and then proceed with only a small portion of those plans. From these new positions the sensors establish new communication networks, run their optimizations again, and proceed with the next phase of the sweep, and so on. Fault tolerance would be much improved with this method. If some sensors fail to reach their destinations, whether by movement inaccuracy or by complete movement failure, the swarm can compensate for this in the next phase of movement. General performance could also benefit from this approach as sensors move, establish new communications cliques, and gather additional swarm information from phase to phase.
The behavior of this algorithm may be altered simply by modifying the acceptance criteria for movement plan improvement. Additional metrics unrelated to sweep coverage may be included, for example, growing and maintaining communications groups, or maintaining chains of communication between a base station and all sensors. It should also be possible to enforce a destination for each sensor, so for example, sensors may be instructed to survey the immediate area of a base station and then return, or to survey an area while setting up static coverage of certain areas or boundaries.