Command Correction for Teleoperation using Reinforcment Learning

Felipe Cadar Chamone

In 2001 Fong and Thorpe [1] defined Teleoperation as the action “to operate a vehicle or a system over a distance”. It has the potential to improve a large variety of tasks. An operator can control a bulldozer without actually being in the mine and a pilot can fly a drone to explore hostile areas without risking their life, both receiving live information about the environment their devices are inserted on. Nevertheless, with great power comes great responsibility. When teleoperating, for example, a heavy machine, the operator has to know the state of the remote environment as if he were there, and the commands it sends to the machine must be executed instantly, that’s when our greatest challenge emerges, the time delay in communication.
Generally, the teleoperation consists of a master and slave systems. The master can send commands through the communication channel to be executed by the slave, and the slave sends feedback to the master act on. When the communication channel is not perfect, every message takes a time t to be delivered. So, every time, the master is responding to feedback that is late by t2, and his response will be delivered delayed by t1 + t2, as shown in Figure 1.
Usually, a human operator can perceive and adapt to a constant delay in communication, either by decreasing the velocity of the movements or executing tiny actions
and waiting for the outcome. These strategies also decrease productivity and the feeling of immersion. To deal with these problems, we propose to include an intelligent agent in the remote device to adjust the commands of the operator given a delay, the state when the command was executed and the present state of the remote environment.

2019/2 - POC1

Orientador: Erickson Rangel do Nascimento

PDF Disponível