Q-Learning with Double Progressive Widening: Application to Robotics


Discretization of state and action spaces is a critical issue in Q-Learning. In our contribution, we propose a real-time adaptation of the discretization by the progressive widening technique which has been already used in bandit-based methods. Results are consistently converging to the optimum of the problem, without changing the parametrization for each new problem.

Lu BL., Zhang L., Kwok J. (eds) Neural Information Processing. ICONIP