Design and Optimization of Multi-Rendezvous Maneuvres based on Reinforcement Learning and Convex Optimization
- Paper ID
87909
- DOI
- author
- company
Delft University of Technology; SENER Aeroespacial
- country
The Netherlands
- year
2024
- abstract
The Multi-target Rendezvous or Space Traveling Salesman Problem (STSP) is the Mixed-Integer Nonlinear optimization problem with the goal of calculating the most efficient trajectory visiting a set of targets in space, and is of critical importance for the booming commercial and sustainability space industries. Traditionally, the problem is tackled by decomposition into a higher-level integer optimization, solved using heuristics, and a nonlinear optimization, solved using analytical, semi-analytical or numerical trajectory design approaches. Conventional approaches incur long computation times and struggle to account for complex perturbations in spacecraft dynamics, leading to suboptimal tours. This work presents a STSP missionization framework leveraging Reinforcement Learning (RL) and Sequential Convex Programming (SCP) to accurately solve address these challenges effectively. Deep RL is used to enable a spacecraft to learn optimal decision-making policies through interaction with the integer optimization environment -the space of all STSP tours-, where state-transitions are modelled using state-of-the-art semi-analytical trajectory design methods. A Graph Convolutional Network learns through trial and error to predict the likelihood of a path being optimal: this knowledge is then leveraged to produce near-optimal sequences of unseen targets, greatly improving over the computational efficiency of conventional integer optimization methods. Convex optimization provides a rigorous mathematical framework for optimizing spacecraft trajectories considering vehicle constraints. SCP is a powerful extension of convex optimization to non-convex problems provided a initial guess for the optimal trajectory. The SCP is solved with interior-point methods implemented in the SENER Sequential Optimization Toolbox (SOTB). The present approach offers several advantages. Firstly, it enables the discovery of near-optimal trajectories that minimize fuel consumption and optimize mission duration, resulting in cost savings and improved efficiency. Secondly, it allows for the optimization of diverse spacecraft trajectories while accounting for uncertainties and disturbances, improving the accuracy of the final trajectory and thus minimizing corrective control effort. Lastly, the present framework supports multiple spacecraft propulsion profiles, yielding optimal tours for pseudo-impulsive, pseudo-low-thrust and low thrust spacecraft. The resulting framework is a powerful end-to-end tool for multi-rendezvous mission design and defines an innovative approach to solving orbit transfer problems in the aerospace sector; the tool is going to be part of the OSSIE mission implementation, enabling the discovery of optimal tours while adapting to evolving mission requirements and environmental conditions, advancing European spacecraft deployment, space debris removal and space exploration capabilities.