Reinforcement Learning-Driven Framework for High-Precision Target Tracking in Radio Astronomy

Tanawit Sahavisit; Popphon Laon; Supavee Pourbunthidkul; Pattharin Wichittrakarn; Pattarapong Phasukkit; Nongluck Houngkamhang

doi:10.3390/galaxies13060124

Reinforcement Learning-Driven Framework for High-Precision Target Tracking in Radio Astronomy

Date

2025-10-31

Authors

Tanawit Sahavisit

Popphon Laon

Supavee Pourbunthidkul

Pattharin Wichittrakarn

Pattarapong Phasukkit

Nongluck Houngkamhang

Publisher

Galaxies

Abstract

Radio astronomy requires precise target localization and tracking to ensure accurate observations. Conventional regulation methodologies, encompassing PID controllers, frequently encounter difficulties due to orientation inaccuracies precipitated by mechanical limitations, environmental fluctuations, and electromagnetic interferences. To tackle these obstacles, this investigation presents a reinforcement learning (RL)-oriented framework for high-accuracy monitoring in radio telescopes. The suggested system amalgamates a localization control module, a receiver, and an RL tracking agent that functions in scanning and tracking stages. The agent optimizes its policy by maximizing the signal-to-noise ratio (SNR), a critical factor in astronomical measurements. The framework employs a reconditioned 12-m radio telescope at King Mongkut’s Institute of Technology Ladkrabang (KMITL), originally constructed as a satellite earth station antenna for telecommunications and was subsequently refurbished and adapted for radio astronomy research. It incorporates dual-axis servo regulation and high-definition encoders. Real-time SNR data and streaming are supported by a HamGeek ZedBoard with an AD9361 software-defined radio (SDR). The RL agent leverages the Proximal Policy Optimization (PPO) algorithm with a self-attention actor–critic model, while hyperparameters are tuned via Optuna. Experimental results indicate strong performance, successfully maintaining stable tracking of randomly moving, non-patterned targets for over 4 continuous hours without any external tracking assistance, while achieving an SNR improvement of up to 23.5% compared with programmed TLE-based tracking during live satellite experiments with Thaicom-4. The simplicity of the framework, combined with its adaptability and ability to learn directly from environmental feedback, highlights its suitability for next-generation astronomical techniques in radio telescope surveys, radio line observations, and time-domain astronomy. These findings underscore RL’s potential to enhance telescope tracking accuracy and scalability while reducing control system complexity for dynamic astronomical applications.