Look ma, no ground truth! On building supervised anomaly detection from OPS-SAT telemetry

Paper ID

78668

author

Jakub Nalepa
Bogdan Ruszczak
Krzysztof Kotowski
Jacek Andrzejewski
Alicja Musial
David Evans
Vladimir Zelenevskiy
Sam Bammens
Rodrigo Laurinovics

company

KP Labs; European Space Agency (ESA); Telespazio; European Space Agency (ESA-ESOC); IrbGS ltd.

country

Poland

year

2023

abstract

Detecting anomalies in satellite telemetry data is pivotal in ensuring its safe operations. Although there exist various data-driven techniques for determining abnormal parts of the signal, they are virtually never validated over real telemetries. Analyzing such data is challenging due to its intrinsic characteristics, as telemetry may be noisy and affected by incorrect acquisition, e.g., rendering missing parts of the signal. Although there exist data-driven algorithms for detecting abnormal events, spanning across classic techniques exploiting expert systems, unsupervised approaches and deep learning models, they are virtually never validated over real-life telemetry data. Also, they often require long time-series data to build a model reflecting the nominal operation of the satellite – capturing it on board is tedious and time-consuming, thus data-level digital twins have been blooming to simulate the correct telemetry. Benchmark datasets commonly exploited to validate detection algorithms contain time-series data, where each time series is split into its training and test parts, presenting similar characteristics. Such data is not affected by the practical challenges commonly observed in on-board telemetry, such as data noisiness or missing data, due to e.g., inappropriate signal acquisition. Therefore, the estimated anomaly detection capabilities of data-driven techniques may easily become over-optimistic, and the experimental scenarios are often flawed by methodological issues in the field. In this paper, we present our approach toward building a supervised machine learning model for detecting anomalies in real-life OPS-SAT telemetry data. We will discuss our procedure for building a labeled dataset of telemetry examples from a large dataset of unlabeled telemetry, and will present the importance of following a rigorous procedure for this task, as the quality of the ground-truth annotations (elaborated by a data scientist, with and without the label validation performed by the OPS-SAT Operations Team) affects not only the training process, but also the final validation of the machine learning model. We will thoroughly discuss our quantitative and qualitative experimental analysis that allowed us to objectively quantify the detection capabilities of the models, benefiting from hand-crafted feature extractors and classic supervised learners, even for (extremely limited) ground truth data. Finally, we will discuss the approach that we followed to prepare a resulting machine learning model for deploying it on-board OPS-SAT.

STATUS: waiting for script files to load