Cluster-Based Symbolic Compression of Time Series for Scalable Forecasting and Analysis

  • Yaser Jararweh
  • , Mustafa Daraghmeh
  • , Anjali Agarwal
  • , Kuljeet Kaur

Résultats de recherche: Chapitre dans un livre, rapport, actes de conférenceParticipation à un ouvrage collectif lié à un colloque ou une conférenceRevue par des pairs

Résumé

The increasing volume and velocity of time series data make it very challenging to perform effective and scalable predictive modeling. This paper presents a novel time series compression and transformation pipeline to address this challenge. We have combined three key components in a novel approach: sliding window segmentation for identifying local patterns, centroid-based clustering for converting continuous values into discrete symbols, and run-length encoding for compact representation of similar symbolic sequences. This combination allows us to convert raw time series data into easily understandable numerical symbolic representations, capturing essential time patterns while significantly reducing the volume of data. We utilized real-world metrics from Azure Function system-wide traces, including the number of applications, functions, invocations, and average execution time, which were collected every 5 minutes over a 14-day period, to evaluate the proposed method. Experimental results demonstrate impressive compression ratios while preserving pattern interpretations with lightweight computation, highlighting the method's effectiveness in reducing the learning cost for regression and forecasting tasks. The resulting model-agnostic representation can be easily applied to a variety of machine learning architectures, including recurrent and transformer-based models, providing an effective solution for scalable time series analytics in various fields, including both cloud and edge computing environments.

langue originaleAnglais
titre2025 2nd International Generative AI and Computational Language Modelling Conference, GACLM 2025
rédacteurs en chefJaime Lloret, Yaser Jararweh
EditeurInstitute of Electrical and Electronics Engineers Inc.
Pages256-261
Nombre de pages6
ISBN (Electronique)9798331594060
Les DOIs
étatPublié - 2025
Evénement2nd International Generative AI and Computational Language Modelling Conference, GACLM 2025 - Valencia, Espagne
Durée: 18 août 202521 août 2025

Série de publications

Nom2025 2nd International Generative AI and Computational Language Modelling Conference, GACLM 2025

Conférence

Conférence2nd International Generative AI and Computational Language Modelling Conference, GACLM 2025
Pays/TerritoireEspagne
La villeValencia
période18/08/2521/08/25

Empreinte digitale

Voici les principaux termes ou expressions associés à « Cluster-Based Symbolic Compression of Time Series for Scalable Forecasting and Analysis ». Ces libellés thématiques sont générés à partir du titre et du résumé de la publication. Ensemble, ils forment une empreinte digitale unique.

Contient cette citation