As part of the ACROSS project, researchers from Universidad Politécnica de Madrid (UPM) and Telefónica Innovación Digital (TID) have demonstrated that a network digital twin (NDT) can generate high-quality synthetic data to train machine learning models for heavy-hitter traffic classification. Heavy hitters are large network flows—such as video streams or DDoS floods—that consume a disproportionate share of bandwidth. Accurately distinguishing between benign and malicious traffic is critical to ensuring network performance and security.
Using Telefónica’s Network Digital Twin, Mouseworld, the researchers emulated realistic network conditions and automatically labeled traffic to produce synthetic datasets of exceptional quality. These datasets enabled the training of a machine learning model that achieves over 99% precision in identifying DDoS traffic, while maintaining high performance on benign flows—handling up to 3.6 million inferences per second on standard CPUs.
These findings are presented in the article “On the Applicability of Network Digital Twins in Generating Synthetic Data for Heavy Hitter Discrimination”, published in IEEE Communications Magazine, one of the most prestigious and influential journals in both academic and industrial circles. With a 5-year average Journal Impact Factor of 9.3 and ranked 12th out of 120 in the Telecommunications category (Web of Science), it is widely recognized as a leading platform for cutting-edge research and industrial innovation in the field.
The full article is available at: https://doi.org/10.1109/MCOM.003.2400648
In support of reproducibility and to encourage further research, the authors have also published a new synthetic dataset containing normal, benign heavy-hitter, and malicious traffic samples. The dataset can be found in the following link: https://doi.org/10.5281/zenodo.14134645
Congratulations to the authors for their contribution to advancing AI-driven network intelligence within the ACROSS framework!