Data Lake for roaming data monitoring

Realizations
1

Challenge

The aim of the project was to develop a system for managing and monitoring telephone connections (data, text messages, voice) in roaming for an international telecommunications operator. The business objective of the project was to supervise the continuity of telecommunications services and to detect fraud and anomalies in wholesale traffic.

Challenge Image
2

Solution

apache_hive_logo_icon_167868.svghadoop-logo.svgelastic-ar21.svg

Design complexity:

  • cluster size: ~ 2 PB
  • over 100 servers
  • maintenance of several dozen services Technologies used:
  • HDFS, Hive, Spark, Ranger
  • Elasticsearch (ELK)
  • Apache Kafka
3

Result

  • Implementation of the Elasticsearch cluster and the Apache Hadoop cluster
  • Designing data flows (metadata, flow control, etc.)
  • Implementation of AD integration and Kerberos certification
  • Ensuring the continuity of the system operation
  • Design and implementation of disaster recovery site (DR)
Enabling supervision over the continuity of telecommunications services
Detecting fraud in wholesale traffic
Detection of anomalies in wholesale traffic