What AI-RAN Means

AI-RAN refers to the application of machine learning and artificial intelligence directly within the Radio Access Network for real-time decision making. Unlike traditional RAN optimization -- which relies on rule-based algorithms, static thresholds, and periodic manual tuning -- AI-RAN uses trained models that adapt to traffic patterns, propagation conditions, and user behavior in real time.

The concept spans three layers:

Model training: Offline or near-real-time training using historical RAN data (KPIs, traces, channel measurements). This happens at the SMO (Service Management and Orchestration) or dedicated ML platforms.
Model inference: Real-time or near-real-time inference at the gNB-CU, gNB-DU, or O-RAN RIC (Near-RT and Non-RT). Inference latency determines which use cases are feasible.
Model lifecycle management: Versioning, A/B testing, performance monitoring, and retraining. 3GPP and O-RAN Alliance define frameworks for this.

AI-RAN Use Case Matrix

Use Case	ML Technique	Data Input	Expected Gain	Inference Location	3GPP / O-RAN Ref
Beam prediction	Supervised learning (CNN, transformer)	UL SRS, position, velocity	30--50% reduction in beam sweep overhead	gNB-DU (< 1 ms)	TS 38.843 (Rel-18), TR 38.901
Traffic steering (LB)	Reinforcement learning (Q-learning, PPO)	Cell load, throughput, UE measurements	15--25% throughput gain at cell edge	Near-RT RIC (10--100 ms)	O-RAN WG2 A1 policy
Energy saving (cell sleep)	Time-series prediction (LSTM, GBM)	Traffic volume, PRB utilization, time	15--30% energy reduction	Non-RT RIC (> 1 s)	TS 28.310, O-RAN WG2
Anomaly detection	Unsupervised (autoencoder, isolation forest)	KPI counters, alarms, PM data	40--60% faster fault detection	SMO / Non-RT RIC	TS 28.552, O-RAN WG2
Link adaptation	Online learning (contextual bandit)	CQI, BLER, SINR, velocity	5--10% spectral efficiency gain	gNB-DU (< 1 ms)	TS 38.214 Sec 5.2.2
Mobility optimization	Deep RL (A3C, SAC)	Handover events, RSRP/RSRQ traces	30--50% fewer handover failures	Near-RT RIC (10--100 ms)	TS 38.331, O-RAN WG3
QoS prediction	Regression (XGBoost, neural network)	Flow-level throughput, delay, jitter	Proactive QoS enforcement	gNB-CU (1--10 ms)	TS 23.503, TS 38.843

3GPP AI/ML Standardization Timeline

Release	Period	Scope	Key Deliverable
Release 17	2020--2022	Study phase: AI/ML for NR air interface	TR 38.843 -- studied beam prediction, CSI compression, positioning
Release 18	2022--2024	Normative phase 1: CSI feedback, beam management, positioning	TS 38.214 amendments for ML-based CSI, functional framework for LCM
Release 19	2024--2026	Normative phase 2: Two-sided models, data collection enhancements	Support for UE-side inference, model transfer, performance monitoring
Release 20+	2027+	AI-native air interface studies for 6G	End-to-end learned transceivers, semantic communication

Release 18 AI/ML Framework (TS 38.843)

The 3GPP AI/ML functional framework defines three model deployment scenarios:

Case 1 -- Network-side model: Model runs at gNB. UE reports standard measurements (CSI, beam reports). No UE changes required.
Case 2 -- UE-side model: Model runs at UE. Network provides training data or pre-trained model. Requires new UE capabilities.
Case 3 -- Two-sided model: Split inference between UE (encoder) and network (decoder). Used for CSI compression where UE encodes CSI into a low-dimensional representation and network decodes it.

For beam prediction specifically, Release 18 specifies that the gNB can predict the best beam based on partial SSB measurements. Instead of the UE measuring all 8 (FR1) or 64 (FR2) SSB beams, the ML model predicts the optimal beam from a subset -- reducing measurement overhead by 50--75%.

RL-Based Scheduling: Detailed Example

Problem Formulation

Traditional NR scheduling uses proportional fair (PF) or weighted fair queuing, computed per-slot from CQI and buffer status. An RL-based scheduler can learn policies that optimize long-term objectives (e.g., minimize 5th percentile latency while maintaining throughput).

State-Action-Reward Design

State vector (observed per TTI):

s_t = [PRB_utilization, num_active_UEs, avg_CQI, avg_buffer_size,
       per_UE_RSRP (top 10), per_UE_BLER (top 10),
       QoS_class_distribution, time_of_day_encoding]

Dimension: approximately 40--60 features.

Action space:

a_t = [PRB_allocation_per_UE (discretized into 5 levels),
       MCS_override (0 = auto, 1-5 = forced MCS range),
       power_boost_flag (0/1 per UE)]

For 10 active UEs, this is a multi-dimensional discrete action space. Practical implementations use action decomposition -- the RL agent outputs a priority vector, and a conventional scheduler maps priorities to PRB allocation.

Reward function:

r_t = w1 * log(cell_throughput) + w2 * (-max_latency_violation)
     + w3 * fairness_index + w4 * (-BLER_penalty)

Where w1 = 0.4, w2 = 0.3, w3 = 0.2, w4 = 0.1 (tunable per deployment).

Training Pipeline

Data collection: 2 weeks of per-TTI scheduling logs from a live cell (~8.6 billion TTIs at 1 ms granularity -- subsampled to 1 per 100 ms for training = ~12 million samples).
Offline RL: Train a PPO (Proximal Policy Optimization) agent in a digital twin simulator that replays the collected traces with stochastic channel variations.
Validation: A/B test against the default PF scheduler on 10 cells for 1 week. Compare 5th percentile throughput, median latency, and BLER.
Deployment: Export the trained model to ONNX format, deploy to the gNB-DU via O-RAN Near-RT RIC xApp.

Worked Example -- Reward Calculation

Consider a single TTI with 8 active UEs:

Cell throughput: 450 Mbps
Maximum latency violation: 2 ms over QoS target for 1 UE
Jain's fairness index: 0.85
Average BLER: 3% (penalty threshold: 2%)

r_t = 0.4 * log(450) + 0.3 * (-2) + 0.2 * 0.85 + 0.1 * (-(3-2)/2)
    = 0.4 * 6.11 + 0.3 * (-2) + 0.2 * 0.85 + 0.1 * (-0.5)
    = 2.444 - 0.6 + 0.17 - 0.05
    = 1.964

The agent learns to maximize cumulative discounted reward over episodes (typically 1,000 TTIs per episode).

Real Deployment Results

Samsung AI-Based Energy Saving

Samsung deployed an AI-based cell sleep solution across SK Telecom's n78 network in Seoul (2024). The system uses a gradient-boosted decision tree (LightGBM) to predict per-cell traffic load 15 minutes ahead. When predicted PRB utilization drops below 10%, the system activates symbol-level sleep (shutting down TX for unused OFDM symbols) or carrier-level sleep (deactivating an entire component carrier).

Results across 8,000 cells over 6 months:

Energy reduction: 22% during off-peak hours (midnight -- 6 AM)
Annual energy reduction: 14.5% averaged across all hours
User experience impact: < 0.3% increase in average latency
Wake-up latency: 8 ms (within 5G QoS requirements)
Estimated savings: KRW 18 billion/year (approximately USD 14 million)

The model retrains weekly using the previous 4 weeks of PM data. False positive rate (unnecessary wake-ups) decreased from 8% to 2.1% over 3 months as the model adapted to seasonal traffic patterns.

Ericsson Cognitive Network Optimization

Ericsson's Cognitive Software suite, deployed at STC (Saudi Arabia) and Swisscom, uses ML for automated mobility optimization. The system:

Continuously ingests handover event logs and UE measurement reports.
Identifies cells with high handover failure rates (> 1%) using anomaly detection.
Recommends A3 offset, TTT, and CIO adjustments using a policy gradient RL agent.
Applies changes autonomously (closed-loop) or with operator approval (open-loop).

STC reported:

Handover success rate: 98.1% -> 99.6% (+1.5 pp)
Ping-pong handover rate: 4.2% -> 1.8% (-57%)
Mean time to optimize a new cell: 3 days -> 4 hours
Coverage of optimized cells: 12,000 n78 + 8,000 n41 cells

Swisscom observed similar results with a 15% reduction in optimization team workload, redirecting engineers from repetitive tuning to strategic network planning.

Digital Twin for RAN Simulation

The digital twin workflow for AI-RAN follows five stages:

Data ingestion: Collect 3D building models (LiDAR or OpenStreetMap), antenna configurations (height, tilt, azimuth, pattern), propagation measurements (drive test, MDT), and real-time KPIs from OSS.

Model calibration: Calibrate ray-tracing propagation models (e.g., Volcano by SIRADEL, Ranplan, or Altair WinProp) against drive test data. Target: < 6 dB RMSE between predicted and measured SS-RSRP.

Scenario simulation: Run what-if scenarios: tilt changes, new site additions, traffic growth, spectrum refarming. Each scenario generates synthetic KPIs (RSRP, SINR, throughput, handover success).

ML training: Train RL agents in the digital twin environment. The twin provides the environment dynamics (state transitions) while the agent learns the optimal policy. Training in the twin avoids impacting the live network.

Deployment and feedback: Deploy the trained model to the live network (via O-RAN RIC or vendor OSS). Live KPI data feeds back to the twin for continuous model calibration -- closing the loop.

The O-RAN Alliance published the Digital Twin Framework in 2024 (O-RAN.WG1.DTF-v01.00), defining interfaces between the digital twin and the Non-RT RIC / SMO. Nokia's Digital Twin for Network Planning and Ericsson's Network Digital Twin are commercial implementations currently deployed at multiple Tier-1 operators.

Performance Comparison: Rule-Based vs AI-RAN

Metric	Rule-Based (PF scheduler)	AI-RAN (RL scheduler)	Improvement
Median cell throughput	380 Mbps	435 Mbps	+14.5%
5th percentile UE throughput	12 Mbps	18 Mbps	+50%
95th percentile latency	15 ms	9 ms	-40%
Handover failure rate	1.2%	0.5%	-58%
Energy consumption (per cell/day)	8.4 kWh	6.8 kWh	-19%
Optimization cycle time	2--4 weeks (manual)	4--8 hours (automated)	60x faster

These figures aggregate results from multiple published operator deployments and Samsung/Ericsson white papers (2023--2025).

Challenges and Limitations

Inference latency: Beam prediction and link adaptation require < 1 ms inference. Current GPU/NPU inference for small models (< 1M parameters) achieves 0.1--0.5 ms. Larger models may require model distillation or quantization (INT8).

Training data distribution: Models trained on data from one geographic area may not generalize. Transfer learning and federated learning (studied in 3GPP TR 38.843 Section 6) are active research areas.

Explainability: Operators require understanding of why the AI made a decision, especially for regulatory compliance. SHAP (SHapley Additive exPlanations) values are used in some implementations to explain feature importance.

Interference between AI decisions: Multiple xApps in the O-RAN RIC may make conflicting decisions (e.g., energy saving xApp reduces power while coverage xApp increases it). The O-RAN Conflict Mitigation framework (WG2) addresses this through A1 policy coordination.

Model drift: Network conditions change due to new buildings, traffic pattern shifts, and seasonal variations. Continuous monitoring of model performance metrics (accuracy, reward) with automatic retraining triggers is essential.

> Key Takeaway: AI-RAN transforms the RAN from a rule-based system to an adaptive, data-driven network. 3GPP Release 18 provides the normative foundation for ML-based beam prediction and CSI compression, while O-RAN enables deployment via Near-RT RIC xApps. Real deployments by Samsung (22% energy saving at SK Telecom) and Ericsson (99.6% handover success at STC) demonstrate measurable ROI. The key to successful AI-RAN is the digital twin -- it provides the safe training environment and continuous calibration loop that production ML models require.

AI-RAN: Machine Learning for Beam Prediction, Scheduling, and Energy Savings