What AI-RAN Means
AI-RAN refers to the application of machine learning and artificial intelligence directly within the Radio Access Network for real-time decision making. Unlike traditional RAN optimization -- which relies on rule-based algorithms, static thresholds, and periodic manual tuning -- AI-RAN uses trained models that adapt to traffic patterns, propagation conditions, and user behavior in real time.
The concept spans three layers:
- Model training: Offline or near-real-time training using historical RAN data (KPIs, traces, channel measurements). This happens at the SMO (Service Management and Orchestration) or dedicated ML platforms.
- Model inference: Real-time or near-real-time inference at the gNB-CU, gNB-DU, or O-RAN RIC (Near-RT and Non-RT). Inference latency determines which use cases are feasible.
- Model lifecycle management: Versioning, A/B testing, performance monitoring, and retraining. 3GPP and O-RAN Alliance define frameworks for this.
AI-RAN Use Case Matrix
| Use Case | ML Technique | Data Input | Expected Gain | Inference Location | 3GPP / O-RAN Ref |
|---|---|---|---|---|---|
| Beam prediction | Supervised learning (CNN, transformer) | UL SRS, position, velocity | 30--50% reduction in beam sweep overhead | gNB-DU (< 1 ms) | TS 38.843 (Rel-18), TR 38.901 |
| Traffic steering (LB) | Reinforcement learning (Q-learning, PPO) | Cell load, throughput, UE measurements | 15--25% throughput gain at cell edge | Near-RT RIC (10--100 ms) | O-RAN WG2 A1 policy |
| Energy saving (cell sleep) | Time-series prediction (LSTM, GBM) | Traffic volume, PRB utilization, time | 15--30% energy reduction | Non-RT RIC (> 1 s) | TS 28.310, O-RAN WG2 |
| Anomaly detection | Unsupervised (autoencoder, isolation forest) | KPI counters, alarms, PM data | 40--60% faster fault detection | SMO / Non-RT RIC | TS 28.552, O-RAN WG2 |
| Link adaptation | Online learning (contextual bandit) | CQI, BLER, SINR, velocity | 5--10% spectral efficiency gain | gNB-DU (< 1 ms) | TS 38.214 Sec 5.2.2 |
| Mobility optimization | Deep RL (A3C, SAC) | Handover events, RSRP/RSRQ traces | 30--50% fewer handover failures | Near-RT RIC (10--100 ms) | TS 38.331, O-RAN WG3 |
| QoS prediction | Regression (XGBoost, neural network) | Flow-level throughput, delay, jitter | Proactive QoS enforcement | gNB-CU (1--10 ms) | TS 23.503, TS 38.843 |
3GPP AI/ML Standardization Timeline
| Release | Period | Scope | Key Deliverable |
|---|---|---|---|
| Release 17 | 2020--2022 | Study phase: AI/ML for NR air interface | TR 38.843 -- studied beam prediction, CSI compression, positioning |
| Release 18 | 2022--2024 | Normative phase 1: CSI feedback, beam management, positioning | TS 38.214 amendments for ML-based CSI, functional framework for LCM |
| Release 19 | 2024--2026 | Normative phase 2: Two-sided models, data collection enhancements | Support for UE-side inference, model transfer, performance monitoring |
| Release 20+ | 2027+ | AI-native air interface studies for 6G | End-to-end learned transceivers, semantic communication |
Release 18 AI/ML Framework (TS 38.843)
The 3GPP AI/ML functional framework defines three model deployment scenarios:
- Case 1 -- Network-side model: Model runs at gNB. UE reports standard measurements (CSI, beam reports). No UE changes required.
- Case 2 -- UE-side model: Model runs at UE. Network provides training data or pre-trained model. Requires new UE capabilities.
- Case 3 -- Two-sided model: Split inference between UE (encoder) and network (decoder). Used for CSI compression where UE encodes CSI into a low-dimensional representation and network decodes it.
For beam prediction specifically, Release 18 specifies that the gNB can predict the best beam based on partial SSB measurements. Instead of the UE measuring all 8 (FR1) or 64 (FR2) SSB beams, the ML model predicts the optimal beam from a subset -- reducing measurement overhead by 50--75%.
RL-Based Scheduling: Detailed Example
Problem Formulation
Traditional NR scheduling uses proportional fair (PF) or weighted fair queuing, computed per-slot from CQI and buffer status. An RL-based scheduler can learn policies that optimize long-term objectives (e.g., minimize 5th percentile latency while maintaining throughput).
State-Action-Reward Design
State vector (observed per TTI):`
s_t = [PRB_utilization, num_active_UEs, avg_CQI, avg_buffer_size,
per_UE_RSRP (top 10), per_UE_BLER (top 10),
QoS_class_distribution, time_of_day_encoding]
`
Dimension: approximately 40--60 features.
Action space:`
a_t = [PRB_allocation_per_UE (discretized into 5 levels),
MCS_override (0 = auto, 1-5 = forced MCS range),
power_boost_flag (0/1 per UE)]
`
For 10 active UEs, this is a multi-dimensional discrete action space. Practical implementations use action decomposition -- the RL agent outputs a priority vector, and a conventional scheduler maps priorities to PRB allocation.
Reward function:`
r_t = w1 log(cell_throughput) + w2 (-max_latency_violation)
+ w3 fairness_index + w4 (-BLER_penalty)
`
Where w1 = 0.4, w2 = 0.3, w3 = 0.2, w4 = 0.1 (tunable per deployment).
Training Pipeline
- Data collection: 2 weeks of per-TTI scheduling logs from a live cell (
~8.6 billion TTIsat 1 ms granularity -- subsampled to 1 per 100 ms for training =~12 million samples). - Offline RL: Train a PPO (Proximal Policy Optimization) agent in a digital twin simulator that replays the collected traces with stochastic channel variations.
- Validation: A/B test against the default PF scheduler on 10 cells for 1 week. Compare 5th percentile throughput, median latency, and BLER.
- Deployment: Export the trained model to ONNX format, deploy to the gNB-DU via O-RAN Near-RT RIC xApp.
Worked Example -- Reward Calculation
Consider a single TTI with 8 active UEs:
- Cell throughput:
450 Mbps - Maximum latency violation:
2 msover QoS target for 1 UE - Jain's fairness index:
0.85 - Average BLER:
3%(penalty threshold:2%)
`
r_t = 0.4 log(450) + 0.3 (-2) + 0.2 0.85 + 0.1 (-(3-2)/2)
= 0.4 6.11 + 0.3 (-2) + 0.2 0.85 + 0.1 (-0.5)
= 2.444 - 0.6 + 0.17 - 0.05
= 1.964
`
The agent learns to maximize cumulative discounted reward over episodes (typically 1,000 TTIs per episode).
Real Deployment Results
Samsung AI-Based Energy Saving
Samsung deployed an AI-based cell sleep solution across SK Telecom's n78 network in Seoul (2024). The system uses a gradient-boosted decision tree (LightGBM) to predict per-cell traffic load 15 minutes ahead. When predicted PRB utilization drops below 10%, the system activates symbol-level sleep (shutting down TX for unused OFDM symbols) or carrier-level sleep (deactivating an entire component carrier).
Results across 8,000 cells over 6 months:
- Energy reduction: 22% during off-peak hours (midnight -- 6 AM)
- Annual energy reduction: 14.5% averaged across all hours
- User experience impact: < 0.3% increase in average latency
- Wake-up latency: 8 ms (within 5G QoS requirements)
- Estimated savings: KRW 18 billion/year (approximately USD 14 million)
The model retrains weekly using the previous 4 weeks of PM data. False positive rate (unnecessary wake-ups) decreased from 8% to 2.1% over 3 months as the model adapted to seasonal traffic patterns.
Ericsson Cognitive Network Optimization
Ericsson's Cognitive Software suite, deployed at STC (Saudi Arabia) and Swisscom, uses ML for automated mobility optimization. The system:
- Continuously ingests handover event logs and UE measurement reports.
- Identifies cells with high handover failure rates (> 1%) using anomaly detection.
- Recommends A3 offset, TTT, and CIO adjustments using a policy gradient RL agent.
- Applies changes autonomously (closed-loop) or with operator approval (open-loop).
STC reported:
- Handover success rate:
98.1% -> 99.6%(+1.5 pp) - Ping-pong handover rate:
4.2% -> 1.8%(-57%) - Mean time to optimize a new cell:
3 days -> 4 hours - Coverage of optimized cells: 12,000 n78 + 8,000 n41 cells
Swisscom observed similar results with a 15% reduction in optimization team workload, redirecting engineers from repetitive tuning to strategic network planning.
Digital Twin for RAN Simulation
The digital twin workflow for AI-RAN follows five stages:
- Data ingestion: Collect 3D building models (LiDAR or OpenStreetMap), antenna configurations (height, tilt, azimuth, pattern), propagation measurements (drive test, MDT), and real-time KPIs from OSS.
- Model calibration: Calibrate ray-tracing propagation models (e.g., Volcano by SIRADEL, Ranplan, or Altair WinProp) against drive test data. Target: < 6 dB RMSE between predicted and measured SS-RSRP.
- Scenario simulation: Run what-if scenarios: tilt changes, new site additions, traffic growth, spectrum refarming. Each scenario generates synthetic KPIs (RSRP, SINR, throughput, handover success).
- ML training: Train RL agents in the digital twin environment. The twin provides the environment dynamics (state transitions) while the agent learns the optimal policy. Training in the twin avoids impacting the live network.
- Deployment and feedback: Deploy the trained model to the live network (via O-RAN RIC or vendor OSS). Live KPI data feeds back to the twin for continuous model calibration -- closing the loop.
The O-RAN Alliance published the Digital Twin Framework in 2024 (O-RAN.WG1.DTF-v01.00), defining interfaces between the digital twin and the Non-RT RIC / SMO. Nokia's Digital Twin for Network Planning and Ericsson's Network Digital Twin are commercial implementations currently deployed at multiple Tier-1 operators.
Performance Comparison: Rule-Based vs AI-RAN
| Metric | Rule-Based (PF scheduler) | AI-RAN (RL scheduler) | Improvement |
|---|---|---|---|
| Median cell throughput | 380 Mbps | 435 Mbps | +14.5% |
| 5th percentile UE throughput | 12 Mbps | 18 Mbps | +50% |
| 95th percentile latency | 15 ms | 9 ms | -40% |
| Handover failure rate | 1.2% | 0.5% | -58% |
| Energy consumption (per cell/day) | 8.4 kWh | 6.8 kWh | -19% |
| Optimization cycle time | 2--4 weeks (manual) | 4--8 hours (automated) | 60x faster |
These figures aggregate results from multiple published operator deployments and Samsung/Ericsson white papers (2023--2025).
Challenges and Limitations
- Inference latency: Beam prediction and link adaptation require < 1 ms inference. Current GPU/NPU inference for small models (< 1M parameters) achieves 0.1--0.5 ms. Larger models may require model distillation or quantization (INT8).
- Training data distribution: Models trained on data from one geographic area may not generalize. Transfer learning and federated learning (studied in 3GPP TR 38.843 Section 6) are active research areas.
- Explainability: Operators require understanding of why the AI made a decision, especially for regulatory compliance. SHAP (SHapley Additive exPlanations) values are used in some implementations to explain feature importance.
- Interference between AI decisions: Multiple xApps in the O-RAN RIC may make conflicting decisions (e.g., energy saving xApp reduces power while coverage xApp increases it). The O-RAN Conflict Mitigation framework (WG2) addresses this through A1 policy coordination.
- Model drift: Network conditions change due to new buildings, traffic pattern shifts, and seasonal variations. Continuous monitoring of model performance metrics (accuracy, reward) with automatic retraining triggers is essential.
Key Takeaway: AI-RAN transforms the RAN from a rule-based system to an adaptive, data-driven network. 3GPP Release 18 provides the normative foundation for ML-based beam prediction and CSI compression, while O-RAN enables deployment via Near-RT RIC xApps. Real deployments by Samsung (22% energy saving at SK Telecom) and Ericsson (99.6% handover success at STC) demonstrate measurable ROI. The key to successful AI-RAN is the digital twin -- it provides the safe training environment and continuous calibration loop that production ML models require.