What AI-RAN Means

AI-RAN refers to the application of machine learning and artificial intelligence directly within the Radio Access Network for real-time decision making. Unlike traditional RAN optimization -- which relies on rule-based algorithms, static thresholds, and periodic manual tuning -- AI-RAN uses trained models that adapt to traffic patterns, propagation conditions, and user behavior in real time.

The concept spans three layers:

  1. Model training: Offline or near-real-time training using historical RAN data (KPIs, traces, channel measurements). This happens at the SMO (Service Management and Orchestration) or dedicated ML platforms.
  2. Model inference: Real-time or near-real-time inference at the gNB-CU, gNB-DU, or O-RAN RIC (Near-RT and Non-RT). Inference latency determines which use cases are feasible.
  3. Model lifecycle management: Versioning, A/B testing, performance monitoring, and retraining. 3GPP and O-RAN Alliance define frameworks for this.

AI-RAN Use Case Matrix

Use CaseML TechniqueData InputExpected GainInference Location3GPP / O-RAN Ref
Beam predictionSupervised learning (CNN, transformer)UL SRS, position, velocity30--50% reduction in beam sweep overheadgNB-DU (< 1 ms)TS 38.843 (Rel-18), TR 38.901
Traffic steering (LB)Reinforcement learning (Q-learning, PPO)Cell load, throughput, UE measurements15--25% throughput gain at cell edgeNear-RT RIC (10--100 ms)O-RAN WG2 A1 policy
Energy saving (cell sleep)Time-series prediction (LSTM, GBM)Traffic volume, PRB utilization, time15--30% energy reductionNon-RT RIC (> 1 s)TS 28.310, O-RAN WG2
Anomaly detectionUnsupervised (autoencoder, isolation forest)KPI counters, alarms, PM data40--60% faster fault detectionSMO / Non-RT RICTS 28.552, O-RAN WG2
Link adaptationOnline learning (contextual bandit)CQI, BLER, SINR, velocity5--10% spectral efficiency gaingNB-DU (< 1 ms)TS 38.214 Sec 5.2.2
Mobility optimizationDeep RL (A3C, SAC)Handover events, RSRP/RSRQ traces30--50% fewer handover failuresNear-RT RIC (10--100 ms)TS 38.331, O-RAN WG3
QoS predictionRegression (XGBoost, neural network)Flow-level throughput, delay, jitterProactive QoS enforcementgNB-CU (1--10 ms)TS 23.503, TS 38.843

3GPP AI/ML Standardization Timeline

ReleasePeriodScopeKey Deliverable
Release 172020--2022Study phase: AI/ML for NR air interfaceTR 38.843 -- studied beam prediction, CSI compression, positioning
Release 182022--2024Normative phase 1: CSI feedback, beam management, positioningTS 38.214 amendments for ML-based CSI, functional framework for LCM
Release 192024--2026Normative phase 2: Two-sided models, data collection enhancementsSupport for UE-side inference, model transfer, performance monitoring
Release 20+2027+AI-native air interface studies for 6GEnd-to-end learned transceivers, semantic communication

Release 18 AI/ML Framework (TS 38.843)

The 3GPP AI/ML functional framework defines three model deployment scenarios:

  • Case 1 -- Network-side model: Model runs at gNB. UE reports standard measurements (CSI, beam reports). No UE changes required.
  • Case 2 -- UE-side model: Model runs at UE. Network provides training data or pre-trained model. Requires new UE capabilities.
  • Case 3 -- Two-sided model: Split inference between UE (encoder) and network (decoder). Used for CSI compression where UE encodes CSI into a low-dimensional representation and network decodes it.

For beam prediction specifically, Release 18 specifies that the gNB can predict the best beam based on partial SSB measurements. Instead of the UE measuring all 8 (FR1) or 64 (FR2) SSB beams, the ML model predicts the optimal beam from a subset -- reducing measurement overhead by 50--75%.

RL-Based Scheduling: Detailed Example

Problem Formulation

Traditional NR scheduling uses proportional fair (PF) or weighted fair queuing, computed per-slot from CQI and buffer status. An RL-based scheduler can learn policies that optimize long-term objectives (e.g., minimize 5th percentile latency while maintaining throughput).

State-Action-Reward Design

State vector (observed per TTI): `

s_t = [PRB_utilization, num_active_UEs, avg_CQI, avg_buffer_size,

per_UE_RSRP (top 10), per_UE_BLER (top 10),

QoS_class_distribution, time_of_day_encoding]

`

Dimension: approximately 40--60 features.

Action space: `

a_t = [PRB_allocation_per_UE (discretized into 5 levels),

MCS_override (0 = auto, 1-5 = forced MCS range),

power_boost_flag (0/1 per UE)]

`

For 10 active UEs, this is a multi-dimensional discrete action space. Practical implementations use action decomposition -- the RL agent outputs a priority vector, and a conventional scheduler maps priorities to PRB allocation.

Reward function: `

r_t = w1 log(cell_throughput) + w2 (-max_latency_violation)

+ w3 fairness_index + w4 (-BLER_penalty)

`

Where w1 = 0.4, w2 = 0.3, w3 = 0.2, w4 = 0.1 (tunable per deployment).

Training Pipeline

  1. Data collection: 2 weeks of per-TTI scheduling logs from a live cell (~8.6 billion TTIs at 1 ms granularity -- subsampled to 1 per 100 ms for training = ~12 million samples).
  2. Offline RL: Train a PPO (Proximal Policy Optimization) agent in a digital twin simulator that replays the collected traces with stochastic channel variations.
  3. Validation: A/B test against the default PF scheduler on 10 cells for 1 week. Compare 5th percentile throughput, median latency, and BLER.
  4. Deployment: Export the trained model to ONNX format, deploy to the gNB-DU via O-RAN Near-RT RIC xApp.

Worked Example -- Reward Calculation

Consider a single TTI with 8 active UEs:

  • Cell throughput: 450 Mbps
  • Maximum latency violation: 2 ms over QoS target for 1 UE
  • Jain's fairness index: 0.85
  • Average BLER: 3% (penalty threshold: 2%)
`

r_t = 0.4 log(450) + 0.3 (-2) + 0.2 0.85 + 0.1 (-(3-2)/2)

= 0.4 6.11 + 0.3 (-2) + 0.2 0.85 + 0.1 (-0.5)

= 2.444 - 0.6 + 0.17 - 0.05

= 1.964

`

The agent learns to maximize cumulative discounted reward over episodes (typically 1,000 TTIs per episode).

Real Deployment Results

Samsung AI-Based Energy Saving

Samsung deployed an AI-based cell sleep solution across SK Telecom's n78 network in Seoul (2024). The system uses a gradient-boosted decision tree (LightGBM) to predict per-cell traffic load 15 minutes ahead. When predicted PRB utilization drops below 10%, the system activates symbol-level sleep (shutting down TX for unused OFDM symbols) or carrier-level sleep (deactivating an entire component carrier).

Results across 8,000 cells over 6 months:

  • Energy reduction: 22% during off-peak hours (midnight -- 6 AM)
  • Annual energy reduction: 14.5% averaged across all hours
  • User experience impact: < 0.3% increase in average latency
  • Wake-up latency: 8 ms (within 5G QoS requirements)
  • Estimated savings: KRW 18 billion/year (approximately USD 14 million)

The model retrains weekly using the previous 4 weeks of PM data. False positive rate (unnecessary wake-ups) decreased from 8% to 2.1% over 3 months as the model adapted to seasonal traffic patterns.

Ericsson Cognitive Network Optimization

Ericsson's Cognitive Software suite, deployed at STC (Saudi Arabia) and Swisscom, uses ML for automated mobility optimization. The system:

  1. Continuously ingests handover event logs and UE measurement reports.
  2. Identifies cells with high handover failure rates (> 1%) using anomaly detection.
  3. Recommends A3 offset, TTT, and CIO adjustments using a policy gradient RL agent.
  4. Applies changes autonomously (closed-loop) or with operator approval (open-loop).

STC reported:

  • Handover success rate: 98.1% -> 99.6% (+1.5 pp)
  • Ping-pong handover rate: 4.2% -> 1.8% (-57%)
  • Mean time to optimize a new cell: 3 days -> 4 hours
  • Coverage of optimized cells: 12,000 n78 + 8,000 n41 cells

Swisscom observed similar results with a 15% reduction in optimization team workload, redirecting engineers from repetitive tuning to strategic network planning.

Digital Twin for RAN Simulation

The digital twin workflow for AI-RAN follows five stages:

  1. Data ingestion: Collect 3D building models (LiDAR or OpenStreetMap), antenna configurations (height, tilt, azimuth, pattern), propagation measurements (drive test, MDT), and real-time KPIs from OSS.
  1. Model calibration: Calibrate ray-tracing propagation models (e.g., Volcano by SIRADEL, Ranplan, or Altair WinProp) against drive test data. Target: < 6 dB RMSE between predicted and measured SS-RSRP.
  1. Scenario simulation: Run what-if scenarios: tilt changes, new site additions, traffic growth, spectrum refarming. Each scenario generates synthetic KPIs (RSRP, SINR, throughput, handover success).
  1. ML training: Train RL agents in the digital twin environment. The twin provides the environment dynamics (state transitions) while the agent learns the optimal policy. Training in the twin avoids impacting the live network.
  1. Deployment and feedback: Deploy the trained model to the live network (via O-RAN RIC or vendor OSS). Live KPI data feeds back to the twin for continuous model calibration -- closing the loop.

The O-RAN Alliance published the Digital Twin Framework in 2024 (O-RAN.WG1.DTF-v01.00), defining interfaces between the digital twin and the Non-RT RIC / SMO. Nokia's Digital Twin for Network Planning and Ericsson's Network Digital Twin are commercial implementations currently deployed at multiple Tier-1 operators.

Performance Comparison: Rule-Based vs AI-RAN

MetricRule-Based (PF scheduler)AI-RAN (RL scheduler)Improvement
Median cell throughput380 Mbps435 Mbps+14.5%
5th percentile UE throughput12 Mbps18 Mbps+50%
95th percentile latency15 ms9 ms-40%
Handover failure rate1.2%0.5%-58%
Energy consumption (per cell/day)8.4 kWh6.8 kWh-19%
Optimization cycle time2--4 weeks (manual)4--8 hours (automated)60x faster

These figures aggregate results from multiple published operator deployments and Samsung/Ericsson white papers (2023--2025).

Challenges and Limitations

  1. Inference latency: Beam prediction and link adaptation require < 1 ms inference. Current GPU/NPU inference for small models (< 1M parameters) achieves 0.1--0.5 ms. Larger models may require model distillation or quantization (INT8).
  1. Training data distribution: Models trained on data from one geographic area may not generalize. Transfer learning and federated learning (studied in 3GPP TR 38.843 Section 6) are active research areas.
  1. Explainability: Operators require understanding of why the AI made a decision, especially for regulatory compliance. SHAP (SHapley Additive exPlanations) values are used in some implementations to explain feature importance.
  1. Interference between AI decisions: Multiple xApps in the O-RAN RIC may make conflicting decisions (e.g., energy saving xApp reduces power while coverage xApp increases it). The O-RAN Conflict Mitigation framework (WG2) addresses this through A1 policy coordination.
  1. Model drift: Network conditions change due to new buildings, traffic pattern shifts, and seasonal variations. Continuous monitoring of model performance metrics (accuracy, reward) with automatic retraining triggers is essential.

Key Takeaway: AI-RAN transforms the RAN from a rule-based system to an adaptive, data-driven network. 3GPP Release 18 provides the normative foundation for ML-based beam prediction and CSI compression, while O-RAN enables deployment via Near-RT RIC xApps. Real deployments by Samsung (22% energy saving at SK Telecom) and Ericsson (99.6% handover success at STC) demonstrate measurable ROI. The key to successful AI-RAN is the digital twin -- it provides the safe training environment and continuous calibration loop that production ML models require.