From VNF to CNF: Why Cloud-Native Matters

The 5G core was designed from the ground up as a Service-Based Architecture (SBA), as defined in TS 23.501 Section 4.2. This microservices design maps naturally to cloud-native principles: each Network Function (NF) runs as an independent, stateless, horizontally scalable service. The shift from monolithic VNFs (Virtual Network Functions) to CNFs (Cloud-Native Network Functions) is the most significant infrastructure transformation in telecom history.

A VNF is essentially a physical appliance ported to a virtual machine. It retains monolithic design, vertical scaling patterns, and vendor-specific lifecycle management. A CNF decomposes the same functionality into microservices packaged as containers, orchestrated by Kubernetes, and managed through CI/CD pipelines.

VNF vs CNF Comparison

Dimension	VNF	CNF
Packaging	VM image (QCOW2, VMDK)	Container image (OCI)
Orchestration	VIM (OpenStack, VMware)	Kubernetes (K8s)
Scaling unit	Entire VM	Individual microservice pod
Scale-out time	5--15 minutes	5--30 seconds
State management	Stateful, local storage	Stateless, external state store (Redis, etcd)
Resource efficiency	30--40% overhead (hypervisor + guest OS)	5--10% overhead (shared kernel)
Update mechanism	VM snapshot, rolling upgrade	Rolling update, canary, blue-green
Failure recovery	VM restart (2--5 min)	Pod restart (1--5 sec)
Networking	SR-IOV, DPDK, OVS	CNI plugins (Multus, Calico, Cilium)
Service discovery	Static config, DNS	K8s Service, service mesh (Istio/Linkerd)
Lifecycle management	VNFM (vendor-specific)	Helm charts, Operators, GitOps
Observability	Proprietary NMS	Prometheus, Grafana, Jaeger, OpenTelemetry
Multi-tenancy	VM-level isolation	Namespace + NetworkPolicy isolation

The resource efficiency gain alone is compelling: a CNF-based 5G core requires roughly 40--60% fewer compute cores than the equivalent VNF deployment for the same subscriber capacity.

Kubernetes Components for Telco

Standard Kubernetes requires several enhancements for telco-grade workloads. The following table maps K8s components to their telco-specific roles.

K8s Component	Telco Role	Configuration	Notes
kubelet	Node agent managing NF pods	CPU pinning, NUMA-aware topology manager	Critical for UPF performance
kube-scheduler	NF placement and anti-affinity	Custom scheduling policies for NF redundancy	Spread AMF replicas across failure domains
Multus CNI	Multiple network interfaces per pod	N2/N3/N4/N6 interface separation	Required for 3GPP interface isolation
SR-IOV Device Plugin	Hardware-accelerated dataplane	NIC VF allocation to UPF pods	Enables line-rate UPF forwarding
Topology Manager	NUMA-aware resource allocation	`single-numa-node` policy for UPF	Prevents cross-NUMA memory access latency
Node Feature Discovery	Hardware capability labeling	GPU, FPGA, NIC feature labels	UPF pods scheduled to SR-IOV capable nodes
PersistentVolume	Session state backup	Ceph RBD or local NVMe	Used for UDR and UDSF
Horizontal Pod Autoscaler	NF auto-scaling	CPU and custom metrics (sessions, TPS)	AMF scales on registration TPS
cert-manager	mTLS certificate lifecycle	Automated rotation for SBI TLS	Per TS 33.501 Section 13.1
CoreDNS	NF service discovery	SBA NF registration and discovery	Supplements NRF-based discovery

Multus CNI is particularly critical because 3GPP interfaces (N2, N3, N4, N6, N9) must be separated for security, QoS, and routing purposes. A single AMF pod needs at least three interfaces: management, N2 (toward gNB), and SBI (toward other NFs).

Helm Chart Structure: AMF Example

Helm is the standard package manager for deploying CNFs on Kubernetes. A production AMF Helm chart follows this structure:

amf-chart/
  Chart.yaml              # name: amf, version: 3.2.1, appVersion: R16.8
  values.yaml             # Default configuration
  templates/
    deployment.yaml       # AMF pod spec with 3 replicas
    service.yaml          # ClusterIP for SBI, NodePort for N2
    configmap.yaml        # AMF config (PLMN, TAC, NSSAI, NRF URI)
    hpa.yaml              # Scale on CPU > 70% or registrations > 5000/s
    networkattachment.yaml # Multus annotation for N2 interface
    pdb.yaml              # PodDisruptionBudget: minAvailable=2
    serviceaccount.yaml   # RBAC for NRF registration
    servicemonitor.yaml   # Prometheus scrape config

Key values.yaml parameters:

replicaCount: 3
image:
  repository: registry.vendor.com/5gc/amf
  tag: "R16.8.2"
resources:
  requests:
    cpu: "4"
    memory: "8Gi"
  limits:
    cpu: "8"
    memory: "16Gi"
amf:
  plmnId:
    mcc: "310"
    mnc: "260"
  supportedNssai:
    - sst: 1
      sd: "000001"
    - sst: 2
      sd: "000002"
  n2:
    interface: net1  # Multus network attachment
    port: 38412      # SCTP port per TS 38.412
  sbi:
    scheme: https
    port: 8443
    nrfUri: "https://nrf-svc.5gc:8443"

The PodDisruptionBudget (PDB) ensures that during rolling updates or node maintenance, at least 2 AMF replicas remain available, preventing service disruption during upgrades.

Real Deployment: Rakuten Symphony

Rakuten Mobile launched the world's first fully cloud-native mobile network in 2020 and has since commercialized the platform as Rakuten Symphony. Their architecture runs on bare-metal Kubernetes across 15 regional data centers in Japan.

Key metrics from Rakuten's production deployment:

Core NFs: All 5GC functions (AMF, SMF, UPF, NRF, UDM, AUSF, PCF) run as CNFs on K8s
Subscriber scale: 5+ million subscribers on cloud-native core as of 2025
Infrastructure: Custom platform based on upstream Kubernetes with Wind River StarlingX for edge
OpEx reduction: Rakuten claims 40% lower OpEx versus traditional VNF-based architectures
Scaling: AMF scales from 3 to 12 pods in under 60 seconds during registration storms
Update cadence: Bi-weekly rolling updates to core NFs with zero-downtime deployments

Real Deployment: Dish Network

Dish Network built a greenfield 5G network in the US using cloud-native, O-RAN-compliant architecture from day one. Their 5G core runs on AWS Outposts (on-premise AWS infrastructure) in 5 regional data centers.

Core vendor: Multiple CNF vendors including Mavenir and Oracle
Orchestration: Amazon EKS (Elastic Kubernetes Service) on Outposts
Coverage target: 70% US population by 2025 (FCC build-out commitment)
UPF placement: Distributed UPFs at 100+ edge locations for sub-20 ms latency
Automation: Full GitOps pipeline with ArgoCD for NF deployment and configuration

Real Deployment: AT&T

AT&T's 5G core runs on their Network Cloud platform, built on OpenStack (for VMs) and Kubernetes (for containers) across 100+ data centers. AT&T has been progressively migrating VNFs to CNFs:

Platform: AirShip (bare-metal K8s provisioning) + StarlingX for edge sites
Migration path: Ericsson dual-mode core running VNF and CNF modes simultaneously
Scale: Core serving 100+ million subscribers across 4G/5G
UPF: Ericsson UPF running as CNF with DPDK-accelerated user plane at 200+ Gbps per node

Observability Stack Comparison

Telco-grade observability requires metrics, logs, traces, and alerting across thousands of NF instances.

Capability	Open Source Stack	Commercial Alternative	Telco Consideration
Metrics	Prometheus + Thanos	Datadog, Dynatrace	Prometheus at telco scale needs Thanos/Cortex for long-term storage
Visualization	Grafana	Splunk, Kibana	Grafana dashboards for per-NF KPIs (registrations/s, sessions, latency)
Logging	Fluentd + Elasticsearch	Splunk Enterprise	Log volume at 10+ TB/day requires index lifecycle management
Tracing	Jaeger / OpenTelemetry	Dynatrace, New Relic	SBI call tracing across AMF-SMF-UPF for end-to-end latency analysis
Alerting	Alertmanager	PagerDuty, OpsGenie	3GPP fault management (TS 28.532) integration needed
Service mesh observability	Istio + Kiali	Tetrate, Solo.io	mTLS enforcement and traffic visualization for SBI

Most Tier-1 operators run a hybrid approach: open-source Prometheus/Grafana for real-time metrics and a commercial platform (Splunk or Dynatrace) for log analytics, root cause analysis, and compliance reporting.

Worked Example: Prometheus Scaling

For a 5GC serving 10 million subscribers with 20 NF types averaging 500 metrics each, scraped at 15-second intervals:

Total time series: 20 NFs * 10 replicas avg * 500 metrics = 100,000
Ingestion rate: 100,000 / 15s = 6,667 samples/second
Storage (30 days): 6,667 * 86,400 * 30 * 2 bytes = ~34 GB compressed

A single Prometheus instance handles this comfortably. At 50+ million subscribers, Thanos or Cortex becomes necessary for horizontal scaling and long-term storage.

Worked Example: CNF Scaling Calculation

Calculate the number of AMF pods needed during a Monday morning registration storm:

Peak registrations: 50,000/second (morning attach storm)
AMF capacity per pod: 8,000 registrations/second (benchmarked)
Target utilization: 70%
Effective capacity per pod: 8,000 * 0.70 = 5,600/s

Required pods: 50,000 / 5,600 = 8.93 -> 9 pods
Add 1 for redundancy (PDB minAvailable): 10 pods

HPA config: minReplicas=3, maxReplicas=12
Scale-up trigger: CPU > 70% OR custom metric registrations_per_second > 5,000

Migration Strategy

Operators migrating from VNF to CNF follow a phased approach:

Phase 1 --- Dual-mode: Run VNF and CNF side by side, routing new subscribers to CNF
Phase 2 --- Drain: Gradually migrate existing subscribers via controlled re-registration
Phase 3 --- Decommission: Shut down VNF instances after full migration
Phase 4 --- Optimize: Implement advanced K8s features (service mesh, GitOps, FinOps)

The migration typically spans 18--24 months for a Tier-1 operator. The critical dependency is ensuring CNF feature parity with the incumbent VNF, particularly for regulatory features (lawful intercept, emergency calling) specified in TS 33.127 and TS 23.167.

> Key Takeaway: Cloud-native 5G core on Kubernetes delivers 40--60% better resource efficiency, sub-minute scaling, and zero-downtime upgrades compared to VNF architectures. Multus CNI, SR-IOV, and NUMA-aware scheduling are essential K8s enhancements for telco workloads. Rakuten, Dish, and AT&T prove the model works at scale.

Cloud-Native 5G Core: Kubernetes, CNFs, and Telco Cloud Architecture

From VNF to CNF: Why Cloud-Native Matters

VNF vs CNF Comparison

Kubernetes Components for Telco

Helm Chart Structure: AMF Example

Real Deployment: Rakuten Symphony

Real Deployment: Dish Network

Real Deployment: AT&T

Observability Stack Comparison

Worked Example: Prometheus Scaling

Worked Example: CNF Scaling Calculation

Migration Strategy

Go Deeper

Get the next 5G deep-dive in your inbox

Related Articles

5G Core Network Architecture: SBA, Network Functions, and Key Interfaces

Service-Based Architecture in 5G Core: HTTP/2, APIs, and NF Discovery

What Is O-RAN? Architecture, Interfaces, and Why It Matters