Network Slicing and Automation for 5G (Rel-15+) – A RAN Episode

Figure 1. 5G network slices running on a common underlying multi-vendor and multi-access network. Each slice is independently managed and addresses a particular use case.
Courtesy of IEEE

Network Slicing is a great concept which has always been an attractive jargon for vendors who wish to bundle it with products to sell their products and solutions . However with the arrival of 3GPP Release16 and subsequent products arriving in market things are starting to change ,with so many solutions and requirements finding a novel slicing architecture that fits all is both technically complex and business wise not making lot of ROI sense . Today we will try to analyze and answer the latest progress and directions to solve this dilemma

Slicing top challenges

Based on our recent work in GSMA and 3GPP we believe below are the top questions both to evolve and proliferate slicing solutions

  • Can a Public Slicing solution fulfill vertical industry requirements
  • How to satisfy vertical industry that Slicing solution can fulfil their needs like data sovereignty , SLA , Security , Performance
  • Automation and Intelligence , can a public slicing solution flexible enough to provide all intelligence for each industry
  • Slicing for cases of 5G Infra sharing

Solution baseline principles

When we view Slicing or any tenant provisioning solution it is very important as E2E all layers including business fulfillment , network abstraction and Infrastructure including wireless adhere to the same set of principles .

This image has an empty alt attribute; its file name is image.png

A nice description of it can be found in 3GPP TS28.553 about management and orchestration for Network slicing and 3GPP TS28.554 KPI for 5G solutions and slicing . In summary once we take the systems view for Network Slicing the principles can be summarized to following

  • Slice Demarcation: A way to isolate each tenant and a possibility to offer different features of slicing bundle to different tenants , for example a Large enterprise with 10 features and 20 SLA while for small businesses 5 features and 5 SLA will do
  • Performance: A way to build a highly performant system , the postulate is once we engineer and orchestrate it will it work E2E
  • Observability : With 4B+ devices added every year and with industry setting a futuristic target of a Million Private networks by 2025 its just a pressing issue how to observe and handle such networks in real time

I think when we talk about Slicing mostly we speak about key Technology enablers like NFV , Cloud , MEC , SDN which is obviously great since a software of Network and Infra is vital . However not speaking about RAN #wireless and WNV (Wireless Network virtualization) is not a just . In this paper i just want to shed some light from RAN perspective , consider the fact still today around 65% of customers CAPEX/OPEX pumping in RAN and Transport it is vital to see this view for both conformant and realistic solution . if NFV/SDN/Cloud demands sharing among few 100’s tenants the RAN demands sharing among Million so resource sharing , utilization and optimization is vital

RAN Architecture

From E2E perspective the RAN part of slice is selected based on GST and NSSAI which is done by UE or the Core Network however its easier said than done when we need to view E2E Slicing following should be considered to build a scalable slicing solution

RAN#1: Spectrum and Power resources

The massive requirements for business towards services and slices require a highly efficient Radio resources , luckily low,mid and high bands combined with Massive Mimo is handling this part however not just spectrum and how to utilize this in efficient manner in form of form factor and power is vital .

When we need view RAN view of Slicing its not just the Spectrum it self or RF signal but also the Spectrum management like Macro , Femto and Het Nets including Open cellular . In summary still this part we are not able to understand well as it require some novel algorithms like MINLP (mixed integer non linear) programming which focus to optimize cost while increase resource usage at same time . As per latest trend a tiered RAN architecture combined by new algo like game matching through ML/AI is the answer to standardize this

RAN#2: RAN Dis-aggregation

Just like how NFV/SDN and Orchestration did for Core similarly Open-RAN and RIC (RAN intelligence controller ) will do for RAN . If you want to know more may be you need check author’s writeup about RAN evolution

RAN#3 RAN resource optimize

Based on our Field trials we find the use of Edge and MEC with RAN and specially for CDN will save around 25% of resources , the RAN caching is vital combined with LBO( Local break out) will help Telco’s fulfill the very pressing requirements from verticals . Again this is not just a cloudlet and software issue as different RAN architectures require a different approach like D2D RAN solution , Het Net and macro etc

RAN#4 Mid Haul optimize

Mid haul and Back haul capacity optimization is vital for slicing delivery and today this domain is still in a R&D funnel . A TIP project CANDI Converged Architectures for Network Disaggregation & Integration is some how evolving to understand this requirement

RAN#5 Edge Cost model

Edge solution for Slicing in context of RAN is cost model problem e.g how many MEC servers and location and it can relieve RAN and RF layer processing is the key , our latest work with Telco Edge cloud with different models for different site configuration is the answer

RAN#6 Isolation , elasticity and resource limitation

This is the most important issue for RAN slicing primarily due the the fact that they are different conflicting dimensions viz. extra resource isolation may make impossible to share resources and will limit services during peak and critical times , similarly much elasticity will make isolation and separation practically impossible , solutions for matching algorithms is the answer as it will help to build a RAN system which is not only less complex but also highly conformant . This is a make and break for RAN architecture for slicing

RAN#7 RAN infrastructure sharing for 5G

Today already the Infra sharing has started between ig players in Europe , the one questions that comes what about if a use purchase a slice and service from a tenant ,consider a whole sale view where the Infra is processed by sharing and bundling of resources from all national carriers due to reason that obviously the 5G infra from single operator is not sufficient from both coverage and capacity perspective

RAN#8 RAN Resource RAGF problem

In case of service mobility or congestion how UE can access the resources quickly may be in other sector or sites

RAN#9 Slice SLA

SLA of slices and its real time monitoring is the key requirements of business , however imagine a situation where shortage of shared resource pool make impossible to deliver the SLA

RAN#10 Slice Operations

Slice operations is not just the view of BSS and Operations as real time RAN resource usage and optimization is necessary , Have you ever thought how perfectly managed slice can exist with normal Telco service specially when you find there is a key event and many users will use service . I think this so some dimension still not well addressed . I have no hesitate to say when many CXO’s of enterprise convince them they should opt to build their own 5G private network this is exactly the problem they fear .

Summary

In today’s writeup i have tried to explain both the current progress, challenges and steps to build a successful slicing solution keeping the hat of a RAN architect , i believe this is very important to see the Radio view point which somehow i firmly believe has not gotten its due respect and attention in both standard bodies and by vendors , in my coming blog i shall summarize some key gaps and how we can approach it as still the slicing products and solutions are not carrier grade and it need further tuning to ensure E2E slicing and services fulfillment .

Why Cloud and 5G CNF architects must analyze docker depreciation after kubernetes 1.20

Kubernetes is deprecating Docker as a CRI after v1.20 which makes possible future all applications coverage over a single image standard which is OCI , consider the fact due to 5G Telco CNF secondary networking and running

  • Service aware protocols like SIP
  • connection aware like SCTP Multi homing most
  • Ensuring regulatory requirement specially on traffic separation
  • Load balancing
  • Nw isolation
  • Network acceleration

#CNF suppliers today already prefer OCI over #docker #ship . In the long road obviously it will support portability of all applications across cloud platforms .However a negative side it will impact our tool chains specially in cases where we use docker inside docker like tools such as #kaniko, #img, and most importantly  #buildah

If you an Architect who want to solve this challenge or a developer who is little naggy about applications #LCM can kindly refer to community blog post below

https://kubernetes.io/blog/2020/12/08/kubernetes-1-20-release-announcement/

here is the detailed writeup from community for quick reference .

Kubernetes 1.20: The Raddest Release

Tuesday, December 08, 2020

Authors: Kubernetes 1.20 Release Team

We’re pleased to announce the release of Kubernetes 1.20, our third and final release of 2020! This release consists of 42 enhancements: 11 enhancements have graduated to stable, 15 enhancements are moving to beta, and 16 enhancements are entering alpha.

The 1.20 release cycle returned to its normal cadence of 11 weeks following the previous extended release cycle. This is one of the most feature dense releases in a while: the Kubernetes innovation cycle is still trending upward. This release has more alpha than stable enhancements, showing that there is still much to explore in the cloud native ecosystem.

Major Themes

Volume Snapshot Operations Goes Stable

This feature provides a standard way to trigger volume snapshot operations and allows users to incorporate snapshot operations in a portable manner on any Kubernetes environment and supported storage providers.

Additionally, these Kubernetes snapshot primitives act as basic building blocks that unlock the ability to develop advanced, enterprise-grade, storage administration features for Kubernetes, including application or cluster level backup solutions.

Note that snapshot support requires Kubernetes distributors to bundle the Snapshot controller, Snapshot CRDs, and validation webhook. A CSI driver supporting the snapshot functionality must also be deployed on the cluster.

Kubectl Debug Graduates to Beta

The kubectl alpha debug features graduates to beta in 1.20, becoming kubectl debug. The feature provides support for common debugging workflows directly from kubectl. Troubleshooting scenarios supported in this release of kubectl include:

  • Troubleshoot workloads that crash on startup by creating a copy of the pod that uses a different container image or command.
  • Troubleshoot distroless containers by adding a new container with debugging tools, either in a new copy of the pod or using an ephemeral container. (Ephemeral containers are an alpha feature that are not enabled by default.)
  • Troubleshoot on a node by creating a container running in the host namespaces and with access to the host’s filesystem.

Note that as a new built-in command, kubectl debug takes priority over any kubectl plugin named “debug”. You must rename the affected plugin.

Invocations using kubectl alpha debug are now deprecated and will be removed in a subsequent release. Update your scripts to use kubectl debug. For more information about kubectl debug, see Debugging Running Pods.

Beta: API Priority and Fairness

Introduced in 1.18, Kubernetes 1.20 now enables API Priority and Fairness (APF) by default. This allows kube-apiserver to categorize incoming requests by priority levels.

Alpha with updates: IPV4/IPV6

The IPv4/IPv6 dual stack has been reimplemented to support dual stack services based on user and community feedback. This allows both IPv4 and IPv6 service cluster IP addresses to be assigned to a single service, and also enables a service to be transitioned from single to dual IP stack and vice versa.

GA: Process PID Limiting for Stability

Process IDs (pids) are a fundamental resource on Linux hosts. It is trivial to hit the task limit without hitting any other resource limits and cause instability to a host machine.

Administrators require mechanisms to ensure that user pods cannot induce pid exhaustion that prevents host daemons (runtime, kubelet, etc) from running. In addition, it is important to ensure that pids are limited among pods in order to ensure they have limited impact to other workloads on the node. After being enabled-by-default for a year, SIG Node graduates PID Limits to GA on both SupportNodePidsLimit (node-to-pod PID isolation) and SupportPodPidsLimit (ability to limit PIDs per pod).

Alpha: Graceful node shutdown

Users and cluster administrators expect that pods will adhere to expected pod lifecycle including pod termination. Currently, when a node shuts down, pods do not follow the expected pod termination lifecycle and are not terminated gracefully which can cause issues for some workloads. The GracefulNodeShutdown feature is now in Alpha. GracefulNodeShutdown makes the kubelet aware of node system shutdowns, enabling graceful termination of pods during a system shutdown.

Major Changes

Dockershim Deprecation

Dockershim, the container runtime interface (CRI) shim for Docker is being deprecated. Support for Docker is deprecated and will be removed in a future release. Docker-produced images will continue to work in your cluster with all CRI compliant runtimes as Docker images follow the Open Container Initiative (OCI) image specification. The Kubernetes community has written a detailed blog post about deprecation with a dedicated FAQ page for it.

Exec Probe Timeout Handling

A longstanding bug regarding exec probe timeouts that may impact existing pod definitions has been fixed. Prior to this fix, the field timeoutSeconds was not respected for exec probes. Instead, probes would run indefinitely, even past their configured deadline, until a result was returned. With this change, the default value of 1 second will be applied if a value is not specified and existing pod definitions may no longer be sufficient if a probe takes longer than one second. A feature gate, called ExecProbeTimeout, has been added with this fix that enables cluster operators to revert to the previous behavior, but this will be locked and removed in subsequent releases. In order to revert to the previous behavior, cluster operators should set this feature gate to false.

Please review the updated documentation regarding configuring probes for more details.

Other Updates

Graduated to Stable

Notable Feature Updates

Release notes

You can check out the full details of the 1.20 release in the release notes.

Availability of release

Kubernetes 1.20 is available for download on GitHub. There are some great resources out there for getting started with Kubernetes. You can check out some interactive tutorials on the main Kubernetes site, or run a local cluster on your machine using Docker containers with kind. If you’d like to try building a cluster from scratch, check out the Kubernetes the Hard Way tutorial by Kelsey Hightower.

Release Team

This release was made possible by a very dedicated group of individuals, who came together as a team in the midst of a lot of things happening out in the world. A huge thank you to the release lead Jeremy Rickard, and to everyone else on the release team for supporting each other, and working so hard to deliver the 1.20 release for the community.

Release Logo

Kubernetes 1.20 Release Logo

raddestadjective, Slang. excellent; wonderful; cool:

The Kubernetes 1.20 Release has been the raddest release yet.

2020 has been a challenging year for many of us, but Kubernetes contributors have delivered a record-breaking number of enhancements in this release. That is a great accomplishment, so the release lead wanted to end the year with a little bit of levity and pay homage to Kubernetes 1.14 – Caturnetes with a “rad” cat named Humphrey.

Humphrey is the release lead’s cat and has a permanent blepRad was pretty common slang in the 1990s in the United States, and so were laser backgrounds. Humphrey in a 1990s style school picture felt like a fun way to end the year. Hopefully, Humphrey and his blep bring you a little joy at the end of 2020!

The release logo was created by Henry Hsu – @robotdancebattle.

User Highlights

Project Velocity

The CNCF K8s DevStats project aggregates a number of interesting data points related to the velocity of Kubernetes and various sub-projects. This includes everything from individual contributions to the number of companies that are contributing, and is a neat illustration of the depth and breadth of effort that goes into evolving this ecosystem.

In the v1.20 release cycle, which ran for 11 weeks (September 25 to December 9), we saw contributions from 967 companies and 1335 individuals (44 of whom made their first Kubernetes contribution) from 26 countries.

Ecosystem Updates

  • KubeCon North America just wrapped up three weeks ago, the second such event to be virtual! All talks are now available to all on-demand for anyone still needing to catch up!
  • In June, the Kubernetes community formed a new working group as a direct response to the Black Lives Matter protests occurring across America. WG Naming’s goal is to remove harmful and unclear language in the Kubernetes project as completely as possible and to do so in a way that is portable to other CNCF projects. A great introductory talk on this important work and how it is conducted was given at KubeCon 2020 North America, and the initial impact of this labor can actually be seen in the v1.20 release.
  • Previously announced this summer, The Certified Kubernetes Security Specialist (CKS) Certification was released during Kubecon NA for immediate scheduling! Following the model of CKA and CKAD, the CKS is a performance-based exam, focused on security-themed competencies and domains. This exam is targeted at current CKA holders, particularly those who want to round out their baseline knowledge in securing cloud workloads (which is all of us, right?).

Event Updates

KubeCon + CloudNativeCon Europe 2021 will take place May 4 – 7, 2021! Registration will open on January 11. You can find more information about the conference here. Remember that the CFP closes on Sunday, December 13, 11:59pm PST!

Upcoming release webinar

Stay tuned for the upcoming release webinar happening this January.

Get Involved

If you’re interested in contributing to the Kubernetes community, Special Interest Groups (SIGs) are a great starting point. Many of them may align with your interests! If there are things you’d like to share with the community, you can join the weekly community meeting, or use any of the following channels:

Developing Edge Solutions for Telcos and Enterprise

According to Latest market research most of 5G Edge use cases will be realized in next 12-24 months however time to act now for Telco’s to leave them a chance , reason is very clear this is enough time for Hyperscalers to cannibalize the market something we already witnessed with OTT’s in 3G and with VoD and Content Streaming in 4G

Below are my thoughts on

  • What is Edge definition
  • What is Edge Differentiation
  • Why Telco should care about it
  • Why Software architecture so vital for Telco Edge Success

5G Site Solutions Disaggregation using Open RAN

According to Latest Market insights the RAN innovation for Telecom Lags behind others initiative by 7years which means call for more innovative and Disruptive delivery models for the Site solutions specially for next Wave of 5G Solutions .

However to reach the goal of fully distribute and Open RAN there needs to build a pragmatic view of brown fields and finding the Sweet Spot for its introduction and wide adoption .

Here are my latest thoughts on this and how Telecom Operators should adopt it . There is a still a time for industry wide adoption of Open RAN but as yo will find time to act is now .

What you will

What you will learn

  • Building Delivery Models for Open RAN in a brownfield
  • Understand what,when and how of Open RAN
  • What is Open RAN and its relation with 5G
  • Current Industry solutions
  • Define phases of Open RAN delivery
  • Present and Next Steps 5. Architecture and State of Play

Australia continues to innovate 5G with Applications and Industry

Australia | Gavi, the Vaccine Alliance

How 5G will change the Geo maps and Economy in the Post Covid-19 world , some key initiatives we learn from Australia 5G Ramp

1. Australian Federal Govt to invest more than $21M to boost use of industrial use of 5G across Australia


2. The Key industry Govt want to harness through this grant are agriculture mining, logistics and manufacturing.

3. Attracting top talent in development is key target as 5G comes to life with 1/3 of Aussies already covered by 5G


4. More than $8.1M will improve Spectrum including adoption of DSS and launch of Ku band 26GHz

5. Government plan at least 1GHZ Spectrum availability for uRLCC (Above 279MHZ in US and 300 MHZ in West Europe)


6. Government Smart rollout initiatives plan to increase 5G throughput from 300Mbps to above 1Gbps by 2021+

With Iphone 5G launching this month I am optimistic in all sense AUstralis will keep on as one of leading markets for 5G

5G network infographic


#MyAustralia

https://www.canstarblue.com.au/about-canstar-blue/

Top Considerations to build a Future Transport for 5G Networks

source: NEC Future Networks

According to latest reports from leading Trasnport Infrastructure vendors Ericsson and NEC . it is very vital to build Next Era Transport for 5G that can serve for MPLS and SRV6 end to end . When we think of future business cases like End to End Network Slicing it will become for important .

These are

source: Ericsson

Transport for Open/C-RAN

As 5G will deploy in different scnerios and Cloud will be long term most lucrative one so transport must align with it by offering Packet and Ethernet based transport end to end from device to core preferably using e-cpri models that has significant low latency

Outdoor Dense Urban solutions

In Outdoor in city centers laying NG Fiber may not be feasible solution so offering connectivity using PON and MW is needed that is why it is important to not just lock with Fiber solutions

Automation of Transport

Just like others domains like RAN , Core , Edge the transport also need to be Orchestrated and must be optimized over time using ML and AI . the key for this is build a transport that can automate using T-SDN like ACTL workflows or Open ROADM and that can help collect all data that can be optimized with AI .Cool stuff

Future DC architectures

Transport should be flexible not only to offer transport based on 3GPP service model but based on DC architectures like solutions for Clos architecture , DIs-aggregated scnerios etc

Distributed architectures

Just like with other solutions in 5G the transport should be distributed like support for Hub /Be-spoke rather than full mesh for both reliability , performance and scale .

Secure architectures

Security nevertheless is of paramount importance specially as we will open transport for 3rd party and whole sale connectivity although this domain is well addressed in Orchestration and End to End Slicing however as famous adage we must be leave the software boundary un attended .

“Protecting a software with another software in a cloud is not scalable.”

–Saad Sheikh

Using Cloud and AI to Differentiate your 5G Investment

Source: Disney

In a recent Webinar about how to build a successful 5G networks a question that took my mind was .

“How successful we can be if we address a fundamentally new Problem using a new Technology if we still use old principles to build our Telecom Networks and with out disrupting the supply chains”

I think the answer for these type of questions in the context of 5G fundamentally will depends on following two key initiatives.

  1. How to use Radio spectrum to gain strategic advantage over competitors
  2. How to use Cloud to gain advantage for 5G

The Radio Spectrum is a complex topic primarily driven by many factors like regulatory and existing use of Spectrum making real 5G a slight different than what is really possible with Spectrum of today . This alone is not enough as Smart cells vs Wifi6 will be again something that will really depend on Spectrum use of 5G .These details i will leave it for now for future discussion and want to focus on Cloud and how really it will make your 5G successful.

During our recent work with in ETSI NFV Release4 Sol WG , GSMA and LFN CNTT we have discussed and agreed on a number of ways really Cloud can support you to differentiate your 5G network . Knowing this can be a real game changer for Opcos who are investing in 5G and Future Networks

Homogenity

A homogeneous Infrastructure Platform on 5G that can be used by all applications like traditional 5G CNF’s , MEC , Developer applications and any legacy IT /OTT applications that are required to be offered to users . One such example is OpenShift or VMware Edge and Last mile solutions using technologies like CNV or VCF7.0/NSXT3.0 that will build the edge clouds in an automated manners and enable day 2 through standard tools whether use VM or containers or BM’s as a baseline architecture

A uniform IPI that can be deployed using standard Red Fish solutions such as the one from HPE really will make is possible to build 5G using the Clone technology as used in most automotive industry today and that really enabled them to produce with minimum toil

Scalability

Scalability in the last mile is the most important criteria for 5G Success . For example a compute solution that can scale and can provide power to process really all sort of workloads at the Edge is certainly a make or break for 5G . When it comes to Data one such example is storage and Disk , with solutions like RedHat Ceph3.0 that supports compression from Q3 2020 using its blue store offering and can integrate CephFS with NFS support makes the real convergence possible .

Convergence vs Automation

IT SRE and DevOps has gained lot of traction recently and this is not without a reason . It has certainly reduced the CFO bills and that is why the Telco’s want to achieve the same . However the requirements of workloads are really unique and that makes us to understand that real automation with out standard modeling is never possible .

On the Cloud side we can make use of TOSCA models together with solutions like automation hub together with secure catalog and registry means we can do both modeling for varying workload requirements and to automate it in the same fashion . Further we can do some advanced testing like the one we have been doing in PyATS

Registries and Repositories

The concept of 5G factory that we have been rigorously trying to achieve in Middle East Telco projects are really made possible using secure registries like Quay for containers , Dockerhub and its integration with Jenkins and CI/CD tools for Telco.

It is no surprise if i tell you these are most important differentiators as we introduce public clouds for 5G

Operators

The programmability of Immutable infrastructure is the biblical principle for 5G Networks . Both Service Mesh , NSM and Server less are deployed as operators which a practically CNI programs that makes your infra follow software YAML instead of tight and coupled instructions .Further to that the Operator supports full automation of both day0 and day2 Infrastructure tasks .

For K8S it is currently supported while for VM’s it will be available fully in Dec 2020

Openshift service mesh for 5G CP CNF’s is possible today with

  • Istio
  • Grafana
  • Prometheus
  • Kiali
  • Jaeger

Further to that today we faced a number of issues in Docker to Telco and use of CRI-O and PodMan will certainly support to advance the 5G .

“Podman is more light weight compared to CRI-O so you should expect it better performing on 5G Edge compared to PoDman .

5G Integration

Redhat Fuse online is one of solutions which abstracts Infrastructure and make it possible to integrate developer , integrator and tester using one tool . Except of container it also standardized your VM’s . E.g VM in Openshift running FTP service and that make it possible to run on native containers itself .Fuse Online provides a data mapper to help you do this. In a flow, at each point where you need to map data fields, add a data mapper step. Details for mapping etc

Red Hat® Fuse is a distributed integration platform with standalone, cloud, and iPaaS deployment options. Using Fuse, integration experts, application developers, and business users can independently develop connected solutions in the environment of their choice. This unified platform lets users collaborate, access self-service capabilities, and enforce governance.

An SDK is definitely helpful for 5G platform specially when it comes to open your networks for the developer who need .NET or JAVA . Quarkus from RedHat is a Kubernetes-Native full-stack Java framework aimed to optimize work with Java virtual machines.

Quarkus provides tools for Quarkus applications developers, helping them reduce the size of Java application and container image footprint, eliminate programming baggage, and reduce the amount of memory required.

Advanced Cluster Management

With huge number of 5G sites and future scnerio of site sharing between operators . It will be a real need to deploy Apps and manage them in a hybrid Cloud scnerio and nothing explains it better than burr sutter demo at the RedHat summit . A cool video from RedHat team is available there if you want to learn it more

In a summary you can mange

  • 5K+ poD’s
  • Create clusters in hybrid cloud like AWS,GCP,Azure, Bare metal and On prem
  • Policy management
  • Secure deployment by validating YAML and images using Quay/clair sorted by Labels
  • Possibility for developer to create and deploy policy using GUI

Above all RHACM makes is possible to measure SLA of Clusters and Optimize workloads e.g shift to other clusters in an automated manner .Certainly a Cool thing for 5G to serve heavy lift and Content driven applications

Heavy Lifting of Workloads

The proponents of silo vendor solutions often tell us that 5G Base band processing and e-CPRI heavy lifting with parallel processing will make X-86 a non practical choice to adopt classical cloud way .

However the latest Intel atomic series with FPGA’s and NVIDIA GPU’s means we can not only solve the Radio issues such as the ones we are trying to solve in Open-RAN but will enable to introduce latest technologies like AI and ML in 5G era networks . Those who are more interested in this domain can refer to latest work in ITU here

For ML/AI use cases in 5G there are many made possible in both Telco and vertical industry like Automobiles, warehouse monitoring etc today using GPU operator , Topology manager like shows visibility in to GPU ,NIC,BW,Performance etc.

Open Policy Pipeline can optimize the ML model itself using analytics functions of the Cloud

When it comes to Cloud value to data scientist in 5G using platforms like OCP or HPE Blue Data as follows

  • Anaconda tool sets for programming
  • Jupyter notebooks
  • CUDA and other similar libraries
  • Report on both Log and Policy compliance
  • Tekton Pipeline in OCP for CI/CD of ML/AI use cases
  • Models are made in Jupyter by scientists while it is triggered in the Tektron pipeline

Finally using OCP Open Model Manager we can Register, deploy and monitor open source models in one central environment, uniting data scientists and IT/DevOps.

Summary

The most important takeaway is that if we have to take full advantage from 5G we not only need to follow 3GPP and traditional Telecom SQI’s but also those advantages offered by Cloud . This is only way to not only manage a TCO attractive 5G but also will enable to introduce both Niche players and new services that will be required to build and drive a Post COVID-19 world economy .

Disrupting Telco Access rollouts through Open and Dis-aggregated RAN Networks

Disrupting Telco Access rollouts through Open and Dis-aggregated RAN Networks

“Open, Intelligent, virtualized and fully interoperable”

O-RAN alliance

Source: O-RAN

During the last decade the Disaggregation movement which was initially started to address only a handful of business requirements have come far and delivered results. This early success has enabled many verticals to build the robust architectures through which we can solve both current and future business requirements to define a new momentum in a global Post COVID-19 economy.

For CSP and TMT industry specially we have seen Cloud movement is taking full momentum and hence it is becoming more prominent that we can only reap the real advantages from this transformation if the technology building block Lego’s are aligned with new principles to build infrastructure and networks. The list of such principles can be big and daunting however an Architect can start by breaking this list to Top important initiatives viz. 

  • New Business models (To expand from Traditional B2C to B2B and Future B2B 2X (SaaS) offerings
  • NE coding principles towards Cloud Native
  • Focus towards Enterprise offerings
  • Automation and efficiency
  • Any Access
  • Converged Transport and Finally
  • Dis-aggregated and Open  RAN

All of these initiatives are inter mingled and has to be achieved along the transformation journey however there are some initiatives which have profound impact on CSP’s   investment and CAPEX/OPEX spending portfolio of which RAN comes to be on the top of list.

RAN(Radio Access Network)  networks which directly connects the subscriber to large Telco Infrastructure encompass to cover more than 5 Billion users  and define 1 Trillion $ Annual global revenue is certainly believed to be the most important infrastructure that we need to modernize and cloudify in an efficient manner .RAN has  historically proved to be a big cash cow for the Telco Industry . Infact  as per our analysis this piece is swallowing at least 70% of Operator CAPEX . So the traditional manner to build RAN networks by building a complete Radio/TXN Network using proprietary is neither attractive to the CFO nor is it scalable to deliver future requirements for new services in the 5G world. In this Paper I will try to share my view point how to start and build Dis-aggregated RAN Networks

How to build the Remote and Edge Clouds

We believe that in future every 5G new site will be an Edge Site which makes it necessary to build both Edge and RAN infrastructure on common and share Cloud Infrastructure. When it comes to define characteristics of this shared infrastructure the most important piece will be the operations and manageability of this large and scalable infrastructure.

Next the most important piece will be the networking primarily because we CSP’s traditionally have been doing this stuff for decades which is to ensure to wire and connect all solutions to offer them as a service to our customer. We have already seen some vendors bringing their commercial offerings to address these points, For example Vmware NSX-T 3.0 and Juniper 2.0 with unified SDN and analytics for all Core, Metro, Edge and Public Clouds. RedHat is another such vendor whose massively scaler AMQB based Openstack and Kubevirt based containerized platform support vision of complete automated infrastructure that is provisioned and managed by a central controller.

Future Disaggregated Hardware

Proponents of Closed RAN (at least from RU to DU) often claims the open X-86 hardware can not compete with high processing and performance required for a real time RAN however with the arrival of O-RAN (Open RAN alliance)  open interfaces , advancements in Silicon and collaborative community driven efforts such as in TIP driven Open RAN has really made it possible to open up the entire network and to build it using common principles using EOM offerings from a diverse suppliers .There are many advances in hardware that need to be carefully tailored to make successful Open RAN like ASICS specially for Network, GPU ,FPGA , NPU according to priorities for time-to-market, cost-performance, and reconfigurability, as well as scalability and SW ecosystem in making decision. Parallel processing-based accelerator can easily handle the performance of OFDMA and massive MIMO. In addition, the use of AI in edge computing is fueling the emergence of new AI processors beyond GPU, with TPU or spatial processor.

One of such recent advancements is Intel P-Series Atom Processor that is best suited to build Open RAN solutions , similarly to converge hardware at the sites Intel has announced the C-series Atom processors primarily which are used to deploy Edge and SD-WAN CPE’s , such a convergence is vital for industry to achieve vision of truly open and converged networks at the Edge and last mile networks .

One key point when we view Hardware modernization for Open RAN is to look advantages of Software e.g DPDK itself, For example the DPDK recent release DPDK19.05 only require 1-2 CPU cores per server so value coming from Smart NICs are declining as industry is reaching a new efficiency on the software. Again the latest offerings from intel chip set is a good example where latest Cascade Refresh is adding 30~40% power at the same cost which will make is possible not only to virtualize the RAN but to run AI and Data science processing at the Edge and RAN itself .

Having said this the hardware profile standardization of Open RAN components is already in full momentum with first release Beta coming in 2020 Feb and we expect to have a GA offer ready by end of 2020. So we can expect a commercial ready and GA  solutions of Open RAN to be ready by Q1 2021 .

Cost Efficiency of Open RAN and 5G Transport Networks

Although operators have gained early experience with 5G but obviously the cost to deploy the networks is quite high , it is primarily that still the RAN networks are not fully virtualized making transmission options possible that require a lot of transmission over heads . For example, based on our trials in middle East most vendors are coming with C-RAN delivery models only supporting Option 2-4 which waster more than 20% of transmission sources. Open RAN will make it possible to apply Option-7 to split the resources at RF and Lower Physical layers . The Open RAN lead network trails  will enable Cross-Layer network modeling to assess various scenarios

Understanding O-RAN , Open RAN , 3GPP

In the recent workshops with customers I have seen even a senior engineers sometimes confuse O-RAN , Open RAN and 3GPP itself , one common example is when some RAN teams claim that O-RAN is not 3GPP compliance and vice versa which obviously is  a wrong comparison  so it is important to share some thoughts of  how exactly it works

O-RAN:

As a matter of principle every O-RAN divides the whole RAN architecture in in RT-RIC (Real time  -RAN intelligent controller) , O-CU-CP(Open RAN CU control) , O-CU-IP (Open RAN CU user),O-DU (Open RAN DU) and O-RU (Open RAN RU) , the whole system view interconnected by interfaces is as follows

Source: O-RAN alliance

A1:The interface between RIC elements primary for policy control and implement AI/ML at the RAN side

O1:The O&M and monitoring interface

O2:The interface through which Orchestration layer control and provision the RAN Cloud components

E2 Interface:The interface that connects all the E2 nodes to the RT RIC , these E2 Nodes cane be CU-CP ,CU-UP ,DU or even an 4G O-eNb . As CUs are placed between RIC and DU, the location of CU will be left for decision of RAN vendors – either co-located with RIC or co-located with DU. Such choice will be an important for both telcos and vendors along the choice of open fronthaul. What is your choice?

Open Front Haul Interface:The interface between DU and RU

3GPP and its relation with O-RAN:

Primary body who defined the System Architecture and interfaces for horizontal services integration and specifications. 3GPP also defines a number of interfaces in O-RAN starting from 3GPP Release 15 where a nice description of interfaces can be found in 3GPP TS 31.905

The following interfaces are defined and managed by 3GPP and O-RAN is fully aligned with it as user or consumer of these interfaces these are

  1. EI
  2. FI
  3. NG
  4. X2
  5. Xn
  6. Uu

Open RAN:

Open RAN which is a TIP lead initiative is the complete E2E validation of O-RAN defined RAN architecture that also includes other components like Cloud ,Orchestration , Testing and validation of the full stack  The details can be found here https://telecominfraproject.com/openran/

“Open RAN Project Group is an initiative to define and build RAN solutions using a general purpose vendor neutral hardware and software defined technology “.

The flexibility of multivendor solutions enables a diverse ecosystem for operators to choose best-of-breed options for their 2G/3G/4G and 5G deployments. Solutions can be implemented on bare metal as well as on virtualized or containerized platforms.

One of the most important direction for Open RAN is to cover all 2G ,3G,4G and 5G NR solutions. Seeing Market in middle east as many operators will shutdown 2G by 2022 it is important specially to look for Open RAN solutions for 5G and 4G and in cases with 3G networks. Such pilots and network blue printing will be vital when we need to deploy such solutions in the brown fields.

In addition it will make possible for co-deployment and re-use of scarce spectrum resource’s between all access technologies.

RIC and Deployment Blue Prints for Open RAN Deployments

RAN intelligent controller (RIC) provides an interface between the RAN controller at the 5G core and the access network, enabling policy-driven closed-loop automation. The RIC is the interface piece that concatenates the RU, DU, and CU into the Open RAN (Full Functional Solution) through software programming and agile service delivery.

Introduction of many horizontal as well as vertical interfaces means more complexity to deploy and manage the Open RAN networks which requires a standard blue print manner to deploy such networks as provided by the Linux foundation Akraino Stack by abstracting the Infrastructure as a standard API and to be available Via standard API’s that can be consume d by the upper layers .RIC as depicted below deploy an east and flexible way to MANAGE Open RAN and having following characteristics

  • Built from reusable components of the “Telco Appliance” blueprint family
  • Automated Continuous Deployment pipeline testing the full software stack (bottom to top, from firmware up to and including application) simultaneously on chassis based extended environmental range servers and commodity datacenter servers
  • Integrated with Regional Controller (Akraino Feature Project) for “zero touch” deployment of REC to edge sites

Orchestration of Open RAN

O-RAN Physical NF and O-Cloud represented by O-RAN architecture depicts the future RAN network comprise of many NE’s (As VNF’s or CNF’s) that will be part of standard Cloud -Cloud to be deployed on an X-86 hardware. This hardware will have standard acceleration support as may be required by each f the RAN NF and is responsible to de-couple each layer of the hardware from the software. However such a solution not only require standard deployment like Akraino but complete orchestration for both Day1 and Day 2 as supported by ICN (integrated Cloud Native network) . The standard orchestration is a key as it will not only orchestrate RAN functions but also other Functions like 5G CN along with provision of E2E Network slices and services.As the future Disaggregated networks split network in to layers so from both business perspective and technology view point we need to common set of tools to orchestrate the whole solution thus ensuring we  can provision any 5G and future business cases on the fly ranging from Remote Factory , e-health , video streaming , VGS broadcast etc .The latest Guilin release already consider how ONAP will power the Open RAN networks with future enhancement like REC blueprint automation and validation ,testing , FCAPS and running ML/AI streams based on Radio Network information are the features coming in next Honolulu release .

Summary

In summary O-RAN and related Open RAN solutions offer new architecture solutions for Telco industry that will make the Full disaggregation of networks possible. In addition to its alignment with Cloud and other related SDO’s mean the future networks can be al define, managed and operated suing same principles. For example, disaggregation means real value coming from deployment of singe transport infrastructure like IPV6 SRV+ where we can divide the single physical network in many logical networks that can serve both existing and new verticals. Similarly, Akraino /StarlingX and Airship installers will offer opportunity to deploy all stack using same principle and blue prints. Finally use of common orchestration platforms and futuristic networking slicing capabilities will open new business opportunities for Telco’s that was not possible before.

As we understand the overall mobility market is changing and coexistence of many access technologies including Wifi means dis aggregated , resource sharing and validated design is the need of the hour .It will not only help us unify the RAN but enable to really monetize the data from RAN like through AI and ML processing . It will also be fully beneficial for industry as a whole as fully open interfaces means  we can expand different components independently and validate it using open tools .s

This is therefore an important era to start the Open RAN journey and to take the early mover advantage in Disrupting the Networks  

Source: Linux Foundation

References:

  1. https://www.5gtechnologyworld.com/5g-breaks-from-proprietary-systems-embraces-open-source-rans/
  2. ONAP Guilin Release
  3. Thought leadership recipes from Alex Jing
  4. Akraino integrated cloud-native icn blueprint
  5. https://docs.o-ran-sc.org/en/latest/architecture/architecture.html
  6. Core analysis blogs

Enterprise and 5G Software Application packaging using Helm

Enterprise and 5G Software Application packaging using Helm

Always great to start as a programmer

1

As most prolific developers consider Kubernetes as the future platform for application development , obviously against odds of Project Pacific https://blogs.vmware.com/vsphere/2019/08/introducing-project-pacific.html) . It is certainly worthy to know a platform that holds the future by learning how to best use it .

An investigation in to Kubernetes platform will reveal that although Kubernetes as platform is a kitchen with Recipe for all sort of applications in any vertical  however things can become very complex as we implement H/A , Load balancers  and other complex scnerio each of which require its own YAML definition and instance creation. In addition, as we apply more and more complex concepts like node affinity and taints it becomes more difficult to remember parameter definitions and to build configurations. Then in addition to this there are so many tools both in community and provided by partner distros followed by Geeks who are always willing to build their own tools so question is how to unify and address the challenges in the most efficient manner.

Can I use a collection of tools like Ansible + Mesos + Vagrant + Helm to use the best of all solve the Infra provisioning and monitoring issues?

 Obviously, no one tool can satisfy all but how to unify the pipeline and packaging and where to start, let us discuss some details to solve these very vital issues of future infrastructure. Most of these tools like HELM are available in community to accelerate development and find and fix bugs. Users of these tools also share deployment playbooks, manifests, recipes, etc  distributing via repos like GitHub and build platforms like Jenkins , mostly community and partners hardened this knowledge and also share it on secure and trusted repos and libraries like Ansible Galaxy  to which reader can refer to following to get more details https://galaxy.ansible.com/

2

Source: RedHat

All of this require a novel approach to manage the containerized infrastructure , HELM® which is a seed project with in CNCF® is a packaging solution that can address most of the challenges defined above . Just like Kubernetes it also supprots operators through which vendors and ISG can publish their software artefacts and packages to onboard it . This is also a way through which 5G CNF will be onboarded through NFVO (NFV Orchestrator) to the Infrastructure. This is exciting way to manage applications through easy to play charts , template files and easy to manage and control dependencies .

So let us try to understand some key concepts on Helm charts and Helm Operators.

4

Source: RedHat

Helm Charts:

A Helm chart is a single repository or artefact that contain all objects like deployment , services , policy , routes ,PV’s etc into a single .tgz (ZIP) file that can be instantiated on the fly . Helm also supprots aggregation concept which means you can either deploy each micro service or a collection of them altogether through one deployment process . The later is important specially in Telecom CNF’s . A good collection of helm charts are available at https://github.com/helm/charts which we can customize and also integrate with CI pipeline like Jenkins to do all operations on the fly .

When it comes to telecom and 5G CNF’s it is important to understand following terms before understanding contents of the package

5

Source: K8S and ETSI NFV Community

3

Source: Kodecloud and Project experience

Chart: A collection of resources which are packaged as one and will be used to run an application, too or service etc

Repo: A collection like an RPM used to manage and distribute resources as packages. Satellite can be used to integrate both VIM and CIM Repos in a 5G world

Release: A helm supprots to run a Canary release in a Telco environment, each time a chart is instantiated obviously including incremental changes each time will be considered a Release

Helm latest version is 3.0 release in ONS North Americas In Nov 2019 and includes a major change like removal of Tiller (Major security bottleneck) which was major impediment to use helm on more secure clusters.

Just like VNFD and NSD which follows ETSI ® SOL1 and SOL4 which defines VNF packages and its structure using TOSCA in Kubernetes we follow helm chart standard which YAML descriptors and structure that can be instantiates using helm create chart name , further it can be enriched and customized as per need , the mandatory manifests are values.yaml contains details like IP’s ,networks , template.yaml consumes the values ,chart.yaml the master file to manage charts , NOTES.txt  a comment files and Test.yaml to conduct chart testing once deployed . requirements.yaml is a file that list the dependencies

Happy and ready to apply your own helm charts , then try this out https://hub.helm.sh/charts?q=ericsson .  Although helm charts provide an easy way to manage applications however not all the changes are acceptable directly specially for the case of stateful CNF’s which are very relevant to the Telecom use case. In this case we need to use the Helm operator which first version 1.0 is GA now and let us discuss its key points below. Similarly Kubernetes operator need to be installed first via CRD’s , Helm charts behave in the same manner with a difference it is installed using Software developer provided charts .

 Helm Operators:

A helm chart and its packaging can be compared to Functions of Kubernetes operator which makes it easy to deploy and manage application across its life cycle using CRD and customer defined definition .

The helm operator is doing the next step of what Kubernetes is by enabling complete GitOps  for helm .It focused on defining a custom resource for the helm release itself thereby making it simple to manage complete artefacts as it is being deployed and managed .

As of April 2020 following are major features already added in Helm1.0 Operator

  • Declaratively installs, upgrades, and deletes Helm releases
  • Pull charts from anychart source;
  • Public or private Helm repositories over HTTP/S
  • Public or private Git repositories over HTTPS or SSH
  • Any other public or private chart source using one of the availableHelm downloader plugins
  • Allows Helm values to be specified;
  • In-line in the HelmRelease resource
  • In (external) sources, e.g. ConfigMap and Secret resources, or a (local) URL
  • Automated purging on release install failures
  • Automated (optional) rollback on upgrade failures
  • Automated image upgradesusing Flux
  • Automated (configurable) chart dependency updates for Helm charts from Git sources on install or upgrade
  • Detection and recovery from Helm storage mutations (e.g. a manual Helm release that was made but conflicts with the declared configuration for the release)
  • Parallel and scalable processing of different Helm Release resources using workers

Source: http://www.weave.works

Helm Operator can also work with other Kubernetes operators and address any dependency constraints infact all those can be expressed as part of the Chart itself. This is certainly needed in CNF’s and Telco use cases where there are lot of dependencies between API versions and cluster components for all rolling updates and each of this will require testing and validation. Traditional Helm obviously can not address it and it is almost impossible for user to address all such changes in an ever changing and complex world of network meshes, Helm operator ensures these requirements are fulfilled with in the GitOps frameworks.

Helm basic commands:

Below is a good jump start to some of basic helm commands .

  • helm repo add

command to add a helm chart from a repo

  • helm create chart-name

command to add a helm chart , a directory with basic files

  • helm install –dry-rundebug ./mychart

Run dry run to install and show debug instructuctions

  • helm package ./mychart

Will prepare the .tgz package and a user can install the application from this package.

  • helm get all UPF -n CNF

Will retrieve the details of application deployed via helm in a give NS

  • helm –help

Want to know all just try it out

Conclusion:

Although I have explained the Helm and Kubernetes in a way that one can believe that Helm chart is the replacement of Operator which is not the case. Infact the Helm is mainly aimed to deploy and manage Day1 tasks while still along the LCM of application you rely on CRD’s and Operators with one caveat why I do not like is that each time a new CRD we have to install and manage them. It will definitely change over time as Helm operator will target to resolve for most of day2 issues and that’s why I will encourage to get involved in Kubernetes SIG community.

Finally, as we will standardize the Dev Pipeline for Telco’s as well which is still too much invisible to us it will enable us to build hybrid cloud environment that will certainly solve so many fundamental architecture and business challenges. As an example, in the COVID-19 scnerio so many of the business face challenge to expand their networks to cater for increased load. If Telco’s already have figured out this pipeline it would have been both economical and responsible to share load between Onprem and Public cloud to address the hour need. This is why the journey to Hybrid cloud and software package standardization and delivery is too vital for both growth and sustainability of the Telco industry and national growth.

References:

ETSI NFV IFA29

@Oreily Kubernetes book sponsored by RedHat

https://medium.com/

https://www.weave.works/blog/introducing-helm-operator-1-0

https://www.digitalocean.com/

The comments in this paper do not reflect any views of my employer and sole analysis based on my individual participation in industry, partners and business at large. I hope sharing of this information with the larger community is the only way to share, improve and grow. Author can be reached at snasrullah@swedtel.com