Improving Telco’s Enterprise reach through Slicing and RAN control

Growing business in 5G era largely depends on ecosystem enablement and on idea that Telco’s can build a future Infrastructure that can deliver business outcomes for not only traditional Telco customers but also for broader verticals may it be Manufacturing ,Mining, Retail , Finance , Public safety , Tourism or eventually anything .

This means “Programmable” and “Automated” infrastructure is the base to achieve any such business outcome . Applying this to Telco’s 5G and Transformation journey will mean both “Network Slicing” and the “Private Networks” and although i totally agree with idea that both will co-exist and proliferate but to be fair it is a fact that although Network slicing has delivered outcomes in Labs and Demos it has still number of challenges when apply to practice .

Recently i have heard many views from many reputable names including my friend Dean Bubley and Karim so i thought to share my views on this topic highlighting some work we have been doing with our partners and customers in APAC as well in the GSMA and to allude to some improvements we have achieved in last year since i shared my views with industry on Network Slicing and its Delivery .

Today Network slicing has been live in a number of customers including Singtel that has achieved substantial outcomes with Slicing however the bigger challenge still remain un-answered

How will Network slicing address RAN resources ?

How will Network Slicing can help to monetize low hanging fruits of Edge together with Telco domain slicing

In 3GPP Release17 we have been doing some exciting progress on later with a new architecture and API exposure for co-deployment of Edge with RAN but again prior to this we need to extend the Slicing towards the Access Networks both from Technology and Business Architecture Perspective and this is what i will share in this paper

Experience Learnt from 5G Networks rollout

As many Telco’s in 2021+ accelerated 5G rollout and built 5G SA Core Networks one think proved more than before that limitation on

  • RAN resources are always scarce
  • AI need to be enabled to intelligently modify slicing in real time
  • Spectrum and RAN layers will be a top pressings time for Telco’s to deliver value
  • RAN resource isolation must follow performance: cost baseline
  • How to handle resources in peak time and pre-empt some over others is vital
  • Regulatory and GDPR is vital to achieve anything big in this domain

Orchestration must precede Network Slicing

From Above experience we can infer that it is really not about Network slicing but rather “Open” , “Control” and “API” to enable End to end Network slicing LCM and Orchestration all the way from UE to RAN to Edge to Core to Cloud

  • Dynamic control of resources with Telco level visibility in Key
  • RAN automation is first step to be done before Slicing change RAN resources
  • Cloud operations model is vital to support Network slicing because although there are many business verticals the Telco’s really have to build an efficient and Multi tenant operational model to win it

Cloud Operations Model that is secure and Multi tenant must be enabled across all Telecom Infrastructure

RAN SLA’s for vertical industry

The notion of Network slicing still lies in selling a SLA vs Selling a Network .

First of all RAN resources are always limited and secondly each vertical industry has its own traffic profile and trajectory which can never be planned using old Telecom simulation tools it means dynamic learning and resource adjustment is key . This all alludes to the fact that changing network while ensuring network KQI remain intact is something that require Full visibility and programmable Control

It leads us to consider following architecture first before Slicing is full enabled for the RAN

  • Slice LCM must be supported by automatic Infrastructure that is elastic and Telco grade at the same time
  • Scale out architecture must be enabled in RAN
  • RF and Spectrum resource scheduling is the most expensive and intricate resources for services and we must enable their dynamic control

Intelligent Networking ML/AI must be enabled first

Automation can deliver a myriad of outcomes including better control , real time changes , optimization , compliance and FCAPS for each tenant however it is not sufficient

Intent driven networks that uses power of data , ML and AI to orchestrate and adapt network is a capability that should be enabled on a network scale before network slicing can deliver a business outcome

It leads us to consider following architecture first before Slicing is full enabled for the RAN

  • Slice LCM must be supported by automatic Infrastructure that is elastic and Telco grade at the same time
  • Scale out architecture must be enabled in RAN
  • RF and Spectrum resource scheduling is the most expensive and intricate resources for services and we must enable their dynamic control

Components of a RAN Slice

Although the Core Slicing capability still exists on OSS and SMO layers that are outside the RAN still the real power of Slicing will come as we address the RT capability of RAN slicing which enables us to deliver following for a business tenant

  • RRM
  • Connection management
  • MM
  • Spectrum layers

All of this must be available to package as a NSSF functional instance as alluded below

Partnerships and Ecosystem

According to the latest GSMA report one use case enablement for any vertical will require at least 7 Players to work together , so RAN slicing or in other words Slicing Business outcomes is not a matter of one body or business to solve . Today to incrementally deliver the business outcomes following are key organizations collaborating to adress those challenges

  • ETSI NFV
  • GSMA
  • MEF
  • IETF
  • O-RAN and TIP
  • ONAP
  • 5GAA , ZVEI etc

We are also taking an aggregation approach where we are summing all the knowledge from these bodies and deliver as a outcome for our customers . you can reach out for more details .

How APAC Telco’s are applying Data, Cloud and Automation to define Future Mode of Operations

As CSPs continue to evolve to be a digital player they are facing some new challenges like size and traffic requirements are increasing at an exponential pace, the networks which were previously only serving telco workloads are now required to be open for a range of business, industrial and services verticals.  These factors necessitate the CSPs to revamp their operations model that is digital, automated, efficient and above all services driven. Similarly, the future operations should support innovation rather than relying on offerings from existing vendor operations models, tool and capabilities.

As CSPs will require to operate and manage both the legacy and new digital platforms during the migrations phase hence it is also imperative that operations have a clear transition strategy and processes that can meet both PNF and VNF service requirements with optimum synergy where possible.

In work done by our team with our customers specifically in APAC , future network should address the following challenges for its operations transformations.

  • Fault Management: Fault management in the digital era is more complex as there are no dedicated infrastructure for the applications. The question therefore arises how to demarcate the fault and corelate cross layer faults to improve O&M troubleshooting of ICT services.
  • Service Assurance: The future operations model requires being digital in nature with minimum manual intervention, fully aligned with ZTM (Zero Touch Management) and data driven using principles of closed loop feedback control.
  • Competency: To match operation requirements of future digital networks the skills of engineers and designers will play a pivotal role in defining and evolving to future networks. The new roles will require network engineers to be more technologist rather than technicians
  • IT Technology: IT technology including skills in data centers, cloud and shared resources will be vital to operate the network. Operation teams need to understand impacts of scaling elasticity and network healing on operational services

Apply ODA (Open Digital Architecture) for Future Operations Framework (FMO)

TM Forum ODA A.K.A Open Digital Architecture is a perfect place to start but since it is just an architecture and can lead to different implementation and application architecture so below i will try to share how in real brown field networks it is being applied . I will cover all modules except for AI and intelligent management which i shall be discussed in a separate paper .

Lack of automation in legacy telco networks is an important pain point that needs to be addressed promptly in the future networks. It will not only enable CSPs to avoid the toil of repetitive tasks but also allow them to reduce risks of man-made mistakes.

In order to address the challenges highlighted above it is vital to  develop an agile operations models that improves customer experience , optimize CAPEX , AI operations and Business process transformation

Such a strategic vision will be built on an agile operations model that can fulfill the following:

  • Efficiency and Intelligent Operation:  Telecom efficiency is based on data driven architectures using AI and providing actionable information and automation, Self-Healing Network capability and automation of network and as follows
    •  Task Automation & Established foundation
    •  Proactive Operation & Advanced Operation
    •  Machine managed & intelligent Operation.
  • Service Assurance: Building a service assurance framework to achieve an automated Network Surveillance, Service Quality Monitoring, Fault Management, Preventive Management and Performance Management to ensure close loop feedback control for delivery of zero touch incident handling.
  • Operations Support: Building a support framework to achieve automated operation acceptance, change & configuration management.

Phased approach to achieve Operational transformation

Based on the field experience we achieved with our partners and customers through Telecom transformation we can summarize the learning as follows

  • People transformation: Transforming teams and workforce that matches the DevOps concept to streamline organization and hence to deliver services in an agile and efficient manner. This is vital because 5G , Cloud and DevOps is a journey of experience not deployment of solutions , start quickly to embark the digital journey
  • Business Process transformation: Working together with its partners for unification, simplification and digitization of end-to-end processes. The new process will enable Telco’s  to quickly adapt the network to offer new products and to reduce time for troubleshooting.
  • Infrastructure transformation: Running services on digital platforms and cloud, matching a clear vision to swap the legacy infrastructure.

If PNF to VNF/CNF migration is vital the Hybrid Network management is critical

  • Automation and tools: Operations automation using tools like  workforce management, ticket management etc is vital but not support vision of full automation . The services migration to cloud will enable automated delivery of services across the whole life cycle. Programming teams should   join operations to start a journey where the network can be managed through power of software like Python, JAVA, GOLANG and YANG models. It will also enable test automation, a vision which will enable operations teams to validate any change before applying it to live network.

Having said this i hope it shall serve as a high level guide for architects adressing operational transformation , as we can see AI and Intelligent Managmeent is vital piece of it and i shall write on this soon .

Building Smart and Green Telecom Infrastructure using AI and Data

source: Total Telecom green summit

During last year industry has witnessed Telco’s increased spend and maturity in Cloud and Automation Platforms . During Pandemic it is proven that Digital and Cloud is the answer our customers require to design , build and Operate Future Telecom Networks .

The Second key Pillar forcing Telecom industry towards Autonomous networks is to deliver business outcomes while doing business responsibly .

Getting Business outcomes and doing a sustainable business that supports Green Vision has been a not related discussion in Telecom Industry

But now infusion of Data and Cloud is really enabling it , it is expected that we as industry can cutdown at least 50% of Power emissions in coming decade but how it will become possible . According to Pareto’s law the last 20% will be most difficult .

This is where my team main focus has been to build robust AI and Automation use cases that are intelligent enough and that solves broader issues . Today the biggest focus for ML/AI for Telco’s that can really put them lead such outcomes are

  1. Smart Capacity management
  2. O&M of networks that reduces emissions and improves availability
  3. Service assurance based on data

The biggest Challenge in Transformation is Fragmentation

The biggest bottleneck is making such outcomes is related to data . Intricately “Data” is both the problem and the Solution because of so many sources of truth and different ingestion mechanisms . Do check details on #Dell Streaming data platforms and how we are solving this problem

https://www.delltechnologies.com/en-au/storage/streaming-data-platform.htm

Today under the umbrella of Anuket , 3GPP , TMF and ITU are all collaborating to come a validated and composite solution to deliver those use cases . So in a nutshell it is vital to build a holistic and unified view to deliver data driven AI use cases

Scope and Scale of Intelligent automation

The biggest bottleneck is coming from the fact that in real world Telco Apps can never be fully cloud native , at some level both the state and resiliency requirements and App requirements has to be kept and to come with intelligent work load driven decisions . The decade long journey of Telecom Transformation has revealed that just building everything as a code and expecting it to work and Telco’s can rollback their NOC sizes simply not works .

This is where intelligence from layers above the Orchestration and SDN will be of help like google does in the Internet era .

The second biggest issue is in the Scalable Telco solutions itself , it is proven that Telco’s face unique challenges as they move from hundred’s to thousand of nodes . So imagine running AI for heterogenous environments each coming with different outcomes can never deliver power and scale Telco’s need in the new era .

Telco grade AIOPS models

It is true that with 5G and Business digital transformation the industry really want to ramp up to build an improved user experience and unified model to expand portfolio towards vertical markets as well , this is only possible if we can have a coordinated system , workflow management and data sharing and exposure with strict TSR security measures . Similarly this model should cover full LCM including FCAPS model .

Building Intelligent Telco’s

Although using AI and ML is an exciting ambition for a Telco still the bottom line is how to build these platforms on top of NFVI and Existing Orchestration and Automation frameworks . In other words really business case to build an intelligent networks starts with using Data and ML to automate the entire network . Although in this aspect the scope can extend not just to service domain but also to business domains i.e automate business process including event correlation , anomaly and RCA

Building a Unified AI Platform

Although this intention or target is clear however in context of networks this is complex as we need to solve challenge of data security , regulation as well as what it really means to do the certification of an AI platform because focus should be given that allow this layer to be built from solution from many vendors so a loose coupling with more focus on Network service and AI algorithms is a key to build this platform

Instead of focusing on network element certification focus of AI platform is service level compatibility , data models and AI algorithms

However lack of unified standard specially on trusted data normalization , sharing and exposure is certainly forcing operators to adopt a Be-Spoke solutions to build AI platforms and that itself is a big impediment to wide scale adoption of AI and ML in the Networks

To move forward more close collaboration between different standard bodies and governance by more Telco centric organization like TMF is the answer with immediate focus to be given to Data standardization , labs integration and to enable shared data sets and algorithms to evolve and support wide deployments of ML and AI in Telecom Networks

Latest Industry progress and standardization

Although this is the early time of AI platforms standardization still we need to aggregate the progress between different bodies lest we can only expect the plethora of silo solutions each with a different specifications

  • ONAP as baseline of automation platform has components like DCAE and AI engine that makes sense to make it the defacto baseline standard
  • Anuket is the Cloud Infrastructure reference and it has recently launched a new project “Thoth” to look in to AI network standardization
  • ETSI ZSM is E2E automation platform across full LCM of a Telecom network and certainly an important direction
  • ETSI ENI or enhanced network intelligence is another body that closely defines AI specifications in the context of Telecom
  • TMF as a broader Telecom body is defining architectures including ODA and AIOPS that really breaks down on how a Telco can take a phased approach to build these platforms

Above all early involvement and support from Telecom operators and partners is very important to realize this goal . I hope in this year we will see more success and standardization on these initiatives so lets work together and stay tuned .

How to Migrate applications to the cloud

Building #Cloud and #Applications on #Cloud is only half a journey , biggest issues surface as we traverse the last 10% which is #Migration and #Handoff to #Operations ?

Questions like #Coexistence ,#NewOnM models and #Processes reengineering required to address following challenges:
a. Theoretically zero downtime by building E2E Architecture e.g UPF pool
b. Delivery pipeline for migration through unified tools solving touching every NE one by one an improve TTM
c. Service consistency automatic verification between Cloud and Legacy
d. 5 9’s reliability and TSR gold standard security

Customers expect #partners who offer #Operational tools and services optimized for
1) E2E whole solution support ownership in MVI
2) SPOC support services
3) Tools and Platform services specially on PaaS and NFVO to co-develop
4) Cloud assurance services specially for #business readiness , #Transition , #TaaS ,#Solution emergency support and CSR

And Partner must work with #Telco to #remodel processes starting with
1) Incident handling for L1
2) Config management supporting multi layers
3) Change management with focus on scaling , SDN and Policy enforcement
4) Release management supporting #devops viz staging to prod

#Cloud#Migration#Operations#Services#Applications#Networking#5G#Edge

Why Cloud and 5G CNF architects must analyze docker depreciation after kubernetes 1.20

Kubernetes is deprecating Docker as a CRI after v1.20 which makes possible future all applications coverage over a single image standard which is OCI , consider the fact due to 5G Telco CNF secondary networking and running

  • Service aware protocols like SIP
  • connection aware like SCTP Multi homing most
  • Ensuring regulatory requirement specially on traffic separation
  • Load balancing
  • Nw isolation
  • Network acceleration

#CNF suppliers today already prefer OCI over #docker #ship . In the long road obviously it will support portability of all applications across cloud platforms .However a negative side it will impact our tool chains specially in cases where we use docker inside docker like tools such as #kaniko, #img, and most importantly  #buildah

If you an Architect who want to solve this challenge or a developer who is little naggy about applications #LCM can kindly refer to community blog post below

https://kubernetes.io/blog/2020/12/08/kubernetes-1-20-release-announcement/

here is the detailed writeup from community for quick reference .

Kubernetes 1.20: The Raddest Release

Tuesday, December 08, 2020

Authors: Kubernetes 1.20 Release Team

We’re pleased to announce the release of Kubernetes 1.20, our third and final release of 2020! This release consists of 42 enhancements: 11 enhancements have graduated to stable, 15 enhancements are moving to beta, and 16 enhancements are entering alpha.

The 1.20 release cycle returned to its normal cadence of 11 weeks following the previous extended release cycle. This is one of the most feature dense releases in a while: the Kubernetes innovation cycle is still trending upward. This release has more alpha than stable enhancements, showing that there is still much to explore in the cloud native ecosystem.

Major Themes

Volume Snapshot Operations Goes Stable

This feature provides a standard way to trigger volume snapshot operations and allows users to incorporate snapshot operations in a portable manner on any Kubernetes environment and supported storage providers.

Additionally, these Kubernetes snapshot primitives act as basic building blocks that unlock the ability to develop advanced, enterprise-grade, storage administration features for Kubernetes, including application or cluster level backup solutions.

Note that snapshot support requires Kubernetes distributors to bundle the Snapshot controller, Snapshot CRDs, and validation webhook. A CSI driver supporting the snapshot functionality must also be deployed on the cluster.

Kubectl Debug Graduates to Beta

The kubectl alpha debug features graduates to beta in 1.20, becoming kubectl debug. The feature provides support for common debugging workflows directly from kubectl. Troubleshooting scenarios supported in this release of kubectl include:

  • Troubleshoot workloads that crash on startup by creating a copy of the pod that uses a different container image or command.
  • Troubleshoot distroless containers by adding a new container with debugging tools, either in a new copy of the pod or using an ephemeral container. (Ephemeral containers are an alpha feature that are not enabled by default.)
  • Troubleshoot on a node by creating a container running in the host namespaces and with access to the host’s filesystem.

Note that as a new built-in command, kubectl debug takes priority over any kubectl plugin named “debug”. You must rename the affected plugin.

Invocations using kubectl alpha debug are now deprecated and will be removed in a subsequent release. Update your scripts to use kubectl debug. For more information about kubectl debug, see Debugging Running Pods.

Beta: API Priority and Fairness

Introduced in 1.18, Kubernetes 1.20 now enables API Priority and Fairness (APF) by default. This allows kube-apiserver to categorize incoming requests by priority levels.

Alpha with updates: IPV4/IPV6

The IPv4/IPv6 dual stack has been reimplemented to support dual stack services based on user and community feedback. This allows both IPv4 and IPv6 service cluster IP addresses to be assigned to a single service, and also enables a service to be transitioned from single to dual IP stack and vice versa.

GA: Process PID Limiting for Stability

Process IDs (pids) are a fundamental resource on Linux hosts. It is trivial to hit the task limit without hitting any other resource limits and cause instability to a host machine.

Administrators require mechanisms to ensure that user pods cannot induce pid exhaustion that prevents host daemons (runtime, kubelet, etc) from running. In addition, it is important to ensure that pids are limited among pods in order to ensure they have limited impact to other workloads on the node. After being enabled-by-default for a year, SIG Node graduates PID Limits to GA on both SupportNodePidsLimit (node-to-pod PID isolation) and SupportPodPidsLimit (ability to limit PIDs per pod).

Alpha: Graceful node shutdown

Users and cluster administrators expect that pods will adhere to expected pod lifecycle including pod termination. Currently, when a node shuts down, pods do not follow the expected pod termination lifecycle and are not terminated gracefully which can cause issues for some workloads. The GracefulNodeShutdown feature is now in Alpha. GracefulNodeShutdown makes the kubelet aware of node system shutdowns, enabling graceful termination of pods during a system shutdown.

Major Changes

Dockershim Deprecation

Dockershim, the container runtime interface (CRI) shim for Docker is being deprecated. Support for Docker is deprecated and will be removed in a future release. Docker-produced images will continue to work in your cluster with all CRI compliant runtimes as Docker images follow the Open Container Initiative (OCI) image specification. The Kubernetes community has written a detailed blog post about deprecation with a dedicated FAQ page for it.

Exec Probe Timeout Handling

A longstanding bug regarding exec probe timeouts that may impact existing pod definitions has been fixed. Prior to this fix, the field timeoutSeconds was not respected for exec probes. Instead, probes would run indefinitely, even past their configured deadline, until a result was returned. With this change, the default value of 1 second will be applied if a value is not specified and existing pod definitions may no longer be sufficient if a probe takes longer than one second. A feature gate, called ExecProbeTimeout, has been added with this fix that enables cluster operators to revert to the previous behavior, but this will be locked and removed in subsequent releases. In order to revert to the previous behavior, cluster operators should set this feature gate to false.

Please review the updated documentation regarding configuring probes for more details.

Other Updates

Graduated to Stable

Notable Feature Updates

Release notes

You can check out the full details of the 1.20 release in the release notes.

Availability of release

Kubernetes 1.20 is available for download on GitHub. There are some great resources out there for getting started with Kubernetes. You can check out some interactive tutorials on the main Kubernetes site, or run a local cluster on your machine using Docker containers with kind. If you’d like to try building a cluster from scratch, check out the Kubernetes the Hard Way tutorial by Kelsey Hightower.

Release Team

This release was made possible by a very dedicated group of individuals, who came together as a team in the midst of a lot of things happening out in the world. A huge thank you to the release lead Jeremy Rickard, and to everyone else on the release team for supporting each other, and working so hard to deliver the 1.20 release for the community.

Release Logo

Kubernetes 1.20 Release Logo

raddestadjective, Slang. excellent; wonderful; cool:

The Kubernetes 1.20 Release has been the raddest release yet.

2020 has been a challenging year for many of us, but Kubernetes contributors have delivered a record-breaking number of enhancements in this release. That is a great accomplishment, so the release lead wanted to end the year with a little bit of levity and pay homage to Kubernetes 1.14 – Caturnetes with a “rad” cat named Humphrey.

Humphrey is the release lead’s cat and has a permanent blepRad was pretty common slang in the 1990s in the United States, and so were laser backgrounds. Humphrey in a 1990s style school picture felt like a fun way to end the year. Hopefully, Humphrey and his blep bring you a little joy at the end of 2020!

The release logo was created by Henry Hsu – @robotdancebattle.

User Highlights

Project Velocity

The CNCF K8s DevStats project aggregates a number of interesting data points related to the velocity of Kubernetes and various sub-projects. This includes everything from individual contributions to the number of companies that are contributing, and is a neat illustration of the depth and breadth of effort that goes into evolving this ecosystem.

In the v1.20 release cycle, which ran for 11 weeks (September 25 to December 9), we saw contributions from 967 companies and 1335 individuals (44 of whom made their first Kubernetes contribution) from 26 countries.

Ecosystem Updates

  • KubeCon North America just wrapped up three weeks ago, the second such event to be virtual! All talks are now available to all on-demand for anyone still needing to catch up!
  • In June, the Kubernetes community formed a new working group as a direct response to the Black Lives Matter protests occurring across America. WG Naming’s goal is to remove harmful and unclear language in the Kubernetes project as completely as possible and to do so in a way that is portable to other CNCF projects. A great introductory talk on this important work and how it is conducted was given at KubeCon 2020 North America, and the initial impact of this labor can actually be seen in the v1.20 release.
  • Previously announced this summer, The Certified Kubernetes Security Specialist (CKS) Certification was released during Kubecon NA for immediate scheduling! Following the model of CKA and CKAD, the CKS is a performance-based exam, focused on security-themed competencies and domains. This exam is targeted at current CKA holders, particularly those who want to round out their baseline knowledge in securing cloud workloads (which is all of us, right?).

Event Updates

KubeCon + CloudNativeCon Europe 2021 will take place May 4 – 7, 2021! Registration will open on January 11. You can find more information about the conference here. Remember that the CFP closes on Sunday, December 13, 11:59pm PST!

Upcoming release webinar

Stay tuned for the upcoming release webinar happening this January.

Get Involved

If you’re interested in contributing to the Kubernetes community, Special Interest Groups (SIGs) are a great starting point. Many of them may align with your interests! If there are things you’d like to share with the community, you can join the weekly community meeting, or use any of the following channels:

5G Site Solutions Disaggregation using Open RAN

According to Latest Market insights the RAN innovation for Telecom Lags behind others initiative by 7years which means call for more innovative and Disruptive delivery models for the Site solutions specially for next Wave of 5G Solutions .

However to reach the goal of fully distribute and Open RAN there needs to build a pragmatic view of brown fields and finding the Sweet Spot for its introduction and wide adoption .

Here are my latest thoughts on this and how Telecom Operators should adopt it . There is a still a time for industry wide adoption of Open RAN but as yo will find time to act is now .

What you will

What you will learn

  • Building Delivery Models for Open RAN in a brownfield
  • Understand what,when and how of Open RAN
  • What is Open RAN and its relation with 5G
  • Current Industry solutions
  • Define phases of Open RAN delivery
  • Present and Next Steps 5. Architecture and State of Play

Security Framework to Secure Online Education and Remote Workers

The Recent report by Cisco reveals 250% increase in security attacks since Covid-19 Sets in . It required a new paradigm of how to secure our online presence

A part from increasing attacks the factors like Government encourage to not using VPN and presence of public Wifi make the whole story more horrible .

Cisco Latest Umbrella Offering is the answer to these challenges with subscription SaaS Model and options of Delivery whether onprem or on the Cloud it is best suited for both home and Enterprise customers .

Logically Umbrella only requires DNS re-routing

Finally the Provision of policy manager to manage and subscribe to policies like Parental control and URL filtering for business and personal use is something important both from security and improve efficiency on line .

SaaS offerings for Security is the answer and for sure Cisco Umbrella is a good solution to adress that .For Details please refer to umbrealla.cisco.com

Using Cloud and AI to Differentiate your 5G Investment

Source: Disney

In a recent Webinar about how to build a successful 5G networks a question that took my mind was .

“How successful we can be if we address a fundamentally new Problem using a new Technology if we still use old principles to build our Telecom Networks and with out disrupting the supply chains”

I think the answer for these type of questions in the context of 5G fundamentally will depends on following two key initiatives.

  1. How to use Radio spectrum to gain strategic advantage over competitors
  2. How to use Cloud to gain advantage for 5G

The Radio Spectrum is a complex topic primarily driven by many factors like regulatory and existing use of Spectrum making real 5G a slight different than what is really possible with Spectrum of today . This alone is not enough as Smart cells vs Wifi6 will be again something that will really depend on Spectrum use of 5G .These details i will leave it for now for future discussion and want to focus on Cloud and how really it will make your 5G successful.

During our recent work with in ETSI NFV Release4 Sol WG , GSMA and LFN CNTT we have discussed and agreed on a number of ways really Cloud can support you to differentiate your 5G network . Knowing this can be a real game changer for Opcos who are investing in 5G and Future Networks

Homogenity

A homogeneous Infrastructure Platform on 5G that can be used by all applications like traditional 5G CNF’s , MEC , Developer applications and any legacy IT /OTT applications that are required to be offered to users . One such example is OpenShift or VMware Edge and Last mile solutions using technologies like CNV or VCF7.0/NSXT3.0 that will build the edge clouds in an automated manners and enable day 2 through standard tools whether use VM or containers or BM’s as a baseline architecture

A uniform IPI that can be deployed using standard Red Fish solutions such as the one from HPE really will make is possible to build 5G using the Clone technology as used in most automotive industry today and that really enabled them to produce with minimum toil

Scalability

Scalability in the last mile is the most important criteria for 5G Success . For example a compute solution that can scale and can provide power to process really all sort of workloads at the Edge is certainly a make or break for 5G . When it comes to Data one such example is storage and Disk , with solutions like RedHat Ceph3.0 that supports compression from Q3 2020 using its blue store offering and can integrate CephFS with NFS support makes the real convergence possible .

Convergence vs Automation

IT SRE and DevOps has gained lot of traction recently and this is not without a reason . It has certainly reduced the CFO bills and that is why the Telco’s want to achieve the same . However the requirements of workloads are really unique and that makes us to understand that real automation with out standard modeling is never possible .

On the Cloud side we can make use of TOSCA models together with solutions like automation hub together with secure catalog and registry means we can do both modeling for varying workload requirements and to automate it in the same fashion . Further we can do some advanced testing like the one we have been doing in PyATS

Registries and Repositories

The concept of 5G factory that we have been rigorously trying to achieve in Middle East Telco projects are really made possible using secure registries like Quay for containers , Dockerhub and its integration with Jenkins and CI/CD tools for Telco.

It is no surprise if i tell you these are most important differentiators as we introduce public clouds for 5G

Operators

The programmability of Immutable infrastructure is the biblical principle for 5G Networks . Both Service Mesh , NSM and Server less are deployed as operators which a practically CNI programs that makes your infra follow software YAML instead of tight and coupled instructions .Further to that the Operator supports full automation of both day0 and day2 Infrastructure tasks .

For K8S it is currently supported while for VM’s it will be available fully in Dec 2020

Openshift service mesh for 5G CP CNF’s is possible today with

  • Istio
  • Grafana
  • Prometheus
  • Kiali
  • Jaeger

Further to that today we faced a number of issues in Docker to Telco and use of CRI-O and PodMan will certainly support to advance the 5G .

“Podman is more light weight compared to CRI-O so you should expect it better performing on 5G Edge compared to PoDman .

5G Integration

Redhat Fuse online is one of solutions which abstracts Infrastructure and make it possible to integrate developer , integrator and tester using one tool . Except of container it also standardized your VM’s . E.g VM in Openshift running FTP service and that make it possible to run on native containers itself .Fuse Online provides a data mapper to help you do this. In a flow, at each point where you need to map data fields, add a data mapper step. Details for mapping etc

Red Hat® Fuse is a distributed integration platform with standalone, cloud, and iPaaS deployment options. Using Fuse, integration experts, application developers, and business users can independently develop connected solutions in the environment of their choice. This unified platform lets users collaborate, access self-service capabilities, and enforce governance.

An SDK is definitely helpful for 5G platform specially when it comes to open your networks for the developer who need .NET or JAVA . Quarkus from RedHat is a Kubernetes-Native full-stack Java framework aimed to optimize work with Java virtual machines.

Quarkus provides tools for Quarkus applications developers, helping them reduce the size of Java application and container image footprint, eliminate programming baggage, and reduce the amount of memory required.

Advanced Cluster Management

With huge number of 5G sites and future scnerio of site sharing between operators . It will be a real need to deploy Apps and manage them in a hybrid Cloud scnerio and nothing explains it better than burr sutter demo at the RedHat summit . A cool video from RedHat team is available there if you want to learn it more

In a summary you can mange

  • 5K+ poD’s
  • Create clusters in hybrid cloud like AWS,GCP,Azure, Bare metal and On prem
  • Policy management
  • Secure deployment by validating YAML and images using Quay/clair sorted by Labels
  • Possibility for developer to create and deploy policy using GUI

Above all RHACM makes is possible to measure SLA of Clusters and Optimize workloads e.g shift to other clusters in an automated manner .Certainly a Cool thing for 5G to serve heavy lift and Content driven applications

Heavy Lifting of Workloads

The proponents of silo vendor solutions often tell us that 5G Base band processing and e-CPRI heavy lifting with parallel processing will make X-86 a non practical choice to adopt classical cloud way .

However the latest Intel atomic series with FPGA’s and NVIDIA GPU’s means we can not only solve the Radio issues such as the ones we are trying to solve in Open-RAN but will enable to introduce latest technologies like AI and ML in 5G era networks . Those who are more interested in this domain can refer to latest work in ITU here

For ML/AI use cases in 5G there are many made possible in both Telco and vertical industry like Automobiles, warehouse monitoring etc today using GPU operator , Topology manager like shows visibility in to GPU ,NIC,BW,Performance etc.

Open Policy Pipeline can optimize the ML model itself using analytics functions of the Cloud

When it comes to Cloud value to data scientist in 5G using platforms like OCP or HPE Blue Data as follows

  • Anaconda tool sets for programming
  • Jupyter notebooks
  • CUDA and other similar libraries
  • Report on both Log and Policy compliance
  • Tekton Pipeline in OCP for CI/CD of ML/AI use cases
  • Models are made in Jupyter by scientists while it is triggered in the Tektron pipeline

Finally using OCP Open Model Manager we can Register, deploy and monitor open source models in one central environment, uniting data scientists and IT/DevOps.

Summary

The most important takeaway is that if we have to take full advantage from 5G we not only need to follow 3GPP and traditional Telecom SQI’s but also those advantages offered by Cloud . This is only way to not only manage a TCO attractive 5G but also will enable to introduce both Niche players and new services that will be required to build and drive a Post COVID-19 world economy .

Enterprise and 5G Software Application packaging using Helm

Enterprise and 5G Software Application packaging using Helm

Always great to start as a programmer

1

As most prolific developers consider Kubernetes as the future platform for application development , obviously against odds of Project Pacific https://blogs.vmware.com/vsphere/2019/08/introducing-project-pacific.html) . It is certainly worthy to know a platform that holds the future by learning how to best use it .

An investigation in to Kubernetes platform will reveal that although Kubernetes as platform is a kitchen with Recipe for all sort of applications in any vertical  however things can become very complex as we implement H/A , Load balancers  and other complex scnerio each of which require its own YAML definition and instance creation. In addition, as we apply more and more complex concepts like node affinity and taints it becomes more difficult to remember parameter definitions and to build configurations. Then in addition to this there are so many tools both in community and provided by partner distros followed by Geeks who are always willing to build their own tools so question is how to unify and address the challenges in the most efficient manner.

Can I use a collection of tools like Ansible + Mesos + Vagrant + Helm to use the best of all solve the Infra provisioning and monitoring issues?

 Obviously, no one tool can satisfy all but how to unify the pipeline and packaging and where to start, let us discuss some details to solve these very vital issues of future infrastructure. Most of these tools like HELM are available in community to accelerate development and find and fix bugs. Users of these tools also share deployment playbooks, manifests, recipes, etc  distributing via repos like GitHub and build platforms like Jenkins , mostly community and partners hardened this knowledge and also share it on secure and trusted repos and libraries like Ansible Galaxy  to which reader can refer to following to get more details https://galaxy.ansible.com/

2

Source: RedHat

All of this require a novel approach to manage the containerized infrastructure , HELM® which is a seed project with in CNCF® is a packaging solution that can address most of the challenges defined above . Just like Kubernetes it also supprots operators through which vendors and ISG can publish their software artefacts and packages to onboard it . This is also a way through which 5G CNF will be onboarded through NFVO (NFV Orchestrator) to the Infrastructure. This is exciting way to manage applications through easy to play charts , template files and easy to manage and control dependencies .

So let us try to understand some key concepts on Helm charts and Helm Operators.

4

Source: RedHat

Helm Charts:

A Helm chart is a single repository or artefact that contain all objects like deployment , services , policy , routes ,PV’s etc into a single .tgz (ZIP) file that can be instantiated on the fly . Helm also supprots aggregation concept which means you can either deploy each micro service or a collection of them altogether through one deployment process . The later is important specially in Telecom CNF’s . A good collection of helm charts are available at https://github.com/helm/charts which we can customize and also integrate with CI pipeline like Jenkins to do all operations on the fly .

When it comes to telecom and 5G CNF’s it is important to understand following terms before understanding contents of the package

5

Source: K8S and ETSI NFV Community

3

Source: Kodecloud and Project experience

Chart: A collection of resources which are packaged as one and will be used to run an application, too or service etc

Repo: A collection like an RPM used to manage and distribute resources as packages. Satellite can be used to integrate both VIM and CIM Repos in a 5G world

Release: A helm supprots to run a Canary release in a Telco environment, each time a chart is instantiated obviously including incremental changes each time will be considered a Release

Helm latest version is 3.0 release in ONS North Americas In Nov 2019 and includes a major change like removal of Tiller (Major security bottleneck) which was major impediment to use helm on more secure clusters.

Just like VNFD and NSD which follows ETSI ® SOL1 and SOL4 which defines VNF packages and its structure using TOSCA in Kubernetes we follow helm chart standard which YAML descriptors and structure that can be instantiates using helm create chart name , further it can be enriched and customized as per need , the mandatory manifests are values.yaml contains details like IP’s ,networks , template.yaml consumes the values ,chart.yaml the master file to manage charts , NOTES.txt  a comment files and Test.yaml to conduct chart testing once deployed . requirements.yaml is a file that list the dependencies

Happy and ready to apply your own helm charts , then try this out https://hub.helm.sh/charts?q=ericsson .  Although helm charts provide an easy way to manage applications however not all the changes are acceptable directly specially for the case of stateful CNF’s which are very relevant to the Telecom use case. In this case we need to use the Helm operator which first version 1.0 is GA now and let us discuss its key points below. Similarly Kubernetes operator need to be installed first via CRD’s , Helm charts behave in the same manner with a difference it is installed using Software developer provided charts .

 Helm Operators:

A helm chart and its packaging can be compared to Functions of Kubernetes operator which makes it easy to deploy and manage application across its life cycle using CRD and customer defined definition .

The helm operator is doing the next step of what Kubernetes is by enabling complete GitOps  for helm .It focused on defining a custom resource for the helm release itself thereby making it simple to manage complete artefacts as it is being deployed and managed .

As of April 2020 following are major features already added in Helm1.0 Operator

  • Declaratively installs, upgrades, and deletes Helm releases
  • Pull charts from anychart source;
  • Public or private Helm repositories over HTTP/S
  • Public or private Git repositories over HTTPS or SSH
  • Any other public or private chart source using one of the availableHelm downloader plugins
  • Allows Helm values to be specified;
  • In-line in the HelmRelease resource
  • In (external) sources, e.g. ConfigMap and Secret resources, or a (local) URL
  • Automated purging on release install failures
  • Automated (optional) rollback on upgrade failures
  • Automated image upgradesusing Flux
  • Automated (configurable) chart dependency updates for Helm charts from Git sources on install or upgrade
  • Detection and recovery from Helm storage mutations (e.g. a manual Helm release that was made but conflicts with the declared configuration for the release)
  • Parallel and scalable processing of different Helm Release resources using workers

Source: http://www.weave.works

Helm Operator can also work with other Kubernetes operators and address any dependency constraints infact all those can be expressed as part of the Chart itself. This is certainly needed in CNF’s and Telco use cases where there are lot of dependencies between API versions and cluster components for all rolling updates and each of this will require testing and validation. Traditional Helm obviously can not address it and it is almost impossible for user to address all such changes in an ever changing and complex world of network meshes, Helm operator ensures these requirements are fulfilled with in the GitOps frameworks.

Helm basic commands:

Below is a good jump start to some of basic helm commands .

  • helm repo add

command to add a helm chart from a repo

  • helm create chart-name

command to add a helm chart , a directory with basic files

  • helm install –dry-rundebug ./mychart

Run dry run to install and show debug instructuctions

  • helm package ./mychart

Will prepare the .tgz package and a user can install the application from this package.

  • helm get all UPF -n CNF

Will retrieve the details of application deployed via helm in a give NS

  • helm –help

Want to know all just try it out

Conclusion:

Although I have explained the Helm and Kubernetes in a way that one can believe that Helm chart is the replacement of Operator which is not the case. Infact the Helm is mainly aimed to deploy and manage Day1 tasks while still along the LCM of application you rely on CRD’s and Operators with one caveat why I do not like is that each time a new CRD we have to install and manage them. It will definitely change over time as Helm operator will target to resolve for most of day2 issues and that’s why I will encourage to get involved in Kubernetes SIG community.

Finally, as we will standardize the Dev Pipeline for Telco’s as well which is still too much invisible to us it will enable us to build hybrid cloud environment that will certainly solve so many fundamental architecture and business challenges. As an example, in the COVID-19 scnerio so many of the business face challenge to expand their networks to cater for increased load. If Telco’s already have figured out this pipeline it would have been both economical and responsible to share load between Onprem and Public cloud to address the hour need. This is why the journey to Hybrid cloud and software package standardization and delivery is too vital for both growth and sustainability of the Telco industry and national growth.

References:

ETSI NFV IFA29

@Oreily Kubernetes book sponsored by RedHat

https://medium.com/

https://www.weave.works/blog/introducing-helm-operator-1-0

https://www.digitalocean.com/

The comments in this paper do not reflect any views of my employer and sole analysis based on my individual participation in industry, partners and business at large. I hope sharing of this information with the larger community is the only way to share, improve and grow. Author can be reached at snasrullah@swedtel.com