How APAC Telco’s are applying Data, Cloud and Automation to define Future Mode of Operations

As CSPs continue to evolve to be a digital player they are facing some new challenges like size and traffic requirements are increasing at an exponential pace, the networks which were previously only serving telco workloads are now required to be open for a range of business, industrial and services verticals.  These factors necessitate the CSPs to revamp their operations model that is digital, automated, efficient and above all services driven. Similarly, the future operations should support innovation rather than relying on offerings from existing vendor operations models, tool and capabilities.

As CSPs will require to operate and manage both the legacy and new digital platforms during the migrations phase hence it is also imperative that operations have a clear transition strategy and processes that can meet both PNF and VNF service requirements with optimum synergy where possible.

In work done by our team with our customers specifically in APAC , future network should address the following challenges for its operations transformations.

  • Fault Management: Fault management in the digital era is more complex as there are no dedicated infrastructure for the applications. The question therefore arises how to demarcate the fault and corelate cross layer faults to improve O&M troubleshooting of ICT services.
  • Service Assurance: The future operations model requires being digital in nature with minimum manual intervention, fully aligned with ZTM (Zero Touch Management) and data driven using principles of closed loop feedback control.
  • Competency: To match operation requirements of future digital networks the skills of engineers and designers will play a pivotal role in defining and evolving to future networks. The new roles will require network engineers to be more technologist rather than technicians
  • IT Technology: IT technology including skills in data centers, cloud and shared resources will be vital to operate the network. Operation teams need to understand impacts of scaling elasticity and network healing on operational services

Apply ODA (Open Digital Architecture) for Future Operations Framework (FMO)

TM Forum ODA A.K.A Open Digital Architecture is a perfect place to start but since it is just an architecture and can lead to different implementation and application architecture so below i will try to share how in real brown field networks it is being applied . I will cover all modules except for AI and intelligent management which i shall be discussed in a separate paper .

Lack of automation in legacy telco networks is an important pain point that needs to be addressed promptly in the future networks. It will not only enable CSPs to avoid the toil of repetitive tasks but also allow them to reduce risks of man-made mistakes.

In order to address the challenges highlighted above it is vital to  develop an agile operations models that improves customer experience , optimize CAPEX , AI operations and Business process transformation

Such a strategic vision will be built on an agile operations model that can fulfill the following:

  • Efficiency and Intelligent Operation:  Telecom efficiency is based on data driven architectures using AI and providing actionable information and automation, Self-Healing Network capability and automation of network and as follows
    •  Task Automation & Established foundation
    •  Proactive Operation & Advanced Operation
    •  Machine managed & intelligent Operation.
  • Service Assurance: Building a service assurance framework to achieve an automated Network Surveillance, Service Quality Monitoring, Fault Management, Preventive Management and Performance Management to ensure close loop feedback control for delivery of zero touch incident handling.
  • Operations Support: Building a support framework to achieve automated operation acceptance, change & configuration management.

Phased approach to achieve Operational transformation

Based on the field experience we achieved with our partners and customers through Telecom transformation we can summarize the learning as follows

  • People transformation: Transforming teams and workforce that matches the DevOps concept to streamline organization and hence to deliver services in an agile and efficient manner. This is vital because 5G , Cloud and DevOps is a journey of experience not deployment of solutions , start quickly to embark the digital journey
  • Business Process transformation: Working together with its partners for unification, simplification and digitization of end-to-end processes. The new process will enable Telco’s  to quickly adapt the network to offer new products and to reduce time for troubleshooting.
  • Infrastructure transformation: Running services on digital platforms and cloud, matching a clear vision to swap the legacy infrastructure.

If PNF to VNF/CNF migration is vital the Hybrid Network management is critical

  • Automation and tools: Operations automation using tools like  workforce management, ticket management etc is vital but not support vision of full automation . The services migration to cloud will enable automated delivery of services across the whole life cycle. Programming teams should   join operations to start a journey where the network can be managed through power of software like Python, JAVA, GOLANG and YANG models. It will also enable test automation, a vision which will enable operations teams to validate any change before applying it to live network.

Having said this i hope it shall serve as a high level guide for architects adressing operational transformation , as we can see AI and Intelligent Managmeent is vital piece of it and i shall write on this soon .

Telco Cloud and Application characteristics for 5G and Edge Native World

@ETSI

Cloud computing use in the Telecom industry has been increasingly adopted during the last decade. It has changed many shapes and architectures since the first phase of NFV that started back in 2012. In today’s data hungry world there is an increasing demand to move Cloud architectures from central clouds to loosely coupled distributed clouds both to make sense from cost perspective by slashing transport cost to anchor all user traffic back to central data centers but also certainly from security perspective where major customer prefers to keep data on premises. Similarly, with the Hyperscale’s and public cloud providers targeting Telco industry it is evident that the future Cloud will be a fully distributed and multi cloud constituted by  many on-premise and public cloud offerings.

Since 5G by design is based on Cloud concepts like 

  • Service based architectures
  • Micro services
  • Scalar
  • Automated

Hence it is evident that many operators are embarking on a journey to build open and scalable 5G clouds that are capable to handle the future business requirements from both Telco and industry verticals. The purpose of this paper is to highlight the key characteristics of such Clouds and how we must collaborate with rich ecosystem to make 5G a success to achieve industry4.0 targets.

Cloud Native Infrastructure for 5G Core and Edge

Cloud native do not refer to a particular technology but a set of principles that will ensure future Applications are fully de-coupled from the Infrastructure, on atomic level it can a VM or container or may be futuristic serverless and unikernels. As of today, the only community accepted Cloud native standard for 5G and Cloud is an OCI compliant infrastructure. In general cloud native for Telco means a Telecom application as per 3GPP , IETF and related standard  that meets criteria of Cloud native principles as shared in this paper, support vision of immutable infrastructure, declarative and native DevSecOps for the whole Infrastructure.

Cloud native is the industry defacto for develop and deliver applications in the Cloud and since 5G by its design is service based and microservice enabled so the basic principle for 5G infrastructure is Cloud native which will support scalability, portability, openness and most importantly flexibility on board a wide variety of Applications.

As per latest industry studies the data in 5G era will quadruple every year this will make Cloud native a necessity to provision automated infrastructures that will be fully automated, support common SDK’s and above all will enable CI/CD across the full application life cycle

Scalability to deploy services in many PoP’s is the other key requirements for 5G along with possibility to build or tear the service on the fly. As 5G deployments will scale so is cloud instances and it is a necessity that future Cloud infrastructure can be scaled and managed automatically

Application portability is the other key characteristics of 5G cloud. As 5G use cases will become mature there is an increasing requirement to deploy different applications in different clouds and to connect them is a loosely based meshes. In addition, as Network capacities and usage will increase the applications must be capable to move across the different clouds

What Cloud means for Telco 5G

Telco operators through their mission critical infrastructure holds a seminal place in the post covid-19 digital economy. Telecom networks use impacts economy, society, commerce and law order directly this is why Telecom networks are designed with higher availability, reliability and performance.

The biggest challenge for Cloud Native Infrastructure for

Telco lies

  • Granularity of Telco App decomposition
  • Networking
  • Performance acceleration
  • O&M and Operational frameworks

Due to the reason that Telco 5G applications need to fulfill special SLA based performance and functions which somehow are not possible in the containerized and Kubernetes based Cloud platforms of today so we must define a Telco definition of Cloud. Similarly, how we will connect workloads E-W is very important. The questions become more prevalent  as we move towards edge .  The downside is that any deviation from standard Cloud native means we cannot achieve the promised of Scaling, performance and distribution the very purpose for which we have built these platforms ,

Any tweaks on the cloud principles means we can not provision and manage a truly automated Cloud Infrastructure following DevSecOps which is so vital to deliver continuous updates and new software codes in the 5G infrastructure. Lacking such functions means we can not meet fast pace innovation requirements which are necessary for the 5G new use cases specially for the vertical markets

The last and most important factor is leveraging advances from hyper scalers to achieve Cloud and 5G deployments , today we already see a movement in market where a carrier  grade Clouds from famous distros like IBM can be deployed on top of public clouds but here top question  impacting is whether  “abstraction will impact performance” ,  the one top reason NFV first wave was not such disruptive because we defined so many models and used model to define another model which obviously added to complexity and deployment issues . Cloud Native for 5G Telco need to address and harmonize it as well

Applications for 5G

Application economy is vital for the success of 5G and Edge . However, based on T1 operators’ deployments of open 5G platforms has revealed that just deploying a open Infrastructure is not enough as adherence of Cloud by application vendors will vary and to truly take advantage of Cloud it is vital to define principles for a Infrastructure lead delivery by devising frameworks and tools to test and benchmark the 5G applications  classification as Gold , Bronze , Silver with common direction to achieve a fully gold standard applications in the 5G era . Although Cloud native by principles support vision to achieve common, shared and automated infrastructure but it is easier said than done in real practice as achieving a Telco grade conformance for

Telco services is complex that require rigorous validation and testing. Based on real Open 5G cloud deployments and corresponding CNF benchmarking there are still certain gaps in standards that need  both standardization and testing.

  • Application resources over committing
  • Application networking dependence that slows scaling
  • Use of SDN in 5G Cloud
  • Lack of Open Telemetry which makes customized EMS mandatory
  • Hybrid management of VNF and CNF’s

Luckily there are a number of industry initiatives like CNCF ConformanceCNTT RI2  , NFV NOC ,  OPNFV  which fundamentally address this very issues and already we have seen the results  . It is vital that 5G Cloud infrastructures are capable to support east to use SDK’s and tools that vendors and developers can use flexibly to offer and deploy different applications in the 5G era.

In Next Part I shall try to elaborate how Open Telemetry and Automation is driving next era of growth using ML and AI driven solutions

P4 programming with Intel 3rd Gen can help build standard Telco Edge Infrastructure Models

Retrieving data in the edge computing environment.

Recent PoC’s of  P4 programming models with multiple registers alongside Intel 3rd gen  processors and agileX(10nm) gave critical indications about Future delivery model of 5G Edge Infrastructure and Networks by solving following Telco requirements  

Telco Key Edge Infrastructure Requirements

  1. Workload placements at edge with possibility to make them portable based on real time use
  2. Enhancing Infrastructure models and form factors for the Edge
  3. Evaluate P4 as baseline to build unified model to program Compute/Network , VNF/CNF got real time use cases like traffic steering

Results with Smart NICS and Latest X86 Chips

Below are key findings

  1. The hardware can deliver consistent throughput (Required 10Gbps per core ) till 10 register pipelines and degrades exponentially after it @12% for each pair of registers after this
  2. Impact on latency is prominent as we increase registers like 40-50% variation after register >10
  3. P4 architecture with Intel 3rd generation can help us solve and optimize for this issues dramatically

Key Achievements with P4 in last year

  1. Unified programming model from Core to Cloud to Edge demo
  2. Vendors like Xilinx ,Intel , Dell , barefoot etc fully supporting accelerate
  3. 12Tbps programmable chips available for PoC at least in Labs
  4. P4 can build CI/CD for Hardware infrastructure to ensure Infra resilience

Current Challenges and Targets in 2021

Below are still some gaps that is expected to be addressed to ensure Telco Grade Edge

  1. P4 is good in Core but in Edge it needs needs to be improved especially Common API models
  2. Performance on Latency is key to build Edge Infrastructure
  3. P4 is not a modeling language but switching model language how to abstract it on service level is a issue
  4. VNF partners ecosystem specially drivers on Cloud and VNF side
  5. Can GPU be of help solve multiple register pipelines
  6. How P4 can work with IPV4 to build use cases like Slicing

Finally most important need that need more cohesive community effort is Telemetry and topology till now we only have less references like from deepinsights on P4 refer to below

References

  • Advanced Information Networking and Applications: Proceedings , Volume 1
  • Toyota Edge Proceedings
  • IEEE

Delivering Edge Architecture Standardization in Edge user group

Edge Deployments are gaining momentum in Australia APAC and rest of markets , due to sheer size of Edge there are new challenges and opportunities for all players in the Ecosystem including

  • Hardware Infrastructure suppliers i.e Dell, HPE , Lenovo etc
  • On-prem Cloud vendors like RedHat ,VMware
  • Hybrid Cloud companies like IBM , Mirantis
  • Public Cloud and Hyperscalers like AWS , Azure , Google etc
  • SI’s like Southtel , Tech Mahindra etc

However one thing which a Telco community need to do is to make a standard architecture and specifications of Edge that will support not only build a thriving ecosystem but also achieve promises of global scale and developer experience . Within Open Infrastructure community we have been working with in Open Infra Edge computing group to achieve exactly this .

Focus Areas

Following is the Scope and areas we are enabling today

  • Defining Reference Architectures for Edge Delivery model in the form of Reference Architectures , Reference Model and Certification process where we are working together with #GSMA and #Anuket in Linux Foundation
  • Defining Use cases based on real RFX and Telco customer requirements
  • Requirements prioritization for each half year
  • Enabling Edge Ecosystem
  • Output the White paper specially on Implementation and Testing Frameworks

Edge Architectures

Alongside Linux foundation Akraino blueprints we are enabling blue prints and best practices in Edge user group however we are emphasizing that the Architecture remains as vendor agnostic as possible with different flavors and vendors solving following challenges Edge Computing Group – OpenStack

  • Life-cycle Management. A virtual-machine/container/bare-metal manager in charge of managing machine/container lifecycle (configuration, scheduling, deployment, suspend/resume, and shutdown). (Current Projects: TK)
  • Image Management. An image manager in charge of template files (a.k.a. virtual-machine/container images). (Current Projects: TK)
  • Network Management. A network manager in charge of providing connectivity to the infrastructure: virtual networks and external access for users. (Current Projects: TK)
  • Storage Management. A storage manager, providing storage services to edge applications. (Current Projects: TK)
  • Administrative. Administrative tools, providing user interfaces to operate and use the dispersed infrastructure. (Current Projects: TK)
  • Storage latency. Addressing storage latency over WAN connections.
  • Reinforced security at the edge. Monitoring the physical and application integrity of each site, with the ability to autonomously enable corrective actions when necessary.
  • Resource utilization monitoring. Monitor resource utilization across all nodes simultaneously.
  • Orchestration tools. Manage and coordinate many edge sites and workloads, potentially leading toward a peering control plane or “selforganizing edge.”
  • Federation of edge platforms orchestration (or cloud-of-clouds). Must be explored and introduced to the IaaS core services.
  • Automated edge commission/decommission operations. Includes initial software deployment and upgrades of the resource management system’s components.
  • Automated data and workload relocations. Load balancing across geographically distributed hardware.
  • Synchronization of abstract state propagation Needed at the “core” of the infrastructure to cope with discontinuous network links.
  • Network partitioning with limited connectivity New ways to deal with network partitioning issues due to limited connectivity—coping with short disconnections and long disconnections alike.
  • Manage application latency requirements. The definition of advanced placement constraints in order to cope with latency requirements of application components.
  • Application provisioning and scheduling. In order to satisfy placement requirements (initial placement).
  • Data and workload relocations. According to internal/external events (mobility use-cases, failures, performance considerations, and so forth).
  • Integration location awareness. Not all edge deployments will require the same application at the same moment. Location and demand awareness are a likely need.
  • Dynamic rebalancing of resources from remote sites. Discrete hardware with limited resources and limited ability to expand at the remote site needs to be taken into consideration when designing both the overall architecture at the macro level and the administrative tools. The concept of being able to grab remote resources on demand from other sites, either neighbors over a mesh network or from core elements in a hierarchical network, means that fluctuations in local demand can be met without inefficiency in hardware deployments.

Edge Standards under Review

Although owing to Carrier grade Telco service requirements on the Edge preference has always been on StarlingX and this is what are maturing to GA but there are many other standards we are standardizing at the Edge as follows

StarlingX

Complete cloud infrastructure solution for edge and IoT
• Fusion between Kubernetes and OpenStack
• Integrated stack
• Installation package for the whole stack
• Distributed cloud support

K3S and Minimal Kubernetes

  • Lightweight Kubernetes distribution
  • Single binary
  • Basic features added, like local storage provider, service load balancer, Traefik ingress controller
  • Tunnel Proxy

KubeEdge specially for IOT

  • Kubernetes distribution tailored for IoT
  • Has orchestration and device management features
  • Basic features added, like storage provider, service loadbalancer, ingress controller
  • Cloud Core and EdgeCore

Submariner

  • Cross Kubernetes cluster L3 connectivity over VPN tunnels
  • Service discovery across clusters
  • Connects clusters with overlapping CIDR-s

Call for Action

Weekly meeting on Mondays at 6am PDT / 1300 UTC
https://wiki.openstack.org/wiki/Edge_Computing_Group#Meetings
● Join our mailing list and IRC channel for more edge discussions
http://lists.openstack.org/cgi-bin/mailman/listinfo/edge-computing
○ #edge-computing-group channel on Freenode

Procedure to Join MIRC Channel

Following are the Steps to join as many guys reported they find issues in MIRC latest version after 7.5 so i wanted to give some summary here

Step1: Registration and Nickname setitngs

You may see some notices from Nickserv that the nick you use is already taken by someone else. The notice looks like this: Nickserv notice Well in this case you need to choose another nickname. You can do this easily by typing

/nick nick_of_your_choice

/nick john_doe

Nickserv will keep telling you this notice until you found a nick, which is not registered by someone else. If you want to use the same nick every time when you connect you may register it. The service called Nickserv handles the nicks of all registered users of the Network. The nick registration is free and you just need an email to confirm that you are a real person. To register the nick you currently use type

/nickserv register password email

/nickserv register supersecret myemail@address.net

Note: Your email address will be kept confidential. We will never send you spam mails or mails were we request private data (like passwords, banking accounts, etc). After this you will see a notice from nickserv telling you this:

– NickServ – A passcode has been sent to myemail@address.net, please type /msg NickServ confirm <passcode> to complete registration

Check your email account for new mails. Some email providers like hotmail may drop our mail sent by our services into your spamfolder. Open the mail and you will find a text like this:

Hi, You have requested to register the following nickname some_nickname. Please type ” /msg NickServ confirm JpayrtZSx ” to complete registration. If you don’t know why this mail is sent to you, please ignore it silently. PLEASE DON’T ANSWER TO THIS MAIL! irchighway.net administrators.

Just copy and paste the part /msg NickServ confirm JpayrtZSx into your status window of you mIRC. Then press the enter key. A text like:

– *NickServ* confirm JpayrtZSx – 
– NickServ – Nickname some_nickname registered under your account: *q@*.1413884c.some.isp.net –
– NickServ – Your password is supersecret – remember this for later use.
– * some_nickname sets mode: +r

should appear after this. This means you finished your registration and the nick can only be used by you or you can force someone else if he/she uses your nick to give it back to you. If you disconnect then you need to tell nickserv that the nick is yours. you can do that by:

/nickserv identify password e.g. /nickserv identify supersecret

if the password is correct it should look like this:

* some_nickname sets mode: +r – 
– NickServ – Password accepted – you are now recognized.

In mIRC you can do the identification process automatically so you don’t have to care about this anymore. Open the mIRC Options by pressing he key combination Alt + O then select the category Options and click on Perform you will see this dialog: Perform window

Check Enable perform on connect and add: if ($network == irchighway) { /nickserv identify password } in the edit box called Perform commands Close the options by clicking OK. Now your mIRC will automatically identify you every time you connect to IRCHighway.

Step2: Setting SAS/CAP authentication

mIRC added built-in SASL support in version 7.48, released April 2017. The below instructions were written for version 7.51, released September 2017. Earlier versions of mIRC have unofficial third-party support for SASL, which is not documented here. freenode strongly recommends using the latest available version of your IRC client so that you are up-to-date with security fixes.

In the File menu, click Select Server…
In the Connect -> Servers section of the mIRC Options window, select the correct server inside the Freenode folder, then click Edit
In the Login Method dropdown, select SASL (/CAP)
In the second Password box at the bottom of the window, enter your NickServ username, then a colon, then your NickServ password. For example, dax:hunter2
Click the OK button

Step3: Joining Channel

Following command to join the channel , best of luck

/connect chat.freenode.net 6667 SID_SAAD:XYZPASSWORD
/join #edge-computing-group

References

  1. https://gist.github.com/xero/2d6e4b061b4ecbeb9f99
  2. https://irchighway.net/14-blog/gaming/14-i-m-new-to-irc
  3. https://freenode.net/kb/answer/mirc
  4. https://www.delltechnologies.com/en-au/solutions/edge-computing/index.htm?gacd=9685819-7002-5761040-271853941-0&dgc=ST&gclid=CjwKCAjwqIiFBhAHEiwANg9szpqx5CQ3z_Q5oeI1eTXLtfXVNDBJSj_vNinJFO7667YIywxAQIlPARoCIogQAvD_BwE&gclsrc=aw.ds
  5. https://www.redhat.com/en/topics/edge-computing/approach
  6. https://aws.amazon.com/edge/
  7. Kubecon Euope April 2021 session by Ildikó Váncsa (Open Infrastructure Foundation) – ildiko@openinfra.dev and colleague Gergely Csatári (Nokia) – gergely.csatari@nokia.com

Understanding Openshift-4 installation for Developer and Lab Environments

As Linux is the defacto OS for innovation in the Datacenters sameway the OpenSHift is proving to be a Catalyst for both Enterprise and Telco’s Cloud transformation . In this blog i will like to share my experience with two environments one is minishift that is a home brew environment for developers and others based on Pre-existing infrastructure .

As you know Openshift is a cool platform as a part of these two modes it support a wide variety of deployment options including hosted platforms on

  • AWS
  • Google
  • Azure
  • IBM

However for hosted platforms we will use full installers with out any customization so this is simply not complex provided you must use only Redhat guide for deployment.

Avoid common Mistakes

  • As a pre requisite you must have a bastion host to be used as bootstrap node
  • Linux manifest NTP , registry ,key should be available while for Full installation the DNS is to be prepared before cloud installer kicks in .
  • Making ignition files on your own (Always use and generate manifest from installers)
  • FOr Pre-existing the Control plane is based on Core OS while workers can be RHel or COreOS while for full stack everything including workers must be based on CoreOS
  • Once started installation whole cluster must be spinned within 24hours otherwise you need to generate new keys before proceed as controller will stop ping as license keys have a 24hour validity
  • As per my experience most manifest for full stack installation is created by installers viz. Cluster Node instances , Cluster Networks and bootstrap nodes

Pain points in Openshift3 installation

Since most openshift installation is around complex Ansible Playbooks , roles and detailed Linux files configuration all the way from DNS , CSR etc so it was a dire need to make it simple and easy for customers and it is what RedHat has done by moving to Opinionated installation which make it simple to install with only high level information and later based on each environment the enterprise can scale as per needs for Day2 requirements , such a mode solves three fundamental issues

  • Installer customization needs (At least this was my experience in OCP3)
  • Full automation of environment
  • Implement CI/CD

Components of installation

There are two pieces you should know for OCP4 installation

Installer

Installers is a linux manifest coming from RedHat directly and need very less tuning and customization

Ignition Files

Ignition files are first bootstrap configs needed to configure both the bootstrap , control and compute nodes .If you have managed the Openstack platform before you know we need separate Kickstart and cloud-init files and in Ignition files process RedHat makes simple both steps . For details on Ignition process and cluster installation refer to nice stuff below

Minishift installation:

Pre-requisites:

Download the CDK (RedHat container development Kit) from below :
https://developers.redhat.com/products/cdk/hello-world/#fndtn-windows

  1. copy CDK in directory C:/users/Saad.Sheikh/minishift and in CMD go in that directory
  2. minishift setup-cdk
  3. It will create .minishift in your path C:/users/Saad.Sheikh
  4. set MINISHIFT_USERNAME=snasrullah.c
  5. minishift start –vm-driver virtualbox
  6. Add the directory containing oc.exe to your PATH
    1. FOR /f “tokens=*” %i IN (‘minishift oc-env’) DO @call %i
  7. minishift stop
  8. minishift start
  9. Below message will come just ignore it and enjoy
    error: dial tcp 192.168.99.100:8443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. – verify you have provided the correct host and port and that the server is currently running.
    Could not set oc CLI context for ‘minishift’ profile: Error during setting ‘minishift’ as active profile: Unable to login to cluster
  10. oc login -u system:admin

The server is accessible via web console at:
https://192.168.99.100:8443/console

You are logged in as:
User: developer
Password:

To login as administrator:
oc login -u system:admin

Openshift installation based on onprem hosting

This mode is also known as UPI (User provided infrastructure) and it has the following the key steps for OCP full installation

Step1: run the redhat installer

Step2: Based on manifests build the ignition files for the bootstrap nodes

Step3: The control node boots and fetches information from the bootstrap server

Step4: The etcd provisioned on control node scales to 3 nodes to build a 3 control nore HA cluster

Finally the bootstrap node is depleted and removed

Following is the scripts i used to spin my OCP cluster

1#@Reboot the machine bootstrap during reboot go to PXE and install CoreOS

2#openshift-install --dir=./ocp4upi

3@rmeove the bootstrap IP's entries from /etc/haproxy/haproxy.cfg 
4# systemctl reload haproxy

5#set the kubeconfig ENV variables 
6# export kubeconfig=~/ocp4upi/auth/kubeconfig

7# verify the installation 
8# oc get pv
9# oc get nodes
10# oc get custeroperator

11#approve any CSR and certificates 
12# oc get csr -o go-template='{{range.items}}{{if no .status}}{{.metadata .name}}{{""\n""}}{{end}} | xargs oc adm certificate approve

13#login to OCP cluster GUI using 
https://localhost:8080

Do try it out and share your experience what you think about OCP4.6 installation .

Disclaimer: All commands and processes i validated in my home lab environment and you need tune and check your environment before apply as some tuning may be needed .

Early lessons from Open RAN Deployment in brownfield ,a must delivery model to untap 5G Scale,Complexity and economy in 2021+

With most of Tier1 Operators rolling the ball for early experience of 5G Standalone based on 3GPP Release-16 that can offer very new and unique experiences to 5G  around uRLLC and mIOT along with improved QOS for broadband the whole industry is looking forward to accelerate 5G adoption in 2021 .

This is an ideal time for the industry to find new ways to improve human life and economy in the post covid world . However the biggest problem with 5G experience so far has been the Delivery model that can offer both Cost and Scale advantages . With the hyperscalers eyeing the Telco billion dollar market it is necessary for all vendors and Telco’s themselves to analyze and find ways towards new business models that are based on

  • Coherence
  • Data Driven
  • Service Quality

By coherence I mean when we go to 5G Radio site what components will be residing there

  • A 5G Radio site components
  • Edge Applications
  • Network connectivity

By Data I mean till now depsite dleiverying thouand’s of 5G sites we can not offer DaaS(Data as a service) to vertical’s the only visibiltiy we have is on horizontal interfaces in a 3GPP way .

The third and the most important piece is RF and RAN service . You talk to any RAN guy and he is not interested in both Coherence and Data unless we answer him a RAN service that is atleast same lr better the legacy .

This makes the story of Open RAN very exciting to analyze and this is the  story of my experience of leading such projects during last years in both Company and Industry . This is my humble opinion that Open RAN and other such innovative solutions must not be analyzed only through tehnology but an end to end view where some use cases are specificcally requiring such solutions to be succcessful .

Why Open RAN

For me the Open RAN is not about Cloud but finding a new and disruptive Delivery model to offer RaaS (RAN as a service). Have you ever imagined what will happen if a Hyperscaler like AWS or Azure acquire a couple of RAN companies who can build a Cloud Native RAN Applications  , the Telco can order it online and it can be spined in the Data center or on the Edge Device in a PnP(Plug and Play manner)

If you think I am exaggerating  already there are disucssions and PoC’s happening aorund this . So Open RAN is not about Cloud but about doing somethnig similar by the Telco Industry  in a Open and Carrier grade fashion .

This is where Open RAN is gaining momentum to bring power of Open , Data driven and Cloud based disaggregated solution to the RAN site .Future of Telco is a well-designed software stack that extends all the way from Core the last mile Site of Network  Crucially, it also allows for placing more compute, network and storage closer to the source of unrelenting volume of data – devices, applications, and end-users.

There is another aspect which is often overlooked which is transport cost , as per filed trial result use of Open RAN CPRI interface 7-2 increased the front haulf capacity utilziation by at least 30-40% primarily as there are lot of propietry overheads in CPRI world .

“For CXO it means  at least 30-40% direct savings on Metro and transport Costs”

What is Open RAN

5G RAN has a huge capacity requirements and naturally to scale this network disaggregation and layered network architecture is the key . Open RAN architecture can be depicted as follows .

RU:

Radio Unit that handles the digital front end (DFE) and the parts of the PHY layer, as well as the digital beamforming functionality .It is almost same architecture as in DBS (Distributed Basestation) architecture fofered by many legacy vendors . Conceptually AAU (Active antenna unit) is considered together with radio unit

DU:

Distributed unit handles the real time L1 and L2 scheduling functions mainly MAC split . it is realt time part of BBU (Baseband unit)

CU:

responsible for non-real time, higher L2 and L3 , it is non real time functions of BBU like reosurce pooling , optimization etc

RIC:

RAN Intelligent contorller is the intelligence component of Open RAN that will collect all data and offer insights and innovation through xAPPS . similarly it cooperates with NT-RIC (Non real time RIC) which is part of ONAP/MANO to offer end to end end SMO (Service management and Orchestration) functions  

Interfaces of Open RAN

When it comes to understand Open RAN we need to understand both those defined by O-RAN and those by 3GPP and this is so important as it do require cross SDO liaisonship

O-RAN interfaces

  • A1 interface is the inerface between non real time RIC /Orchestrator and RAN components
  • E2 interface is the interface between NT-RIC (RAN Intelligent Controller) and CU/DU
  • Open FrontHaul is the interface between RU and DU mostly we are focusing on eCPRI 7-2 to standardize it
  • O2 interface is the interfaces betwee NFVI/CISM and Orchestrator

3GPP interfaces

  • E1 interface  is the the interface between CU-CP (control plane) and CU-UP (user plane)
  • F1 interface is the interface between CU and DU
  • NG-c is the interface between gNB CU-UP and AMF in 5G Core Network

To solve all interface and use case issues the ORAN Alliance is working in a number of streams  to solve issues .

  • WG1: Use Cases and Overall Architecture
  • WG2: The Non-real-time RAN Intelligent Controller and A1 Interface
  • WG3: The Near-real-time RIC and E2 Interface
  • WG4: The Open Fronthaul Interfaces, 
  • WG5: The Open F1/W1/E1/X2/Xn Interface
  • WG6: The Cloudification and Orchestration
  • WG7: The White-box Hardware Workgroup
  • WG8: Stack Reference Design
  • WG9: Open X-haul Transport.
  • Standard Development Focus Group (SDFG): Strategizes standardization effort. Coordinates and liaises with other standard organizations.
  • Test & Integration Focus Group (TIGF): Defines test and integration specifications across workgroups.
  • Open Source Focus Group (OSFG): Successfully established O-RAN SC to bring developer in the Open RAN ecosystem

Early Playground of Open RAN

The change in world economy impacted by geo political factors like a drive to replace Chinese vendors from networks like in Australia for national security reasons naturally change momentum to find both less costly and high-performance systems. Naturally one of the prime networks where Open RAN will get introduced are above .It is true that still there are some gaps in Open RAN performance mainly on the Base band processing and front haul but there are some use cases in which Open RAN already proved to be successful as shown below , the key point here is that although there are some issues but with some use cases ready it is important to introduce Open RAN now and to evolve it in a pragmatic way to ensure they can coexist with traditional RAN solutions

  • Private 5G Networks
  • Rural Deployment e.g 8T8R FDD
  • In building solutions
  • Macro sites

TIP Open RAN Project Progress

TIP A.K.A Telecom Infra Project is an Open-source project looking after a number of disruptive solutions that can make 5G networks both cost efficient and innovative. Below are some highlights on our progress in the community till 2021

A1: Built the Reference architecture for Multi vendor validations

Through support of operators, vendors and partners built a complete reference architecture to test and validate the complete stack end to end and SI integration

A2: Built the Reference architecture for Multi vendor validations

Worked to define the define the complete RFX requirements for Open RAN and it can be retrieved as below

TIP OpenRAN_OpenRAN Technical Requirements_FINAL

A3: Use cases of Open RAN success

In 2020 and better part of 2021 the community have worked tirelessly to couple Open RAN with some exciting use cases to capitalize the low hanging fruits of Open RAN as follows

  1. “Context Based Dynamic Handover Management for V2X”
  2. “QoE Optimization”
  3. “Flight Path Based Dynamic UAV Resource Allocation”
  4. “Traffic Steering”
  5. “Massive MIMO Optimization”
  6. “Radio Resource Allocation for UAV Applications”
  7. “QoS Based Resource Optimization”

O-RAN+Use+Cases+and+Deployment+Scenarios+Whitepaper+February+2020

A4: Success through RIC (RAN Intelligence controller)

There are two-fold advantages of RIC introduction in Open RAN architectures mainly first for RAN automation for both Managmeent and KPI optimization and secondly bring new and disruptive use cases through xAPPS and data driven operations including

  1. Smart troubleshooting of RAN
  2. RAN parameter optimization using AI
  3. Capacity predicting
  4. New Use cases by exposing API’s to 3rd part developer

A5: RAN configuration standardizations

Benchmarked a number of RAN configurations to be deployed in the field including

  1. Small cells mainly for SME and in building
  2. Low-capacity Macro (Rural)
  3. High-capacity Macro (Highways)
  4. RAN parameter optimization using AI

Highly encourage to join TIP to learn more

https://member.telecominfraproject.com/

A6: Chip advancements brought prime time for Open RAN with 64T64R trial in 2021

Customers realy on 5G massive multiple-input and multiple-output (MIMO) to increase capacity and throughput. With Intel’s newest Intel Xeon processors, Intel Ethernet 800 series adapters and Intel vRAN dedicated accelerators, customers can double massive MIMO throughput in a similar power envelope for a best-in-class 3x100MHz 64T64R vRAN configuration”

Challenges and Solutions

In last two years we have come far to innovate and experiment with a lot of solutions on Open RAN , the list of issues we solved are huge so lets focus on only the top challenges today and how we are solving them today, let’s say to set a 2021 target our target is by the time O-RAN alliance  freezes coming releases Dawn  (June 2021)  and E Release (Dec 2021) we as a  community are able to fix those top issues as below .I apologize to leave down some less complex issues like deploy Open RAN , focus groups key interfaces specifications status specially momentum around 7-2 etc . I feel I ran of time, pages and energy and I fear I will test your patience with a bigger tome

P#1: Ecosystem issues in Open RAN

Based on our field trial we found for 8T8R we can achieve around 40% of cost reductio with Open RAN with the future transport architectures like “Open ROADM” we can build whole RAN network in open manner and achieve great cost and efficiency advantages. However when components are from different vendors the non-technical factors to make it successful is really challenging e.g

  • How to convince all team including RAN guys that what we are doing is right J and we should get rid of the black boxes
  • How to let all partners work together in a team which are historically competitors
  • Making software integration teams

P#2: Radio Issues in Open RAN

The Site swap scenarios in most brown field environments require efficient Antennas and Radios that are

  • 2G + 3G +4G +5G supported
  • Massive Mimo and beam forming support
  • Low cost

This is a fact that till now most of effort has been on the DU/CU part but now we need more attention solving the radios and antennas issues .

Lesson we learnt in 2020-2021 is that everything can not be solved by the software as finally every software need to run over a hardware . An increased focus on Radio , Antennas and COTS hardware is a must to acceerate any software innovation

P#3: Improve RAN KPI

No disruptive solution of Open RAN acceptable unless it can deliver a comparable performance to legacy systems like coverage, speed , KPI .

To make Open RAN main stream DT, tools, RAN benchmarking all need to be addressed and not only the Cloud and automation part 

P#4:SI and certification process

We already witnessed a number of SI forms and capabilities during NFV PSI however for a disruptive solution like Open RAN it need a different approach and SI should possess following

  • Complete vertical stack validation

Its not just the Cloud or the hardware but the end to end working solution that is required

  • Stack should include Radios and Hardware

Certification should consider RF/radio and hardware validation

  • Software capability and automation

To make it successful it is very important that SI is rich on bothy tools and capabilities on automation ,data and AI

Source: mavenir

P#5:Impacts of Telco PaaS and ONAP

To make Open RAN real success it is very important to consider this while building capabilities and specifications of other reference Architectures most of which are Telco PaaS and ONAP . If i go to explain this part i fear the paper length will become too long and may skew towards something not necessarily a RAN issue .

However just to summarize the ONAP community has been working closely with Open RAN to bring the reference architecture in upcoming release

and see some of agreed WI’s in below

https://wiki.onap.org/display/DW/5G+Use+Case+Meeting+notes+for+May+to+Dec+2020

Finally for Telco PaaS we are also working to include Telemetry , Packaging and Test requirements for Open RAN stack . Those who interested in these details kindly do check my early paper below

Open RAN a necessity for 5G era

Early experience with 5G proved that it is about scale and agility, with cost factors driving operators towards an efficient delivery model that is agile, innovative and that can unleash the true potential of network through Data and AI.

In additon as time will pass and more and more use cases will require integration of RAN with 3rd party xAPPs it will be definite need to eolvve to a architecture like Open RAN that will not only support coexistence and integration with legacy systems but also support fast innovation and fleibiltiy over time  . With early successful deployments of Open RAN already happened in APAC and US its improtant for all industry Catch the Momentum

Those who are proponents of closed RAN systems often say that an Open system can never compare with monolithic and ASIC based systems performance, similarly they claim the SI and integration challenge to stitch those systems far outweigh any other advantage.

The recent advantages in Silicon like Intel FLEX architecture with ready libraries like OPENESS and Open Vino has really made it possible to achieve almost same performance like monolithic RAN systems.

Above all latest launch of #intel 3rd generation XEON processors are proving to be a game changer in bringing COTS to the last mile sites .

 Above all involvement of SI in the ecosystem means the industry is approaching phase where all integration can be done through open API and in no time making true vision of level4 autonomous network possible. 

TermDescription
DUDistributed Unit
CUControl Unit
CP & UPControl Plane and User Plane
A & AIActive and Available Inventory
CLAMOControl Loop Automation Management Function
NFVINetwork Function Virtualized Infrastructure
SDNSoftware Defined Networks
VLANVirtual LAN
L2TPLayer2 Tunneling Protocol
SBIService Based Interface
NRFNetwork Repository Function
NEFNetwork Exposure Function
NATNetwork Address translation
LBLoad Balance
HAHigh Availability
PaaSPlatform as a Service
ENIEnhanced Network Intelligence
ZSMZero touch Service Management
EFKElastic search, Fluent and Kibana
APIApplication Programming Interface

Evaluating Gaps and Solutions to build Open 5G Platforms and Capabilities

Since the release of much awaited 3GPP Release-16   in June last year lot of vendors have proliferated their products and brought their 5G SA A.K.A Standalone products to market and with promises like support of Slicing , massive IoT , uRLLC and improved , Edge capability ,NPN and IAB backhauling it is just natural all big Telco’s in APAC and globally have already started their journey towards 5G Standalone core . However, most of the commercial deployments are based on vendor E2E stack which is a good way to start journey and offer services quickly however with the type of services and versatility of solution specially on the industry verticals required and expected from both 3GPP Release16 and SA Core it is just a matter of time when one vendor cannot fulfill all the solutions and that is when a dire need to build a Telco grade Cloud platform will become a necessity.

During the last two years we have done a lot of work and progress in both better understanding of what will be the Cloud platforms  for 5G era , it is correct that as of now the 5G Core container platform  from open cloud perspective is not fully ready but we are also not too far from making it happen . From community Anuket  Kali that we are targeting in June is expecting to fulfill many gaps and our release cycle for XGVELA will try to close many gaps , so in a nutshell 2021 is the year where we expect a Production ready open cloud platforms avoiding all sorts of vendor lock ins .

Let’s try to understand top issues enlisted based on  5G SA deployments in Core and Edge  Vendors are mostly leveraging existing NFVI to evolve to CaaS by using a middle layer  shown Caas on Iaas , the biggest challenge is this interface is not open which means  there are many out of box enhancements done by each vendor and this is one classic case of “When open became the closed “

https://cntt-n.github.io/CNTT/doc/ref_model/chapters/chapter04.html

The most enhancement done on the adaptors for container images  are as follows

  • Provides container orchestration, deployment, and scheduling capabilities.
  • Provides container Telco  enhancement capabilities: Hugepage memory, shared memory, DPDK, CPU core binding, and isolation
  • Supports container network capabilities, SR-IOV+DPDK, and multiple network planes.
  • Supports the IP SAN storage capability of the VM container.
  1. Migration path from Caas on IaaS towards BMCaaS is not smooth and it will involve complete service deployment, it is true with most operators investing heavily in last few years to productionize the NFVi no body is really considering to empty pockets again to build purely CaaS new and stand-alone platform however smooth migration must be considered
  2. We are still in early phase of 5G SA core and eMBB is only use case so still we have not tested the scaling of 5G Core with NFVi based platforms
  3. ETSI Specs for CISM are not as mature as expected and again there are lot of out of box customizations done by each vendor VNFM to cater this.

Now lets come to point where the open platforms are lacking and how intend to fix it

Experience #1: 5G Outgoing traffic from PoD

The traditional Kubernetes and CaaS Platforms today handles and scales well with ingress controller however 5G PoD’s and containers outgoing traffic is not well addressed as both N-S and E-W traffic follows same path and it becomes an issue of scaling finally.

We know some vendors like Ericsson  who already bring products like ECFE and LB in their architecture to address these requirements.

Experience#2: Support for non-IP protocols

PoD is natively coming with IP and all external communication to be done by Cluster IP’s it means architecture is not designed for non-IP protocols like VLAN, L2TP, VLAN trunking

Experience#3: High performance workloads

Today all high data throughputs are supported CNI plugin’s which natively are like SR-IOV means totally passthrough, an Operator framework to enhance real time processing is required something we have done with DPDK in the open stack world

Experience#4: Integration of 5G SBI interfaces

The newly defined SBI interfaces became more like API compared to horizontal call flows, however today all http2/API integration is based on “Primary interfaces” .

It becomes a clear issue as secondary interfaces for inter functional module is not supported

Experience#5: Multihoming for SCTP and SI is not supported

For hybrid node connectivity at least towards egress and external networks still require a SCTP link and/or SIP endpoints which is not well supported

Experience#6: Secondary interfaces for CNF’s

Secondary interfaces raise concerns for both inter-operability, monitoring and O&M, secondary interfaces is very important concept in K8S and 5G CNF’s as it is needed during

  • For all Telecom protocols e.g BGP
  • Support for Operator frameworks (CRD’s)
  • Performance scenarios like CNI’s for SR-IOV

today only viable solution is by NSM i.e service mesh that solves both management and monitoring issues

Experience#7: Platform Networking Issues in 5G

Today in commercial networks for internal networking most products are using Multus+VLAN while for internal based on Multus+VxLAN it requires separate planning for both underlay and overlay and that becomes an issue for large scale 5G SA Core Network

Similarly, top requirements for service in 5G Networks are

  • Network separation on each logical interface e.g VRF and each physical sub interface
  • Outgoing traffic from PoD
  • NAT and reverse proxy

Experience#8: Service Networking Issues in 5G

For primary networks we are relying on Calico +IPIP while for secondary network we are relying ion Multus

Experience#9: ETSI specs specially for BM CaaS

Still I believe the ETSI specs for CNF’s are lacking compared to others like 3GPP and that is enough to make a open solution move to a closed through adaptors and plugin’s something we already experienced during SDN introduction in the cloud networks today a rigorous updates are expected on

  • IFA038 which is container integration in MANO
  • IFA011 which is VNFD with container support
  • Sol-3 specs updated for the CIR (Container image registry) support

Experience#10: Duplication of features on NEF/NRM and Cloud platforms  

In the 5G new API ecosystem operators look at their network as a platform opening it to application developers. API exposure is fundamental to 5G as it is built into the architecture natively where applications can talk back to the network, command the network to provide better experience in applications however the NEF and similarly NRF service registry are also functions available on platforms. Today it looks a way is required to share responsibility for such integrations to avoid duplicates

Reference Architectures for the Standard 5G Platform and Capabilities

Cap#1: Solving Data Integration issues   

Real AI is the next most important thing for Telco’s as they evolve in their automation journey from conditional #automation to partial autonomy . However to make any fully functional use case will require first to solve #Data integration architecture as any real product to be successful with #AI in Telco will require to use Graph Databases and Process mining and both of it will based on assumption that all and valid data is there .

Cap#2: AI profiles for processing in Cloud Infra Hardware profiles    

With 5G networks relying more on robust mechanisms to ingest and use data of AI , it is very important to agree on hardware profiles that are powerful enough to deliver AI use cases to deliver complete AI pipe lines all the way from flash base to tensor flow along with analytics .  

Cap#3: OSS evolution that support data integration pipeline    

To evolve to future ENI architecture for use of AI in Telco and ZSM architecture for the closed loop to be based on standard data integration pipeline like proposed in ENI-0017 (Data Integration mechanisms)

Cap#4: Network characteristics      

A mature way to handle outgoing traffic and LB need to be included in Telco PaaS

Cap#5: Telco PaaS     

Based on experience with NFV it is clear that IaaS is not the Telco service delivery model and hence use cases like NFVPaaS has been in consideration for the early time of NFV . With CNF introduction that will require a more robust release times it is imperative and not optional to build a stable Telco PaaS that meet Telco requirements. As of today, the direction is to divide platform between general PaaS that will be part of standard cloud platform over release iterations while for specific requirements will be part of Telco PaaS.

The beauty of this architecture is no ensure the multi-vendor component selection between them. The key characteristics to be addressed are

Paas#6: Telco PaaS Tools    

The agreement on PaaS tools over the complete LCM , there is currently a survey running in the community to agree on this and this is an ongoing study

https://wiki.anuket.io/display/HOME/Joint+Anuket+and+XGVELA+PaaS+Survey

Paas#2: Telco PaaS Lawful interception

During recent integrations for NFV and CNF we still rely on Application layer LI characteristics as defined by ETSI and with open cloud layer ensuring the necessary LI requirements are available it is important that PaaS include this part through API’s

Paas#3: Telco PaaS Charging Characteristics

The resource consumption and reporting of real time resources is very important as with 5G and Edge we will evolve towards the Hybrid cloud  

Paas#4: Telco PaaS Topology management and service discovery

A single API end point to expose both the topology and services towards Application is the key requirement of Telco PaaS

Paas#5: Telco PaaS Security Hardening

With 5G and critical services security hardening has become more and more important, use of tools like Falco and Service mesh is important in this platform

Paas#6: Telco PaaS Tracing and Logging

Although monitoring is quite mature in Kubernetes and its Distros the tracing and logging is still need to be addressed. Today with tools like Jaeger and Kafka /EFK needs to be include in the Telco PaaS

Paas#7: Telco PaaS E2E DevOps

For IT workloads already the DevOps capability is provided by PaaS in a mature manner through both cloud and application tools but with enhancements required by Telco workloads it is important the end-to-end capability of DevOps is ensured. Today tools like Argo need to be considered and it need to be integrated with both the general PaaS and Telco PaaS

Paas#8: Packaging

Standard packages like VNFD which cover both Application and PaaS layer

Paas#8: Standardization of API’s

API standardization in ETSI fashion is the key requirement of NFV and Telco journey and it needs to be ensured in PaaS layer as well. For Telco PaaS it should cover VES , TMForum,3GPP , ETSI MANO etc . Community has made following workings to standardize this

  • TMF 641/640
  • 3GPP TS28.532 /531/ 541
  • IFA029 containers in NFV
  • ETSI FEAT17 which is Telco DevOps
  • ETSI TST10 /13 for API testing and verification  

Based on these features there is an ongoing effort with in the LFN XGVELA community and I hope more and more users, partners and vendors can join to define the Future Open 5G Platform

https://github.com/XGVela/XGVela/wiki/XGVela-Meeting-Logistics

Why Cloud and 5G CNF architects must analyze docker depreciation after kubernetes 1.20

Kubernetes is deprecating Docker as a CRI after v1.20 which makes possible future all applications coverage over a single image standard which is OCI , consider the fact due to 5G Telco CNF secondary networking and running

  • Service aware protocols like SIP
  • connection aware like SCTP Multi homing most
  • Ensuring regulatory requirement specially on traffic separation
  • Load balancing
  • Nw isolation
  • Network acceleration

#CNF suppliers today already prefer OCI over #docker #ship . In the long road obviously it will support portability of all applications across cloud platforms .However a negative side it will impact our tool chains specially in cases where we use docker inside docker like tools such as #kaniko, #img, and most importantly  #buildah

If you an Architect who want to solve this challenge or a developer who is little naggy about applications #LCM can kindly refer to community blog post below

https://kubernetes.io/blog/2020/12/08/kubernetes-1-20-release-announcement/

here is the detailed writeup from community for quick reference .

Kubernetes 1.20: The Raddest Release

Tuesday, December 08, 2020

Authors: Kubernetes 1.20 Release Team

We’re pleased to announce the release of Kubernetes 1.20, our third and final release of 2020! This release consists of 42 enhancements: 11 enhancements have graduated to stable, 15 enhancements are moving to beta, and 16 enhancements are entering alpha.

The 1.20 release cycle returned to its normal cadence of 11 weeks following the previous extended release cycle. This is one of the most feature dense releases in a while: the Kubernetes innovation cycle is still trending upward. This release has more alpha than stable enhancements, showing that there is still much to explore in the cloud native ecosystem.

Major Themes

Volume Snapshot Operations Goes Stable

This feature provides a standard way to trigger volume snapshot operations and allows users to incorporate snapshot operations in a portable manner on any Kubernetes environment and supported storage providers.

Additionally, these Kubernetes snapshot primitives act as basic building blocks that unlock the ability to develop advanced, enterprise-grade, storage administration features for Kubernetes, including application or cluster level backup solutions.

Note that snapshot support requires Kubernetes distributors to bundle the Snapshot controller, Snapshot CRDs, and validation webhook. A CSI driver supporting the snapshot functionality must also be deployed on the cluster.

Kubectl Debug Graduates to Beta

The kubectl alpha debug features graduates to beta in 1.20, becoming kubectl debug. The feature provides support for common debugging workflows directly from kubectl. Troubleshooting scenarios supported in this release of kubectl include:

  • Troubleshoot workloads that crash on startup by creating a copy of the pod that uses a different container image or command.
  • Troubleshoot distroless containers by adding a new container with debugging tools, either in a new copy of the pod or using an ephemeral container. (Ephemeral containers are an alpha feature that are not enabled by default.)
  • Troubleshoot on a node by creating a container running in the host namespaces and with access to the host’s filesystem.

Note that as a new built-in command, kubectl debug takes priority over any kubectl plugin named “debug”. You must rename the affected plugin.

Invocations using kubectl alpha debug are now deprecated and will be removed in a subsequent release. Update your scripts to use kubectl debug. For more information about kubectl debug, see Debugging Running Pods.

Beta: API Priority and Fairness

Introduced in 1.18, Kubernetes 1.20 now enables API Priority and Fairness (APF) by default. This allows kube-apiserver to categorize incoming requests by priority levels.

Alpha with updates: IPV4/IPV6

The IPv4/IPv6 dual stack has been reimplemented to support dual stack services based on user and community feedback. This allows both IPv4 and IPv6 service cluster IP addresses to be assigned to a single service, and also enables a service to be transitioned from single to dual IP stack and vice versa.

GA: Process PID Limiting for Stability

Process IDs (pids) are a fundamental resource on Linux hosts. It is trivial to hit the task limit without hitting any other resource limits and cause instability to a host machine.

Administrators require mechanisms to ensure that user pods cannot induce pid exhaustion that prevents host daemons (runtime, kubelet, etc) from running. In addition, it is important to ensure that pids are limited among pods in order to ensure they have limited impact to other workloads on the node. After being enabled-by-default for a year, SIG Node graduates PID Limits to GA on both SupportNodePidsLimit (node-to-pod PID isolation) and SupportPodPidsLimit (ability to limit PIDs per pod).

Alpha: Graceful node shutdown

Users and cluster administrators expect that pods will adhere to expected pod lifecycle including pod termination. Currently, when a node shuts down, pods do not follow the expected pod termination lifecycle and are not terminated gracefully which can cause issues for some workloads. The GracefulNodeShutdown feature is now in Alpha. GracefulNodeShutdown makes the kubelet aware of node system shutdowns, enabling graceful termination of pods during a system shutdown.

Major Changes

Dockershim Deprecation

Dockershim, the container runtime interface (CRI) shim for Docker is being deprecated. Support for Docker is deprecated and will be removed in a future release. Docker-produced images will continue to work in your cluster with all CRI compliant runtimes as Docker images follow the Open Container Initiative (OCI) image specification. The Kubernetes community has written a detailed blog post about deprecation with a dedicated FAQ page for it.

Exec Probe Timeout Handling

A longstanding bug regarding exec probe timeouts that may impact existing pod definitions has been fixed. Prior to this fix, the field timeoutSeconds was not respected for exec probes. Instead, probes would run indefinitely, even past their configured deadline, until a result was returned. With this change, the default value of 1 second will be applied if a value is not specified and existing pod definitions may no longer be sufficient if a probe takes longer than one second. A feature gate, called ExecProbeTimeout, has been added with this fix that enables cluster operators to revert to the previous behavior, but this will be locked and removed in subsequent releases. In order to revert to the previous behavior, cluster operators should set this feature gate to false.

Please review the updated documentation regarding configuring probes for more details.

Other Updates

Graduated to Stable

Notable Feature Updates

Release notes

You can check out the full details of the 1.20 release in the release notes.

Availability of release

Kubernetes 1.20 is available for download on GitHub. There are some great resources out there for getting started with Kubernetes. You can check out some interactive tutorials on the main Kubernetes site, or run a local cluster on your machine using Docker containers with kind. If you’d like to try building a cluster from scratch, check out the Kubernetes the Hard Way tutorial by Kelsey Hightower.

Release Team

This release was made possible by a very dedicated group of individuals, who came together as a team in the midst of a lot of things happening out in the world. A huge thank you to the release lead Jeremy Rickard, and to everyone else on the release team for supporting each other, and working so hard to deliver the 1.20 release for the community.

Release Logo

Kubernetes 1.20 Release Logo

raddestadjective, Slang. excellent; wonderful; cool:

The Kubernetes 1.20 Release has been the raddest release yet.

2020 has been a challenging year for many of us, but Kubernetes contributors have delivered a record-breaking number of enhancements in this release. That is a great accomplishment, so the release lead wanted to end the year with a little bit of levity and pay homage to Kubernetes 1.14 – Caturnetes with a “rad” cat named Humphrey.

Humphrey is the release lead’s cat and has a permanent blepRad was pretty common slang in the 1990s in the United States, and so were laser backgrounds. Humphrey in a 1990s style school picture felt like a fun way to end the year. Hopefully, Humphrey and his blep bring you a little joy at the end of 2020!

The release logo was created by Henry Hsu – @robotdancebattle.

User Highlights

Project Velocity

The CNCF K8s DevStats project aggregates a number of interesting data points related to the velocity of Kubernetes and various sub-projects. This includes everything from individual contributions to the number of companies that are contributing, and is a neat illustration of the depth and breadth of effort that goes into evolving this ecosystem.

In the v1.20 release cycle, which ran for 11 weeks (September 25 to December 9), we saw contributions from 967 companies and 1335 individuals (44 of whom made their first Kubernetes contribution) from 26 countries.

Ecosystem Updates

  • KubeCon North America just wrapped up three weeks ago, the second such event to be virtual! All talks are now available to all on-demand for anyone still needing to catch up!
  • In June, the Kubernetes community formed a new working group as a direct response to the Black Lives Matter protests occurring across America. WG Naming’s goal is to remove harmful and unclear language in the Kubernetes project as completely as possible and to do so in a way that is portable to other CNCF projects. A great introductory talk on this important work and how it is conducted was given at KubeCon 2020 North America, and the initial impact of this labor can actually be seen in the v1.20 release.
  • Previously announced this summer, The Certified Kubernetes Security Specialist (CKS) Certification was released during Kubecon NA for immediate scheduling! Following the model of CKA and CKAD, the CKS is a performance-based exam, focused on security-themed competencies and domains. This exam is targeted at current CKA holders, particularly those who want to round out their baseline knowledge in securing cloud workloads (which is all of us, right?).

Event Updates

KubeCon + CloudNativeCon Europe 2021 will take place May 4 – 7, 2021! Registration will open on January 11. You can find more information about the conference here. Remember that the CFP closes on Sunday, December 13, 11:59pm PST!

Upcoming release webinar

Stay tuned for the upcoming release webinar happening this January.

Get Involved

If you’re interested in contributing to the Kubernetes community, Special Interest Groups (SIGs) are a great starting point. Many of them may align with your interests! If there are things you’d like to share with the community, you can join the weekly community meeting, or use any of the following channels:

Developing Edge Solutions for Telcos and Enterprise

According to Latest market research most of 5G Edge use cases will be realized in next 12-24 months however time to act now for Telco’s to leave them a chance , reason is very clear this is enough time for Hyperscalers to cannibalize the market something we already witnessed with OTT’s in 3G and with VoD and Content Streaming in 4G

Below are my thoughts on

  • What is Edge definition
  • What is Edge Differentiation
  • Why Telco should care about it
  • Why Software architecture so vital for Telco Edge Success