“Industry and Experience based knowledge sharing about Cloud ,5G ,Edge,O-RAN,Network Dis-aggregation and Industry 4.0 Transformation ~ A Cloud Architect's Journey
As of Q1 2022 for 5G adoption we have just passed 600M mark and expected to hit 2.5B connections by 2025 that is almost 500M every year . If we combine IoT and Device ecosystem the scale can go horrendously big .One of the biggest advantage and also challenge that comes from this once in a life time opportunity is “scale” .
Simply putting in to context “automation” and using ML/AI is a must to achieve both Network SLA’s ,efficiency and Optimizing Network TCO
AI has potential in creating value in terms of enhanced workload availability and improved performance and efficiency for 5G and Telco Cloud . However the biggest problem when it comes to use “AI” and Machine learning Telco’s is “Data” and “data Models” because simply there is no standardization or model definition on how Telco systems including Infrastructure expose the “Information” to upper layers Since data sets are huge in this domain with n permutations therefore first step to normalize is the Use case driven normalization of data that can be consumed both by Network and Data science domains . This will enable to develop a future Telco that can detect and also self heal itself .
Understanding Data Integration Architecture
Considering the 5G architecture which is based on Open API and Horizontal services design A.K.A SBA the Data integration and using AI should be an easy problem that can be divided in following to define a pipeline
Telemetry and data
Each layer data exposure as an API starting from Baremetal and then extending upwards towards Cloud , SDN , NFVO , Assurance etc
Data models and engines to disseminate information
However it is easier said than done because of many reasons including
What will be key data sets
how FCAPS of each layer can be dis-aggregated i.e dropping one layer data without confirming dependency is a kill
Business Architecture
In order to address this we need to understand and gain experience from other industries and SDO’s and to see how it can both be agreed and integrated in Telco Networks , this lead us to approach this as a use case driven approach and select those domains and business challenges that can deliver quick results
"Follow the Money to deliver use cases that can monetize 5G
We have analyzed lot of use cases both from academia and industry and compiled a complete list here
From this we infer there are just too many ways Telco’s are solving same problems and this is what make us understand that there should be clear definition of “data Models” and use cases that should be defined at first steps .
The most important of which are :
Using Machine Learning to Detect Noisy Neighbors in 5G Networks.
2.Towards Black-Box Anomaly Detection in Virtual Network Functions
3. Causality Inference for Failure in NFV
4. Self Adaptive Deep Learning Based System for Anomaly Detection in 5G
5. Correlating multiple Events and Data in an Ethernet Network
This leads us to define following as first steps for AI and Intelligence as applied to Telco’s
Source: LFN acumos
Analysis
Data Lakes , Log analysis and correlation
Detection
Anomaly detection including pattern detection , trend and Multi layer correlation
Prediction
Intelligent prediction including capacity ,SLA , Scaling and Cloud KPIs
Generation
Measure data and Synthetize it using frameworks like eBPF
Data Monetization is first to make 5G Profitable
Adressing both the Data Architecture and Business Architecture is vital as different Telco’s including in cases different BU’s in same Customer take it differently and what makes it worst is manipulate and store data lakes using different forms i.e Infrastructure metrics , Agents , Databases which is hard to apply between different data sets and hence it is biggest issue to Monetize one key assets of 5G which is “data” and hence to define a pipeline that can be shared between all of tenants including vertical industry
As said before we are defining few key use cases in LFN project “Thoth” to learn and elaborate from there applying concepts of Events , Anomaly and Prediction across layers and first phase use cases are
Growing business in 5G era largely depends on ecosystem enablement and on idea that Telco’s can build a future Infrastructure that can deliver business outcomes for not only traditional Telco customers but also for broader verticals may it be Manufacturing ,Mining, Retail , Finance , Public safety , Tourism or eventually anything .
This means “Programmable” and “Automated” infrastructure is the base to achieve any such business outcome . Applying this to Telco’s 5G and Transformation journey will mean both “Network Slicing” and the “Private Networks” and although i totally agree with idea that both will co-exist and proliferate but to be fair it is a fact that although Network slicing has delivered outcomes in Labs and Demos it has still number of challenges when apply to practice .
Recently i have heard many views from many reputable names including my friend Dean Bubley and Karim so i thought to share my views on this topic highlighting some work we have been doing with our partners and customers in APAC as well in the GSMA and to allude to some improvements we have achieved in last year since i shared my views with industry on Network Slicing and its Delivery .
Today Network slicing has been live in a number of customers including Singtel that has achieved substantial outcomes with Slicing however the bigger challenge still remain un-answered
How will Network slicing address RAN resources ?
How will Network Slicing can help to monetize low hanging fruits of Edge together with Telco domain slicing
In 3GPP Release17 we have been doing some exciting progress on later with a new architecture and API exposure for co-deployment of Edge with RAN but again prior to this we need to extend the Slicing towards the Access Networks both from Technology and Business Architecture Perspective and this is what i will share in this paper
Experience Learnt from 5G Networks rollout
As many Telco’s in 2021+ accelerated 5G rollout and built 5G SA Core Networks one think proved more than before that limitation on
RAN resources are always scarce
AI need to be enabled to intelligently modify slicing in real time
Spectrum and RAN layers will be a top pressings time for Telco’s to deliver value
RAN resource isolation must follow performance: cost baseline
How to handle resources in peak time and pre-empt some over others is vital
Regulatory and GDPR is vital to achieve anything big in this domain
Orchestration must precede Network Slicing
From Above experience we can infer that it is really not about Network slicing but rather “Open” , “Control” and “API” to enable End to end Network slicing LCM and Orchestration all the way from UE to RAN to Edge to Core to Cloud
Dynamic control of resources with Telco level visibility in Key
RAN automation is first step to be done before Slicing change RAN resources
Cloud operations model is vital to support Network slicing because although there are many business verticals the Telco’s really have to build an efficient and Multi tenant operational model to win it
Cloud Operations Model that is secure and Multi tenant must be enabled across all Telecom Infrastructure
RAN SLA’s for vertical industry
The notion of Network slicing still lies in selling a SLA vs Selling a Network .
First of all RAN resources are always limited and secondly each vertical industry has its own traffic profile and trajectory which can never be planned using old Telecom simulation tools it means dynamic learning and resource adjustment is key . This all alludes to the fact that changing network while ensuring network KQI remain intact is something that require Full visibility and programmable Control
It leads us to consider following architecture first before Slicing is full enabled for the RAN
Slice LCM must be supported by automatic Infrastructure that is elastic and Telco grade at the same time
Scale out architecture must be enabled in RAN
RF and Spectrum resource scheduling is the most expensive and intricate resources for services and we must enable their dynamic control
Intelligent Networking ML/AI must be enabled first
Automation can deliver a myriad of outcomes including better control , real time changes , optimization , compliance and FCAPS for each tenant however it is not sufficient
Intent driven networks that uses power of data , ML and AI to orchestrate and adapt network is a capability that should be enabled on a network scale before network slicing can deliver a business outcome
It leads us to consider following architecture first before Slicing is full enabled for the RAN
Slice LCM must be supported by automatic Infrastructure that is elastic and Telco grade at the same time
Scale out architecture must be enabled in RAN
RF and Spectrum resource scheduling is the most expensive and intricate resources for services and we must enable their dynamic control
Components of a RAN Slice
Although the Core Slicing capability still exists on OSS and SMO layers that are outside the RAN still the real power of Slicing will come as we address the RT capability of RAN slicing which enables us to deliver following for a business tenant
RRM
Connection management
MM
Spectrum layers
All of this must be available to package as a NSSF functional instance as alluded below
Partnerships and Ecosystem
According to the latest GSMA report one use case enablement for any vertical will require at least 7 Players to work together , so RAN slicing or in other words Slicing Business outcomes is not a matter of one body or business to solve . Today to incrementally deliver the business outcomes following are key organizations collaborating to adress those challenges
ETSI NFV
GSMA
MEF
IETF
O-RAN and TIP
ONAP
5GAA , ZVEI etc
We are also taking an aggregation approach where we are summing all the knowledge from these bodies and deliver as a outcome for our customers . you can reach out for more details .
As CSPs continue to evolve to be a digital player they are facing some new challenges like size and traffic requirements are increasing at an exponential pace, the networks which were previously only serving telco workloads are now required to be open for a range of business, industrial and services verticals. These factors necessitate the CSPs to revamp their operations model that is digital, automated, efficient and above all services driven. Similarly, the future operations should support innovation rather than relying on offerings from existing vendor operations models, tool and capabilities.
As CSPs will require to operate and manage both the legacy and new digital platforms during the migrations phase hence it is also imperative that operations have a clear transition strategy and processes that can meet both PNF and VNF service requirements with optimum synergy where possible.
In work done by our team with our customers specifically in APAC , future network should address the following challenges for its operations transformations.
Fault Management: Fault management in the digital era is more complex as there are no dedicated infrastructure for the applications. The question therefore arises how to demarcate the fault and corelate cross layer faults to improve O&M troubleshooting of ICT services.
Service Assurance: The future operations model requires being digital in nature with minimum manual intervention, fully aligned with ZTM (Zero Touch Management) and data driven using principles of closed loop feedback control.
Competency: To match operation requirements of future digital networks the skills of engineers and designers will play a pivotal role in defining and evolving to future networks. The new roles will require network engineers to be more technologist rather than technicians
IT Technology: IT technology including skills in data centers, cloud and shared resources will be vital to operate the network. Operation teams need to understand impacts of scaling elasticity and network healing on operational services
TM Forum ODA A.K.A Open Digital Architecture is a perfect place to start but since it is just an architecture and can lead to different implementation and application architecture so below i will try to share how in real brown field networks it is being applied . I will cover all modules except for AI and intelligent management which i shall be discussed in a separate paper .
Lack of automation in legacy telco networks is an important pain point that needs to be addressed promptly in the future networks. It will not only enable CSPs to avoid the toil of repetitive tasks but also allow them to reduce risks of man-made mistakes.
In order to address the challenges highlighted above it is vital to develop an agile operations models that improves customer experience , optimize CAPEX , AI operations and Business process transformation
Such a strategic vision will be built on an agile operations model that can fulfill the following:
Efficiency and Intelligent Operation: Telecom efficiency is based on data driven architectures using AI and providing actionable information and automation, Self-Healing Network capability and automation of network and as follows
Task Automation & Established foundation
Proactive Operation & Advanced Operation
Machine managed & intelligent Operation.
Service Assurance: Building a service assurance framework to achieve an automated Network Surveillance, Service Quality Monitoring, Fault Management, Preventive Management and Performance Management to ensure close loop feedback control for delivery of zero touch incident handling.
Operations Support: Building a support framework to achieve automated operation acceptance, change & configuration management.
Based on the field experience we achieved with our partners and customers through Telecom transformation we can summarize the learning as follows
People transformation: Transforming teams and workforce that matches the DevOps concept to streamline organization and hence to deliver services in an agile and efficient manner. This is vital because 5G , Cloud and DevOps is a journey of experience not deployment of solutions , start quickly to embark the digital journey
Business Process transformation: Working together with its partners for unification, simplification and digitization of end-to-end processes. The new process will enable Telco’s to quickly adapt the network to offer new products and to reduce time for troubleshooting.
Infrastructure transformation: Running services on digital platforms and cloud, matching a clear vision to swap the legacy infrastructure.
If PNF to VNF/CNF migration is vital the Hybrid Network management is critical
Automation and tools: Operations automation using tools like workforce management, ticket management etc is vital but not support vision of full automation . The services migration to cloud will enable automated delivery of services across the whole life cycle. Programming teams should join operations to start a journey where the network can be managed through power of software like Python, JAVA, GOLANG and YANG models. It will also enable test automation, a vision which will enable operations teams to validate any change before applying it to live network.
Having said this i hope it shall serve as a high level guide for architects adressing operational transformation , as we can see AI and Intelligent Managmeent is vital piece of it and i shall write on this soon .
During last year industry has witnessed Telco’s increased spend and maturity in Cloud and Automation Platforms . During Pandemic it is proven that Digital and Cloud is the answer our customers require to design , build and Operate Future Telecom Networks .
The Second key Pillar forcing Telecom industry towards Autonomous networks is to deliver business outcomes while doing business responsibly .
Getting Business outcomes and doing a sustainable business that supports Green Vision has been a not related discussion in Telecom Industry
But now infusion of Data and Cloud is really enabling it , it is expected that we as industry can cutdown at least 50% of Power emissions in coming decade but how it will become possible . According to Pareto’s law the last 20% will be most difficult .
This is where my team main focus has been to build robust AI and Automation use cases that are intelligent enough and that solves broader issues . Today the biggest focus for ML/AI for Telco’s that can really put them lead such outcomes are
Smart Capacity management
O&M of networks that reduces emissions and improves availability
Service assurance based on data
The biggest Challenge in Transformation is Fragmentation
The biggest bottleneck is making such outcomes is related to data . Intricately “Data” is both the problem and the Solution because of so many sources of truth and different ingestion mechanisms . Do check details on #Dell Streaming data platforms and how we are solving this problem
Today under the umbrella of Anuket , 3GPP , TMF and ITU are all collaborating to come a validated and composite solution to deliver those use cases . So in a nutshell it is vital to build a holistic and unified view to deliver data driven AI use cases
Scope and Scale of Intelligent automation
The biggest bottleneck is coming from the fact that in real world Telco Apps can never be fully cloud native , at some level both the state and resiliency requirements and App requirements has to be kept and to come with intelligent work load driven decisions . The decade long journey of Telecom Transformation has revealed that just building everything as a code and expecting it to work and Telco’s can rollback their NOC sizes simply not works .
This is where intelligence from layers above the Orchestration and SDN will be of help like google does in the Internet era .
The second biggest issue is in the Scalable Telco solutions itself , it is proven that Telco’s face unique challenges as they move from hundred’s to thousand of nodes . So imagine running AI for heterogenous environments each coming with different outcomes can never deliver power and scale Telco’s need in the new era .
Telco grade AIOPS models
It is true that with 5G and Business digital transformation the industry really want to ramp up to build an improved user experience and unified model to expand portfolio towards vertical markets as well , this is only possible if we can have a coordinated system , workflow management and data sharing and exposure with strict TSR security measures . Similarly this model should cover full LCM including FCAPS model .
Building Intelligent Telco’s
Although using AI and ML is an exciting ambition for a Telco still the bottom line is how to build these platforms on top of NFVI and Existing Orchestration and Automation frameworks . In other words really business case to build an intelligent networks starts with using Data and ML to automate the entire network . Although in this aspect the scope can extend not just to service domain but also to business domains i.e automate business process including event correlation , anomaly and RCA
Building a Unified AI Platform
Although this intention or target is clear however in context of networks this is complex as we need to solve challenge of data security , regulation as well as what it really means to do the certification of an AI platform because focus should be given that allow this layer to be built from solution from many vendors so a loose coupling with more focus on Network service and AI algorithms is a key to build this platform
Instead of focusing on network element certification focus of AI platform is service level compatibility , data models and AI algorithms
However lack of unified standard specially on trusted data normalization , sharing and exposure is certainly forcing operators to adopt a Be-Spoke solutions to build AI platforms and that itself is a big impediment to wide scale adoption of AI and ML in the Networks
To move forward more close collaboration between different standard bodies and governance by more Telco centric organization like TMF is the answer with immediate focus to be given to Data standardization , labs integration and to enable shared data sets and algorithms to evolve and support wide deployments of ML and AI in Telecom Networks
Latest Industry progress and standardization
Although this is the early time of AI platforms standardization still we need to aggregate the progress between different bodies lest we can only expect the plethora of silo solutions each with a different specifications
ONAP as baseline of automation platform has components like DCAE and AI engine that makes sense to make it the defacto baseline standard
Anuket is the Cloud Infrastructure reference and it has recently launched a new project “Thoth” to look in to AI network standardization
ETSI ZSM is E2E automation platform across full LCM of a Telecom network and certainly an important direction
ETSI ENI or enhanced network intelligence is another body that closely defines AI specifications in the context of Telecom
TMF as a broader Telecom body is defining architectures including ODA and AIOPS that really breaks down on how a Telco can take a phased approach to build these platforms
Above all early involvement and support from Telecom operators and partners is very important to realize this goal . I hope in this year we will see more success and standardization on these initiatives so lets work together and stay tuned .
Cloud computing use in the Telecom industry has been increasingly adopted during the last decade. It has changed many shapes and architectures since the first phase of NFV that started back in 2012. In today’s data hungry world there is an increasing demand to move Cloud architectures from central clouds to loosely coupled distributed clouds both to make sense from cost perspective by slashing transport cost to anchor all user traffic back to central data centers but also certainly from security perspective where major customer prefers to keep data on premises. Similarly, with the Hyperscale’s and public cloud providers targeting Telco industry it is evident that the future Cloud will be a fully distributed and multi cloud constituted by many on-premise and public cloud offerings.
Since 5G by design is based on Cloud concepts like
Service based architectures
Micro services
Scalar
Automated
Hence it is evident that many operators are embarking on a journey to build open and scalable 5G clouds that are capable to handle the future business requirements from both Telco and industry verticals. The purpose of this paper is to highlight the key characteristics of such Clouds and how we must collaborate with rich ecosystem to make 5G a success to achieve industry4.0 targets.
Cloud Native Infrastructure for 5G Core and Edge
Cloud native do not refer to a particular technology but a set of principles that will ensure future Applications are fully de-coupled from the Infrastructure, on atomic level it can a VM or container or may be futuristic serverless and unikernels. As of today, the only community accepted Cloud native standard for 5G and Cloud is an OCI compliant infrastructure. In general cloud native for Telco means a Telecom application as per 3GPP , IETF and related standard that meets criteria of Cloud native principles as shared in this paper, support vision of immutable infrastructure, declarative and native DevSecOps for the whole Infrastructure.
Cloud native is the industry defacto for develop and deliver applications in the Cloud and since 5G by its design is service based and microservice enabled so the basic principle for 5G infrastructure is Cloud native which will support scalability, portability, openness and most importantly flexibility on board a wide variety of Applications.
As per latest industry studies the data in 5G era will quadruple every year this will make Cloud native a necessity to provision automated infrastructures that will be fully automated, support common SDK’s and above all will enable CI/CD across the full application life cycle
Scalability to deploy services in many PoP’s is the other key requirements for 5G along with possibility to build or tear the service on the fly. As 5G deployments will scale so is cloud instances and it is a necessity that future Cloud infrastructure can be scaled and managed automatically
Application portability is the other key characteristics of 5G cloud. As 5G use cases will become mature there is an increasing requirement to deploy different applications in different clouds and to connect them is a loosely based meshes. In addition, as Network capacities and usage will increase the applications must be capable to move across the different clouds
What Cloud means for Telco 5G
Telco operators through their mission critical infrastructure holds a seminal place in the post covid-19 digital economy. Telecom networks use impacts economy, society, commerce and law order directly this is why Telecom networks are designed with higher availability, reliability and performance.
The biggest challenge for Cloud Native Infrastructure for
Telco lies
Granularity of Telco App decomposition
Networking
Performance acceleration
O&M and Operational frameworks
Due to the reason that Telco 5G applications need to fulfill special SLA based performance and functions which somehow are not possible in the containerized and Kubernetes based Cloud platforms of today so we must define a Telco definition of Cloud. Similarly, how we will connect workloads E-W is very important. The questions become more prevalent as we move towards edge . The downside is that any deviation from standard Cloud native means we cannot achieve the promised of Scaling, performance and distribution the very purpose for which we have built these platforms ,
Any tweaks on the cloud principles means we can not provision and manage a truly automated Cloud Infrastructure following DevSecOps which is so vital to deliver continuous updates and new software codes in the 5G infrastructure. Lacking such functions means we can not meet fast pace innovation requirements which are necessary for the 5G new use cases specially for the vertical markets
The last and most important factor is leveraging advances from hyper scalers to achieve Cloud and 5G deployments , today we already see a movement in market where a carrier grade Clouds from famous distros like IBM can be deployed on top of public clouds but here top question impacting is whether “abstraction will impact performance” , the one top reason NFV first wave was not such disruptive because we defined so many models and used model to define another model which obviously added to complexity and deployment issues . Cloud Native for 5G Telco need to address and harmonize it as well
Applications for 5G
Application economy is vital for the success of 5G and Edge . However, based on T1 operators’ deployments of open 5G platforms has revealed that just deploying a open Infrastructure is not enough as adherence of Cloud by application vendors will vary and to truly take advantage of Cloud it is vital to define principles for a Infrastructure lead delivery by devising frameworks and tools to test and benchmark the 5G applications classification as Gold , Bronze , Silver with common direction to achieve a fully gold standard applications in the 5G era . Although Cloud native by principles support vision to achieve common, shared and automated infrastructure but it is easier said than done in real practice as achieving a Telco grade conformance for
Telco services is complex that require rigorous validation and testing. Based on real Open 5G cloud deployments and corresponding CNF benchmarking there are still certain gaps in standards that need both standardization and testing.
Application resources over committing
Application networking dependence that slows scaling
Use of SDN in 5G Cloud
Lack of Open Telemetry which makes customized EMS mandatory
Hybrid management of VNF and CNF’s
Luckily there are a number of industry initiatives like CNCF Conformance , CNTT RI2 , NFV NOC , OPNFV which fundamentally address this very issues and already we have seen the results . It is vital that 5G Cloud infrastructures are capable to support east to use SDK’s and tools that vendors and developers can use flexibly to offer and deploy different applications in the 5G era.
In Next Part I shall try to elaborate how Open Telemetry and Automation is driving next era of growth using ML and AI driven solutions
Questions like #Coexistence ,#NewOnM models and #Processes reengineering required to address following challenges: a. Theoretically zero downtime by building E2E Architecture e.g UPF pool b. Delivery pipeline for migration through unified tools solving touching every NE one by one an improve TTM c. Service consistency automatic verification between Cloud and Legacy d. 5 9’s reliability and TSR gold standard security
Customers expect #partners who offer #Operational tools and services optimized for 1) E2E whole solution support ownership in MVI 2) SPOC support services 3) Tools and Platform services specially on PaaS and NFVO to co-develop 4) Cloud assurance services specially for #business readiness , #Transition , #TaaS ,#Solution emergency support and CSR
And Partner must work with #Telco to #remodel processes starting with 1) Incident handling for L1 2) Config management supporting multi layers 3) Change management with focus on scaling , SDN and Policy enforcement 4) Release management supporting #devops viz staging to prod
Recent PoC’s of P4 programming models with multiple registers alongside Intel 3rd gen processors and agileX(10nm) gave critical indications about Future delivery model of 5G Edge Infrastructure and Networks by solving following Telco requirements
Telco Key Edge Infrastructure Requirements
Workload placements at edge with possibility to make them portable based on real time use
Enhancing Infrastructure models and form factors for the Edge
Evaluate P4 as baseline to build unified model to program Compute/Network , VNF/CNF got real time use cases like traffic steering
Results with Smart NICS and Latest X86 Chips
Below are key findings
The hardware can deliver consistent throughput (Required 10Gbps per core ) till 10 register pipelines and degrades exponentially after it @12% for each pair of registers after this
Impact on latency is prominent as we increase registers like 40-50% variation after register >10
P4 architecture with Intel 3rd generation can help us solve and optimize for this issues dramatically
Key Achievements with P4 in last year
Unified programming model from Core to Cloud to Edge demo
12Tbps programmable chips available for PoC at least in Labs
P4 can build CI/CD for Hardware infrastructure to ensure Infra resilience
Current Challenges and Targets in 2021
Below are still some gaps that is expected to be addressed to ensure Telco Grade Edge
P4 is good in Core but in Edge it needs needs to be improved especially Common API models
Performance on Latency is key to build Edge Infrastructure
P4 is not a modeling language but switching model language how to abstract it on service level is a issue
VNF partners ecosystem specially drivers on Cloud and VNF side
Can GPU be of help solve multiple register pipelines
How P4 can work with IPV4 to build use cases like Slicing
Finally most important need that need more cohesive community effort is Telemetry and topology till now we only have less references like from deepinsights on P4 refer to below
Edge Deployments are gaining momentum in Australia APAC and rest of markets , due to sheer size of Edge there are new challenges and opportunities for all players in the Ecosystem including
Public Cloud and Hyperscalers like AWS , Azure , Google etc
SI’s like Southtel , Tech Mahindra etc
However one thing which a Telco community need to do is to make a standard architecture and specifications of Edge that will support not only build a thriving ecosystem but also achieve promises of global scale and developer experience . Within Open Infrastructure community we have been working with in Open Infra Edge computing group to achieve exactly this .
Focus Areas
Following is the Scope and areas we are enabling today
Defining Reference Architectures for Edge Delivery model in the form of Reference Architectures , Reference Model and Certification process where we are working together with #GSMA and #Anuket in Linux Foundation
Defining Use cases based on real RFX and Telco customer requirements
Requirements prioritization for each half year
Enabling Edge Ecosystem
Output the White paper specially on Implementation and Testing Frameworks
Edge Architectures
Alongside Linux foundation Akraino blueprints we are enabling blue prints and best practices in Edge user group however we are emphasizing that the Architecture remains as vendor agnostic as possible with different flavors and vendors solving following challenges Edge Computing Group – OpenStack
Life-cycle Management. A virtual-machine/container/bare-metal manager in charge of managing machine/container lifecycle (configuration, scheduling, deployment, suspend/resume, and shutdown). (Current Projects: TK)
Image Management. An image manager in charge of template files (a.k.a. virtual-machine/container images). (Current Projects: TK)
Network Management. A network manager in charge of providing connectivity to the infrastructure: virtual networks and external access for users. (Current Projects: TK)
Storage Management. A storage manager, providing storage services to edge applications. (Current Projects: TK)
Administrative. Administrative tools, providing user interfaces to operate and use the dispersed infrastructure. (Current Projects: TK)
Storage latency. Addressing storage latency over WAN connections.
Reinforced security at the edge. Monitoring the physical and application integrity of each site, with the ability to autonomously enable corrective actions when necessary.
Resource utilization monitoring. Monitor resource utilization across all nodes simultaneously.
Orchestration tools. Manage and coordinate many edge sites and workloads, potentially leading toward a peering control plane or “selforganizing edge.”
Federation of edge platforms orchestration (or cloud-of-clouds). Must be explored and introduced to the IaaS core services.
Automated edge commission/decommission operations. Includes initial software deployment and upgrades of the resource management system’s components.
Automated data and workload relocations. Load balancing across geographically distributed hardware.
Synchronization of abstract state propagation Needed at the “core” of the infrastructure to cope with discontinuous network links.
Network partitioning with limited connectivity New ways to deal with network partitioning issues due to limited connectivity—coping with short disconnections and long disconnections alike.
Manage application latency requirements. The definition of advanced placement constraints in order to cope with latency requirements of application components.
Application provisioning and scheduling. In order to satisfy placement requirements (initial placement).
Data and workload relocations. According to internal/external events (mobility use-cases, failures, performance considerations, and so forth).
Integration location awareness. Not all edge deployments will require the same application at the same moment. Location and demand awareness are a likely need.
Dynamic rebalancing of resources from remote sites. Discrete hardware with limited resources and limited ability to expand at the remote site needs to be taken into consideration when designing both the overall architecture at the macro level and the administrative tools. The concept of being able to grab remote resources on demand from other sites, either neighbors over a mesh network or from core elements in a hierarchical network, means that fluctuations in local demand can be met without inefficiency in hardware deployments.
Edge Standards under Review
Although owing to Carrier grade Telco service requirements on the Edge preference has always been on StarlingX and this is what are maturing to GA but there are many other standards we are standardizing at the Edge as follows
StarlingX
Complete cloud infrastructure solution for edge and IoT • Fusion between Kubernetes and OpenStack • Integrated stack • Installation package for the whole stack • Distributed cloud support
K3S and Minimal Kubernetes
Lightweight Kubernetes distribution
Single binary
Basic features added, like local storage provider, service load balancer, Traefik ingress controller
Tunnel Proxy
KubeEdge specially for IOT
Kubernetes distribution tailored for IoT
Has orchestration and device management features
Basic features added, like storage provider, service loadbalancer, ingress controller
Cloud Core and EdgeCore
Submariner
Cross Kubernetes cluster L3 connectivity over VPN tunnels
Following are the Steps to join as many guys reported they find issues in MIRC latest version after 7.5 so i wanted to give some summary here
Step1: Registration and Nickname setitngs
You may see some notices from Nickserv that the nick you use is already taken by someone else. The notice looks like this: Nickserv notice Well in this case you need to choose another nickname. You can do this easily by typing
/nick nick_of_your_choice
/nick john_doe
Nickserv will keep telling you this notice until you found a nick, which is not registered by someone else. If you want to use the same nick every time when you connect you may register it. The service called Nickserv handles the nicks of all registered users of the Network. The nick registration is free and you just need an email to confirm that you are a real person. To register the nick you currently use type
Note: Your email address will be kept confidential. We will never send you spam mails or mails were we request private data (like passwords, banking accounts, etc). After this you will see a notice from nickserv telling you this:
– NickServ – A passcode has been sent to myemail@address.net, please type /msg NickServ confirm <passcode> to complete registration
Check your email account for new mails. Some email providers like hotmail may drop our mail sent by our services into your spamfolder. Open the mail and you will find a text like this:
Hi, You have requested to register the following nickname some_nickname. Please type ” /msg NickServ confirm JpayrtZSx ” to complete registration. If you don’t know why this mail is sent to you, please ignore it silently. PLEASE DON’T ANSWER TO THIS MAIL! irchighway.net administrators.
Just copy and paste the part /msg NickServ confirm JpayrtZSx into your status window of you mIRC. Then press the enter key. A text like:
– *NickServ* confirm JpayrtZSx – – NickServ – Nickname some_nickname registered under your account: *q@*.1413884c.some.isp.net – – NickServ – Your password is supersecret – remember this for later use. – * some_nickname sets mode: +r
should appear after this. This means you finished your registration and the nick can only be used by you or you can force someone else if he/she uses your nick to give it back to you. If you disconnect then you need to tell nickserv that the nick is yours. you can do that by:
/nickserv identify password e.g. /nickserv identify supersecret
if the password is correct it should look like this:
* some_nickname sets mode: +r – – NickServ – Password accepted – you are now recognized.
In mIRC you can do the identification process automatically so you don’t have to care about this anymore. Open the mIRC Options by pressing he key combination Alt + O then select the category Options and click on Perform you will see this dialog: Perform window
Check Enable perform on connect and add: if ($network == irchighway) { /nickserv identify password } in the edit box called Perform commands Close the options by clicking OK. Now your mIRC will automatically identify you every time you connect to IRCHighway.
Step2: Setting SAS/CAP authentication
mIRC added built-in SASL support in version 7.48, released April 2017. The below instructions were written for version 7.51, released September 2017. Earlier versions of mIRC have unofficial third-party support for SASL, which is not documented here. freenode strongly recommends using the latest available version of your IRC client so that you are up-to-date with security fixes.
In the File menu, click Select Server… In the Connect -> Servers section of the mIRC Options window, select the correct server inside the Freenode folder, then click Edit In the Login Method dropdown, select SASL (/CAP) In the second Password box at the bottom of the window, enter your NickServ username, then a colon, then your NickServ password. For example, dax:hunter2 Click the OK button
Step3: Joining Channel
Following command to join the channel , best of luck
Kubecon Euope April 2021 session by Ildikó Váncsa (Open Infrastructure Foundation) – ildiko@openinfra.dev and colleague Gergely Csatári (Nokia) – gergely.csatari@nokia.com
As Linux is the defacto OS for innovation in the Datacenters sameway the OpenSHift is proving to be a Catalyst for both Enterprise and Telco’s Cloud transformation . In this blog i will like to share my experience with two environments one is minishift that is a home brew environment for developers and others based on Pre-existing infrastructure .
As you know Openshift is a cool platform as a part of these two modes it support a wide variety of deployment options including hosted platforms on
AWS
Google
Azure
IBM
However for hosted platforms we will use full installers with out any customization so this is simply not complex provided you must use only Redhat guide for deployment.
Avoid common Mistakes
As a pre requisite you must have a bastion host to be used as bootstrap node
Linux manifest NTP , registry ,key should be available while for Full installation the DNS is to be prepared before cloud installer kicks in .
Making ignition files on your own (Always use and generate manifest from installers)
FOr Pre-existing the Control plane is based on Core OS while workers can be RHel or COreOS while for full stack everything including workers must be based on CoreOS
Once started installation whole cluster must be spinned within 24hours otherwise you need to generate new keys before proceed as controller will stop ping as license keys have a 24hour validity
As per my experience most manifest for full stack installation is created by installers viz. Cluster Node instances , Cluster Networks and bootstrap nodes
Pain points in Openshift3 installation
Since most openshift installation is around complex Ansible Playbooks , roles and detailed Linux files configuration all the way from DNS , CSR etc so it was a dire need to make it simple and easy for customers and it is what RedHat has done by moving to Opinionated installation which make it simple to install with only high level information and later based on each environment the enterprise can scale as per needs for Day2 requirements , such a mode solves three fundamental issues
Installer customization needs (At least this was my experience in OCP3)
Full automation of environment
Implement CI/CD
Components of installation
There are two pieces you should know for OCP4 installation
Installer
Installers is a linux manifest coming from RedHat directly and need very less tuning and customization
Ignition Files
Ignition files are first bootstrap configs needed to configure both the bootstrap , control and compute nodes .If you have managed the Openstack platform before you know we need separate Kickstart and cloud-init files and in Ignition files process RedHat makes simple both steps . For details on Ignition process and cluster installation refer to nice stuff below
copy CDK in directory C:/users/Saad.Sheikh/minishift and in CMD go in that directory
minishift setup-cdk
It will create .minishift in your path C:/users/Saad.Sheikh
set MINISHIFT_USERNAME=snasrullah.c
minishift start –vm-driver virtualbox
Add the directory containing oc.exe to your PATH
FOR /f “tokens=*” %i IN (‘minishift oc-env’) DO @call %i
minishift stop
minishift start
Below message will come just ignore it and enjoy error: dial tcp 192.168.99.100:8443: connectex: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. – verify you have provided the correct host and port and that the server is currently running. Could not set oc CLI context for ‘minishift’ profile: Error during setting ‘minishift’ as active profile: Unable to login to cluster
To login as administrator: oc login -u system:admin
Openshift installation based on onprem hosting
This mode is also known as UPI (User provided infrastructure) and it has the following the key steps for OCP full installation
Step1: run the redhat installer
Step2: Based on manifests build the ignition files for the bootstrap nodes
Step3: The control node boots and fetches information from the bootstrap server
Step4: The etcd provisioned on control node scales to 3 nodes to build a 3 control nore HA cluster
Finally the bootstrap node is depleted and removed
Following is the scripts i used to spin my OCP cluster
1#@Reboot the machine bootstrap during reboot go to PXE and install CoreOS
2#openshift-install --dir=./ocp4upi
3@rmeove the bootstrap IP's entries from /etc/haproxy/haproxy.cfg
4# systemctl reload haproxy
5#set the kubeconfig ENV variables
6# export kubeconfig=~/ocp4upi/auth/kubeconfig
7# verify the installation
8# oc get pv
9# oc get nodes
10# oc get custeroperator
11#approve any CSR and certificates
12# oc get csr -o go-template='{{range.items}}{{if no .status}}{{.metadata .name}}{{""\n""}}{{end}} | xargs oc adm certificate approve
13#login to OCP cluster GUI using
https://localhost:8080
Do try it out and share your experience what you think about OCP4.6 installation .
Disclaimer: All commands and processes i validated in my home lab environment and you need tune and check your environment before apply as some tuning may be needed .