“Industry and Experience based knowledge sharing about Cloud ,5G ,Edge,O-RAN,Network Dis-aggregation and Industry 4.0 Transformation ~ A Cloud Architect's Journey
Expand , Migrate , Retire using App migration tools
As part of solving any Organization challenges and create new business opportunities “Cloud” is at the central stage
Cloud first as top priority for all major Telco's and Enterprise as a way to unfold future
A right Cloud strategy is not just to save cost but to create value by untethering IT from low value mundane tasks towards innovation and creating value through new capability , tools and services .
Since there are so many solutions and offerings it is important to select an architecture that allows customer to select the best of both worlds that also solves App migration and co-existence issues . Finally as architect It’s about understanding the pros and cons of the options and making the right architecture choices .
Where to begin?
Understanding your starting point is essential to planning and executing a successful application migration strategy. Take a comprehensive approach, including not only technical requirements, but also consideration of your business goals (both present and future), any critical timelines, and your own internal capabilities. Depending on your situation you might fall in any of the below categories as it relates to time-to-value. There is no one size fits all approach to migration but the key here is to know that whichever path you choose, there is always a way to build on top of that and continue to take more advantages of the cloud in an incremental fashion.
How to evaluate Different Clouds
With 100’s of services across a number of big Players including below
AWS (Amazon Web Services) from Amazon
Azure from Microsoft
GCP from Google
IBM Cloud from IBM
Ali Cloud from Ali Baba
OCI (Oracle Cloud Infrastructure) from Oracle
Its always a difficult choice how to select which use case is best for which and to remember their services . Below is a quick cheat sheet that may be of help
According to the latest GSMA report 5G connections will grow at CAGR of 100% every year till 20205 making 500M+ connections today pass 2.9B + in 2025 .
Even in terms of global spend we have seen in 2020 50% of Carrier’s spend in more than 140 Big Telco’s who deployed 5G went for 5G related deployments. By 2025 we expect 80% of Telco’s spend will go directly or in directly in 5G .
This data proves there is a big gap still exists between what technology has enabled us to achieve in 5G and business models like Sell out , Sell in , B2B , Aggregation , roaming etc which will enable us to make money from 5G .
When it comes to making a strong business in 5G is to monetize vertical markets by minimizing dedicated NPN (Non public networks) use through support of Wider use cases using “Network Slicing” by using as
much Public Networks of Telco’s as possible .
In these two part Epilogue , i will focus on key industry progress and what is possible today in this Part1 followed by holistic analysis of Gaps in Part2 specially on RAN and Transport side and to see how we can solve them in 2022-2023 Era .
Commercial Network Slicing Journey Cart on the Go
We as a Industry waited for 3GPP Release16 (ASN.1 ready on June 2020) to really experiment with 5G slicing although its early specs are already ready in Release15 , this was primarily to really make Slicing appealing with
New use cases support specially on uRLLC , mIOT and eMBB QOS , as we all know the advances on V2X , QOS and vertical use cases is a key for 5G and so for slicing .
SBA and software control require not only 3GPP specs but more modular approaches to build NF’s which is not possible in VNF era
Having said this there is no real commercial deployment of Network slicing , those which are ready are limited in scope and variety and not serve a wide industry .In order to speedup deployment with “ready architectures” it is vital to keep pace with following
Alignment with terminals and chipsets to support all slicing profiles and NST’s
Definition of business models including roaming
Upgrades on OSS/BSS and E2E SO to support slicing functions like CSMF , NSMF and Assurance systems
Test a number of use cases with different characteristics
Test agility at scale
Focus on RAN and Transport Slicing
Test additional partners specially ISV’s
This a general state of industry progress and we expect solving above challenges sequentially is the way forward to ensure wide industry adoption is ready by 2023 .
Focus on Private Networks vs Slicing
There is an ever going discussion between different industry players to favor one over others specially Telco’s obviously whose main target is to capitalize publuic networks vs Hyperscale’s and private business whose focus is on building dedicated Private Networks .
However as with major facets of life there is tradeoff always and so is case for this as it makes sense to offer different use cases via different solution i.e Private Networks or Slicing .
According to the latest research published by Ericsson 90% of addressable revenue in Telco oriented key use cases will come to Telco pockets using slices these industries are Healthcare , Transport ,Government, Utilities and Manufacturing . This is why i believe in 2021-2022 this is where Public Networks need focus on while for other key use cases specially industry of using Dedicated Private Networks .
The use cases between Telco’s and Dedicated can be divided as follows .
Telco Lead
Wide Area use For example health , safety , govt using SNPN (Standalone Non Public Networks)
Industry Lead
Wide Area use For example health , safety , govt using PNI-NPN (Public Network Integrated NPN)
If you want to learn more about this i suggest to keep watching my blogs as based on ANZ market and industry trend i will share latest analysis soon .
Monitoring and AEF (API Exposure Functions)
According to latest research the 5G connections in 2021 EoY will reach 580M+ and with architectures like slicing the type of use cases and services needed to be supported on Telco network will be huge , it will require new paradigms of networks e.g NRF , NDAF and API exposures to third party to build , manage ,sell slices .
The other domain is the observability and monitoring which not only need to solved on Telco Infrasturcture side using solutions like Prometheus , Kibana , ELK but also on Use UE side in the form of inbuilt of separated agents like PicoNeo3 , Immetrix , AVEQ SUrmeter app and use the real time data together with API exposrue to build services .
QOE and SLA as key criteria for PNI Private Networks
The Definition of QOE and SLA is vital both for business reasons and for technical architecture .
As per my analysis the Slicing orchestration and ordering systems must be capable to build slices with following characteristics .
Latency—-> E2E packet traversal in real time
Reliability —> 5 9’s for Telecom services
High Precision —-> characteristics for TSN and high quality networks specially or Release16 based network
Security —> Ability to sell Security as a service for enterprise
Isolation —> what type of Slice isolation is a must and to have minimum impact on existing Pre-emption based Telecom Services
Traffic Patterns:
Traffic patterns is a new and evolving domain because obviously most of Telco’s do not understanding the traffic patterns and architectures of Enterprise which can be strictly East West instead of Telco based North South .
Ability to draw this topology , give visibility , to adjust it and a policy framework to adjust in real time are a must design for slicing solutions
“Can Telco become a Managed service provider as well as Infra Service provider from RF to applications” offering a solution that has below characteristics
•Flexibility to add New devices or solutions for a use case (E.g Tetra)
•Efficiency e.g resource optimization over time using ML/AI when in cohesion with Public NW
•Data governance and security isolation
•Cost Effective
•Optimize with capacity (Cost /bit )
Why i believe above model is a key to success because obviously the unified model with different value will fulfil most use case requirements an example below .
Model
Cloud Gaming
AR/VR
Latency
<100ms
<50ms
Throughput
>10Mbps *less also ok as long as can guarantee it
>30Mbps
6FoV
No need
Wide FoV with six Degree of Freedom
Motion
No need
Tactile Level intellignece
Image quality and real time update is a key characteristics of any X-box style VR service .Bad quality which can not be corrected by FEC will lead to bad CEM and non stable latency will lead to motion sickness and complaints .
E2E Slicing Architectures biggest Gaps
The Private networks delivery using PNI NPN is still not well mature and there needs a full paper to write summary to cover each domain , however i firmly believe during last 18months there is a huge industry push on Core , OSS/BSS side along leaving biggest challenges on RAN and Transport as summarized below
Transport Slicing Architectures
Early commercial use cases of slices has proved slicing implementation in form of VLAN QOS , DSCP of ToS all require IPV6 and novel means to re-design the network .
I think as we start deploy slices it will be more and more hard to imagine any case to deploy any service even in latest 5G Core networks without it being passed via brown filed transport networks .
Handoff of QOS and Slice SLA between MW , IP and multi domains
Handoff of SLA between Metro and Back bone networks
IPV6 traffic engineering as it traverses old legacy MPLS networks
Multi domain architectures and to deliver a multi layer
Real time routing updates based on slice real time KPI’s
RAN Slicing Architectures
RAN networks are biggest constraint in a Telecom Networks , the further challenge comes when we find a situation where over lay networks i.e Slices infringe with native customer services which are protected by regulatory and Law like transparency and fair usage .
Following are top domains to be addressed on PNI Private Networks
Resource Blocks
Those who work on RAN knows RRB is a biggest bottleneck to ensure service availability vs performance e.g if you have a multi service network how much % you can reserve for each and how much will be shared and what is mechanism to use specially when resources are scarce .
Will resource RRB use will conflict with Fair use principle ?
TSN AND Urllc USES
RAN performance and resource control is more difficult when we combine slices and logical overlays in the context of latest Release16 features like TSN , Non Slot based scheduling and UL pre-scheduling
How Pre-scheduling will impact Resource usage is vital for SLA delivery
Admission Control
How to ensure high priority cases like admission is not blocked in a restraint environment .Connection mode , establishment and steps to allow CAC are all critical
By pass complex RBAC for resource access is vital for RAN use cases delivery in constraint environments
Pre-emption
Define architecture , testing and performance for Slice Pre-emption
APP Based QOS
USRP A.K.A UE Route Selection Policy is a key feature in 3GPP Release16 that makes it possible to deliver control of slices on App level compared to UE level and enable control and relationship between both flows and routing .
It can deliver following controls
APP ID
IP
DNN
NSSAI
SSC mode
Location
Time Window
With the App based services use rise its very important to investigate these novel features nd how it can interwork with both RAN , Transport to deliver end to end services and resilience for the 5G Telecom Networks .
Standardization gaps
Having said this Slicing involves a myriad of services and requirements that are not all covered by Telecom SDO’s and further even with in telecom there are many organizations involves like ETSI , IETF , GSMA , BBF etc .
In my later paper i plan to write towards end of 2021 i will give a summary on the gaps and next steps to ensure we can build a unified architecture that is agreed not only by Telco’s but also by Vertical industry and that can help us build a PNI NPN business models that are global in scale with a promise of roaming as well . Thanks and keep learning keep growing .
2021+ is a special year with so many announcements for Open RAN , instead of taking a marketing boat of saying everything is cute out there the purpose of these set of posts et all my notes is to share my views on current Open RAN features and status from RAN and Radio perspective and how a phased approach is best both from business and technology feasibility perspective to make it successful to evolve Telco’s RAN towards future Quantum ready communication networks for 5G and beyond .
There are many domains in Open RAN that must be carefully thought to build a open and carrier grade RAN System that i will like to talk about in this post mainly
Market drive
Integration scope and needs
Customer requirements
Architectures
Security
Server Models
Site Configurations
Relation of RAN with Telco’s Edge Future
Based on latest Chipset and Hardware acceleration progress along with radio’s innovation supported by latest automation suites , it is great to share solution updates along following dimensions .
1. Market
Open RAN is only architecture that can adress future use cases and scaling requirements where simply one supplier can not meet all requirements including alternatives to Silicon
Open RAN is a must architecture to bring necessary resilience in a distributed architecture and bring intelligence on top .
Open RAN will be beautiful as Cloud offerings and capabilities will extend to include the RIC and intelligence together with Cloud scale .
Although this is some dimension industry has to wait for few years .
2. Integration
Horizontal interface integration including E2 is not easy if we bring multi vendor so its not just cloud but CUPS and related 3GPP specs that need more heterogeneity
Second is hoe to make sure different vendors with different understandings will cooperate .One technical dilemma is to align how different vendors with different understandings of Specs can be harmonized
Should Operator take new responsibility or they bring forward a trusted partner solving ecosystem issue is a key
3. Telco Needs 2021+
Following are the top needs telco’s should focus to bring Open RAN in brown fields.
Energy —building an efficient energy solution is a must to meet our targets on “Progress made real 2030” so is for business .
Orchestration —RAN Orchestration has a unique requirements and we need to focus on it more to mature it
Performance comparison with Phyiscal brown field —Build a transparent narrative to compare performance of two worlds is a key
I firmly believe the Hybrid is the reality for Future and present of Telco’s and is for RAN . A way two architectures can co-exists and expand is a key
There will be use cases where Open -RAN will be better than legacy and vice versa is and will be true as well
Focus on private Networks—Building indoor capacity and those for Enterprise will be very attractive if we can build and give them this solution ready on optimized servers .
4. Architectures
From architecture and solution packaging side following is the key .
Solution Packages —We must get ready of field integration on RAN to make it ready to work in ZTP way the way we made SD-WAN successful . So we must take complete solution in an E2E manner
SMO and RIC innovation — Need more attention on RIC and SMO RFX than other pieces as it will lead to better performance believe or not
Procurement risk — Analyze software supply chain risk same you analyze hardware and chips shortage .
Software is not about supply but also its continuity and stability and bug control .
5. Security
I am a black box and I have decades of experience so trust me I am secure . I swear it can you trust it ????
A decade in Virtualization and Cloud native architectures proved that security of distributed system is always better when it is ready .
Because a distributed and Open system has more vulnerability points and it needs new dimensions and standards on security to be analyzed . On the contrary the Open systems are always transparent and modular and hence trusted to be more secure
Just not rely only on Cloud or 3GPP security standards but also on ETSI NFV/SEC , CNCF along with latest Security WG inputs on Security
Open RAN increase threat surfaces specially for those that lie outside the Data centers like E2 interfaces between CU and DU and the F1 e.g between O-RU and O-DU as for such cases we needed new ways of mutual authentication with zero trust
Another domain is trusted security certification of X-apps coming from trusted partners that uses signed certificates
One domain often mis managed is defining standard onboarding process between vendors e.g between VMware platform and RedHat RAN platform . As per my experience deviation in these always means open spaces for malicious actors .
Solving multi layer dependence in a standard manner is key to bring secure Open RAN solutions to the market and something better than legacy .
6. Server Models
Finally life is not about dreams its about actions and any software vison we need to package in boxes that customers will buy . This is not only a technical limitation but market reality as well so focus on which vendor will help you make what meet your requirements is a key .
Server Models — Making server models that can deploy on mountains , in trenches , in desert , in parks is vital.
Site configurations — Flexibility to align server with site configuration e.g flexibility on DU+CU co-deployment on site , DU and CU distributed including total DU centralization are all dimensions that are important .
Our latest "Event Start" program in TIP is adressing to analyze and solve all these business scnerios and aligning solutions for that
Already to post looks heavy on words and topics so i need to conclude it now as Part-1 on analysis for 2021+ RAN disaggregation .
I plan to share part-2 on this covering below domains soon .
How to access to ecosystem of use cases
Service Validation and Integration
Software supply chain
Onboarding standardization
Testing framework
Solution Benchmarking and role of Even star
Comparison with current systems and how Telco’s can build RAN roadmaps
What is sweet spot for Open-RAN and for what it is ready now
Last week a session in DT Telecom grounds by Prof. Fitzek raised bars on how to build future networks seeing the bigger picture is what an architect must do and here is why I feel RAN dis-aggregation is key the success of communication networks .
Industry Dilemma at large
Since centuries the way we build communication networks is based on set principles on how to transfer and retrieve information and that principles are now believed to be not true in context of detNets (Deterministic Networks) and randomness theory .
Information
From the time the world has started first call in 1876 it has been about information and not Data but obviously belief’s we set was on assumption that data will be limited or at least in limits as postulated by Shanon and Turing
In other words the Einstein’s relativity theory or Shanon symbol limits all are true but obviously not optimized in the world of communication networks full of randomness and noise .
Randomness
As per analysis by Dr.Fitzek all the communication based on Shannon’s channel theory is based on certain determinism let’s say 95% deterministic and all algos to encode and transmission follow this between transceiver and receiver however if we reduce determinism or alternatively increase randomness will mean we can build a more robust way to handle information in a world that needs huge data and need more ways to increase determinism to deliver many latency use cases .
Improved information
As data flux will increase and there will be more intelligence needed to process and bring forward meaningful data or we can see evolution of communication in a different paradigm .
Consider the fact the progression will be from networks that were entirely mesh and focus to information to internet which is resilient but less informative to an era where we will again improve network information
This all will be possible as instead of purely on message we will focus on what the intent was
A new 5G and 6G context will be focus on Why to transfer this information instead of what exactly the message from sender was
Open-RAN relevance
As we can see from all the architecture the two domains driving the rise of quantum communication networks are Edge and Open RAN .
Edge and Private Mobility as Information which are central will become de-centralized and will be processed at edge bringing only the Intent to the layer up
Open RAN as will re-frame
the sender by bringing and encoding only the rightful information and continuously tune it that is how communication networks will challenge the speed of lights and approach the Quantum limits
Future Reference
Following are some books and updates to my library to navigate the future . Best of luck
Opening up Telecom last mile 5G site (“Transport risk”) to many Infrastructure players(“Public ,Silo, Open”) consumed by many developers (“some hackers also” ) with in a rugged Environment approaching freezing cold in blue mountains (“Hardware standards Risk”) is too risky an endeavor and something with zero trust on security which no Telco wished to consider a decade ago .
This will become more exciting as we move from #staticedge to #movingEdge ( Tractors for Agri and Autonomous cars V2X is that dimension) Further proliferation of many and evolving use cases in verticals poses varying #security challenges further complicated by fact that no standards explains TSR requirements in detail for the Edge
As per recent study by ETSI and supported by #Cyber#GSMA and #IEC a #Telco carrier grade Telecom Edge must solve following Edge Security challenges with proven roadmaps to be Deployment ready specially considering futuristic #5G deployments
Challenges 1. TSR and Govt security standards need global harmony e.g EUCSA , German C5 and NIS ,EAL5 etc 2. Traditional Telecom encryption like 3GPP SA3 snow5 256bits is challenged by rise of quantum processors 3. Confidential computing solutions provided by different vendors has no harmony e.g Cisco is too different than Juniper 4. Platform attestation only protect central cloud need topology attestation for Edge Clouds 5 whole system” security monitoring and management framework. 6 LI architecture and data protection 7. Security Assurance systems with data protections using OATH2.0 , JWT etc
Solutions going forward: 1. Use of AI like Intel bfloat16 for running AI for security to build accurate AI/ML models that reflect a wider dataset while retaining the privacy and locality of private and sensitive data 2. GPU Processing with use of RIC for possible security management and control via 3rd party security X-Apps 3. Use of Ledger in both Hardware and Cloud is the future 4 .LI ETSI GR NFV-SEC 011 and retained data protection ETSI GS NFV-SEC 010 5. Security Assurance systems ETSI SEC 021-027 to prohibit single point of entry
With 5G Deployments based on 3GPP Release16 accelerating the appetite to bring value through slicing is increasing which is obvious as many of Telco’s were waiting for 3GPP Rel16 to bring life to some of new use cases which were obviously requested by customers but that were not viable both from technology and business point of view .
Now it is going to change with 5G Standards along uRLCC , Private Networks and mIOT are hardened and GA for market
5G slicing will spur new wave of industry innovation through logical split of infrastructure and applications as slices and NaaS offerings as #Telco’s continue to differentiate around throughput ,reliability, control and QOS with slicing while ensuring enterprise needs of isolation and security .
But when we talk about slices we must not consider it as a piece of Cake but something that will have a legal business value and huge revenue potential . In addition it will have a Life that needs to be managed like Create, modify and delete a network slice, define and update the set of services and capabilities for a network slice, identify UE and its service requirements and associate it to a network slice .
Slice as a Service Components
For slicing to work the Slicing Manager must have following components
Slice Design
Slice Automation
Slice modeling and Orchestration
Slice O&M
Each of this is a big topic specially when we consider a model to automate both design time and run time . Similarly integration is slicing manager to SBI and to make sure it can enforce slices and monitor/manage them in real time is very important and requires a detailed discussion and not topic of this writeup . Lets summarize how to approach slices business and as Telco what are key focus areas and use cases
Slicing as an Ecosystem
It is true that Slicing is brilliant having said it is not simple and if we have to make it successful we need to make it simple through simplification like Open API for developer and promote System level interaction using platform level abstraction .
Delivery of Ecosystem is more important than ecosystem to promote such solutions compared to Closed Ecosystem vendor and solutions
At GSMA after early version of Network Slice requirements NG.116 we have already worked to capture key gaps and how to enable Slicing in live networks that will be summarized in NG.127 E2E Network Slicing Architecture and will be share with community soon . Here i will jut summarize key points to solve Ecosystem issues .
Slice Potential
According to latest industry research report shared by #Ericsson and #ArthurD.Little the Top10 industries will drive more than 90% of #Networkslicing requirements. This is very important for #Telco’s to approach each use case and industry by priority and customers needs starting from low hanging fruits .
Following are top 5 considerations and TOp5 Use cases Telco’s must focus on
Telco’s Key Targets to enable Slicing Business Potential
Start from brown field mindset with customers who already have some established business with #Telco’s like connectivity , infrastructure etc
Telco’s need strong strategy like which industries they will closelty work and key use cases
Understand enterprise needs and current Operating model all the way from Applications O.B to Managed services
How #Telco can lead to control all ecosystem for enterprise in 5G using slicing
To make new use cases work will need #Telco to take more responsibility like evolving from #connectivity provider to service creator
Big enterprises are willing to partner with Telco’s if they can lead eco system and when offer E2E services with industry grade SLA’s
Slicing Top Use Cases
Following are top use cases to be considered for 5G in 2021 as low hanging fruit .
Top1:Automotive Main use cases are teleoperated driving, coordinated groups of platooning vehicles, automated lane change and real-time situational awareness.
Top2: Healthcare Main use cases are remote procedures in emergencies, precision medicine and rehabilitation robotics
Top3: Manufacturing AR devices will enable improved quality inspection and diagnosis for maintenance workers, technicians and operators throughout a plant, as well as remote controlled robots and3D video-driven interaction between collaborative robots and humans.
Top4: Broadcasting and streaming Typical use cases involve UHD (8K+) , VR ,360-degree video
Top5: Energy Typical use cases involve Voltage monitoring , virtual Power plans /Video surveillance , connected remote windfarms
Storage solution selection for Cloud Native Infrastructure is quite complex primarily due to the fact whether VM or Container , whether PNF or VNF or CNF the “State” has been the most important characteristics of a Telco Service .One ideal scnerio in 5G era will be UPF resiliency where not only we want real time detection but also to secure end point connections after the new instance is spun up , this requires a strong “Storage” solution that is secure , standard and manages state while keeping infrastructure immutability .
Although the block stroage is important choice but fact is with 5G use cases and data proliferation the most important decision for Telco and Enterprise is how to manage the unstructured data at scale
Requirements for a Telco grade Storage solution
Following are top requirements from Telco perspective to architect the storage solutions
A unified solution that is ready for the “Hybrid Cloud Infrastructure” era all the way from Cloud to Core to Edge
Scale out expansion model to ensure workloads are not impacted during day2
H/A architectures considering the fact not every node have an H/A at Edge how stroage can serve those cost and footprint constraint environments
Disaster recovery
Day2 operations specially changes ,upgrades and B&R Processes
Software Upgrades and Hardware refresh
How to ensure Telco grade performance specially as scale of data grows
Dell Power Scale Storage Solutions
PowerScale is a new unstructured data storage family based on new PowerScale OneFS 9.0. The new OneFS is optimized to run on PowerEdge-based x86 servers and will accelerate our time to innovation, and your agility to keep up with your customers ever-changing needs. It can offer simplicity at any scale, handle any data, any where, and search within your data to help you unlock it’s potential.
With a scale-out architecture, capacity and performance are provisioned only as needed without having to over-provision storage or resort to fork-lift upgrades. With a single namespace, single file system environment and Enterprise-class data services customers get simplicity, flexibility and performance with increased efficiency and new automation capabilities.
Benefits from our accelerated innovation include these new features and models:
Simplicity at Any Scale: OneFS increased efficiency and automation capabilities – from 7TB to petabytes scale , with 16TB to 61TB storage per Node
Any data. Anywhere: We now support S3 object access, and offer new PowerEdge-based all flash and NVMe nodes & more cloud options.
Intelligent Insights: CloudIQ for datacenter insights. DataIQ for data insights.
It’s a complete solution for unlocking the potential within your data.
If you want to more do check out PowerScale intro and demo
What is Dell oneFS
As unified OS for all DellEMC storage portfolio including Islion, ECS, PowerScale he OneFS file system is based on the UNIX file system (UFS).Each cluster creates a single namespace and file system, without partitions. File system is distributed across all nodes in the cluster and is accessible by clients connecting to any node in the cluster.
OneFS controls access to free space and to non-authorized files via share and file permissions, and SmartQuotas, which provides directory-level quota management.
Because all information is shared among nodes across the internal network, data can be written to or read from any node, thus optimizing concurrent performance.
PowerScale at the Edge
For the edge we are optimized for data-intensive applications and workloads in the field. For example, you can use a PowerScale F200 at the Edge, let it process that data locally and if needed, replicate back to a F600 at the Core.
Source: RedHat Ceph guide: Public Reference Architecture
During the last several months there are lot of advancements on CPU Cores and specially with launch for Cloud Native and 5G https://www.intel.com/content/www/us/en/newsroom/news/processors-accelerate-5g-network-transformation.html#gs.2fscqr it is strongly believed that Scale out Architectures required for Servers are well in place however the missing piece still is Storage and necessary acceleration that is needed to make sure customer Cloud infrastructures and 5G requirements are addressed .
Why Storage is so difficult
Historically storage requirements from workloads are too variant leaving customers only choice to plan Storage architectures based on worst I/O and that is not TCO efficient . Similarly for Telco Infrastructure as whole industry agreed on X86 as reference there was never an agreement on Storage after Openstack initial releases that puts CEPH as heart of Architecture . It has lead to following issues
Many Cloud workloads can never be satisfied with SDS , one such example in vDPI which as per field experience needs 3X more storage nodes compared on SDS compared to Physical SAN
The vSAN architectures are tied with hardware selection e.g just scaling storage without CPU core expansion as needed in many IT and Data centric Applications is not possible
There are many driver issues between vSAN and physical and that needs a complete compatibility check for optimized utilization ,may be below excellent blog will help you clarify this in a bit detail
There is at least 30% capacity waste when integrating vSAN type of solutions with phyiscal SAN
Industry Early adoption of Storage in Telco’s Infrastructure
From Day1 Telco’s want to build a software defined and programmable Storage infrastructure but that ideally SAN is not meant to offer this , similarly SAN required both FC and FC Switch while a SDS Solution can integrate using FC over IP ethernet making whole networking following same TCP/IP suite .
That’s why in NFV1.0 our all analysis showed to use a SDS like solution on Dell type R730Xd or R740Xd with 2,2TB storage or using MDS2X00 cabinets or DSS7000 where later is only used for data and I/O sensitive workloads like
+vUDC
+vDRA
+SPS etc
Non Realtime Performance vs Realtime Performances
SDS like Ceph and bluedata certainly solves Storage issues on Scale out performance by adding OSD’s as and when needed while for SAN the controller is the bottleneck , test results in field shows that Ceph using Cloning at image level can improve performance by at least 20% using Scale out architectures.
However the issue lies in Realtime performance which is Read and not Write like when the VNF or CNF starts at boot time , no SDS storage architecture solves this issue in a cost efficient manner and thats why for such special workloads adoption of SDS will always be a question .
Scale and Replication
The most important advantage in favor of SDS is advantages of Scale and Replication , 3 state copy and VM image replications like RBD mirroring to replicate cluster with zero impact vs SAN and physical stroage limited options is certainly a Win for SDS like vSAN and Ceph
Data Plane limitations
With pass through architectures like SR-IOV following will be limited in SDS
vMotion
DRS
Data Layer H/A
By limit i mean not the function but performance for a commercial cloud
Storage SKU optimizations
By far the most difficult issues in Cloud Infrastructure is finding most optimized storage solutions in a rugged environment , the most prevalent use of Rack servers vs Blade servers is also due to fact that SDS can not map to a blade server efficiently as it do to a rack server . Similarly the BOSS is the embedded RAID1 controller with M.2 storage for the operating system. Clearly, the HBA330 will carry a far higher I/O load than the BOSS, but there’s no need to add unnecessary IO load where it can be avoided
As an Architect it is your job to look after such caveats to find the most optimum architecture for your infrastructure
Containers Storage Solutions
Currently storage is a critical piece in Cloud native infrastructure specially its acceleration part which is not well standardized so main functions are realized through CNS CSI (Cloud storage Interface ) drivers which are mature and stable
The Kubernetes vSphere CSI driver is implemented under an architecture called vSphere CNS CSI, which is comprised of two key components:
The CNS in the vCenter Server
The vSphere volume driver in a Kubernetes cluster
In a Kubernetes cluster, CNS provides a volume driver that has two subcomponents—the CSI driver and the syncer. The CSI driver is responsible for volume provisioning; attaching and detaching the volume to VMs; mounting, formatting, and unmounting volumes from the pod within the node VM; and so on. The CSI driver is built as an out-of-tree CSI plugin for Kubernetes. The syncer is responsible for pushing PV, PVC, and pod metadata to CNS
Since the Worldbank report on Africa connectivity in 2019 calling for action to increase regions connectivity and transformation by digital means there had been a lot of progress and we are lucky to work and witness change on ground together with big Operators and connectivity providers in the region including some of our trusted customers MTN , Liquid Telecom , South African Post , Telkom and many others .
The purpose of this writeup is to share our unique perspective of region requirements and why there is a great need to build future platforms considering those requirements . This is so important that we build Technology not only for privileged ones but for all if we need to ensure Technology remains a way to drive human progress .
Top requirements from African customers
Connectivity of even the prosperous economies like South Africa , Nigeria , Botswana is no more than 60% leaving the large population inaccessible to the World . Where there is connectivity its not fit for purpose for example the Covid-19 e-learning platforms can not give services on remote Africa 2.5G #Edge Networks at 50Kbps
Following are the lessons we learnt from Africa during last year stride on establishing our strong team in Southtel in Africa that covers 20+ markets in Africa . Although lessons are many but i will summarize top5 to make sense main focus areas in next 12-18months and to build something solid as foundation over which we can scale and build an strong , open and United Africa .
Top1:5G is a Innovation Platform
5G is not like its early generations where in globally there was both one type of delivery model and set of services including OTT that were offered .
5G will require an innovation mindset because it will be something different for different operators , alone the Frequency allocated to 5G will make a huge difference to delivery models e.g mmwave Above 26Ghz is so different than 800MHz FDD deployment .
There is one new initiative pushed by leading vendors like Ericsson in the form of ISV program which we believe can bring local developers and startups to use platforms to build something unique for the market as per promise like below
There is a strong desire by Hype and not by customers to show them how Telco’s use AI ?
In 2020 in GSMA i were deeply involved to work with some of great teams to materialize a few and top use cases that came up were
5G Rollout and Site Optimization
Operations automation like cross layer RCA , Infrastructure optimization
However using Silo stacks to deliver a use case translated to finding a closed road finally i think what projects in Africa taught me that we must build One Datalake and One API Lake primarily focussed on to build
Data Ingestion Platform that is real time , robust and scalable
Use of graphs as a way to build and consume data for use cases
Top3: Edge and NaaS connectivity
Edge is a continuum between developer and user and hence each region and consumer need to focus on its own definition to make it real . The rollout of SouthAfrica 5G revealed that even big enterprises are not big enough to make the Edge of their own and so are NPN networks and hence there will not a CBRS 3.5Ghz the only issue is on current offered 3.7Ghz simple the minimum 280Mhz is not available and it is so vital to make a real use case of Africa around
Industrial IoT specially for mining and worked safety
Private 5G mostly for Airports and University campus
Top4: Disaggregation on RAN and TXN
It is hard to believe but yes its true that big operators in Africa are one of the main participants and users of disaggregation on RAN and Transport , infact the large global operators like Airtel and Vodafone has already rolling the disaggregated solutions in the African remote markets and if we combine right skills on top of it will mean a true transformation for Africa .
In Southtel we have been engaging with a number of standard bodies and ecosystem providers to build local innovation center , test and validation facilities in Johannesburg that can also support openlabs for the community . Through this partnerships we are hopeful to build and innovate Africa , read more information here http://southtel.co.za/
On the RAN side there is still lag in local expertise to build RAN disaggregation large rollout for commercial 5G or 4G services , having said this there is a real desire to introduce them for green field spots inside brown field environments mainly focusing on
End to end solutions with use cases using Open RAN and 5G Open Magma release
Building Macro and mostly inbuilding coverage using RAN disaggregation
I think we are currently in RFX and PoC phase and next 12-18months is crucial to build something meaningful
Top5: Desire of Open Cloud
Due to logistics and local expertise still adoption of opensource in Africa is in frenzy state and i think big vendors really take advantage of this situation to push the closed solutions with open labels on it . I think there is a shared responsibility for all to solve this issue to increase platform adoption and to control global cyber issues in the next years .
However it seems clear with the build of pilot MTN Open cloud using Open cloud solutions means a lot to future growth as most future NF’s will work on the Cloud . But top questions we need to solve is
How to improve Africa Opensource adoption strategy
How to build partner and developer ecosystem
How to accelerate adoption of hybrid cloud
It seems clear in Africa all clouds specially vendor provided solutions like Ericsson CCD , Nokia Cloud band , VMware VIO etc will be dominant and their coexistence with open cloud solutions like RedHat is a must , we need to find a balance for workloads to exist in all platforms that can support portability across the platforms . This is what is a must to realize African Dream .
I am expecting a strong support from vendors , partners and ecosystem to reduce technology barriers that currently reduces new technology rollouts in Africa , there should be something that lies about $ interests and building open platforms for future is most important to build a stronger and connected Africa