Application Aware Infrastructure Architecture of Future Enterprise and Telecom Networks
An architect’s perspective in 2020+ era
The recent global situation and use of Critical Telecom infrastructure and Security solutions in the Cloud has shown to many critics as well the esoteric of terms like Hybrid Cloud , AI , Analytics and modern applications is so vital to bring society and economy forward .
Seeing the latest development where we have been actively joining the community in both Infrastructure and Application evolution both in the Telco and Enterprise Application world i can safely conclude that days where Infrastructure is engineered or built to serve application requirements are over . On the contrary with the wide range of adoption of application containerization and Kubernetes as platform of choice the future direction is to design or craft the application that can best take advantage from a standard cloud infrastructure.
Understanding this relation is key impetus between business who will flay and those who will crawl to serve the ever-moving parts of the Eco System which are the Applications
Source: Intel public
In this paper let us try to investigate some key innovations on Infrastructure both Physical and Cloud which is changing the industry Pareto share from Applications to Infra thereby enabling the developers to code and innovate faster.
Industry readiness of containerized solutions
The adoption of micro services and application standardization around the 12 factor App by cloud Pioneer Heroku in 2012 gave birth to the entire new industry that has matured far quickly compared to virtualization. A brief description of how it is impacting market and industry can be referenced in Scott Kelly paper in Dynatrace blog . This innovation is based on standardization of Cloud native infrastructure and CNCF principles around Kubernetes platforms aimed at following key points
The Covid-19 has proved the fact that if there a single capability that is necessary for modern era business to survive then this is scalability , in recent weeks we have seen millions of downloads of video conferencing applications like Zoom , Webex , blue Jeans then similarly we have seen surge demand of services in the Cloud . Obviously, it would have been an altogether different story if still we were living in legacy Telco or IT world.
Immutable but Programmable
On every new deployment across the LCM of applications the services will be deployed on new infrastructure components, however all this should be managed via an automated framework. Although Containers in Telco space do require stateful and somehow mutable infrastructure however the beauty of Infra will keep the state out of its Core and managed on application and 3rd party level ensuring easy management of the overall stack
Portable with minimum Toil
Portability and ease of migration across infrastructure PoPs is the most important benefit of lifting applications to the containers, infact the evolution of Hybrid clouds is the byproduct business can reap by ensuring applications portability in
Easy Monitoring and Observability of Infra
There is large innovation happening on the Chip set, Network Chips (ASIC), NOS i.e P4 etc however the current state of Infra do not allow the applications and services to fully capitalize on these advantages. This is why there are many workarounds and complexity both around application assessment and application onboarding in current Network and Enterprise deployments
One goof example of how the Container platforms is changing the business experience on observability is Dynatrace which allows the code level visibility , layers mapping and digital experience across all hybrid clouds .
Already there looks a link from platform to infrastructure which will support delivery of all workloads with different requirements over a shared infrastructure. The Kubernetes as a platform already architecting to fulfill this promise however it requires further enhancements in Hardware, the first phase of this enhancement is using HCI, our recent study shows in a central DC using of HCI will save CAPEX by 20% annually. The further introduction of open hardware and consolidation of open hardware and open networking as explained in the later section of this paper will mean services will be built, managed and disposed on the fly.
From automated Infrastructure to Orchestrated delivery
However, all those who work on IT and Telco Applications design and delivery will agree the cumbersomeness of both application assessment/onboarding and application management with little infrastructure visibility. This is because the mapping between application and infrastructure is not automated. The global initiatives of both the OSC and SDO’s like prevalent in TMT industry has primarily focused on Orchestration solutions that is leveraging the advantages of the infrastructure specially on chip sets driven by AI/ML and enabling this relationship to solve business issues by ensuring true de-coupling between the Application and Infrastructure
Although the reader can say the platforms like Kubernetes has played a vital part for this move however without taking advantages of physical infrastructure simply it could not be possible. For example both Orchestration in IT side primarily driven by K8S and on Telco Side primarily driven by initiatives like OSM and ONAP is relying on infra to execute all pass through and accelerations required by the applications to fulfill business requirements .
Infact the Nirvana state of Automated networks a more cohesive and coordinate interaction between application and infrastructure under the closed loop instructions of Orchestrator to enable delivery of Industry4.0 targets.
Benefiting from the Advantages of the Silicon
Advantages of Silicon were, are and will be the source of innovation in the Cloud and 5G era . When it comes to Hardware infrastructure role in whole Ecosystem, we must look to capitalize on following
The changing role of Silicon Chips and Architectures (X-86 vs ARM)
The Intel and AMD choices are something familiar to many Data center teams, somehow in data centers where performance is a killer still Intel XEON family outperforms AMD whose advantages of lower floor print (7nm) and better Core/Price ratio has not built a rational to select them. Other major direction supporting Intel is their supremacy in 5G , Edge and AI chips for which AMD somehow failed to bring a comparative alternative. The most important drawback as the author views is basically the sourcing issues and global presence which makes big OEM/ODM’s to prefer Intel over AMD.
However the Hi-Tec industry fight to dominate the market with multiple supply options specially during recent US-China trade conflict has put TMT industry in a tough choice to consider non X-86 Architectures something which obviously no one like to have as its Eco system is not mature and the author believes a un-rational selection will mean the future business may not be able to catch advantages coming from disruptors and open industry initiatives like ONF , TIP , ORAN etc
Following points should be considered while evaluating
- Ecosystem support
- Use cases (The one which support Max should win)
- Business case analysis to evaluate performance vs high density
Except Edge and C-RAN obviously Intel beats ARM
- Aggregate throughput per Server
- NIC support specially FPGA and Smart NIC
Obviously, Intel has a preference here
- Cache and RAM, over years Intel has focused more on RAM and RDIMM innovation so somehow on Cache side its thing ARM has an edge and should be evaluated. However consider fact not all use cases require it makes it a less distinct advantage
- Storage and Cores , this will be key distinguisher however we find both vendors are not good in both. Secondly their ready configuration means we have to compromise one over other
This will be the killer point for the future silicon architecture selection
- Finally, the use of inbuilt switching modules in ARM bypassing totally the TOR/SPINE architecture in Data centers in totally may got proponents of Pre-Data center architecture era however promise of in-built switching in scaled architecture is not tested well. For example, it means it is a good architecture to be used in dense edge deployments but obviously as far as my say is not recommended for large central Data centers.
However only the quantitative judgement is not enough as too much dominance of intel meant they do not deliver the necessary design cadence as expected by business and obviously opened gates for others, it is my humble believe in the 5G and Cloud era at least outside the Data centers both Intel and ARM will have deployments and that they need to prove their success in commercial deployments so you should expect both XEON® and Exynos silicon recently .
FPGA ,SmartNICs and vGPU’s:
Software architecture has recently moved for C/C++/JS/Ruby to more disruptive Python/Go/YAML schemes primarily in a drive of business to adopt the Cloud . Business is addressing these challenges by requiring more and more X-86 compute power however improving the efficiency is equally important as well. As an example, Intel Smart NIC family PAC 3000 we tested for a long time to ensure we validate power and performance requirements for throughput heavy workloads.
Similarly, Video will be vital service in 5G however it will require SP’s to implement AI and ML in the Cloud. The engineered solutions of RedHat OSP and Openshift with NVIDIA vGPU means the data processing that was previously only possible in offline analytics using static data source of CEM and Wirefilters.
Envisaging the future networks that combines power of all hardware options like Silicon Chips, FPGA, Smart NICs, GPU’s is vital to solve the most vital and business savvy challenges we have been facing in the Cloud and 5G era.
There is no doubt networking has been the most important piece in Infrastructure and the networking importance has only increased with virtualization and with a further 10-Fold increase with Containers primarily as Data centers fight to deliver best solutions for East-West Traffic. Although there are a number of SDN or automation solutions however there performance has scale has really shifted the balance towards infrastructure where more and more vendors are now vesting on the advantages of ASIC’s and NPC’s to improve both the forward plane performance but also to make the whole stack including fabric and overlay automated and intelligent fulfilling IDN dream by using latest Intel chips that comes with inherent AI and ML capabilities .
The story of how hardware innovation is bringing agility to network and services do not ends here for example use of Smart NICS and FPGA to deploy SRV6 is a successful business reality of today to converge compute and networking infrastructure around shared and common infrastructure.
Decoupling, pooling and centralized monitoring is the target to achieve and already we know with so many solutions which are somehow totally different in nature like on networking side between fabric and overlay means to harmonize the solutions through concept of single view visibility. This will mean that when an application demands elasticity hardware does not need to be physically reconfigured. More compute power, for instance, can be pulled from the pool and applied to the application.
From Hyperscale’s to innovators
The dominance of hyperscale’s in Cloud is well known however recently there had been some further movements that is disrupting the whole chain. For example, now ONF Open EPC can be deployed on OCP platform. Similarly, the TIP Open-RAN initiative is changing the whole landscape to image something which was not even in discussion a few years before.
Since the ONF is too focused on Software and advantage brought forward by NOS and P4 programming so I think it is important just to talk about OCP . The new innovations in rack design and open networking will ensure to define new compute and storage specifications that best meet the requirements for the unique business requirements .Software for Open Networking in the Cloud (SONiC) was built using the SAI (Switch Abstraction Interface) switch programming API and has been adopted unsurprisingly by Microsoft, Alibaba, LinkedIn, Tencent and more. The speed at which adoption is taking place is phenomenal and new features are being added to the open source project all the time, like integration with Kubernetes and configuration management
Finally, I am seeing a new wave of innovation and this time it is coming via harmonizing of architecture around Hardware, thanks to the effort in last few years around Cloud , Open Stack and Kubernetes. However, these types if initiatives will need a more collaborative efforts between OSC and SDO’s i.e TIP and OCP Project harnessing the best of both Worlds
However, with proliferation of so many solutions and offerings the standardization and alignment of common definitions of Specs for the Shared Infrastructure is very important.
Similarly to ensure innovation delivers the promise the involvement of End user community will be very important , the directions like LFN CNTT , ONAP , ETSI NFV , CNCF and GSMA TEC are some of the streams which require operator community wide support and involvement to come out of clumsy picture of NFV/Cloud of last decade to replace by true innovative picture of Network and Digital Transformation .A balanced approach from Enterprise and Telco industry will result the business of today to become the hyperscale’s of tomorrow .
I believe this is why after a break this is the topic I selected to write. I am looking forward for any comments and reviews that can benefit community at large
- Fast Convergence and Network Stability Considerations in Service Provider Network https://www.youtube.com/watch?v=yG2pwOtiBo4&t=37s
- OSC -Open Source community
- LFN- Linux Foundation
- NPC -Network Processing Cards
The comments in this paper do not reflect any views of my employer and sole analysis based on my individual participation in industry, partners and business at large. I hope sharing of this information with the larger community is the only way to share, improve and grow. Author can be reached at email@example.com