In my previous article “5G signaling: Why do telecom operators need an SCP?”, I outlined the importance of the SCP -5G signaling controller- in a 5GC network and why Communication Service Providers should invest in it right at the start of the 5GC deployment. In this post, I will try to further describe the architecture of the SCP (Service Communication Proxy).
The new 5GC architecture is a set of network functions (NFs) that need to interact in order to fulfill the required 5G service. To facilitate the communication between NFs (consumers and producers), 3GPP has introduced the SCP. For its architecture, 3GPP outlined three possible scenarios: SCP based on service mesh, SCP based on independent deployment units, and SCP based on name-based routing. In this article, we will be discussing those three proposals and which one is implemented by vendors.
SCP based on service mesh
As already explained in my previous post, the new 5G Core network (5GC) is completely based on what is called a Service-Based Architecture (SBA), which implements IT network principles and a cloud-native design approach. This allows the implementation of the existing web scale technologies and open source software used in IT architectures. One of the technologies that 3GPP has recommended to adopt is “service mesh”.
Off-the shelf service mesh
In the cloud native ecosystem, applications are developed using micro-service architecture with a high degree of modularity, such that applications are composed from easily managed and deployed services across a multitude of infrastructures with flexible and minimal coupling. With this refactoring, applications become composed of hundreds of services, each of which might have many instances. This new concept enhance the flexibility and efficiency of IT solutions, however with the increasing number of such microservices, the complexity of service-to-service communications is further increased. To overcome this challenge, a service mesh was introduced.
A service mesh is a dedicated infrastructure layer for facilitating service-to-service communications. It manages service discovery, controls the delivery of service requests, performs load balancing, encrypts data and adds observability features to microservices.
As illustrated in figure 1, a service mesh consists of two elements: the data plane and the control plane:
· The data plane is a set of proxies (sidecars) deployed alongside application code. In the case of Kubernetes, a sidecar container can be deployed along with application service container as part of the Kubernetes Pod. These proxies handle the communication between microservices, all service calls moving to and from a service go through these proxies. The proxy also act as a point at which the service mesh features can be introduced. It then applies rules of authentication, authorization, encryption, rate limiting and load balancing, handles service discovery, implements logging, tracing etc.
· The control plane is the brain of the mesh that coordinates the behavior of proxies and provides them with the rules needed to manage inter-service communications.
Service Discovery
One of the tasks that are handled by a service mesh is Service Discovery. A new concept that is related to the adoption of micro-service architecture. A microservice application consist of a number of service instances running with dynamic changes. Therefore, a client needs a mechanism, such as Service Discovery, to locate a server automatically without the need for a complex configuration process. This is done via a service registry that allow service instances to register/deregister itself in order to maintain a database of all available service instances.
The top service mesh solutions (i.e Istio, Consul, Linkerd) have an internal registry to supply the proxy sidecars with information needed for service discovery.
In 5G, the service discovery is provided by a 5GC NF called NRF (Network Repository Function). The NRF works as a centralized repository for all the NFs, it maintains a record of available NF instances and their supported services. The NRF allows NFs to register/deregister itself and its supported services, which is used by other NFs for the discovery of available instances and their services.
3GPP proposal
By adopting cloud-native technology, the 5GC will be reorganized into individual services called Network Functions (NFs) that are coming together to dynamically discover each other, and utilize the services offered by each other. As a result, the new 5G network will be confronted with the same challenges that IT solutions have faced. That’s why 3GPP has added in 3GPP TS 23.501/R16 the SCP, to deal with the same issues that the “Off-the shelf service mesh” has dealt with. [1]
Because of the similarities between SCP and “Service Mesh”, 3GPP has urged 5GC vendors to design their SCP solutions based on the service mesh architecture.
The design proposed by 3GPP (see figure 2) is based on a distributed model in which SCP service agents are co-located with 5GC network functions (NF) and are acting as sidecar proxies. All the logic required for inter-service communication is abstracted out of the NF microservice and put into the sidecar. Service agent is injected as a sidecar container alongside the NF instance (in the same Pod for Kubernetes) and without the knowledge of the NF. That means, the SCP is unknown to the NF and therefore the NF will act as there is no SCP in the path. Once a service agent has been deployed, it interacts with the service-mesh controller; which is responsible for pushing all traffic management policies to all service agents.
In this deployment, the SCP manages registration and discovery for communication within the service mesh and it interacts with an external NRF for service exposure and communication across service mesh boundaries.
SCP based on independent deployment units
Like service mesh architecture, this deployment consists of SCP agents and an SCP controller. The SCP Agent is acting as a proxy that implement necessary peripheral tasks and providing NFs with indirect communication and delegated discovery. It implements the http intermediaries between service consumers and service producers. Routing and selection policies an SCP Agent applies for a given request are determined by routing and selection policies pushed by the SCP controller. Communication between SCP controller and SCP agents is via SCP internal interface. (see figure 3)
The word “independent deployment units” indicates that the SCP agent is not co-located in the same deployment unit (i.e. kubernetes pod) with the NF, which means that the SCP is a component known to the NF. Therefore, there is a need to explicitly address the SCP within the NF for leveraging the SCP’s functionality, which necessitates further configuration in both consumer and producer sides. Also, routing mechanisms will be different from those specified for SCP based on service mesh architecture (see clause 6.10 of 3GPP TS 29.500) and NFs should be configured to route all the traffic toward the SCP. [2]
SCP name-based routing deployment
This scenario implements a name-based routing mechanism that provides IP over ICN (Information-Centric Networking) capabilities [3]. It comprises three essential components (see figure 4):
· Service router: acts as communication proxy and it provides discovery, registration, and routing through a path computation element. It is responsible for mapping IP based messages onto ICN publication and subscriptions. Once ICN information is translated to an IP message, the task of selecting the optimum path is handled by another component called PCE.
· Path Computation Element (PCE): is the core part of the SCP, which implements the Name-based Routing mechanism. It calculates a path between the consumer and the producer (e.g. the shortest path between the nodes).
· Registration and discovery service: to perform registration/discovery in the form of an internal service registrar and controller.
As you can see in figure 4, a service router resides as a single unit within a cluster and serves multiple 5G NFs within that cluster. Service router and 5G NFs are separate components but should be co-located within the same Service Deployment Cluster.
Service registration is forwarded to the internal registry as well as forwarded to the NRF, which is used to expose NF services outside the depicted SCP.
After selecting the producer, an NF consumer may communicate either directly without any SCP involvement; or via the Service Router when the NF producer resides outside of the consumer’s Service Deployment Cluster.
Which architecture is implemented by SCP vendors?
5GC vendors are leveraging the existing service mesh used in IT architectures, to address SBA challenges. To handle congestion control, traffic prioritisation, overload control and optimized routing within a microservices architecture etc.; many 5GC solution vendors are building their SCP based on the service mesh architecture. We will discuss here two of the leading vendors in the 5GC market, Nokia and Oracle.
Nokia CSD
Nokia SCP solution is called Nokia Cloud Signaling Director (CSD). It consists of two main components, a signaling plane and a mesh control plane (see figure 5). All 5G core signaling traverses a signaling plane composed of lightweight service proxies called NFP (Network Function Proxy). The centralized mesh control plane is tasked with the coordination and application of management, operations, and policies to all NFPs. Nokia uses Istio for the mesh control and Envoy as NFP. [4]
In Kubernetes, each of the NFPs is autonomously injected as a sidecar container alongside the instance of an NF service (same pod). Once an NFP has been deployed, it interacts with the mesh control plane over its APIs to implement and apply the necessary policies ― observability and security, for example ― to the 5G core signaling traffic flows.
The architecture described above is in line with the service mesh architecture used in IT solutions where the proxy is co-located with the NF in the same deployment unit. The SCP is unknown to the NF, thus there will be no https between the two elements according to 3GPP. This kind of deployment is recommended when the SCP and NFs are all provided by a single 5GC vendor.
Oracle SCP
Oracle SCP is modeled after the cloud-native service mesh solution and is made up of a control plane and data plane (see figure 6). The control plane is used to transfer routing rules from the controller to the worker, while the data plane is used to transport 5G messages. Oracle SCP is composed of Service Proxy Controllers and Service Proxy Workers:
· Service Proxy Controller — Learns network topology by subscribing to notifications from the NRF. It then derives routing policies and transfers them to the SCP workers. Also hosts the configuration interface for SCP.
· Service Proxy Workers — Use the routing policies to route the 5G signaling traffic between consumer and producer NFs.
Oracle’s SCP leverages the existing IT service mesh (Istio/Envoy) and adds some capabilities to address many of the challenges caused by the SBA architecture. [6]
Unlike Istio/Envoy standard deployment, the proxies in the figure above are not deployed as sidecars alongside the instance of a 5G NF but as a part of SCP microservices. That means that the SCP is a standalone component known by 5GC elements, therefore all NFs should be configured to route the traffic toward the SCP. This kind of architecture is a bit similar the SCP based on independent deployment units, it’s a mix of service mesh architecture and traditional telco architectures which is a good fit when dealing with a multivendor 5GC solution.
Conclusion
The best SCP architecture should combine the flexibility of modern IT architectures with the reliability and security of traditional telecom solutions. By adopting cloud-native technology, a 5GC network element should be able to take advantage of IT web-scale models that have proven to be agile, cost-effective and customer-oriented. That’s why telco vendors are using open source technologies like service mesh to build their SCP solutions. They make some modifications to the existing IT service mesh (like Istio/Envoy) to fit the requirements of 3GPP and manage the signaling traffic for the 5G deployments.
References:
[1] 3GPP TS 23.501 V16 : Annex G
[2] 3GPP TS 29.500 V16 : 6.10 Support of Indirect Communication
[3] “IP over ICN goes live”, Xylomenos, George
[4] Nokia Software and SK Telecom technical whitepaper “Evaluation of a Service-mesh based Service Communication Proxy for Future 5G Core Network”
[5] ABI-research: Importance of signaling in 5G