Kaloom's Startup Journey - Part 2
This is the second of a two-part blog series that tells the story of why we founded Kaloom and where we stand today.
Open Source - Standing on the Shoulders of Giants
When we first came to market, we were dismissed as “mad” by a reputable industry analyst. Two years later, that same analyst came back apologized and said, “Sorry, you were right!” Talk about validation.
From the very beginning, we wanted our solution to fulfill the real vision of SDN and NFV by building it on community-based open-source standards such as those from the IETF, ONF, The Broadband Forum, Linux, Kubernetes, and many others. Specifically, P4 programming language is fundamental to us for reasons outlined in our CTO’s blog titled, “P4 and After.” Mainly because it allows software to flexibly program hardware to add new functions—for example network slicing, load balancing and firewalling—in a way that was not possible before.
We decided to build a solution with fully standardized management interfaces so that Kaloom’s solution could interface with any controller including OpenStack and OpenShift. All Kaloom’s management interfaces are based on IETF NetConf specifications and associated YANG models.
Strictly relying on standard mechanisms and protocols permits Kaloom’s solution to interoperate with other standards-based solutions. Therefore, it is possible to integrate networking functions from Kaloom and/or other 3rd-party companies based on the preference of our customers. As we offer additional features in the future, an open standards-based platform enables our clients to choose which solutions they prefer to implement. Kaloom’s vision aligns with the reasoning behind the founding of the Open Compute Project – “to break open the black box of proprietary IT infrastructure to achieve greater choice, customization, and cost savings.”
Consolidating and Offloading the Data Plane
As outlined in a previous Kaloom blog about SDN’s challenges, SDN has brought many benefits to cloud and data center environments. However, it has only added to the already accelerating proliferation of virtual machines (VMs) – creating even more code, file and server bloat – and increasing overall system latency. Formerly hardware-based network equipment functions such as routing, switching, gateways, firewall/security, load balancing and many others have been converted from hardware-embedded into VMs. With VMs now handling a variety of network functions that used to be done by ASIC-based hardware, not only has server bloat increased, eating up the cloud’s valuable compute and storage resources, but it can introduce extra network latency by creating circuitous pathways through virtual architectures. The data plane path is often now “tromboning” back and forth through numerous VMs. In addition to the serpentine network pathway, each VM has its own data plane coding – many redundant to one another – to propagate this meandering packet journey.
Kaloom’s products are based on a clear separation between the control plane and the data plane parts. Our control plane is written in Golang, using containers over a Kubernetes clustering framework. Our data plane is written in P4. Without a doubt, Kaloom has built the most modern networking solution in the industry. The main advantages resulting from our implementation architecture are modularity, scalability, flexibility, and increased productivity compared to traditional C-Linux networking solutions. Moreover, the utilization of P4-enabled switches results in significant characteristic improvements in terms of throughput and latency. The ability of Kaloom to migrate the data plane components of virtual network functions from x86 toward these Tb/s P4 switches provides substantial CAPEX and OPEX reductions to our customers.
Kaloom has created an edge data center fabric capable of running any L2-L7 network function at-scale with unprecedented throughput and low-latency. By migrating network functions such as UPF, load balancing and others, Kaloom frees-up many servers to run revenue-generating applications rather than internal infrastructure services.
What Happened P4
Lucky for us, we founded our company a year after some famous networking experts formed the P4 Networking Consortium. Stanford’s Nick McKeown, known as one of the “fathers of SDN,” was among the visionaries that created P4 and would launch Barefoot Networks. As early adopters, Kaloom started developing on P4’s initial specification, namely, P4-14 in 2015 as it was the only version supported by the high-performance Barefoot (now Intel) Tofino™ Ethernet ASIC. We then migrated our development to the more recent P4-16. In 2016, Barefoot Networks was first to market with its P4-programmable Tofino™ switch chip. At the time, it was the fastest switch chip and enabled the first switches ever to be fully user programmable. Acquired by Intel last year, today, we see P4 support and verification coming from nearly the entire networking industry, including Intel, Xilinx and Netronome, among others. For our part, the Kaloom data plane can execute on any of the Tofino-based ODM platforms from Accton/Edgecore, Delta Networks, Inc./DNI, Foxconn/UfiSpace, Interface Masters Technologies, to Inventec and STORDIS.
Networking giant Cisco has been designing and building its own chips for its networking gear for more than 25 years. Previously, the company didn’t talk about that, focusing its message instead on the systems that the silicon enabled. Last December, everything changed when Cisco announced a move into the chip market and challenge to Broadcom. “Silicon One” is its migration strategy over the next 4 or 5 years that will also move all Cisco products to be P4-capable “for service providers and web-scale providers.” This shows Kaloom made the right choice so many years ago and was even more prescient than many of the world’s networking giants.
NFV/VNFs and the Distributed Edge
Service providers want and need to enable NFV/VNFs. Because containers are far more efficient than VMs, they are looking to migrate their VM-based OpenStack deployments to container-based OpenShift environments to enable container network functions (CNFs). This migration is a huge challenge for them.
For example, let’s say 99 percent of the power in a central office (CO) building is already utilized and an operator wants to deploy CNFs to support 5G. They can’t replace the old and inefficient legacy equipment because of the costs and complexity involved. Perhaps they have 15 kilowatts of power left available, so they can deploy 18 additional servers, which is not very many. With a traditional networking infrastructure, 3 dedicated servers are usually needed to run the network fabric controller and network overlay manager. Moreover, a minimum of 3 dedicated servers are also needed to provide the UPF with a capacity of 500 Gbps. Finally, 3 dedicated servers are needed to run the Kubernetes Masters. Consequently, half of the servers are used for running infrastructure services rather than customers’ revenue-generating applications. Kaloom and RedHat have jointly introduced the most cost-effective solution for 5G Edge data centers: Kaloom’s Unified Edge Fabric.
The Unified Edge
The image below shows an example of the dramatic difference that Kaloom’s solution enables versus a conventional (non-Kaloom) container-based edge deployment when using a 44 rack unit (RU), with Kaloom and Red Hat’s unified edge running on Red Hat’s OpenShift to the right.
With the Kaloom Unified Edge, we are consolidating the fabric controller, the UPF control plane, and the OpenShift Kubernetes Masters on the embedded XEON processors available in the network switches. Furthermore, we are running the UPF data plane function on the Intel Tofino chipset. Therefore, all 18 servers could be used to run OpenShift applications rather than network infrastructure services.
If you’ve read my previous blog, you would know that’s nine additional servers at $25,000 apiece so it’s a savings of $225,000 in just a rack! Plus, the servers are now available to run differentiated revenue generating apps like connected car, AR/VR, CDNs (Netflix) within the same limited space, power and compute resources available. If you multiply this by the 25,000 COs in North America you get a REALLY big number in terms of megawatts, space and money saved not to mention the additional revenues that our joint, unified edge solution enables. Kaloom’s solution allows service providers to maximize their limited space, power, compute and storage resources for revenue generating services that enable them to earn more.
Kaloom’s Many Differentiators
Kaloom’s quest to solve the many challenges created by 5G deployments led to containerization, offloading, consolidating and P4-enabled networking functionalities that created real-world differentiators.
Most networking solutions offer the same set of functions but rely upon different price points, sales teams and marketing messages to make them stand out among competitors. It was clear that we couldn’t just containerize the same features as other networking vendors onto white boxes and hope the cost, space and power savings would be enough for service providers to switch from their incumbents — even when our costs, latency and throughput were 10x better.
From the start, Kaloom decided that it would be the first vendor to properly support network slicing. Kaloom’s Cloud Edge Fabric natively supports network slicing, whereby an edge data center can be partitioned into multiple independent virtual data centers (vDC), with each vDC being provided its own virtual fabric called “a vFabric.” 5G-enabled secure, end-to-end, fully isolated network slices even have separate internal IP addresses, and each associated vFabric can be assigned to a different vDC operator. In this regard, slicing permits multiple operators to share a common distributed cloud infrastructure, with each entity enjoying full isolation down to the hardware level for better security and a better quality of experience. vFabric is also superior to VXLAN because users can’t “view” others’ network traffic.
For example, it is possible to create vDCs whereby one may occupy 10 physical servers and another one with 6 physical servers, depending on the application requirements. Kaloom supports any combination of vFabric servers. As explained in another of our CTO’s blogs “5G Drives Cloud-Native Distributed Edge,” different service providers – for example MVNOs – could share the same physical network resources while maintaining separate SLAs and offering differentiated services on each slice. This could financially enable initial 5G service rollouts by minimizing costs, and risks via shared infrastructure.
Another differentiator involves Kubernetes’ container networking interface (CNI). We added multi-networking and dynamic run-time capabilities to our CNI, named Kactus. This made it possible to connect multiple network interfaces to the various pods running within a Kubernetes cluster. Using Kactus, networks can be dynamically updated, or changed, at run-time and it provides a way for Kubernetes environments to dynamically discover and adapt to changes in the networking environment so it’s always aware of the changes.
Another benefit of our approach is scalability. Whether our clients require small distributed edge or centralized hyperscale deployments, Kaloom’s solution can be deployed in any scenario to meet the scale that is required. Inband Network Telemetry (INT) is a framework designed to allow the collection and reporting of network state by the data plane, without requiring intervention or work by the control plane. Because Kaloom’s solution uses P4-enabled Tofino, it has INT capabilities for clients that want to monitor what’s really happening on their networks in real time. Real-time visibility and management capabilities are very important requirements for troubleshooting and for guaranteeing the quality of service terms that are defined by the service providers’ SLAs. Because it’s not an overlay and interacts directly with the Linux kernel, it provides direct visibility and control of the network in real time.
These are just a few examples of our current differentiators, though we expect to continue building more in the future.
Flexibly Satisfying Business Demands
When it comes to 5G, businesses and service providers alike need much lower cost solutions that can evolve fluently so they don’t have to keep buying new hardware every few years. However, lowering space, equipment costs, costs per Gbps and per kilowatt-hour alone is not enough. At the end of the day businesses require solutions that meet very specific technical demands that are unique to them.
Today, technology and trends have come to where we thought the world was headed when we founded Kaloom, with many newer open source organizations increasing awareness of the challenges of containerized 5G environments and delivering new code and standards for the market. With the maturation of P4-programmable switches, many media, analysts and investors now fully understand and support our initial vision.
Our journey isn’t over and our quest continues. Kaloom hasn’t had to change anything regarding its vision, the world around us has changed. We had the right vision all along. We strategically bet on technologies like P4, initially, and now the industry is rallying around P4 as the domain-specific language for networking. That’s a good sign for our service provider clients and their end-user customers. We can’t wait to see what comes next!