Beyond the Specs: AI and the Future of DOCSIS Expertise

DOCSIS 4.0 chatbot: A Use Case for Generative AI in the Cable Industry

For anyone who has worked in the Cable industry for a period of time, and specifically with DOCSIS as a technology delivered over Hybrid Fiber-Coaxial (HFC) networks, the complexities of the DOCSIS communications protocol and associated management models is well known. Each release of the DOCSIS specification suite takes many years to develop, with vendors spending several more years developing and building new network devices. Once Cable Operators deploy a new technology, they will tend to operate with this new equipment for the next 7-10 years before the next rollout of upgraded equipment begins anew. Effectively, a DOCSIS release may have a total lifecycle of 15-20 years from inception to decommissioning. On top of that, each new release of DOCSIS adds more complexity to the access network and the business and operational support systems, requiring more time in the integration stages that occur before and during deployment of the new systems and Customer Premises Equipment (CPE), further extending those timelines.

DOCSIS 4.0

The evolution of DOCSIS 4.0 is a great example of the evolution of DOCSIS in terms of both capability and complexity. There are two, mutually exclusive options for deploying DOCSIS 4.0 technology, both of which require some amount of HFC plant upgrade and will require new hardware, both CPE and head-end, and changes to software, processes and maintenance methodologies that affect the entire supply chain (we won’t get into Distributed Access Architectures in this article, though DOCSIS 4.0 requires a disaggregated approach). Below is a summary of the two versions of DOCSIS 4.0.

Frequency Division Duplexing DOCSIS

Frequency Division Duplexing (FDD), sometimes referred to as Extended Spectrum DOCSIS (ESD), is an option that relies on dedicated spectrum allocated to the upstream and downstream direction of flow on separate bands. As the name implies, It is similar to previous versions of DOCSIS in most respects, but extends the spectrum for both the upstream and downstream bands to increase capacity. FDD’s extended spectrum supports downstream operation to 1784 MHz while supporting existing upper band edges for backward compatibility. Support for 1784 MHz requires upgrading both the HFC plant’s active components (an amplifier’s diplex filter features a fixed band split that must be replaced) and also its passive components (splitters, taps, faceplates) to support the higher frequencies. In the upstream direction, an FDD Cable Modem (CM) can make use of extended high-split options including 300 MHz, 396 MHz, 492 MHz, or 684 MHz with corresponding support for this upstream split in the HFC (diplex filters in the plant’s amplifiers) and at the CM. A CM is generally expected to support only one or two of the defined upstream bands, using a switchable diplex filter controlled by management software to switch between bands. A certified DOCSIS 4.0 CM will need to comply with provisions for the FDD version of the standard to operate in an FDD-enabled network. Note that this is independent of the spectrum plan selected by the Operator which adds an additional consideration for customers who like to buy their own modems.

Full Duplex DOCSIS

Full Duplex DOCSIS (FDX), relies on advanced echo-cancellation technology to support both upstream and downstream traffic operating in the same spectrum. It offers an upper band edge of 1218 MHz. FDX specific HFC network upgrades include replacing existing amplifiers with those suitable for use with FDX, or extending the fiber plant deeper into the network and operating the coaxial portion beyond the node as a passive coaxial network. This is also referred to colloquially as “node + 0.” Because FDX does not require frequencies above 1218 MHz, passive components of the network, such as splitters, taps, drops and faceplates, at least those supporting 1218 MHz, do not need to be replaced. A certified DOCSIS 4.0 CM will need to comply with the specific requirements of the FDX provisions of the standard in order to operate in an FDX-enabled network.

Where are the Subject Matter Experts?

One of the challenges our industry faces is the lack of technical expertise in DOCSIS technologies, and how to retain that talent as well as training new talent. Understanding the different layers of the technology stack can take a decade or more. There is the physical layer (L1) that requires Radio Frequency (RF) expertise, the MULPI (Media Access Control) layer (L2) that requires communication protocol experience, the Operational Support System (OSS) management and control layer (L3+) and finally the security features that span all layers of the technology. Many of the original DOCSIS Subject Matter Experts (SMEs) who have been involved over the different DOCSIS releases are reaching retirement age. The barrier to entry for learning the technology is fairly high due to the different layers of the standard with increasing complexity over each iteration.

A Very Large Unstructured Dataset

As one metric, there are five CableLabs specifications for DOCSIS 4.0 which specify Layers 1-3 including Security. Those specifications, which do not include the management and control data models, encompass over 2600 pages of text. There’s an additional 200+ pages of text for DOCSIS 3.1 Physical Layer which DOCSIS 4.0 requires as well. If we were to factor in the Distributed Access Architecture Remote PHY specifications which are required by DOCSIS 4.0, that’s an additional 7 specifications and 1300 pages of text. Additionally, CableLabs publishes technical reports, guideline documents and many additional industry white papers and other publications. All in all, this is a significant amount of information in textual and diagram format (excluding data models which use formal modeling languages). What is the feasibility of someone mastering this much data by studying text-based documents?

Generative Artificial Intelligence

Enter a new wave of Generative Artificial Intelligence (AI). Generative AI refers to a broader set of AI models that can generate new, creative outputs from a set of inputs. This could be in the form of text (like ChatGPT), images (like Stable Diffusion), music, or any other form of data (remembering how many pages of specification text that the DOCSIS standard encompasses). Generative models learn the underlying patterns in the data they are trained on, and can then use this learning to create new data that follows the same patterns. For example, a generative model trained on a dataset of paintings could generate a new painting in a similar style. How about an image or painting of a future mars colony designed in an art deco theme? A Generative AI can do that and more.

Language Models

The current generation of emerging Generative AI applications are based on Language Models (LMs). In this context, LMs are a type of AI model that is trained to understand, generate, and/or manipulate human language. Large Language Models (LLMs) are “large” in the sense that they have a vast number of parameters, which are the parts of the model that are learned from training data. One of the most well-known examples of an LLM is GPT-3 (Generative Pretrained Transformer 3) developed by OpenAI. GPT-3 has 175 billion parameters, making it one of the largest language models until GPT-4 was released a few months ago. GPT-4 is reportedly six times larger than GPT-3, with a trillion parameters. These models are trained on a diverse range of Internet text, but because of their size, they do not have specific documents, books, or other sources that they were trained on. OpenAI’s ChatGPT4 has a basic understanding of DOCSIS 3.1 from a high level feature viewpoint (based on a few prompts). However, knowledge of DOCSIS 4.0 in the existing data set is almost non-existent.

Fine-tuned or Domain-specific Language Models (DLMs) are generally smaller than LLMs and are fine-tuned for specific tasks. Fine-tuning can improve a model’s ability to perform a task and can also bolster a model’s understanding of certain subject matter. These models are good for mature tasks with lots of training data such as machine translation, question answering, and information retrieval. An example of this type of model is OpenAI’s Codex, which is a direct descendant of GPT-3 fine-tuned for programming tasks.

DOCSIS 4.0 Domain-specific LLM

What might a Domain-specific Language Model trained on thousands of pages of CableLabs’ specification text be capable of? There are many viable use cases for how such a DOCSIS 4.0 LLM could be leveraged, once a foundational model is built and trained. This article does not discuss how such an LLM might be developed and trained, but introduces how powerful this new toolset could be.

Meet Jarvis, a Self-Learning DOCSIS 4.0 Chatbot

A chatbot is a type of application that is designed to interact with users using ordinary dialogue. They are often used in customer service situations to automate tasks that would typically require human interaction for what is an otherwise repetitive, menial task. Most Cable Operators today provide Customer Support chatbots (often inside customer web portals) which are based on predefined rules. Chatbots can be based on a set of predetermined rules, or can be more sophisticated and use machine learning techniques to improve their ability to comprehend and predict the most likely outcome over time. Self-learning chatbots use machine learning algorithms to understand the context of a conversation and generate responses. They can also learn from past interactions and improve their performance over time by pruning out predictions that went wrong.

What if we had a self-learning chatbot that understood DOCSIS 4.0 at a detailed technical level? Instead of reading thousands of pages of CableLabs specification text, I might ask the DOCSIS chatbot (Jarvis in this example) a specific question about the protocol, or ask it to summarize a concept based on its understanding of DOCSIS. Chatbots are excellent at extracting, summarizing and synthesizing information in many different modes and tones. For example, I might ask Jarvis to explain upstream channel bonding as if it’s a teacher and to use a more conversational tone in its response. I might also request Jarvis to create a training class for DOCSIS 4.0 that I could deliver to my organization’s operations team. These types of requests are nearly unlimited and response times for generating such information are on the order of only a few seconds. As the DOCSIS LLM matures, Jarvis might gain the intelligence to generate new DOCSIS features, solve existing technical challenges, validate existing specifications or find defects in those specifications and recommend one or more resolutions.

Conclusion

The realm of cable technology, specifically DOCSIS, remains a highly intricate, constantly evolving field that demands an extensive planning and deployment lifecycle for every release. Amid this complexity, DOCSIS 4.0, whether applied through Frequency Division Duplexing or Full Duplex DOCSIS, necessitates extensive upgrades, integration and other changes that impact the entire supply chain. Mastery of this expansive and unstructured dataset presents a challenge, compounded by the impending retirement of many DOCSIS subject matter experts.

The advent of Generative AI and Domain-specific Language Models offers promising solutions to these current and near-term future challenges. Such AI-based technologies, exemplified by OpenAI’s GPT series and embodied in the conceptual “Jarvis”, a self-learning DOCSIS 4.0 chatbot, could provide invaluable support to the industry. These AI systems could potentially digest and comprehend thousands of pages of specifications and guidelines and then deliver precise information to the information’s consumer. It will be able to teach, even innovate new approaches within the DOCSIS sphere, thus addressing the industry’s technical expertise shortage while streamlining the understanding of such a complex technology. By embracing the capabilities of these AI models, the cable industry could allow its workforce to more easily traverse the complexity of DOCSIS, heralding an exciting new era of information management and technological evolution.

Special thanks to Kirk Erichsen and ChatGPT4 with Bing integration for contributing to this article.

1905 Iris Ave
Boulder, CO 80304 US

+1-720-470-7091

8 AM - 5 PM MT Mon-Fri

Beyond the Specs: AI and the Future of DOCSIS Expertise

DOCSIS 4.0 chatbot: A Use Case for Generative AI in the Cable Industry

DOCSIS 4.0

Frequency Division Duplexing DOCSIS

Full Duplex DOCSIS

Where are the Subject Matter Experts?

A Very Large Unstructured Dataset

Generative Artificial Intelligence

Language Models

DOCSIS 4.0 Domain-specific LLM

Meet Jarvis, a Self-Learning DOCSIS 4.0 Chatbot

Conclusion

About OAM Technology Consulting

Resources

Follow Us

Contact Us

About OAM Technology Consulting

Resources

Follow Us

Contact Us

1905 Iris Ave Boulder, CO 80304 US

+1-720-470-7091

8 AM - 5 PM MT Mon-Fri

Beyond the Specs: AI and the Future of DOCSIS Expertise

DOCSIS 4.0 chatbot: A Use Case for Generative AI in the Cable Industry

DOCSIS 4.0

Frequency Division Duplexing DOCSIS

Full Duplex DOCSIS

Where are the Subject Matter Experts?

A Very Large Unstructured Dataset

Generative Artificial Intelligence

Language Models

DOCSIS 4.0 Domain-specific LLM

Meet Jarvis, a Self-Learning DOCSIS 4.0 Chatbot

Conclusion

About OAM Technology Consulting

Resources

Follow Us

Contact Us

About OAM Technology Consulting

Resources

Follow Us

Contact Us

1905 Iris Ave
Boulder, CO 80304 US