TruncateWordsHTMLParser
This class provides specialized HTML parsing functionality for truncating content based on a specific word count. It processes text data by splitting it into individual words, joining a subset determined by the remaining word limit, and escaping the resulting output for safe HTML rendering.
Methods
process()
@classmethod
def process(
data: string
) - > tuple
Splits the input text into individual words and truncates the content based on the remaining word count allowed for the HTML document.
Parameters
| Name | Type | Description |
|---|---|---|
| data | string | The raw text content extracted from an HTML node to be processed and truncated. |
Returns
| Type | Description |
|---|---|
tuple | A tuple containing the list of all split words and the HTML-escaped string of the truncated text. |