TruncateCharsHTMLParser

This class provides specialized HTML parsing functionality to truncate content based on a specific character count while preserving HTML structure. It tracks the number of processed characters and handles the insertion of replacement strings when the specified length limit is reached. The parser ensures that truncation occurs accurately across text data while managing character references and escaping requirements.

Attributes

Attribute	Type	Description
length	`int`	The maximum number of characters allowed in the truncated output.
processed_chars	`int` = 0	A counter tracking the total number of characters encountered during the parsing process to determine when truncation limits are reached.

Constructor

Signature

def TruncateCharsHTMLParser(
    length: int,
    replacement: str,
    convert_charrefs: bool = True
) - > null

Parameters

Name	Type	Description
length	`int`	The maximum number of characters allowed before truncation.
replacement	`str`	The string to append when truncation occurs.
convert_charrefs	`bool` = True	Whether to convert character references during parsing.

Signature

def TruncateCharsHTMLParser(
    length: int,
    replacement: str,
    convert_charrefs: bool = True
) - > null

Parameters

Name	Type	Description
length	`int`	The maximum number of characters allowed before truncation occurs.
replacement	`str`	The string appended to the content if it exceeds the specified length.
convert_charrefs	`bool` = True	Determines whether character references are converted during parsing.

Methods

`process()`

@classmethod
def process(
    data: str
) - > tuple

Processes a chunk of text data, tracking character counts and raising a TruncationCompleted exception if the limit is reached.

Parameters

Name	Type	Description
data	`str`	The raw text segment to be processed and counted against the truncation limit.

Returns

Type	Description
`tuple`	A tuple containing the original data chunk and the escaped, potentially truncated output string.

Attributes​

Constructor​

Signature​

Parameters​

Signature​

Parameters​

Methods​

process()​

Parameters​

Returns​

Attributes

Constructor

Signature

Parameters

Signature

Parameters

Methods

`process()`

Parameters

Returns