class documentation

class SafeHTMLParser(HTMLParser): (source)

Constructor: SafeHTMLParser()

View In Hierarchy

An HTMLParser that only allows a very limited subset of HTML.

Method __init__ Initialise the parser.
Method handle_charref Append any character references.
Method handle_data Append any data inside tags after escaping it.
Method handle_endtag Append the end tag only if it is in the allowed list.
Method handle_entityref Append any named character.
Method handle_startendtag Append a self-closing tag if it is a new line.
Method handle_starttag Append a start tag found in the HTML if it is in the allowed list.
Instance Variable output Undocumented
def __init__(self): (source)

Initialise the parser.

def handle_charref(self, name: str): (source)

Append any character references.

def handle_data(self, data: str): (source)

Append any data inside tags after escaping it.

def handle_endtag(self, tag: str): (source)

Append the end tag only if it is in the allowed list.

def handle_entityref(self, name: str): (source)

Append any named character.

def handle_startendtag(self, tag: str, _attrs: list[tuple[str, str | None]]): (source)

Append a self-closing tag if it is a new line.

def handle_starttag(self, tag: str, attrs: list[tuple[str, str | None]]): (source)

Append a start tag found in the HTML if it is in the allowed list.

This is called by the base class when a start tag is found. We only append the tag if it is our allowed list. We re-write the tag so no extra attributes can be added.

No tags can have attributes except for a tags, why only allow href. We then append target="_blank" rel="noopener noreferrer" to ensure the link opens in a new window.

Parameters
tag:strthe tag name
attrs:list[tuple[str, str | None]]the attributes for the tag.
output: str = (source)

Undocumented