Decoder for HTML documents.
Constructor:
Initialize and reset this instance.
Methods:
add_element(element) add_text(text) check_for_whole_start_tag(i) clear_cdata_mode() close() Handle any buffered data. decode(text[, location]) decode_structured(text, location) error(message) feed(data) Feed data to the parser. get_image(filename) get_starttag_text() Return full source of start tag: ‘<...>’. getpos() Return current line number and offset. goahead(end) handle_charref(name) handle_comment(data) handle_data(data) handle_decl(decl) handle_endtag(tag) handle_entityref(name) handle_pi(data) handle_startendtag(tag, attrs) handle_starttag(tag, case_attrs) parse_bogus_comment(i[, report]) parse_comment(i[, report]) parse_declaration(i) parse_endtag(i) parse_html_declaration(i) parse_marked_section(i[, report]) parse_pi(i) parse_starttag(i) pop_style(key) prepare_for_data() push_style(key, styles) reset() Reset this instance. set_cdata_mode(elem) unescape(s) unknown_decl(data) updatepos(i, j)
Attributes:
CDATA_CONTENT_ELEMENTS Type: tuple default_style Type: dict entitydefs font_sizes Type: dict
Default style attributes for unstyled text in the HTML document.
Type: | dict |
---|
Map HTML font sizes to actual font sizes, in points.
Type: | dict |
---|
Methods
- HTMLDecoder.add_element(element)
- HTMLDecoder.add_text(text)
- HTMLDecoder.check_for_whole_start_tag(i)
- HTMLDecoder.clear_cdata_mode()
- HTMLDecoder.close()
Handle any buffered data.
- HTMLDecoder.decode(text, location=None)
- HTMLDecoder.error(message)
- HTMLDecoder.feed(data)
Feed data to the parser.
Call this as often as you want, with as little or as much text as you want (may include ‘n’).
- HTMLDecoder.get_starttag_text()
Return full source of start tag: ‘<...>’.
- HTMLDecoder.getpos()
Return current line number and offset.
- HTMLDecoder.goahead(end)
- HTMLDecoder.handle_comment(data)
- HTMLDecoder.handle_decl(decl)
- HTMLDecoder.handle_pi(data)
- HTMLDecoder.handle_startendtag(tag, attrs)
- HTMLDecoder.parse_bogus_comment(i, report=1)
- HTMLDecoder.parse_comment(i, report=1)
- HTMLDecoder.parse_declaration(i)
- HTMLDecoder.parse_endtag(i)
- HTMLDecoder.parse_html_declaration(i)
- HTMLDecoder.parse_marked_section(i, report=1)
- HTMLDecoder.parse_pi(i)
- HTMLDecoder.parse_starttag(i)
- HTMLDecoder.pop_style(key)
- HTMLDecoder.push_style(key, styles)
- HTMLDecoder.reset()
Reset this instance. Loses all unprocessed data.
- HTMLDecoder.set_cdata_mode(elem)
- HTMLDecoder.unescape(s)
- HTMLDecoder.unknown_decl(data)
- HTMLDecoder.updatepos(i, j)
Attributes
- HTMLDecoder.CDATA_CONTENT_ELEMENTS = ('script', 'style')
- HTMLDecoder.entitydefs = None