TagSoup is a SAX2 parser written in Java that, instead of parsing well-formed or valid XML. HTML Tag Soup analysis as found in the wild: nasty and brutal, but often far from short.By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML. This is a parser, and not an application as a whole, it is not intended to permanently clean up bad HTML, as HTML Tidy does, only to analyze the options fly.The following are included: -- The output files into individual files with the extensions changed HTML to XHTML. Otherwise, any output is sent to standard output. - Html output HTML is clean: the XML declaration is removed, as the end-tags for empty elements known. - Omit-xml-declaration The XML declaration is removed. - Method = End HTML tags for empty elements known HTML are removed. - Output format is ciborium ciborium. - Entry is pyxine PYXoid format (no need to be trained). - Namespaces Nons are removed. Normally, all elements are in the namespace XHTML 1.x, and all attributes are in no namespace. - Bogoni nobogons (unidentified) are removed. Normally, they are treated as empty. - Nodefaults delete attribute values by default - nocolons settlers explicit change in the element and attribute names underscore - norestart Do not restart all the elements normally restartable - all Bogoni are given a content model of any rather than EMPTY. - Pass lexical form of HTML comments. Has no effect when the output format is ciborium. - Reuse such a reuse of single parser TagSoup everywhere. Normally, a new one is instantiated for each input file. - Change nocdata content models of the script and style elements to be treated as ordinary # PCDATA (text only) elements, as in XHTML, rather than the special model CDATA content. - Encoding = encoding Specify the encoding of entry. The default is the Java platform by default. - Help Show help. - Version Display version number.Requirements: Java 1.4.2 or · laterWhat New in this version: • The main problem was with HTML comments, which were very bad shape: character> would end one to comment elements do not function properly. · Everything should now be correct. · Everyone should update possible. · Moreover, Xnnnn & # (with a capital X) now works a little debugging code was removed from PYXWriter, a nomenclature Unicode beginning of a document is ignored, and the new version of Saxon is taken into load as an XSLT processor. · Documentation was added on the SAX features and properties specific to TagSoup.