Class Xhtml5BaseParser
- java.lang.Object
-
- org.apache.maven.doxia.parser.AbstractParser
-
- org.apache.maven.doxia.parser.AbstractXmlParser
-
- org.apache.maven.doxia.parser.Xhtml5BaseParser
-
- All Implemented Interfaces:
LogEnabled,HtmlMarkup,Markup,XmlMarkup,Parser
- Direct Known Subclasses:
FmlContentParser,Xhtml5Parser
public class Xhtml5BaseParser extends AbstractXmlParser implements HtmlMarkup
Common base parser for xhtml5 events.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.maven.doxia.parser.AbstractXmlParser
AbstractXmlParser.CachedFileEntityResolver
-
-
Field Summary
Fields Modifier and Type Field Description private java.util.Stack<java.lang.String>divStackUsed to keep track of closing tags for content events(package private) booleanhasDefinitionListItemUsed to wrap the definedTerm with its definition, even when one is omittedprivate intheadingLevelCounts heading level.private booleaninVerbatimVerbatim flag, true whenever we are inside a <pre> tag.private booleanisAnchorUsed to distinguish <a href=""> from <a name="">.private booleanisLinkUsed to distinguish <a href=""> from <a name="">.private intorderedListDepthUsed for nested lists.private booleanscriptBlockTrue if a <script></script> or <style></style> block is read.private intsectionLevelCounts section level.private java.util.Map<java.lang.String,java.util.Set<java.lang.String>>warnMessagesMap of warn messages with a String as key to describe the error type and a Set as value.-
Fields inherited from interface org.apache.maven.doxia.markup.HtmlMarkup
A, ABBR, ACRONYM, ADDRESS, APPLET, AREA, ARTICLE, ASIDE, AUDIO, B, BASE, BASEFONT, BDI, BDO, BIG, BLOCKQUOTE, BODY, BR, BUTTON, CANVAS, CAPTION, CDATA_TYPE, CENTER, CITE, CODE, COL, COLGROUP, COMMAND, DATA, DATALIST, DD, DEL, DETAILS, DFN, DIALOG, DIR, DIV, DL, DT, EM, EMBED, ENTITY_TYPE, FIELDSET, FIGCAPTION, FIGURE, FONT, FOOTER, FORM, FRAME, FRAMESET, H1, H2, H3, H4, H5, H6, HEAD, HEADER, HGROUP, HR, HTML, I, IFRAME, IMG, INPUT, INS, ISINDEX, KBD, LABEL, LEGEND, LI, LINK, MAIN, MAP, MARK, MENU, META, METER, NAV, NOFRAMES, NOSCRIPT, OBJECT, OL, OPTGROUP, OPTION, OUTPUT, P, PARAM, PICTURE, PRE, PROGRESS, Q, RB, RP, RT, RTC, RUBY, S, SAMP, SCRIPT, SECTION, SELECT, SMALL, SOURCE, SPAN, STRIKE, STRONG, STYLE, SUB, SUMMARY, SUP, TABLE, TAG_TYPE_END, TAG_TYPE_SIMPLE, TAG_TYPE_START, TBODY, TD, TEMPLATE, TEXTAREA, TFOOT, TH, THEAD, TIME, TITLE, TR, TRACK, TT, U, UL, VAR, VIDEO, WBR
-
Fields inherited from interface org.apache.maven.doxia.markup.Markup
COLON, EOL, EQUAL, GREATER_THAN, LEFT_CURLY_BRACKET, LEFT_SQUARE_BRACKET, LESS_THAN, MINUS, PLUS, QUOTE, RIGHT_CURLY_BRACKET, RIGHT_SQUARE_BRACKET, SEMICOLON, SLASH, SPACE, STAR
-
Fields inherited from interface org.apache.maven.doxia.parser.Parser
ROLE, TXT_TYPE, UNKNOWN_TYPE, XML_TYPE
-
Fields inherited from interface org.apache.maven.doxia.markup.XmlMarkup
BANG, CDATA, DOCTYPE_START, ENTITY_START, XML_NAMESPACE
-
-
Constructor Summary
Constructors Constructor Description Xhtml5BaseParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected booleanbaseEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through a common list of possible html end tags.protected booleanbaseStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through a common list of possible html5 start tags.private voidcloseOpenSections(int newLevel, Sink sink)Close open sections.protected voidconsecutiveSections(int newLevel, Sink sink, SinkEventAttributeSet attribs)Make sure sections are nested consecutively.protected intgetSectionLevel()Return the current section level.private voidhandleAEnd(Sink sink)private voidhandleAStart(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink, SinkEventAttributeSet attribs)protected voidhandleCdsect(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Handles CDATA sections.protected voidhandleComment(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Handles comments.private booleanhandleDivEnd(Sink sink)private booleanhandleDivStart(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, SinkEventAttributeSet attribs, Sink sink)protected voidhandleEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through the possible end tags.private voidhandleHeadingStart(Sink sink, int level, SinkEventAttributeSet attribs)private voidhandleImgStart(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink, SinkEventAttributeSet attribs)private voidhandleLIStart(Sink sink, SinkEventAttributeSet attribs)private voidhandleListItemEnd(Sink sink)private voidhandleOLStart(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink, SinkEventAttributeSet attribs)private voidhandlePreStart(SinkEventAttributeSet attribs, Sink sink)private voidhandlePStart(Sink sink, SinkEventAttributeSet attribs)private voidhandleSectionEnd(Sink sink)private voidhandleSectionStart(Sink sink, SinkEventAttributeSet attribs)protected voidhandleStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through the possible start tags.private voidhandleTableStart(Sink sink, SinkEventAttributeSet attribs, org.codehaus.plexus.util.xml.pull.XmlPullParser parser)protected voidhandleText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Handles text events.protected voidinit()Initialize the parser.protected voidinitXmlParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser)Initializes the parser with custom entities or other options.protected booleanisScriptBlock()Checks if we are currently inside a <script> tag.protected booleanisVerbatim()Checks if we are currently inside a <pre> tag.private voidlogMessage(java.lang.String key, java.lang.String msg)If debug mode is enabled, log themsgas is, otherwise add unique msg inwarnMessages.private voidlogWarnings()private voidopenMissingSections(int newLevel, Sink sink)Open missing sections.voidparse(java.io.Reader source, Sink sink)Parses the given source model and emits Doxia events into the given sink.protected voidsetSectionLevel(int newLevel)Set the current section level.protected java.lang.StringvalidAnchor(java.lang.String id)Checks if the given id is a valid Doxia id and if not, returns a transformed one.protected voidverbatim()Start verbatim mode.protected voidverbatim_()Stop verbatim mode.-
Methods inherited from class org.apache.maven.doxia.parser.AbstractXmlParser
getAttributesFromParser, getLocalEntities, getText, getType, handleEntity, handleUnknown, isCollapsibleWhitespace, isIgnorableWhitespace, isTrimmableWhitespace, isValidate, parse, setCollapsibleWhitespace, setIgnorableWhitespace, setTrimmableWhitespace, setValidate
-
Methods inherited from class org.apache.maven.doxia.parser.AbstractParser
doxiaVersion, enableLogging, executeMacro, getBasedir, getLog, getMacroManager, isEmitComments, isSecondParsing, parse, setEmitComments, setSecondParsing
-
-
-
-
Field Detail
-
scriptBlock
private boolean scriptBlock
True if a <script></script> or <style></style> block is read. CDATA sections within are handled as rawText.
-
isLink
private boolean isLink
Used to distinguish <a href=""> from <a name="">.
-
isAnchor
private boolean isAnchor
Used to distinguish <a href=""> from <a name="">.
-
orderedListDepth
private int orderedListDepth
Used for nested lists.
-
sectionLevel
private int sectionLevel
Counts section level.
-
headingLevel
private int headingLevel
Counts heading level.
-
inVerbatim
private boolean inVerbatim
Verbatim flag, true whenever we are inside a <pre> tag.
-
divStack
private java.util.Stack<java.lang.String> divStack
Used to keep track of closing tags for content events
-
hasDefinitionListItem
boolean hasDefinitionListItem
Used to wrap the definedTerm with its definition, even when one is omitted
-
warnMessages
private java.util.Map<java.lang.String,java.util.Set<java.lang.String>> warnMessages
Map of warn messages with a String as key to describe the error type and a Set as value. Using to reduce warn messages.
-
-
Method Detail
-
parse
public void parse(java.io.Reader source, Sink sink) throws ParseExceptionParses the given source model and emits Doxia events into the given sink.- Specified by:
parsein interfaceParser- Overrides:
parsein classAbstractXmlParser- Parameters:
source- not null reader that provides the source document. You could usenewReadermethods fromReaderFactory.sink- A sink that consumes the Doxia events.- Throws:
ParseException- if the model could not be parsed.
-
initXmlParser
protected void initXmlParser(org.codehaus.plexus.util.xml.pull.XmlPullParser parser) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionInitializes the parser with custom entities or other options. Adds all XHTML (HTML 5.2) entities to the parser so that they can be recognized and resolved without additional DTD.- Overrides:
initXmlParserin classAbstractXmlParser- Parameters:
parser- A parser, not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem initializing the parser
-
baseStartTag
protected boolean baseStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through a common list of possible html5 start tags. These include only tags that can go into the body of an xhtml5 document and so should be re-usable by different xhtml-based parsers.
The currently handled tags are:
<article>, <nav>, <aside>, <section>, <h2>, <h3>, <h4>, <h5>, <h6>, <header>, <main>, <footer>, <em>, <strong>, <small>, <s>, <cite>, <q>, <dfn>, <abbr>, <i>, <b>, <code>, <samp>, <kbd>, <sub>, <sup>, <u>, <mark>, <ruby>, <rb>, <rt>, <rtc>, <rp>, <bdi>, <bdo>, <span>, <ins>, <del>, <p>, <pre>, <ul>, <ol>, <li>, <dl>, <dt>, <dd>, <a>, <table>, <tr>, <th>, <td>, <caption>, <br/>, <wbr/>, <hr/>, <img/>.- Parameters:
parser- A parser.sink- the sink to receive the events.- Returns:
- True if the event has been handled by this method, i.e. the tag was recognized, false otherwise.
-
baseEndTag
protected boolean baseEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink)Goes through a common list of possible html end tags. These should be re-usable by different xhtml-based parsers. The tags handled here are the same as for
baseStartTag(XmlPullParser,Sink), except for the empty elements (<br/>, <hr/>, <img/>).- Parameters:
parser- A parser.sink- the sink to receive the events.- Returns:
- True if the event has been handled by this method, false otherwise.
-
handleStartTag
protected void handleStartTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionExceptionGoes through the possible start tags. Just callsbaseStartTag(XmlPullParser,Sink), this should be overridden by implementing parsers to include additional tags.- Specified by:
handleStartTagin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the modelMacroExecutionException- if there's a problem executing a macro
-
handleEndTag
protected void handleEndTag(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserException, MacroExecutionExceptionGoes through the possible end tags. Just callsbaseEndTag(XmlPullParser,Sink), this should be overridden by implementing parsers to include additional tags.- Specified by:
handleEndTagin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the modelMacroExecutionException- if there's a problem executing a macro
-
handleText
protected void handleText(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionHandles text events.This is a default implementation, if the parser points to a non-empty text element, it is emitted as a text event into the specified sink.
- Overrides:
handleTextin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events. Not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the model
-
handleComment
protected void handleComment(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionHandles comments.This is a default implementation, all data are emitted as comment events into the specified sink.
- Overrides:
handleCommentin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events. Not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the model
-
handleCdsect
protected void handleCdsect(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink) throws org.codehaus.plexus.util.xml.pull.XmlPullParserExceptionHandles CDATA sections.This is a default implementation, all data are emitted as text events into the specified sink.
- Overrides:
handleCdsectin classAbstractXmlParser- Parameters:
parser- A parser, not null.sink- the sink to receive the events. Not null.- Throws:
org.codehaus.plexus.util.xml.pull.XmlPullParserException- if there's a problem parsing the model
-
consecutiveSections
protected void consecutiveSections(int newLevel, Sink sink, SinkEventAttributeSet attribs)Make sure sections are nested consecutively.HTML5 heading tags H1 to H6 imply sections where they are not present, that means we have to open close any sections that are missing in between.
For instance, if the following sequence is parsed:
<h3></h3> <h6></h6>
we have to insert two section starts before we open the
<h6>. In the following sequence<h6></h6> <h3></h3>
we have to close two sections before we open the
<h3>.The current level is set to newLevel afterwards.
- Parameters:
newLevel- the new section level, all upper levels have to be closed.sink- the sink to receive the events.
-
closeOpenSections
private void closeOpenSections(int newLevel, Sink sink)Close open sections.- Parameters:
newLevel- the new section level, all upper levels have to be closed.sink- the sink to receive the events.
-
openMissingSections
private void openMissingSections(int newLevel, Sink sink)Open missing sections.- Parameters:
newLevel- the new section level, all lower levels have to be opened.sink- the sink to receive the events.
-
getSectionLevel
protected int getSectionLevel()
Return the current section level.- Returns:
- the current section level.
-
setSectionLevel
protected void setSectionLevel(int newLevel)
Set the current section level.- Parameters:
newLevel- the new section level.
-
verbatim_
protected void verbatim_()
Stop verbatim mode.
-
verbatim
protected void verbatim()
Start verbatim mode.
-
isVerbatim
protected boolean isVerbatim()
Checks if we are currently inside a <pre> tag.- Returns:
- true if we are currently in verbatim mode.
-
isScriptBlock
protected boolean isScriptBlock()
Checks if we are currently inside a <script> tag.- Returns:
- true if we are currently inside
<script>tags. - Since:
- 1.1.1.
-
validAnchor
protected java.lang.String validAnchor(java.lang.String id)
Checks if the given id is a valid Doxia id and if not, returns a transformed one.- Parameters:
id- The id to validate.- Returns:
- A transformed id or the original id if it was already valid.
- See Also:
DoxiaUtils.encodeId(String)
-
init
protected void init()
Initialize the parser. This is called first byParser.parse(java.io.Reader, org.apache.maven.doxia.sink.Sink)and can be used to set the parser into a clear state so it can be re-used.- Overrides:
initin classAbstractParser
-
handleAEnd
private void handleAEnd(Sink sink)
-
handleAStart
private void handleAStart(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink, SinkEventAttributeSet attribs)
-
handleDivStart
private boolean handleDivStart(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, SinkEventAttributeSet attribs, Sink sink)
-
handleDivEnd
private boolean handleDivEnd(Sink sink)
-
handleImgStart
private void handleImgStart(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink, SinkEventAttributeSet attribs)
-
handleLIStart
private void handleLIStart(Sink sink, SinkEventAttributeSet attribs)
-
handleListItemEnd
private void handleListItemEnd(Sink sink)
-
handleOLStart
private void handleOLStart(org.codehaus.plexus.util.xml.pull.XmlPullParser parser, Sink sink, SinkEventAttributeSet attribs)
-
handlePStart
private void handlePStart(Sink sink, SinkEventAttributeSet attribs)
-
handlePreStart
private void handlePreStart(SinkEventAttributeSet attribs, Sink sink)
-
handleSectionStart
private void handleSectionStart(Sink sink, SinkEventAttributeSet attribs)
-
handleHeadingStart
private void handleHeadingStart(Sink sink, int level, SinkEventAttributeSet attribs)
-
handleSectionEnd
private void handleSectionEnd(Sink sink)
-
handleTableStart
private void handleTableStart(Sink sink, SinkEventAttributeSet attribs, org.codehaus.plexus.util.xml.pull.XmlPullParser parser)
-
logMessage
private void logMessage(java.lang.String key, java.lang.String msg)If debug mode is enabled, log themsgas is, otherwise add unique msg inwarnMessages.- Parameters:
key- not nullmsg- not null- Since:
- 1.1.1
- See Also:
parse(Reader, Sink)
-
logWarnings
private void logWarnings()
- Since:
- 1.1.1
-
-