org.htmlparser.tests.lexerTests
Class KitTest

java.lang.Object
  extended byjavax.swing.text.html.HTMLEditorKit.ParserCallback
      extended byorg.htmlparser.tests.lexerTests.KitTest

public class KitTest
extends javax.swing.text.html.HTMLEditorKit.ParserCallback

Compare output from javax.swing.text.html.HTMLEditorKit with Lexer. This test provides a means of comparing the lexemes from javax.swing.text.html.HTMLEditorKit.Parser class with the lexemes produced by the org.htmlparser.lexer.Lexer class.

The differences have eluded automation since the HTMLEditorKit parser adds spurious nodes where it thinks elements need closing or it gets confused. The intent is to eventually incorporate this into the 'fit test' and run it against lots of HTML pages, but so far you must analyse the differences by hand.


Field Summary
 
Fields inherited from class javax.swing.text.html.HTMLEditorKit.ParserCallback
IMPLIED
 
Constructor Summary
KitTest(java.util.Vector nodes)
          Creates a new instance of KitTest
 
Method Summary
 void flush()
          Callback for flushing the state, just prior to shutting down the parser.
 org.htmlparser.tests.lexerTests.KitTest.MyKit getKit()
          Return a editor kit.
 void handleComment(char[] data, int pos)
          Callback for a remark lexeme.
 void handleEndOfLineString(java.lang.String eol)
          This is invoked after the stream has been parsed, but before flush.
 void handleEndTag(javax.swing.text.html.HTML.Tag t, int pos)
          Callback for an end tag lexeme.
 void handleError(java.lang.String errorMsg, int pos)
          Callback for an error condition.
 void handleSimpleTag(javax.swing.text.html.HTML.Tag t, javax.swing.text.MutableAttributeSet a, int pos)
          Callback for a non-composite tag.
 void handleStartTag(javax.swing.text.html.HTML.Tag t, javax.swing.text.MutableAttributeSet a, int pos)
          Callback for a start tag lexeme.
 void handleText(char[] data, int pos)
          Callback for a text lexeme.
static void main(java.lang.String[] args)
          Manline for the test.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KitTest

public KitTest(java.util.Vector nodes)
Creates a new instance of KitTest

Parameters:
nodes - The list of lexemes from Lexer to compare with the kit lexemes.
Method Detail

handleText

public void handleText(char[] data,
                       int pos)
Callback for a text lexeme.

Parameters:
data - The text extracted from the page.
pos - The position in the page. Note: This differs from the Lexer concept of position which is an absolute location in the HTML input stream. This position is the character position if the text from the page were displayed in a browser.

handleComment

public void handleComment(char[] data,
                          int pos)
Callback for a remark lexeme.

Parameters:
data - The text extracted from the page.
pos - The position in the page. Note: This differs from the Lexer concept of position which is an absolute location in the HTML input stream. This position is the character position if the text from the page were displayed in a browser.

handleStartTag

public void handleStartTag(javax.swing.text.html.HTML.Tag t,
                           javax.swing.text.MutableAttributeSet a,
                           int pos)
Callback for a start tag lexeme.

Parameters:
t - The tag extracted from the page.
a - The attributes parsed out of the tag.
pos - The position in the page. Note: This differs from the Lexer concept of position which is an absolute location in the HTML input stream. This position is the character position if the text from the page were displayed in a browser.

handleEndTag

public void handleEndTag(javax.swing.text.html.HTML.Tag t,
                         int pos)
Callback for an end tag lexeme.

Parameters:
t - The tag extracted from the page.
pos - The position in the page. Note: This differs from the Lexer concept of position which is an absolute location in the HTML input stream. This position is the character position if the text from the page were displayed in a browser.

handleSimpleTag

public void handleSimpleTag(javax.swing.text.html.HTML.Tag t,
                            javax.swing.text.MutableAttributeSet a,
                            int pos)
Callback for a non-composite tag.

Parameters:
t - The tag extracted from the page.
a - The attributes parsed out of the tag.
pos - The position in the page. Note: This differs from the Lexer concept of position which is an absolute location in the HTML input stream. This position is the character position if the text from the page were displayed in a browser.

handleError

public void handleError(java.lang.String errorMsg,
                        int pos)
Callback for an error condition.

Parameters:
errorMsg - The error condition as a text message.
pos - The position in the page. Note: This differs from the Lexer concept of position which is an absolute location in the HTML input stream. This position is the character position if the text from the page were displayed in a browser.

flush

public void flush()
           throws javax.swing.text.BadLocationException
Callback for flushing the state, just prior to shutting down the parser.

Throws:
javax.swing.text.BadLocationException

handleEndOfLineString

public void handleEndOfLineString(java.lang.String eol)
This is invoked after the stream has been parsed, but before flush. eol will be one of \n, \r or \r\n, which ever is encountered the most in parsing the stream.

Since:
1.3

getKit

public org.htmlparser.tests.lexerTests.KitTest.MyKit getKit()
Return a editor kit.


main

public static void main(java.lang.String[] args)
                 throws ParserException,
                        java.io.IOException
Manline for the test.

Parameters:
args - the command line arguments. If present the first array element is used as a URL to parse.
Throws:
ParserException
java.io.IOException