|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.htmlparser.tests.utilTests.CharacterTranslationTest.Generate
Create a character reference translation class source file. Usage:
java -classpath .:lib/htmlparser.jar Generate > Translate.javaDerived from HTMLStringFilter.java provided as an example with the htmlparser.jar file available at htmlparser.sourceforge.net written by Somik Raha ( somik@industriallogic. com http://industriallogic.com).
Field Summary | |
protected Parser |
mParser
The working parser. |
protected java.lang.String |
nl
|
Constructor Summary | |
CharacterTranslationTest.Generate()
Create a Generate object. |
Method Summary | |
void |
extract(java.lang.String string,
java.io.PrintWriter out)
Parse the sgml declaration for character entity reference name, equivalent numeric character reference and a comment. |
void |
gather(Node node,
java.lang.StringBuffer buffer)
|
int |
indexOfWhitespace(java.lang.String string,
int index)
Find the lowest index of whitespace (space or newline). |
java.lang.String |
pack(java.lang.String string)
Rewrite the comment string. |
java.lang.String |
pad(java.lang.String string,
char character,
int length)
Pad a string on the left with the given character to the length specified. |
void |
parse(java.io.PrintWriter out)
Pull out text elements from the HTML. |
java.lang.String |
pretty(java.lang.String string)
Pretty up a comment string. |
void |
sgml(java.lang.String string,
java.io.PrintWriter out)
Extract special characters. |
java.lang.String |
translate(java.lang.String string)
Translate character references. |
java.lang.String |
unicode(java.lang.String string)
Convert the textual representation of the numeric character reference to a character. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
protected Parser mParser
protected java.lang.String nl
Constructor Detail |
public CharacterTranslationTest.Generate() throws ParserException
Parser
pointed
at http://www.w3.org/TR/REC-html40/sgml/entities.html
with the standard scanners registered.
Method Detail |
public java.lang.String translate(java.lang.String string)
string
- The raw string.
public void gather(Node node, java.lang.StringBuffer buffer)
public int indexOfWhitespace(java.lang.String string, int index)
string
- The string to look in.index
- Where to start looking.
public java.lang.String pack(java.lang.String string)
-- latin capital letter I with diaeresis, U+00CF ISOlat1so we just want to make a one-liner without the spaces and newlines.
string
- The raw comment.
public java.lang.String pretty(java.lang.String string)
string
- The comment to operate on.
public java.lang.String pad(java.lang.String string, char character, int length)
string
- The string to padcharacter
- The character to pad with.length
- The size to pad to.
public java.lang.String unicode(java.lang.String string)
string
- The numeric character reference (in quotes).
public void extract(java.lang.String string, java.io.PrintWriter out)
string
- The contents of the sgml declaration.out
- The sink for output.public void sgml(java.lang.String string, java.io.PrintWriter out)
<!ENTITY nbsp CDATA " " -- no-break space = non-breaking space, U+00A0 ISOnum -->and emit a java definition for each.
string
- The raw string from w3.org.out
- The sink for output.public void parse(java.io.PrintWriter out) throws ParserException
out
- The sink for output.
ParserException
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |