org.htmlparser.nodeDecorators
Class EscapeCharacterRemovingNode

java.lang.Object
  extended byorg.htmlparser.nodeDecorators.AbstractNodeDecorator
      extended byorg.htmlparser.nodeDecorators.EscapeCharacterRemovingNode
All Implemented Interfaces:
java.lang.Cloneable, Node, Text

Deprecated. Use direct subclasses or dynamic proxies instead.

Use either direct subclasses of the appropriate node and set them on the PrototypicalNodeFactory, or use a dynamic proxy implementing the required node type interface.

public class EscapeCharacterRemovingNode
extends AbstractNodeDecorator

See Also:
AbstractNodeDecorator

Field Summary
 
Fields inherited from class org.htmlparser.nodeDecorators.AbstractNodeDecorator
delegate
 
Constructor Summary
EscapeCharacterRemovingNode(Text newDelegate)
          Deprecated.  
 
Method Summary
 java.lang.String toPlainTextString()
          Deprecated. A string representation of the node.
 
Methods inherited from class org.htmlparser.nodeDecorators.AbstractNodeDecorator
accept, clone, collectInto, doSemanticAction, equals, getChildren, getEndPosition, getFirstChild, getLastChild, getNextSibling, getPage, getParent, getPreviousSibling, getStartPosition, getText, setChildren, setEndPosition, setPage, setParent, setStartPosition, setText, toHtml, toString
 
Methods inherited from class java.lang.Object
finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

EscapeCharacterRemovingNode

public EscapeCharacterRemovingNode(Text newDelegate)
Deprecated. 
Method Detail

toPlainTextString

public java.lang.String toPlainTextString()
Deprecated. 
Description copied from interface: Node
A string representation of the node. This is an important method, it allows a simple string transformation of a web page, regardless of a node. For a Text node this is obviously the textual contents itself. For a Remark node this is the remark contents (sic). For tags this is the text contents of it's children (if any). Because multiple nodes are combined when presenting a page in a browser, this will not reflect what a user would see. See HTML specification section 9.1 White space http://www.w3.org/TR/html4/struct/text.html#h-9.1.
Typical application code (for extracting only the text from a web page) would be:
 for (Enumeration e = parser.elements (); e.hasMoreElements ();)
     // or do whatever processing you wish with the plain text string
     System.out.println ((Node)e.nextElement ()).toPlainTextString ());
 

Specified by:
toPlainTextString in interface Node
Overrides:
toPlainTextString in class AbstractNodeDecorator