|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.htmlparser.nodeDecorators.AbstractNodeDecorator
Use either direct subclasses of the appropriate node and set them on the
PrototypicalNodeFactory
,
or use a dynamic proxy implementing the required node type interface.
In the former case this avoids the wrapping and delegation, while the latter
case handles the wrapping and delegation without this class.
Here is an example of how to use dynamic proxies to accomplish the same effect as using decorators to wrap Text nodes:
import java.lang.reflect.InvocationHandler; import java.lang.reflect.InvocationTargetException; import java.lang.reflect.Method; import java.lang.reflect.Proxy; import org.htmlparser.Parser; import org.htmlparser.PrototypicalNodeFactory; import org.htmlparser.Text; import org.htmlparser.nodes.TextNode; import org.htmlparser.util.ParserException; public class TextProxy implements InvocationHandler { protected Object mObject; public static Object newInstance (Object object) { Class cls; cls = object.getClass (); return (Proxy.newProxyInstance ( cls.getClassLoader (), cls.getInterfaces (), new TextProxy (object))); } private TextProxy (Object object) { mObject = object; } public Object invoke (Object proxy, Method m, Object[] args) throws Throwable { Object result; String name; try { result = m.invoke (mObject, args); name = m.getName (); if (name.equals ("clone")) result = newInstance (result); // wrap the cloned object else if (name.equals ("doSemanticAction")) // or other methods System.out.println (mObject); // do the needful on the TextNode } catch (InvocationTargetException e) { throw e.getTargetException (); } catch (Exception e) { throw new RuntimeException ("unexpected invocation exception: " + e.getMessage()); } finally { } return (result); } public static void main (String[] args) throws ParserException { // create the wrapped text node and set it as the prototype Text text = (Text) TextProxy.newInstance (new TextNode (null, 0, 0)); PrototypicalNodeFactory factory = new PrototypicalNodeFactory (); factory.setTextPrototype (text); // perform the parse Parser parser = new Parser (args[0]); parser.setNodeFactory (factory); parser.parse (null); } }
Node wrapping base class.
Field Summary | |
protected Text |
delegate
Deprecated. |
Constructor Summary | |
protected |
AbstractNodeDecorator(Text delegate)
Deprecated. |
Method Summary | |
void |
accept(NodeVisitor visitor)
Deprecated. Apply the visitor to this node. |
java.lang.Object |
clone()
Deprecated. Clone this object. |
void |
collectInto(NodeList list,
NodeFilter filter)
Deprecated. Collect this node and its child nodes into a list, provided the node satisfies the filtering criteria. |
void |
doSemanticAction()
Deprecated. Perform the meaning of this tag. |
boolean |
equals(java.lang.Object arg0)
Deprecated. |
NodeList |
getChildren()
Deprecated. Get the children of this node. |
int |
getEndPosition()
Deprecated. Gets the ending position of the node. |
Node |
getFirstChild()
Deprecated. Get the first child of this node. |
Node |
getLastChild()
Deprecated. Get the last child of this node. |
Node |
getNextSibling()
Deprecated. Get the next sibling to this node. |
Page |
getPage()
Deprecated. Get the page this node came from. |
Node |
getParent()
Deprecated. Get the parent of this node. |
Node |
getPreviousSibling()
Deprecated. Get the previous sibling to this node. |
int |
getStartPosition()
Deprecated. Gets the starting position of the node. |
java.lang.String |
getText()
Deprecated. Accesses the textual contents of the node. |
void |
setChildren(NodeList children)
Deprecated. Set the children of this node. |
void |
setEndPosition(int position)
Deprecated. Sets the ending position of the node. |
void |
setPage(Page page)
Deprecated. Set the page this node came from. |
void |
setParent(Node node)
Deprecated. Sets the parent of this node. |
void |
setStartPosition(int position)
Deprecated. Sets the starting position of the node. |
void |
setText(java.lang.String text)
Deprecated. Sets the contents of the node. |
java.lang.String |
toHtml()
Deprecated. Return the HTML for this node. |
java.lang.String |
toPlainTextString()
Deprecated. A string representation of the node. |
java.lang.String |
toString()
Deprecated. Return the string representation of the node. |
Methods inherited from class java.lang.Object |
finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
protected Text delegate
Constructor Detail |
protected AbstractNodeDecorator(Text delegate)
Method Detail |
public java.lang.Object clone() throws java.lang.CloneNotSupportedException
clone
in interface Node
java.lang.CloneNotSupportedException
- This shouldn't be thrown since
the Node
interface extends Cloneable.public void accept(NodeVisitor visitor)
Node
accept
in interface Node
visitor
- The visitor to this node.public void collectInto(NodeList list, NodeFilter filter)
Node
This mechanism allows powerful filtering code to be written very
easily, without bothering about collection of embedded tags separately.
e.g. when we try to get all the links on a page, it is not possible to
get it at the top-level, as many tags (like form tags), can contain
links embedded in them. We could get the links out by checking if the
current node is a CompositeTag
, and going
through its children. So this method provides a convenient way to do
this.
Using collectInto(), programs get a lot shorter. Now, the code to extract all links from a page would look like:
NodeList list = new NodeList (); NodeFilter filter = new TagNameFilter ("A"); for (NodeIterator e = parser.elements (); e.hasMoreNodes ();) e.nextNode ().collectInto (list, filter);Thus,
list
will hold all the link nodes, irrespective of how
deep the links are embedded.
Another way to accomplish the same objective is:
NodeList list = new NodeList (); NodeFilter filter = new TagClassFilter (LinkTag.class); for (NodeIterator e = parser.elements (); e.hasMoreNodes ();) e.nextNode ().collectInto (list, filter);This is slightly less specific because the LinkTag class may be registered for more than one node name, e.g. <LINK> tags too.
collectInto
in interface Node
list
- The list to collect nodes into.filter
- The criteria to use when deciding if a node should
be added to the list.public int getStartPosition()
getStartPosition
in interface Node
Node.setStartPosition(int)
public void setStartPosition(int position)
setStartPosition
in interface Node
position
- The new start position.Node.getStartPosition()
public int getEndPosition()
getEndPosition
in interface Node
Node.setEndPosition(int)
public void setEndPosition(int position)
setEndPosition
in interface Node
position
- The new end position.Node.getEndPosition()
public Page getPage()
getPage
in interface Node
Node.setPage(org.htmlparser.lexer.Page)
public void setPage(Page page)
setPage
in interface Node
page
- The page that supplied this node.Node.getPage()
public boolean equals(java.lang.Object arg0)
public Node getParent()
Node
Lexer
.
Currently, the object returned from this method can be safely cast to a
CompositeTag
, but this behaviour should not
be expected in the future.
getParent
in interface Node
null
otherwise.Node.setParent(org.htmlparser.Node)
public java.lang.String getText()
Text
getText
in interface Text
Text.setText(java.lang.String)
public void setParent(Node node)
Node
setParent
in interface Node
node
- The node that contains this node.Node.getParent()
public NodeList getChildren()
getChildren
in interface Node
null
otherwise.Node.setChildren(org.htmlparser.util.NodeList)
public void setChildren(NodeList children)
setChildren
in interface Node
children
- The new list of children this node contains.Node.getChildren()
public Node getFirstChild()
Node
getFirstChild
in interface Node
null
otherwise.public Node getLastChild()
Node
getLastChild
in interface Node
null
otherwise.public Node getPreviousSibling()
Node
getPreviousSibling
in interface Node
null
otherwise.public Node getNextSibling()
Node
getNextSibling
in interface Node
null
otherwise.public void setText(java.lang.String text)
Text
setText
in interface Text
text
- The new text for the node.Text.getText()
public java.lang.String toHtml()
Node
toHtml
in interface Node
public java.lang.String toPlainTextString()
Node
for (Enumeration e = parser.elements (); e.hasMoreElements ();) // or do whatever processing you wish with the plain text string System.out.println ((Node)e.nextElement ()).toPlainTextString ());
toPlainTextString
in interface Node
public java.lang.String toString()
Node
System.out.println (node);or within a debugging environment.
toString
in interface Node
public void doSemanticAction() throws ParserException
Node
Node.getChildren()
.
doSemanticAction
in interface Node
ParserException
- If a problem is encountered performing the
semantic action.
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |