org.apache.xml.serialize
Class TextSerializer

java.lang.Object
  extended byorg.apache.xml.serialize.BaseMarkupSerializer
      extended byorg.apache.xml.serialize.TextSerializer
All Implemented Interfaces:
org.xml.sax.ContentHandler, org.xml.sax.ext.DeclHandler, org.xml.sax.DocumentHandler, DOMSerializer, org.xml.sax.DTDHandler, org.xml.sax.ext.LexicalHandler, Serializer

public class TextSerializer
extends BaseMarkupSerializer

Implements a text serializer supporting both DOM and SAX serializing. For usage instructions see Serializer.

If an output stream is used, the encoding is taken from the output format (defaults to UTF-8). If a writer is used, make sure the writer uses the same encoding (if applies) as specified in the output format.

The serializer supports both DOM and SAX. DOM serializing is done by calling BaseMarkupSerializer.serialize(org.w3c.dom.Element) and SAX serializing is done by firing SAX events and using the serializer as a document handler.

If an I/O exception occurs while serializing, the serializer will not throw an exception directly, but only throw it at the end of serializing (either DOM or SAX's DocumentHandler.endDocument().

Version:
$Revision: 1.14 $ $Date: 2004/02/24 23:34:03 $
Author:
Assaf Arkin
See Also:
Serializer

Field Summary
 
Fields inherited from class org.apache.xml.serialize.BaseMarkupSerializer
_docTypePublicId, _docTypeSystemId, _encodingInfo, _format, _indenting, _prefixes, _printer, _started, fCurrentNode, fDOMError, fDOMErrorHandler, fDOMFilter, features, fStrBuffer
 
Constructor Summary
TextSerializer()
          Constructs a new serializer.
 
Method Summary
 void characters(char[] chars, int start, int length)
          Receive notification of character data.
protected  void characters(java.lang.String text, boolean unescaped)
           
 void comment(char[] chars, int start, int length)
          Report an XML comment anywhere in the document.
 void comment(java.lang.String text)
           
protected  ElementState content()
          Must be called by a method about to print any type of content.
 void endElement(java.lang.String tagName)
          Receive notification of the end of an element.
 void endElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String rawName)
          Receive notification of the end of an element.
 void endElementIO(java.lang.String tagName)
           
protected  java.lang.String getEntityRef(int ch)
          Returns the suitable entity reference for this character value, or null if no such entity exists.
 void processingInstructionIO(java.lang.String target, java.lang.String code)
           
protected  void serializeElement(org.w3c.dom.Element elem)
          Called to serialize a DOM element.
protected  void serializeNode(org.w3c.dom.Node node)
          Serialize the DOM node.
 void setOutputFormat(OutputFormat format)
          Specifies an output format for this serializer.
protected  void startDocument(java.lang.String rootTagName)
          Called to serialize the document's DOCTYPE by the root element.
 void startElement(java.lang.String tagName, org.xml.sax.AttributeList attrs)
          Receive notification of the beginning of an element.
 void startElement(java.lang.String namespaceURI, java.lang.String localName, java.lang.String rawName, org.xml.sax.Attributes attrs)
          Receive notification of the beginning of an element.
 
Methods inherited from class org.apache.xml.serialize.BaseMarkupSerializer
asContentHandler, asDocumentHandler, asDOMSerializer, attributeDecl, characters, checkUnboundNamespacePrefixedNode, elementDecl, endCDATA, endDocument, endDTD, endEntity, endNonEscaping, endPrefixMapping, endPreserving, enterElementState, externalEntityDecl, fatalError, getElementState, getPrefix, ignorableWhitespace, internalEntityDecl, isDocumentState, leaveElementState, modifyDOMError, notationDecl, prepare, printCDATAText, printDoctypeURL, printEscaped, printEscaped, printText, printText, processingInstruction, reset, serialize, serialize, serialize, serializePreRoot, setDocumentLocator, setOutputByteStream, setOutputCharStream, skippedEntity, startCDATA, startDocument, startDTD, startEntity, startNonEscaping, startPrefixMapping, startPreserving, surrogates, unparsedEntityDecl
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextSerializer

public TextSerializer()
Constructs a new serializer. The serializer cannot be used without calling BaseMarkupSerializer.setOutputCharStream(java.io.Writer) or BaseMarkupSerializer.setOutputByteStream(java.io.OutputStream) first.

Method Detail

setOutputFormat

public void setOutputFormat(OutputFormat format)
Description copied from interface: Serializer
Specifies an output format for this serializer. It the serializer has already been associated with an output format, it will switch to the new format. This method should not be called while the serializer is in the process of serializing a document.

Specified by:
setOutputFormat in interface Serializer
Overrides:
setOutputFormat in class BaseMarkupSerializer

startElement

public void startElement(java.lang.String namespaceURI,
                         java.lang.String localName,
                         java.lang.String rawName,
                         org.xml.sax.Attributes attrs)
                  throws org.xml.sax.SAXException
Description copied from interface: org.xml.sax.ContentHandler
Receive notification of the beginning of an element.

The Parser will invoke this method at the beginning of every element in the XML document; there will be a corresponding endElement event for every startElement event (even when the element is empty). All of the element's content will be reported, in order, before the corresponding endElement event.

This event allows up to three name components for each element:

  1. the Namespace URI;
  2. the local name; and
  3. the qualified (prefixed) name.

Any or all of these may be provided, depending on the values of the http://xml.org/sax/features/namespaces and the http://xml.org/sax/features/namespace-prefixes properties:

Note that the attribute list provided will contain only attributes with explicit values (specified or defaulted): #IMPLIED attributes will be omitted. The attribute list will contain attributes used for Namespace declarations (xmlns* attributes) only if the http://xml.org/sax/features/namespace-prefixes property is true (it is false by default, and support for a true value is optional).

Like characters(), attribute values may have characters that need more than one char value.

Parameters:
namespaceURI - the Namespace URI, or the empty string if the element has no Namespace URI or if Namespace processing is not being performed
localName - the local name (without prefix), or the empty string if Namespace processing is not being performed
rawName - the qualified name (with prefix), or the empty string if qualified names are not available
attrs - the attributes attached to the element. If there are no attributes, it shall be an empty Attributes object. The value of this object after startElement returns is undefined
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception
See Also:
ContentHandler.endElement(java.lang.String, java.lang.String, java.lang.String), Attributes, AttributesImpl

endElement

public void endElement(java.lang.String namespaceURI,
                       java.lang.String localName,
                       java.lang.String rawName)
                throws org.xml.sax.SAXException
Description copied from interface: org.xml.sax.ContentHandler
Receive notification of the end of an element.

The SAX parser will invoke this method at the end of every element in the XML document; there will be a corresponding startElement event for every endElement event (even when the element is empty).

For information on the names, see startElement.

Parameters:
namespaceURI - the Namespace URI, or the empty string if the element has no Namespace URI or if Namespace processing is not being performed
localName - the local name (without prefix), or the empty string if Namespace processing is not being performed
rawName - the qualified XML name (with prefix), or the empty string if qualified names are not available
Throws:
org.xml.sax.SAXException - any SAX exception, possibly wrapping another exception

startElement

public void startElement(java.lang.String tagName,
                         org.xml.sax.AttributeList attrs)
                  throws org.xml.sax.SAXException
Description copied from interface: org.xml.sax.DocumentHandler
Receive notification of the beginning of an element.

The Parser will invoke this method at the beginning of every element in the XML document; there will be a corresponding endElement() event for every startElement() event (even when the element is empty). All of the element's content will be reported, in order, before the corresponding endElement() event.

If the element name has a namespace prefix, the prefix will still be attached. Note that the attribute list provided will contain only attributes with explicit values (specified or defaulted): #IMPLIED attributes will be omitted.

Parameters:
tagName - The element type name.
attrs - The attributes attached to the element, if any.
Throws:
org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
See Also:
DocumentHandler.endElement(java.lang.String), AttributeList

endElement

public void endElement(java.lang.String tagName)
                throws org.xml.sax.SAXException
Description copied from interface: org.xml.sax.DocumentHandler
Receive notification of the end of an element.

The SAX parser will invoke this method at the end of every element in the XML document; there will be a corresponding startElement() event for every endElement() event (even when the element is empty).

If the element name has a namespace prefix, the prefix will still be attached to the name.

Parameters:
tagName - The element type name
Throws:
org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.

endElementIO

public void endElementIO(java.lang.String tagName)
                  throws java.io.IOException
Throws:
java.io.IOException

processingInstructionIO

public void processingInstructionIO(java.lang.String target,
                                    java.lang.String code)
                             throws java.io.IOException
Overrides:
processingInstructionIO in class BaseMarkupSerializer
Throws:
java.io.IOException

comment

public void comment(java.lang.String text)
Overrides:
comment in class BaseMarkupSerializer

comment

public void comment(char[] chars,
                    int start,
                    int length)
Description copied from interface: org.xml.sax.ext.LexicalHandler
Report an XML comment anywhere in the document.

This callback will be used for comments inside or outside the document element, including comments in the external DTD subset (if read). Comments in the DTD must be properly nested inside start/endDTD and start/endEntity events (if used).

Specified by:
comment in interface org.xml.sax.ext.LexicalHandler
Overrides:
comment in class BaseMarkupSerializer

characters

public void characters(char[] chars,
                       int start,
                       int length)
                throws org.xml.sax.SAXException
Description copied from interface: org.xml.sax.ContentHandler
Receive notification of character data.

The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.

The application must not attempt to read from the array outside of the specified range.

Individual characters may consist of more than one Java char value. There are two important cases where this happens, because characters can't be represented in just sixteen bits. In one case, characters are represented in a Surrogate Pair, using two special Unicode values. Such characters are in the so-called "Astral Planes", with a code point above U+FFFF. A second case involves composite characters, such as a base character combining with one or more accent characters.

Your code should not assume that algorithms using char-at-a-time idioms will be working in character units; in some cases they will split characters. This is relevant wherever XML permits arbitrary characters, such as attribute values, processing instruction data, and comments as well as in data reported from this method. It's also generally relevant whenever Java code manipulates internationalized text; the issue isn't unique to XML.

Note that some parsers will report whitespace in element content using the ignorableWhitespace method rather than this one (validating parsers must do so).

Specified by:
characters in interface org.xml.sax.ContentHandler
Overrides:
characters in class BaseMarkupSerializer
Throws:
org.xml.sax.SAXException

characters

protected void characters(java.lang.String text,
                          boolean unescaped)
                   throws java.io.IOException
Throws:
java.io.IOException

startDocument

protected void startDocument(java.lang.String rootTagName)
                      throws java.io.IOException
Called to serialize the document's DOCTYPE by the root element.

This method will check if it has not been called before (BaseMarkupSerializer._started), will serialize the document type declaration, and will serialize all pre-root comments and PIs that were accumulated in the document (see BaseMarkupSerializer.serializePreRoot()). Pre-root will be serialized even if this is not the first root element of the document.

Throws:
java.io.IOException

serializeElement

protected void serializeElement(org.w3c.dom.Element elem)
                         throws java.io.IOException
Called to serialize a DOM element. Equivalent to calling startElement(java.lang.String, java.lang.String, java.lang.String, org.xml.sax.Attributes), endElement(java.lang.String, java.lang.String, java.lang.String) and serializing everything inbetween, but better optimized.

Specified by:
serializeElement in class BaseMarkupSerializer
Parameters:
elem - The element to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing

serializeNode

protected void serializeNode(org.w3c.dom.Node node)
                      throws java.io.IOException
Serialize the DOM node. This method is unique to the Text serializer.

Overrides:
serializeNode in class BaseMarkupSerializer
Parameters:
node - The node to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing
See Also:
BaseMarkupSerializer.serializeElement(org.w3c.dom.Element)

content

protected ElementState content()
Description copied from class: BaseMarkupSerializer
Must be called by a method about to print any type of content. If the element was just opened, the opening tag is closed and will be matched to a closing tag. Returns the current element state with empty and afterElement set to false.

Overrides:
content in class BaseMarkupSerializer
Returns:
The current element state

getEntityRef

protected java.lang.String getEntityRef(int ch)
Description copied from class: BaseMarkupSerializer
Returns the suitable entity reference for this character value, or null if no such entity exists. Calling this method with '&' will return "&".

Specified by:
getEntityRef in class BaseMarkupSerializer
Parameters:
ch - Character value
Returns:
Character entity name, or null


Copyright ? 1999-2005 Apache XML Project. All Rights Reserved.