Class XMLEncodingDetector
- java.lang.Object
-
- org.apache.jasper.xmlparser.XMLEncodingDetector
-
public class XMLEncodingDetector extends java.lang.Object
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description private class
XMLEncodingDetector.RewindableInputStream
This class wraps the byte inputstreams we're presented with.
-
Field Summary
Fields Modifier and Type Field Description private char[]
ch
private int
columnNumber
private int
count
static int
DEFAULT_BUFFER_SIZE
static int
DEFAULT_XMLDECL_BUFFER_SIZE
private java.lang.String
encoding
private ErrorDispatcher
err
private boolean
fAllowJavaEncodings
private int
fBufferSize
private XMLEncodingDetector
fCurrentEntity
private static java.lang.String
fEncodingSymbol
private int
fMarkupDepth
private static java.lang.String
fStandaloneSymbol
private XMLString
fString
private XMLStringBuffer
fStringBuffer
private XMLStringBuffer
fStringBuffer2
private java.lang.String[]
fStrings
private SymbolTable
fSymbolTable
private static java.lang.String
fVersionSymbol
private java.lang.Boolean
hasBom
private java.lang.Boolean
isBigEndian
private boolean
isEncodingSetInProlog
private int
lineNumber
private boolean
literal
private boolean
mayReadChunks
private int
position
private java.io.Reader
reader
private java.io.InputStream
stream
-
Constructor Summary
Constructors Constructor Description XMLEncodingDetector()
Constructor
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
createInitialReader()
private java.io.Reader
createReader(java.io.InputStream inputStream, java.lang.String encoding, java.lang.Boolean isBigEndian)
Creates a reader capable of reading the given input stream in the specified encoding.(package private) void
endEntity()
private java.lang.Object[]
getEncoding(java.io.InputStream in, ErrorDispatcher err)
static java.lang.Object[]
getEncoding(java.lang.String fname, java.util.jar.JarFile jarFile, JspCompilationContext ctxt, ErrorDispatcher err)
Autodetects the encoding of the XML document supplied by the given input stream.private java.lang.Object[]
getEncodingName(byte[] b4, int count)
Returns the IANA encoding name that is auto-detected from the bytes specified, with the endian-ness of that encoding where appropriate.boolean
isExternal()
Returns true if the current entity being scanned is external.(package private) boolean
load(int offset, boolean changeEntity)
Loads a chunk of text.int
peekChar()
Returns the next character on the input.private void
reportFatalError(java.lang.String msgId, java.lang.String arg)
Convenience function used in all XML scanners.int
scanChar()
Returns the next character on the input.boolean
scanData(java.lang.String delimiter, XMLStringBuffer buffer)
Scans a range of character data up to the specified delimiter, setting the fields of the XMLString structure, appropriately.int
scanLiteral(int quote, XMLString content)
Scans a range of attribute value data, setting the fields of the XMLString structure, appropriately.java.lang.String
scanName()
Returns a string matching the Name production appearing immediately on the input as a symbol, or null if no Name string is present.private void
scanPIData(java.lang.String target, XMLString data)
Scans a processing data.java.lang.String
scanPseudoAttribute(boolean scanningTextDecl, XMLString value)
Scans a pseudo attribute.private boolean
scanSurrogates(XMLStringBuffer buf)
Scans surrogates and append them to the specified buffer.private void
scanXMLDecl()
private void
scanXMLDeclOrTextDecl(boolean scanningTextDecl)
Scans an XML or text declaration.private void
scanXMLDeclOrTextDecl(boolean scanningTextDecl, java.lang.String[] pseudoAttributeValues)
Scans an XML or text declaration.boolean
skipChar(int c)
Skips a character appearing immediately on the input.boolean
skipSpaces()
Skips space characters appearing immediately on the input.boolean
skipString(java.lang.String s)
Skips the specified string appearing immediately on the input.
-
-
-
Field Detail
-
stream
private java.io.InputStream stream
-
encoding
private java.lang.String encoding
-
isEncodingSetInProlog
private boolean isEncodingSetInProlog
-
isBigEndian
private java.lang.Boolean isBigEndian
-
hasBom
private java.lang.Boolean hasBom
-
reader
private java.io.Reader reader
-
DEFAULT_BUFFER_SIZE
public static final int DEFAULT_BUFFER_SIZE
- See Also:
- Constant Field Values
-
DEFAULT_XMLDECL_BUFFER_SIZE
public static final int DEFAULT_XMLDECL_BUFFER_SIZE
- See Also:
- Constant Field Values
-
fAllowJavaEncodings
private boolean fAllowJavaEncodings
-
fSymbolTable
private SymbolTable fSymbolTable
-
fCurrentEntity
private XMLEncodingDetector fCurrentEntity
-
fBufferSize
private int fBufferSize
-
lineNumber
private int lineNumber
-
columnNumber
private int columnNumber
-
literal
private boolean literal
-
ch
private char[] ch
-
position
private int position
-
count
private int count
-
mayReadChunks
private boolean mayReadChunks
-
fString
private XMLString fString
-
fStringBuffer
private XMLStringBuffer fStringBuffer
-
fStringBuffer2
private XMLStringBuffer fStringBuffer2
-
fVersionSymbol
private static final java.lang.String fVersionSymbol
- See Also:
- Constant Field Values
-
fEncodingSymbol
private static final java.lang.String fEncodingSymbol
- See Also:
- Constant Field Values
-
fStandaloneSymbol
private static final java.lang.String fStandaloneSymbol
- See Also:
- Constant Field Values
-
fMarkupDepth
private int fMarkupDepth
-
fStrings
private java.lang.String[] fStrings
-
err
private ErrorDispatcher err
-
-
Method Detail
-
getEncoding
public static java.lang.Object[] getEncoding(java.lang.String fname, java.util.jar.JarFile jarFile, JspCompilationContext ctxt, ErrorDispatcher err) throws java.io.IOException, JasperException
Autodetects the encoding of the XML document supplied by the given input stream. Encoding autodetection is done according to the XML 1.0 specification, Appendix F.1: Detection Without External Encoding Information.- Returns:
- Two-element array, where the first element (of type java.lang.String) contains the name of the (auto)detected encoding, and the second element (of type java.lang.Boolean) specifies whether the encoding was specified using the 'encoding' attribute of an XML prolog (TRUE) or autodetected (FALSE).
- Throws:
java.io.IOException
JasperException
-
getEncoding
private java.lang.Object[] getEncoding(java.io.InputStream in, ErrorDispatcher err) throws java.io.IOException, JasperException
- Throws:
java.io.IOException
JasperException
-
endEntity
void endEntity()
-
createInitialReader
private void createInitialReader() throws java.io.IOException, JasperException
- Throws:
java.io.IOException
JasperException
-
createReader
private java.io.Reader createReader(java.io.InputStream inputStream, java.lang.String encoding, java.lang.Boolean isBigEndian) throws java.io.IOException, JasperException
Creates a reader capable of reading the given input stream in the specified encoding.- Parameters:
inputStream
- The input stream.encoding
- The encoding name that the input stream is encoded using. If the user has specified that Java encoding names are allowed, then the encoding name may be a Java encoding name; otherwise, it is an ianaEncoding name.isBigEndian
- For encodings (like uCS-4), whose names cannot specify a byte order, this tells whether the order is bigEndian. null means unknown or not relevant.- Returns:
- Returns a reader.
- Throws:
java.io.IOException
JasperException
-
getEncodingName
private java.lang.Object[] getEncodingName(byte[] b4, int count)
Returns the IANA encoding name that is auto-detected from the bytes specified, with the endian-ness of that encoding where appropriate.- Parameters:
b4
- The first four bytes of the input.count
- The number of bytes actually read.- Returns:
- a 2-element array: the first element, an IANA-encoding string, the second element a Boolean which is true iff the document is big endian, false if it's little-endian, and null if the distinction isn't relevant.
-
isExternal
public boolean isExternal()
Returns true if the current entity being scanned is external.
-
peekChar
public int peekChar() throws java.io.IOException
Returns the next character on the input.Note: The character is not consumed.
- Throws:
java.io.IOException
- Thrown if i/o error occurs.java.io.EOFException
- Thrown on end of file.
-
scanChar
public int scanChar() throws java.io.IOException
Returns the next character on the input.Note: The character is consumed.
- Throws:
java.io.IOException
- Thrown if i/o error occurs.java.io.EOFException
- Thrown on end of file.
-
scanName
public java.lang.String scanName() throws java.io.IOException
Returns a string matching the Name production appearing immediately on the input as a symbol, or null if no Name string is present.Note: The Name characters are consumed.
Note: The string returned must be a symbol. The SymbolTable can be used for this purpose.
- Throws:
java.io.IOException
- Thrown if i/o error occurs.java.io.EOFException
- Thrown on end of file.- See Also:
SymbolTable
,XMLChar.isName(int)
,XMLChar.isNameStart(int)
-
scanLiteral
public int scanLiteral(int quote, XMLString content) throws java.io.IOException
Scans a range of attribute value data, setting the fields of the XMLString structure, appropriately.Note: The characters are consumed.
Note: This method does not guarantee to return the longest run of attribute value data. This method may return before the quote character due to reaching the end of the input buffer or any other reason.
Note: The fields contained in the XMLString structure are not guaranteed to remain valid upon subsequent calls to the entity scanner. Therefore, the caller is responsible for immediately using the returned character data or making a copy of the character data.
- Parameters:
quote
- The quote character that signifies the end of the attribute value data.content
- The content structure to fill.- Returns:
- Returns the next character on the input, if known. This value may be -1 but this does note designate end of file.
- Throws:
java.io.IOException
- Thrown if i/o error occurs.java.io.EOFException
- Thrown on end of file.
-
scanData
public boolean scanData(java.lang.String delimiter, XMLStringBuffer buffer) throws java.io.IOException
Scans a range of character data up to the specified delimiter, setting the fields of the XMLString structure, appropriately.Note: The characters are consumed.
Note: This assumes that the internal buffer is at least the same size, or bigger, than the length of the delimiter and that the delimiter contains at least one character.
Note: This method does not guarantee to return the longest run of character data. This method may return before the delimiter due to reaching the end of the input buffer or any other reason.
Note: The fields contained in the XMLString structure are not guaranteed to remain valid upon subsequent calls to the entity scanner. Therefore, the caller is responsible for immediately using the returned character data or making a copy of the character data.
- Parameters:
delimiter
- The string that signifies the end of the character data to be scanned.buffer
- The data structure to fill.- Returns:
- Returns true if there is more data to scan, false otherwise.
- Throws:
java.io.IOException
- Thrown if i/o error occurs.java.io.EOFException
- Thrown on end of file.
-
skipChar
public boolean skipChar(int c) throws java.io.IOException
Skips a character appearing immediately on the input.Note: The character is consumed only if it matches the specified character.
- Parameters:
c
- The character to skip.- Returns:
- Returns true if the character was skipped.
- Throws:
java.io.IOException
- Thrown if i/o error occurs.java.io.EOFException
- Thrown on end of file.
-
skipSpaces
public boolean skipSpaces() throws java.io.IOException
Skips space characters appearing immediately on the input.Note: The characters are consumed only if they are space characters.
- Returns:
- Returns true if at least one space character was skipped.
- Throws:
java.io.IOException
- Thrown if i/o error occurs.java.io.EOFException
- Thrown on end of file.- See Also:
XMLChar.isSpace(int)
-
skipString
public boolean skipString(java.lang.String s) throws java.io.IOException
Skips the specified string appearing immediately on the input.Note: The characters are consumed only if they are space characters.
- Parameters:
s
- The string to skip.- Returns:
- Returns true if the string was skipped.
- Throws:
java.io.IOException
- Thrown if i/o error occurs.java.io.EOFException
- Thrown on end of file.
-
load
final boolean load(int offset, boolean changeEntity) throws java.io.IOException
Loads a chunk of text.- Parameters:
offset
- The offset into the character buffer to read the next batch of characters.changeEntity
- True if the load should change entities at the end of the entity, otherwise leave the current entity in place and the entity boundary will be signaled by the return value.- Throws:
java.io.IOException
-
scanXMLDecl
private void scanXMLDecl() throws java.io.IOException, JasperException
- Throws:
java.io.IOException
JasperException
-
scanXMLDeclOrTextDecl
private void scanXMLDeclOrTextDecl(boolean scanningTextDecl) throws java.io.IOException, JasperException
Scans an XML or text declaration.[23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>' [24] VersionInfo ::= S 'version' Eq (' VersionNum ' | " VersionNum ") [80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' | "'" EncName "'" ) [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')* [32] SDDecl ::= S 'standalone' Eq (("'" ('yes' | 'no') "'") | ('"' ('yes' | 'no') '"')) [77] TextDecl ::= '<?xml' VersionInfo? EncodingDecl S? '?>'
- Parameters:
scanningTextDecl
- True if a text declaration is to be scanned instead of an XML declaration.- Throws:
java.io.IOException
JasperException
-
scanXMLDeclOrTextDecl
private void scanXMLDeclOrTextDecl(boolean scanningTextDecl, java.lang.String[] pseudoAttributeValues) throws java.io.IOException, JasperException
Scans an XML or text declaration.[23] XMLDecl ::= '' [24] VersionInfo ::= S 'version' Eq (' VersionNum ' | " VersionNum ") [80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' | "'" EncName "'" ) [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')* [32] SDDecl ::= S 'standalone' Eq (("'" ('yes' | 'no') "'") | ('"' ('yes' | 'no') '"')) [77] TextDecl ::= ''
- Parameters:
scanningTextDecl
- True if a text declaration is to be scanned instead of an XML declaration.pseudoAttributeValues
- An array of size 3 to return the version, encoding and standalone pseudo attribute values (in that order). Note: This method uses fString, anything in it at the time of calling is lost.- Throws:
java.io.IOException
JasperException
-
scanPseudoAttribute
public java.lang.String scanPseudoAttribute(boolean scanningTextDecl, XMLString value) throws java.io.IOException, JasperException
Scans a pseudo attribute.- Parameters:
scanningTextDecl
- True if scanning this pseudo-attribute for a TextDecl; false if scanning XMLDecl. This flag is needed to report the correct type of error.value
- The string to fill in with the attribute value.- Returns:
- The name of the attribute Note: This method uses fStringBuffer2, anything in it at the time of calling is lost.
- Throws:
java.io.IOException
JasperException
-
scanPIData
private void scanPIData(java.lang.String target, XMLString data) throws java.io.IOException, JasperException
Scans a processing data. This is needed to handle the situation where a document starts with a processing instruction whose target name starts with "xml". (e.g. xmlfoo) Note: This method uses fStringBuffer, anything in it at the time of calling is lost.- Parameters:
target
- The PI targetdata
- The string to fill in with the data- Throws:
java.io.IOException
JasperException
-
scanSurrogates
private boolean scanSurrogates(XMLStringBuffer buf) throws java.io.IOException, JasperException
Scans surrogates and append them to the specified buffer.Note: This assumes the current char has already been identified as a high surrogate.
- Parameters:
buf
- The StringBuffer to append the read surrogates to.- Throws:
java.io.IOException
JasperException
-
reportFatalError
private void reportFatalError(java.lang.String msgId, java.lang.String arg) throws JasperException
Convenience function used in all XML scanners.- Throws:
JasperException
-
-