Package net.sf.saxon.regex
Class BMPString
java.lang.Object
net.sf.saxon.regex.UnicodeString
net.sf.saxon.regex.BMPString
- All Implemented Interfaces:
CharSequence
,Comparable<UnicodeString>
,AtomicMatchKey
An implementation of UnicodeString optimized for strings that contain
no characters outside the BMP (i.e. no characters whose codepoints exceed 65535)
-
Field Summary
Fields inherited from interface net.sf.saxon.expr.sort.AtomicMatchKey
NaN_MATCH_KEY
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionchar
charAt
(int index) Returns thechar
value at the specified index.Get the underlying CharSequenceboolean
isEnd
(int pos) Ask whether a given position is at (or beyond) the end of the stringint
length()
Returns the length of this character sequence.subSequence
(int start, int end) Returns a newCharSequence
that is a subsequence of this sequence.toString()
int
uCharAt
(int pos) Get the character at a specified positionint
uIndexOf
(int search, int pos) Get the first match for a given characterint
uLength()
Get the length of the string, in Unicode codepointsuSubstring
(int beginIndex, int endIndex) Get a substring of this stringMethods inherited from class net.sf.saxon.regex.UnicodeString
asAtomic, compareTo, containsSurrogatePairs, equals, hashCode, makeUnicodeString, makeUnicodeString
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
Methods inherited from interface java.lang.CharSequence
chars, codePoints, isEmpty
-
Constructor Details
-
BMPString
Create a BMPString- Parameters:
src
- - encapsulated CharSequence. The client must ensure that this contains no surrogate pairs, and that it is immutable
-
-
Method Details
-
uSubstring
Description copied from class:UnicodeString
Get a substring of this string- Specified by:
uSubstring
in classUnicodeString
- Parameters:
beginIndex
- the index of the first character to be included (counting codepoints, not 16-bit characters)endIndex
- the index of the first character to be NOT included (counting codepoints, not 16-bit characters)- Returns:
- a substring
-
uCharAt
public int uCharAt(int pos) Description copied from class:UnicodeString
Get the character at a specified position- Specified by:
uCharAt
in classUnicodeString
- Parameters:
pos
- the index of the required character (counting codepoints, not 16-bit characters)- Returns:
- a character (Unicode codepoint) at the specified position.
-
uIndexOf
public int uIndexOf(int search, int pos) Description copied from class:UnicodeString
Get the first match for a given character- Specified by:
uIndexOf
in classUnicodeString
- Parameters:
search
- the character to look forpos
- the first position to look- Returns:
- the position of the first occurrence of the sought character, or -1 if not found
-
uLength
public int uLength()Description copied from class:UnicodeString
Get the length of the string, in Unicode codepoints- Specified by:
uLength
in classUnicodeString
- Returns:
- the number of codepoints in the string
-
isEnd
public boolean isEnd(int pos) Description copied from class:UnicodeString
Ask whether a given position is at (or beyond) the end of the string- Specified by:
isEnd
in classUnicodeString
- Parameters:
pos
- the index of the required character (counting codepoints, not 16-bit characters)- Returns:
- true iff if the specified index is after the end of the character stream
-
toString
- Specified by:
toString
in interfaceCharSequence
- Overrides:
toString
in classObject
-
getCharSequence
Get the underlying CharSequence- Returns:
- the underlying CharSequence
-
length
public int length()Returns the length of this character sequence. The length is the number of 16-bitchar
s in the sequence.- Returns:
- the number of
char
s in this sequence
-
charAt
public char charAt(int index) Returns thechar
value at the specified index. An index ranges from zero to length() - 1. The firstchar
value of the sequence is at index zero, the next at index one, and so on, as for array indexing.If the
char
value specified by the index is a surrogate, the surrogate value is returned.- Parameters:
index
- the index of thechar
value to be returned- Returns:
- the specified
char
value - Throws:
IndexOutOfBoundsException
- if the index argument is negative or not less than length()
-
subSequence
Returns a newCharSequence
that is a subsequence of this sequence. The subsequence starts with thechar
value at the specified index and ends with thechar
value at index end - 1. The length (inchar
s) of the returned sequence is end - start, so if start == end then an empty sequence is returned.- Parameters:
start
- the start index, inclusiveend
- the end index, exclusive- Returns:
- the specified subsequence
- Throws:
IndexOutOfBoundsException
- if start or end are negative, if end is greater than length(), or if start is greater than end
-