public abstract class AbstractCharInputReader extends java.lang.Object implements CharInputReader
CharInputReader.
It provides the essential conversion of sequences of newline characters defined by Format.getLineSeparator() into the normalized newline character provided in Format.getNormalizedNewline().
It also provides a default implementation for most of the methods specified by the CharInputReader interface.
Extending classes must essentially read characters from a given Reader and assign it to the public buffer when requested (in the reloadBuffer() method).
Format,
DefaultCharInputReader,
ConcurrentCharInputReader| Modifier and Type | Field and Description |
|---|---|
char[] |
buffer
The buffer itself
|
private char |
ch |
private long |
charCount |
protected boolean |
closeOnStop |
private boolean |
commentProcessing |
private boolean |
detectLineSeparator |
int |
i
Current position in the buffer
|
private boolean |
incrementLineCount |
private java.util.List<InputAnalysisProcess> |
inputAnalysisProcesses |
int |
length
Number of characters available in the buffer.
|
private long |
lineCount |
private char |
lineSeparator1 |
private char |
lineSeparator2 |
private boolean |
lineSeparatorDetected |
private char |
normalizedLineSeparator |
private boolean |
normalizeLineEndings |
private int |
recordStart |
private boolean |
skipping |
private ExpandingCharAppender |
tmp |
(package private) int |
whitespaceRangeStart |
| Constructor and Description |
|---|
AbstractCharInputReader(char[] lineSeparator,
char normalizedLineSeparator,
int whitespaceRangeStart,
boolean closeOnStop)
Creates a new instance with the mandatory characters for handling newlines transparently.
|
AbstractCharInputReader(char normalizedLineSeparator,
int whitespaceRangeStart,
boolean closeOnStop)
Creates a new instance that attempts to detect the newlines used in the input automatically.
|
| Modifier and Type | Method and Description |
|---|---|
void |
addInputAnalysisProcess(InputAnalysisProcess inputAnalysisProcess)
Submits a custom
InputAnalysisProcess to analyze the input buffer and potentially discover configuration options such as
column separators is CSV, data formats, etc. |
long |
charCount()
Returns the number of characters returned by
CharInputReader.nextChar() at any given time. |
java.lang.String |
currentParsedContent()
Returns a String with the input character sequence parsed to produce the current record.
|
int |
currentParsedContentLength()
Returns the length of the character sequence parsed to produce the current record.
|
void |
enableNormalizeLineEndings(boolean normalizeLineEndings)
Indicates to the input reader that the parser is running in "escape" mode and
new lines should be returned as-is to prevent modifying the content of the parsed value.
|
char |
getChar()
Returns the last character returned by the
CharInputReader.nextChar() method. |
char[] |
getLineSeparator()
Returns the line separator by this character input reader.
|
java.lang.String |
getQuotedString(char quote,
char escape,
char escapeEscape,
int maxLength,
char stop1,
char stop2,
boolean keepQuotes,
boolean keepEscape,
boolean trimLeading,
boolean trimTrailing)
Attempts to collect a quoted
String from the current position until a closing quote or stop character is found on the input,
or a line ending is reached. |
java.lang.String |
getString(char ch,
char stop,
boolean trim,
java.lang.String nullValue,
int maxLength)
Attempts to collect a
String from the current position until a stop character is found on the input,
or a line ending is reached. |
int |
lastIndexOf(char ch)
Returns the last index of a given character in the current parsed content
|
long |
lineCount()
Returns the number of newlines read so far.
|
void |
markRecordStart()
Marks the start of a new record in the input, used internally to calculate the result of
CharInputReader.currentParsedContent() |
char |
nextChar()
Returns the next character in the input provided by the active
Reader. |
java.lang.String |
readComment()
Collects the comment line found on the input.
|
protected abstract void |
reloadBuffer()
Informs the extending class that the buffer has been read entirely and requests for another batch of characters.
|
private void |
setLineSeparator(char[] lineSeparator) |
protected abstract void |
setReader(java.io.Reader reader)
Passes the
Reader provided in the start(Reader) method to the extending class so it can begin loading characters from it. |
void |
skipLines(long lines)
Skips characters in the input until the given number of lines is discarded.
|
boolean |
skipQuotedString(char quote,
char escape,
char stop1,
char stop2)
Attempts to skip a quoted
String from the current position until a stop character is found on the input,
or a line ending is reached. |
boolean |
skipString(char ch,
char stop)
Attempts to skip a
String from the current position until a stop character is found on the input,
or a line ending is reached. |
char |
skipWhitespace(char ch,
char stopChar1,
char stopChar2)
Skips characters from the current input position, until a non-whitespace character, or a stop character is found
|
void |
start(java.io.Reader reader)
Initializes the CharInputReader implementation with a
Reader which provides access to the input. |
private void |
start(java.io.Reader reader,
boolean resetTmp) |
private void |
submitLineSeparatorDetector() |
private void |
throwEOFException() |
protected void |
unwrapInputStream(BomInput.BytesProcessedNotification notification) |
private void |
updateBuffer()
Requests the next batch of characters from the implementing class and updates
the character count.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitstopprivate final ExpandingCharAppender tmp
private boolean lineSeparatorDetected
private final boolean detectLineSeparator
private java.util.List<InputAnalysisProcess> inputAnalysisProcesses
private char lineSeparator1
private char lineSeparator2
private final char normalizedLineSeparator
private long lineCount
private long charCount
private int recordStart
final int whitespaceRangeStart
private boolean skipping
private boolean commentProcessing
protected final boolean closeOnStop
public int i
private char ch
public char[] buffer
public int length
private boolean incrementLineCount
private boolean normalizeLineEndings
public AbstractCharInputReader(char normalizedLineSeparator,
int whitespaceRangeStart,
boolean closeOnStop)
normalizedLineSeparator - the normalized newline character (as defined in Format.getNormalizedNewline()) that is used to replace any lineSeparator sequence found in the input.whitespaceRangeStart - starting range of characters considered to be whitespace.closeOnStop - indicates whether to automatically close the input when CharInputReader.stop() is calledpublic AbstractCharInputReader(char[] lineSeparator,
char normalizedLineSeparator,
int whitespaceRangeStart,
boolean closeOnStop)
lineSeparator - the sequence of characters that represent a newline, as defined in Format.getLineSeparator()normalizedLineSeparator - the normalized newline character (as defined in Format.getNormalizedNewline()) that is used to replace any lineSeparator sequence found in the input.whitespaceRangeStart - starting range of characters considered to be whitespace.closeOnStop - indicates whether to automatically close the input when CharInputReader.stop() is calledprivate void submitLineSeparatorDetector()
private void setLineSeparator(char[] lineSeparator)
protected abstract void setReader(java.io.Reader reader)
Reader provided in the start(Reader) method to the extending class so it can begin loading characters from it.reader - the Reader provided in start(Reader)protected abstract void reloadBuffer()
buffer attribute, as well as the number of characters available to the public length attribute.
To notify the input does not have any more characters, length must receive the -1 valueprotected final void unwrapInputStream(BomInput.BytesProcessedNotification notification)
private void start(java.io.Reader reader,
boolean resetTmp)
public final void start(java.io.Reader reader)
CharInputReaderReader which provides access to the input.start in interface CharInputReaderreader - A Reader that provides access to the input.private void updateBuffer()
If there are no more characters in the input, the reading will stop by invoking the CharInputReader.stop() method.
public final void addInputAnalysisProcess(InputAnalysisProcess inputAnalysisProcess)
InputAnalysisProcess to analyze the input buffer and potentially discover configuration options such as
column separators is CSV, data formats, etc. The process will be execute only once.inputAnalysisProcess - a custom process to analyze the contents of the input buffer.private void throwEOFException()
public final char nextChar()
CharInputReaderReader.
If the input contains a sequence of newline characters (defined by Format.getLineSeparator()), this method will automatically converted them to the newline character specified in Format.getNormalizedNewline().
A subsequent call to this method will return the character after the newline sequence.
nextChar in interface CharInputnextChar in interface CharInputReaderpublic final char getChar()
CharInputReaderCharInputReader.nextChar() method.getChar in interface CharInputgetChar in interface CharInputReaderCharInputReader.nextChar() method.'\0' if there are no more characters in the input or if the CharInputReader was stopped.public final long lineCount()
CharInputReaderlineCount in interface CharInputReaderpublic final void skipLines(long lines)
CharInputReaderskipLines in interface CharInputReaderlines - the number of lines to skip from the current location in the inputpublic java.lang.String readComment()
CharInputReaderreadComment in interface CharInputReaderpublic final long charCount()
CharInputReaderCharInputReader.nextChar() at any given time.charCount in interface CharInputReaderCharInputReader.nextChar()public final void enableNormalizeLineEndings(boolean normalizeLineEndings)
CharInputReaderenableNormalizeLineEndings in interface CharInputReadernormalizeLineEndings - flag indicating that the parser is escaping values and line separators are to be returned as-is.public char[] getLineSeparator()
CharInputReaderFormat.getLineSeparator() configuration, or the line separator sequence identified automatically
when CommonParserSettings.isLineSeparatorDetectionEnabled() evaluates to true.getLineSeparator in interface CharInputReaderpublic final char skipWhitespace(char ch,
char stopChar1,
char stopChar2)
CharInputReaderskipWhitespace in interface CharInputReaderch - the current character of the inputstopChar1 - the first stop character (which can be a whitespace)stopChar2 - the second character (which can be a whitespace)public final int currentParsedContentLength()
CharInputReadercurrentParsedContentLength in interface CharInputReaderpublic final java.lang.String currentParsedContent()
CharInputReadercurrentParsedContent in interface CharInputReaderpublic final int lastIndexOf(char ch)
CharInputReaderlastIndexOf in interface CharInputReaderch - the character to look for-1 if not found.public final void markRecordStart()
CharInputReaderCharInputReader.currentParsedContent()markRecordStart in interface CharInputReaderpublic final boolean skipString(char ch,
char stop)
CharInputReaderString from the current position until a stop character is found on the input,
or a line ending is reached. If the String can be skipped, the current position of the parser will be updated to
the last consumed character. If the internal buffer needs to be reloaded, this method will return false
and the current position of the buffer will remain unchanged.skipString in interface CharInputReaderch - the current character to be considered. If equal to the stop character false will be returnedstop - the stop character that identifies the end of the content to be collectedtrue if an entire String value was found on the input and skipped, or false if the buffer needs to reloaded.public final java.lang.String getString(char ch,
char stop,
boolean trim,
java.lang.String nullValue,
int maxLength)
CharInputReaderString from the current position until a stop character is found on the input,
or a line ending is reached. If the String can be obtained, the current position of the parser will be updated to
the last consumed character. If the internal buffer needs to be reloaded, this method will return null
and the current position of the buffer will remain unchanged.getString in interface CharInputReaderch - the current character to be considered. If equal to the stop character the nullValue will be returnedstop - the stop character that identifies the end of the content to be collectedtrim - flag indicating whether or not trailing whitespaces should be discardednullValue - value to return when the length of the content to be returned is 0.maxLength - the maximum length of the String to be returned. If the length exceeds this limit, null will be returnedString found on the input, or null if the buffer needs to reloaded or the maximum length has been exceeded.public final java.lang.String getQuotedString(char quote,
char escape,
char escapeEscape,
int maxLength,
char stop1,
char stop2,
boolean keepQuotes,
boolean keepEscape,
boolean trimLeading,
boolean trimTrailing)
CharInputReaderString from the current position until a closing quote or stop character is found on the input,
or a line ending is reached. If the String can be obtained, the current position of the parser will be updated to
the last consumed character. If the internal buffer needs to be reloaded, this method will return null
and the current position of the buffer will remain unchanged.getQuotedString in interface CharInputReaderquote - the quote characterescape - the quote escape characterescapeEscape - the escape of the quote escape charactermaxLength - the maximum length of the String to be returned. If the length exceeds this limit, null will be returnedstop1 - the first stop character that identifies the end of the content to be collectedstop2 - the second stop character that identifies the end of the content to be collectedkeepQuotes - flag to indicate the quotes that wrap the resulting String should be kept.keepEscape - flag to indicate that escape sequences should be kepttrimLeading - flag to indicate leading whitespaces should be trimmedtrimTrailing - flag to indicate that trailing whitespaces should be trimmedString found on the input, or null if the buffer needs to reloaded or the maximum length has been exceeded.public final boolean skipQuotedString(char quote,
char escape,
char stop1,
char stop2)
CharInputReaderString from the current position until a stop character is found on the input,
or a line ending is reached. If the String can be skipped, the current position of the parser will be updated to
the last consumed character. If the internal buffer needs to be reloaded, this method will return false
and the current position of the buffer will remain unchanged.skipQuotedString in interface CharInputReaderquote - the quote characterescape - the quote escape characterstop1 - the first stop character that identifies the end of the content to be collectedstop2 - the second stop character that identifies the end of the content to be collectedtrue if an entire String value was found on the input and skipped, or false if the buffer needs to reloaded.