Class CommonParserSettings<F extends Format>
- java.lang.Object
-
- com.univocity.parsers.common.CommonSettings<F>
-
- com.univocity.parsers.common.CommonParserSettings<F>
-
- Type Parameters:
F- the format supported by this parser.
- All Implemented Interfaces:
java.lang.Cloneable
- Direct Known Subclasses:
CsvParserSettings,FixedWidthParserSettings,TsvParserSettings
public abstract class CommonParserSettings<F extends Format> extends CommonSettings<F>
This is the parent class for all configuration classes used by parsers (AbstractParser)By default, all parsers work with, at least, the following configuration options in addition to the ones provided by
CommonSettings:- rowProcessor: a callback implementation of the interface
RowProcessorwhich handles the life cycle of the parsing process and processes each record extracted from the input - headerExtractionEnabled (defaults to false): indicates whether or not the first valid record parsed from the input should be considered as the row containing the names of each column
- columnReorderingEnabled (defaults to true): indicates whether fields selected using the field selection methods (defined by the parent class
CommonSettings) should be reordered.When disabled, each parsed record will contain values for all columns, in the order they occur in the input. Fields which were not selected will not be parsed but and the record will contain empty values.
When enabled, each parsed record will contain values only for the selected columns. The values will be ordered according to the selection.
- inputBufferSize (defaults to 1024*1024 characters): The number of characters held by the parser's buffer when processing the input.
- readInputOnSeparateThread (defaults true if the number of available processors at runtime is greater than 1):
When enabled, a reading thread (in
input.concurrent.ConcurrentCharInputReader) will be started and load characters from the input, while the parser is processing its input buffer. This yields better performance, especially when reading from big input (greater than 100 mb)When disabled, the parsing process will briefly pause so the buffer can be replenished every time it is exhausted (in
DefaultCharInputReaderit is not as bad or slow as it sounds, and can even be (slightly) more efficient if your input is small) - numberOfRecordsToRead (defaults to -1): Defines how many (valid) records are to be parsed before the process is stopped. A negative value indicates there's no limit.
- lineSeparatorDetectionEnabled (defaults to false): Attempts to identify what is the line separator being used in the input. The first row of the input will be read until a sequence of '\r\n', or characters '\r' or '\n' is found. If a match is found, then it will be used as the line separator to use to parse the input
- See Also:
RowProcessor,CsvParserSettings,FixedWidthParserSettings
-
-
Field Summary
Fields Modifier and Type Field Description private booleanautoClosingEnabledprivate booleancolumnReorderingEnabledprivate booleancommentCollectionEnabledprotected java.lang.BooleanheaderExtractionEnabledprivate intinputBufferSizeprivate booleanlineSeparatorDetectionEnabledprivate longnumberOfRecordsToReadprivate longnumberOfRowsToSkipprivate Processor<? extends Context>processorprivate booleanreadInputOnSeparateThread-
Fields inherited from class com.univocity.parsers.common.CommonSettings
headerSourceClass
-
-
Constructor Summary
Constructors Constructor Description CommonParserSettings()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected voidaddConfiguration(java.util.Map<java.lang.String,java.lang.Object> out)protected voidclearInputSpecificSettings()Clears settings that are likely to be specific to a given input.protected CommonParserSettingsclone()Clones this configuration object.protected CommonParserSettingsclone(boolean clearInputSpecificSettings)Clones this configuration object to reuse user-provided settings.protected voidconfigureFromAnnotations(java.lang.Class<?> beanClass)Configures the parser based on the annotations provided in a given class(package private) FieldSelectorgetFieldSelector()Returns the FieldSelector object, which handles selected fields.(package private) FieldSet<?>getFieldSet()Returns the set of selected fields, if anyintgetInputBufferSize()Informs the number of characters held by the parser's buffer when processing the input (defaults to 1024*1024 characters).longgetNumberOfRecordsToRead()The number of valid records to be parsed before the process is stopped.longgetNumberOfRowsToSkip()Returns the number of rows to skip from the input before the parser can begin to execute.<T extends Context>
Processor<T>getProcessor()Returns the callback implementation of the interfaceProcessorwhich handles the lifecycle of the parsing process and processes each record extracted from the inputbooleangetReadInputOnSeparateThread()Indicates whether or not a separate thread will be used to read characters from the input while parsing (defaults true if the number of available processors at runtime is greater than 1)RowProcessorgetRowProcessor()Deprecated.Use thegetProcessor()method as it allows format-specific processors to be built to work with different implementations ofContext.booleanisAutoClosingEnabled()Indicates whether automatic closing of the input (reader, stream, etc) is enabled.booleanisColumnReorderingEnabled()Indicates whether fields selected using the field selection methods (defined by the parent classCommonSettings) should be reordered (defaults to true).booleanisCommentCollectionEnabled()Indicates that comments found in the input must be collected (disabled by default).booleanisHeaderExtractionEnabled()Indicates whether or not the first valid record parsed from the input should be considered as the row containing the names of each columnbooleanisLineSeparatorDetectionEnabled()Indicates whether the parser should detect the line separator automatically.protected CharAppendernewCharAppender()Returns an instance of CharAppender with the configured limit of maximum characters per column and the default value used to represent a null value (when the String parsed from the input is empty)protected CharInputReadernewCharInputReader(int whitespaceRangeStart)An implementation ofCharInputReaderwhich loads the parser buffer in parallel or sequentially, as defined by the readInputOnSeparateThread propertyprivate booleanpreventReordering()(package private) voidrunAutomaticConfiguration()voidsetAutoClosingEnabled(boolean autoClosingEnabled)Configures whether the parser should always close the input (reader, stream, etc) automatically when all records have been parsed or when an error occurs.voidsetColumnReorderingEnabled(boolean columnReorderingEnabled)Defines whether fields selected using the field selection methods (defined by the parent classCommonSettings) should be reordered (defaults to true).voidsetCommentCollectionEnabled(boolean commentCollectionEnabled)Enables collection of comments found in the input (disabled by default).voidsetHeaderExtractionEnabled(boolean headerExtractionEnabled)Defines whether or not the first valid record parsed from the input should be considered as the row containing the names of each columnvoidsetInputBufferSize(int inputBufferSize)Defines the number of characters held by the parser's buffer when processing the input (defaults to 1024*1024 characters).voidsetLineSeparatorDetectionEnabled(boolean lineSeparatorDetectionEnabled)Defines whether the parser should detect the line separator automatically.voidsetNumberOfRecordsToRead(long numberOfRecordsToRead)Defines the number of valid records to be parsed before the process is stopped.voidsetNumberOfRowsToSkip(long numberOfRowsToSkip)Defines a number of rows to skip from the input before the parser can begin to execute.voidsetProcessor(Processor<? extends Context> processor)Defines the callback implementation of the interfaceProcessorwhich handles the lifecycle of the parsing process and processes each record extracted from the inputvoidsetReadInputOnSeparateThread(boolean readInputOnSeparateThread)Defines whether or not a separate thread will be used to read characters from the input while parsing (defaults true if the number of available processors at runtime is greater than 1)voidsetRowProcessor(RowProcessor processor)Deprecated.Use thesetProcessor(Processor)method as it allows format-specific processors to be built to work with different implementations ofContext.-
Methods inherited from class com.univocity.parsers.common.CommonSettings
autoConfigure, createDefaultFormat, deriveHeadersFrom, excludeFields, excludeFields, excludeIndexes, getErrorContentLength, getFormat, getHeaders, getIgnoreLeadingWhitespaces, getIgnoreTrailingWhitespaces, getMaxCharsPerColumn, getMaxColumns, getNullValue, getProcessorErrorHandler, getRowProcessorErrorHandler, getSkipBitsAsWhitespace, getSkipEmptyLines, getWhitespaceRangeStart, isAutoConfigurationEnabled, isProcessorErrorHandlerDefined, selectFields, selectFields, selectIndexes, setAutoConfigurationEnabled, setErrorContentLength, setFormat, setHeaders, setHeadersDerivedFromClass, setIgnoreLeadingWhitespaces, setIgnoreTrailingWhitespaces, setMaxCharsPerColumn, setMaxColumns, setNullValue, setProcessorErrorHandler, setRowProcessorErrorHandler, setSkipBitsAsWhitespace, setSkipEmptyLines, toString, trimValues
-
-
-
-
Field Detail
-
headerExtractionEnabled
protected java.lang.Boolean headerExtractionEnabled
-
columnReorderingEnabled
private boolean columnReorderingEnabled
-
inputBufferSize
private int inputBufferSize
-
readInputOnSeparateThread
private boolean readInputOnSeparateThread
-
numberOfRecordsToRead
private long numberOfRecordsToRead
-
lineSeparatorDetectionEnabled
private boolean lineSeparatorDetectionEnabled
-
numberOfRowsToSkip
private long numberOfRowsToSkip
-
commentCollectionEnabled
private boolean commentCollectionEnabled
-
autoClosingEnabled
private boolean autoClosingEnabled
-
-
Method Detail
-
getReadInputOnSeparateThread
public boolean getReadInputOnSeparateThread()
Indicates whether or not a separate thread will be used to read characters from the input while parsing (defaults true if the number of available processors at runtime is greater than 1)When enabled, a reading thread (in
com.univocity.parsers.common.input.concurrent.ConcurrentCharInputReader) will be started and load characters from the input, while the parser is processing its input buffer. This yields better performance, especially when reading from big input (greater than 100 mb)When disabled, the parsing process will briefly pause so the buffer can be replenished every time it is exhausted (in
DefaultCharInputReaderit is not as bad or slow as it sounds, and can even be (slightly) more efficient if your input is small)- Returns:
- true if the input should be read on a separate thread, false otherwise
-
setReadInputOnSeparateThread
public void setReadInputOnSeparateThread(boolean readInputOnSeparateThread)
Defines whether or not a separate thread will be used to read characters from the input while parsing (defaults true if the number of available processors at runtime is greater than 1)When enabled, a reading thread (in
com.univocity.parsers.common.input.concurrent.ConcurrentCharInputReader) will be started and load characters from the input, while the parser is processing its input buffer. This yields better performance, especially when reading from big input (greater than 100 mb)When disabled, the parsing process will briefly pause so the buffer can be replenished every time it is exhausted (in
DefaultCharInputReaderit is not as bad or slow as it sounds, and can even be (slightly) more efficient if your input is small)- Parameters:
readInputOnSeparateThread- the flag indicating whether or not the input should be read on a separate thread
-
isHeaderExtractionEnabled
public boolean isHeaderExtractionEnabled()
Indicates whether or not the first valid record parsed from the input should be considered as the row containing the names of each column- Returns:
- true if the first valid record parsed from the input should be considered as the row containing the names of each column, false otherwise
-
setHeaderExtractionEnabled
public void setHeaderExtractionEnabled(boolean headerExtractionEnabled)
Defines whether or not the first valid record parsed from the input should be considered as the row containing the names of each column- Parameters:
headerExtractionEnabled- a flag indicating whether the first valid record parsed from the input should be considered as the row containing the names of each column
-
getRowProcessor
@Deprecated public RowProcessor getRowProcessor()
Deprecated.Use thegetProcessor()method as it allows format-specific processors to be built to work with different implementations ofContext. Implementations based onRowProcessorallow only parsers who provide aParsingContextto be used.Returns the callback implementation of the interfaceRowProcessorwhich handles the lifecycle of the parsing process and processes each record extracted from the input- Returns:
- Returns the RowProcessor used by the parser to handle each record
- See Also:
ObjectRowProcessor,ObjectRowListProcessor,MasterDetailProcessor,MasterDetailListProcessor,BeanProcessor,BeanListProcessor
-
setRowProcessor
@Deprecated public void setRowProcessor(RowProcessor processor)
Deprecated.Use thesetProcessor(Processor)method as it allows format-specific processors to be built to work with different implementations ofContext. Implementations based onRowProcessorallow only parsers who provide aParsingContextto be used.Defines the callback implementation of the interfaceRowProcessorwhich handles the lifecycle of the parsing process and processes each record extracted from the input- Parameters:
processor- the RowProcessor instance which should used by the parser to handle each record- See Also:
ObjectRowProcessor,ObjectRowListProcessor,MasterDetailProcessor,MasterDetailListProcessor,BeanProcessor,BeanListProcessor
-
getProcessor
public <T extends Context> Processor<T> getProcessor()
Returns the callback implementation of the interfaceProcessorwhich handles the lifecycle of the parsing process and processes each record extracted from the input- Type Parameters:
T- the context type supported by the parser implementation.- Returns:
- Returns the
Processorused by the parser to handle each record - See Also:
AbstractObjectProcessor,AbstractObjectListProcessor,AbstractMasterDetailProcessor,AbstractMasterDetailListProcessor,AbstractBeanProcessor,AbstractBeanListProcessor
-
setProcessor
public void setProcessor(Processor<? extends Context> processor)
Defines the callback implementation of the interfaceProcessorwhich handles the lifecycle of the parsing process and processes each record extracted from the input- Parameters:
processor- theProcessorinstance which should used by the parser to handle each record- See Also:
AbstractObjectProcessor,AbstractObjectListProcessor,AbstractMasterDetailProcessor,AbstractMasterDetailListProcessor,AbstractBeanProcessor,AbstractBeanListProcessor,AbstractColumnProcessor,AbstractColumnProcessor
-
newCharInputReader
protected CharInputReader newCharInputReader(int whitespaceRangeStart)
An implementation ofCharInputReaderwhich loads the parser buffer in parallel or sequentially, as defined by the readInputOnSeparateThread property- Parameters:
whitespaceRangeStart- starting range of characters considered to be whitespace.- Returns:
- The input reader as chosen with the readInputOnSeparateThread property.
-
getNumberOfRecordsToRead
public long getNumberOfRecordsToRead()
The number of valid records to be parsed before the process is stopped. A negative value indicates there's no limit (defaults to -1).- Returns:
- the number of records to read before stopping the parsing process.
-
setNumberOfRecordsToRead
public void setNumberOfRecordsToRead(long numberOfRecordsToRead)
Defines the number of valid records to be parsed before the process is stopped. A negative value indicates there's no limit (defaults to -1).- Parameters:
numberOfRecordsToRead- the number of records to read before stopping the parsing process.
-
isColumnReorderingEnabled
public boolean isColumnReorderingEnabled()
Indicates whether fields selected using the field selection methods (defined by the parent classCommonSettings) should be reordered (defaults to true).When disabled, each parsed record will contain values for all columns, in the order they occur in the input. Fields which were not selected will not be parsed but and the record will contain empty values.
When enabled, each parsed record will contain values only for the selected columns. The values will be ordered according to the selection.
- Returns:
- true if the selected fields should be reordered and returned by the parser, false otherwise
-
getFieldSet
FieldSet<?> getFieldSet()
Returns the set of selected fields, if any- Overrides:
getFieldSetin classCommonSettings<F extends Format>- Returns:
- the set of selected fields. Null if no field was selected/excluded
-
getFieldSelector
FieldSelector getFieldSelector()
Returns the FieldSelector object, which handles selected fields.- Overrides:
getFieldSelectorin classCommonSettings<F extends Format>- Returns:
- the FieldSelector object, which handles selected fields. Null if no field was selected/excluded
-
setColumnReorderingEnabled
public void setColumnReorderingEnabled(boolean columnReorderingEnabled)
Defines whether fields selected using the field selection methods (defined by the parent classCommonSettings) should be reordered (defaults to true).When disabled, each parsed record will contain values for all columns, in the order they occur in the input. Fields which were not selected will not be parsed but the record will contain empty values.
When enabled, each parsed record will contain values only for the selected columns. The values will be ordered according to the selection.
- Parameters:
columnReorderingEnabled- the flag indicating whether or not selected fields should be reordered and returned by the parser
-
getInputBufferSize
public int getInputBufferSize()
Informs the number of characters held by the parser's buffer when processing the input (defaults to 1024*1024 characters).- Returns:
- the number of characters held by the parser's buffer when processing the input
-
setInputBufferSize
public void setInputBufferSize(int inputBufferSize)
Defines the number of characters held by the parser's buffer when processing the input (defaults to 1024*1024 characters).- Parameters:
inputBufferSize- the new input buffer size (in number of characters)
-
newCharAppender
protected CharAppender newCharAppender()
Returns an instance of CharAppender with the configured limit of maximum characters per column and the default value used to represent a null value (when the String parsed from the input is empty)- Returns:
- an instance of CharAppender with the configured limit of maximum characters per column and the default value used to represent a null value (when the String parsed from the input is empty)
-
isLineSeparatorDetectionEnabled
public final boolean isLineSeparatorDetectionEnabled()
Indicates whether the parser should detect the line separator automatically.- Returns:
trueif the first line of the input should be used to search for common line separator sequences (the matching sequence will be used as the line separator for parsing). Otherwisefalse.
-
setLineSeparatorDetectionEnabled
public final void setLineSeparatorDetectionEnabled(boolean lineSeparatorDetectionEnabled)
Defines whether the parser should detect the line separator automatically.- Parameters:
lineSeparatorDetectionEnabled- a flag indicating whether the first line of the input should be used to search for common line separator sequences (the matching sequence will be used as the line separator for parsing).
-
getNumberOfRowsToSkip
public final long getNumberOfRowsToSkip()
Returns the number of rows to skip from the input before the parser can begin to execute.- Returns:
- number of rows to skip before parsing
-
setNumberOfRowsToSkip
public final void setNumberOfRowsToSkip(long numberOfRowsToSkip)
Defines a number of rows to skip from the input before the parser can begin to execute.- Parameters:
numberOfRowsToSkip- number of rows to skip before parsing
-
addConfiguration
protected void addConfiguration(java.util.Map<java.lang.String,java.lang.Object> out)
- Overrides:
addConfigurationin classCommonSettings<F extends Format>
-
preventReordering
private boolean preventReordering()
-
isCommentCollectionEnabled
public boolean isCommentCollectionEnabled()
Indicates that comments found in the input must be collected (disabled by default). If enabled, comment lines will be stored by the parser and made available viaAbstractParser.getContext().comments()andAbstractParser.getContext().lastComment()- Returns:
- a flag indicating whether or not to enable collection of comments.
-
setCommentCollectionEnabled
public void setCommentCollectionEnabled(boolean commentCollectionEnabled)
Enables collection of comments found in the input (disabled by default). If enabled, comment lines will be stored by the parser and made available viaAbstractParser.getContext().comments()andAbstractParser.getContext().lastComment()- Parameters:
commentCollectionEnabled- flag indicating whether or not to enable collection of comments.
-
runAutomaticConfiguration
final void runAutomaticConfiguration()
- Overrides:
runAutomaticConfigurationin classCommonSettings<F extends Format>
-
configureFromAnnotations
protected void configureFromAnnotations(java.lang.Class<?> beanClass)
Configures the parser based on the annotations provided in a given class- Parameters:
beanClass- the classes whose annotations will be processed to derive configurations for parsing
-
clone
protected CommonParserSettings clone(boolean clearInputSpecificSettings)
Description copied from class:CommonSettingsClones this configuration object to reuse user-provided settings. Properties that are specific to a given input (such as header names and selection of fields) can be reset to their defaults if theclearInputSpecificSettingsflag is set totrue- Overrides:
clonein classCommonSettings<F extends Format>- Parameters:
clearInputSpecificSettings- flag indicating whether to clear settings that are likely to be associated with a given input.- Returns:
- a copy of the configurations applied to the current instance.
-
clone
protected CommonParserSettings clone()
Description copied from class:CommonSettingsClones this configuration object. Use alternativeCommonSettings.clone(boolean)method to reset properties that are specific to a given input, such as header names and selection of fields.- Overrides:
clonein classCommonSettings<F extends Format>- Returns:
- a copy of all configurations applied to the current instance.
-
clearInputSpecificSettings
protected void clearInputSpecificSettings()
Description copied from class:CommonSettingsClears settings that are likely to be specific to a given input.- Overrides:
clearInputSpecificSettingsin classCommonSettings<F extends Format>
-
isAutoClosingEnabled
public boolean isAutoClosingEnabled()
Indicates whether automatic closing of the input (reader, stream, etc) is enabled. Iftrue, the parser will always close the input automatically when all records have been parsed or when an error occurs. Defaults totrue- Returns:
- flag indicating whether automatic input closing is enabled.
-
setAutoClosingEnabled
public void setAutoClosingEnabled(boolean autoClosingEnabled)
Configures whether the parser should always close the input (reader, stream, etc) automatically when all records have been parsed or when an error occurs. Defaults totrue- Parameters:
autoClosingEnabled- flag determining whether automatic input closing should be enabled.
-
-