Class ScannerOptions
- java.lang.Object
-
- org.apache.accumulo.core.clientImpl.ScannerOptions
-
- All Implemented Interfaces:
AutoCloseable,Iterable<Map.Entry<Key,Value>>,ScannerBase
- Direct Known Subclasses:
ClientSideIteratorScanner,IsolatedScanner,OfflineScanner,ScannerImpl,TabletServerBatchReader
public class ScannerOptions extends Object implements ScannerBase
-
-
Field Summary
Fields Modifier and Type Field Description protected longbatchTimeOutprotected StringclassLoaderContextprotected Map<String,String>executionHintsprotected SortedSet<Column>fetchedColumnsprotected List<IterInfo>serverSideIteratorListprotected Map<String,Map<String,String>>serverSideIteratorOptionsprotected longtimeOut
-
Constructor Summary
Constructors Modifier Constructor Description protectedScannerOptions()ScannerOptions(ScannerOptions so)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddScanIterator(IteratorSetting si)Add a server-side scan iterator.voidclearClassLoaderContext()Clears the current classloader context set on this scannervoidclearColumns()Clears the columns to be fetched (useful for resetting the scanner for reuse).voidclearSamplerConfiguration()Clears sampler configuration making a scanner read all data.voidclearScanIterators()Clears scan iterators prior to returning a scanner to the pool.voidclose()Closes any underlying connections on the scanner.voidfetchColumn(IteratorSetting.Column column)Adds a column to the list of columns that will be fetch by this scanner.voidfetchColumn(org.apache.hadoop.io.Text colFam, org.apache.hadoop.io.Text colQual)Adds a column to the list of columns that will be fetched by this scanner.voidfetchColumnFamily(org.apache.hadoop.io.Text col)Adds a column family to the list of columns that will be fetched by this scanner.AuthorizationsgetAuthorizations()Returns the authorizations that have been set on the scannerlonggetBatchTimeout(TimeUnit timeUnit)Returns the timeout to fill a batch in the given TimeUnit.StringgetClassLoaderContext()Returns the name of the current classloader context set on this scannerSortedSet<Column>getFetchedColumns()SamplerConfigurationgetSamplerConfiguration()longgetTimeout(TimeUnit timeunit)Returns the setting for how long a scanner will automatically retry when a failure occurs.Iterator<Map.Entry<Key,Value>>iterator()Returns an iterator over an accumulo table.voidremoveScanIterator(String iteratorName)Remove an iterator from the list of iterators.voidsetBatchTimeout(long timeout, TimeUnit timeUnit)This setting determines how long a scanner will wait to fill the returned batch.voidsetClassLoaderContext(String classLoaderContext)Sets the name of the classloader context on this scanner.voidsetExecutionHints(Map<String,String> hints)Set hints for the configuredScanPrioritizerandScanDispatcher.protected static voidsetOptions(ScannerOptions dst, ScannerOptions src)voidsetSamplerConfiguration(SamplerConfiguration samplerConfig)Setting this will cause the scanner to read sample data, as long as that sample data was generated with the given configuration.voidsetTimeout(long timeout, TimeUnit timeUnit)This setting determines how long a scanner will automatically retry when a failure occurs.voidupdateScanIteratorOption(String iteratorName, String key, String value)Update the options for an iterator.-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Methods inherited from interface org.apache.accumulo.core.client.ScannerBase
fetchColumn, fetchColumnFamily
-
-
-
-
Field Detail
-
timeOut
protected long timeOut
-
batchTimeOut
protected long batchTimeOut
-
classLoaderContext
protected String classLoaderContext
-
-
Constructor Detail
-
ScannerOptions
protected ScannerOptions()
-
ScannerOptions
public ScannerOptions(ScannerOptions so)
-
-
Method Detail
-
addScanIterator
public void addScanIterator(IteratorSetting si)
Description copied from interface:ScannerBaseAdd a server-side scan iterator.- Specified by:
addScanIteratorin interfaceScannerBase- Parameters:
si- fully specified scan-time iterator, including all options for the iterator. Any changes to the iterator setting after this call are not propagated to the stored iterator.
-
removeScanIterator
public void removeScanIterator(String iteratorName)
Description copied from interface:ScannerBaseRemove an iterator from the list of iterators.- Specified by:
removeScanIteratorin interfaceScannerBase- Parameters:
iteratorName- nickname used for the iterator
-
updateScanIteratorOption
public void updateScanIteratorOption(String iteratorName, String key, String value)
Description copied from interface:ScannerBaseUpdate the options for an iterator. Note that this does not change the iterator options during a scan, it just replaces the given option on a configured iterator before a scan is started.- Specified by:
updateScanIteratorOptionin interfaceScannerBase- Parameters:
iteratorName- the name of the iterator to changekey- the name of the optionvalue- the new value for the named option
-
fetchColumnFamily
public void fetchColumnFamily(org.apache.hadoop.io.Text col)
Description copied from interface:ScannerBaseAdds a column family to the list of columns that will be fetched by this scanner. By default when no columns have been added the scanner fetches all columns. To fetch multiple column families call this function multiple times.This can help limit which locality groups are read on the server side.
When used in conjunction with custom iterators, the set of column families fetched is passed to the top iterator's seek method. Custom iterators may change this set of column families when calling seek on their source.
- Specified by:
fetchColumnFamilyin interfaceScannerBase- Parameters:
col- the column family to be fetched
-
fetchColumn
public void fetchColumn(org.apache.hadoop.io.Text colFam, org.apache.hadoop.io.Text colQual)Description copied from interface:ScannerBaseAdds a column to the list of columns that will be fetched by this scanner. The column is identified by family and qualifier. By default when no columns have been added the scanner fetches all columns.WARNING. Using this method with custom iterators may have unexpected results. Iterators have control over which column families are fetched. However iterators have no control over which column qualifiers are fetched. When this method is called it activates a system iterator that only allows the requested family/qualifier pairs through. This low level filtering prevents custom iterators from requesting additional column families when calling seek.
For an example, assume fetchColumns(A, Q1) and fetchColumns(B,Q1) is called on a scanner and a custom iterator is configured. The families (A,B) will be passed to the seek method of the custom iterator. If the custom iterator seeks its source iterator using the families (A,B,C), it will never see any data from C because the system iterator filtering A:Q1 and B:Q1 will prevent the C family from getting through. ACCUMULO-3905 also has an example of the type of problem this method can cause.
tl;dr If using a custom iterator with a seek method that adds column families, then may want to avoid using this method.
- Specified by:
fetchColumnin interfaceScannerBase- Parameters:
colFam- the column family of the column to be fetchedcolQual- the column qualifier of the column to be fetched
-
fetchColumn
public void fetchColumn(IteratorSetting.Column column)
Description copied from interface:ScannerBaseAdds a column to the list of columns that will be fetch by this scanner.- Specified by:
fetchColumnin interfaceScannerBase- Parameters:
column- theIteratorSetting.Columnto fetch
-
clearColumns
public void clearColumns()
Description copied from interface:ScannerBaseClears the columns to be fetched (useful for resetting the scanner for reuse). Once cleared, the scanner will fetch all columns.- Specified by:
clearColumnsin interfaceScannerBase
-
clearScanIterators
public void clearScanIterators()
Description copied from interface:ScannerBaseClears scan iterators prior to returning a scanner to the pool.- Specified by:
clearScanIteratorsin interfaceScannerBase
-
setOptions
protected static void setOptions(ScannerOptions dst, ScannerOptions src)
-
iterator
public Iterator<Map.Entry<Key,Value>> iterator()
Description copied from interface:ScannerBaseReturns an iterator over an accumulo table. This iterator uses the options that are currently set for its lifetime, so setting options will have no effect on existing iterators.Keys returned by the iterator are not guaranteed to be in sorted order.
-
setTimeout
public void setTimeout(long timeout, TimeUnit timeUnit)Description copied from interface:ScannerBaseThis setting determines how long a scanner will automatically retry when a failure occurs. By default, a scanner will retry forever.Setting the timeout to zero (with any time unit) or
Long.MAX_VALUE(withTimeUnit.MILLISECONDS) means no timeout.- Specified by:
setTimeoutin interfaceScannerBase- Parameters:
timeout- the length of the timeouttimeUnit- the units of the timeout
-
getTimeout
public long getTimeout(TimeUnit timeunit)
Description copied from interface:ScannerBaseReturns the setting for how long a scanner will automatically retry when a failure occurs.- Specified by:
getTimeoutin interfaceScannerBase- Returns:
- the timeout configured for this scanner
-
close
public void close()
Description copied from interface:ScannerBaseCloses any underlying connections on the scanner. This may invalidate any iterators derived from the Scanner, causing them to throw exceptions.- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceScannerBase
-
getAuthorizations
public Authorizations getAuthorizations()
Description copied from interface:ScannerBaseReturns the authorizations that have been set on the scanner- Specified by:
getAuthorizationsin interfaceScannerBase- Returns:
- The authorizations set on the scanner instance
-
setSamplerConfiguration
public void setSamplerConfiguration(SamplerConfiguration samplerConfig)
Description copied from interface:ScannerBaseSetting this will cause the scanner to read sample data, as long as that sample data was generated with the given configuration. By default this is not set and all data is read.One way to use this method is as follows, where the sampler configuration is obtained from the table configuration. Sample data can be generated in many different ways, so its important to verify the sample data configuration meets expectations.
// could cache this if creating many scanners to avoid RPCs. SamplerConfiguration samplerConfig = client.tableOperations().getSamplerConfiguration(table); // verify table's sample data is generated in an expected way before using userCode.verifySamplerConfig(samplerConfig); scanner.setSamplerConfiguration(samplerConfig);Of course this is not the only way to obtain a
SamplerConfiguration, it could be a constant, configuration, etc.If sample data is not present or sample data was generated with a different configuration, then the scanner iterator will throw a
SampleNotPresentException. Also if a table's sampler configuration is changed while a scanner is iterating over a table, aSampleNotPresentExceptionmay be thrown.- Specified by:
setSamplerConfigurationin interfaceScannerBase
-
getSamplerConfiguration
public SamplerConfiguration getSamplerConfiguration()
- Specified by:
getSamplerConfigurationin interfaceScannerBase- Returns:
- currently set sampler configuration. Returns null if no sampler configuration is set.
-
clearSamplerConfiguration
public void clearSamplerConfiguration()
Description copied from interface:ScannerBaseClears sampler configuration making a scanner read all data. After calling this,ScannerBase.getSamplerConfiguration()should return null.- Specified by:
clearSamplerConfigurationin interfaceScannerBase
-
setBatchTimeout
public void setBatchTimeout(long timeout, TimeUnit timeUnit)Description copied from interface:ScannerBaseThis setting determines how long a scanner will wait to fill the returned batch. By default, a scanner wait until the batch is full.Setting the timeout to zero (with any time unit) or
Long.MAX_VALUE(withTimeUnit.MILLISECONDS) means no timeout.- Specified by:
setBatchTimeoutin interfaceScannerBase- Parameters:
timeout- the length of the timeouttimeUnit- the units of the timeout
-
getBatchTimeout
public long getBatchTimeout(TimeUnit timeUnit)
Description copied from interface:ScannerBaseReturns the timeout to fill a batch in the given TimeUnit.- Specified by:
getBatchTimeoutin interfaceScannerBase- Returns:
- the batch timeout configured for this scanner
-
setClassLoaderContext
public void setClassLoaderContext(String classLoaderContext)
Description copied from interface:ScannerBaseSets the name of the classloader context on this scanner. See the administration chapter of the user manual for details on how to configure and use classloader contexts.- Specified by:
setClassLoaderContextin interfaceScannerBase- Parameters:
classLoaderContext- name of the classloader context
-
clearClassLoaderContext
public void clearClassLoaderContext()
Description copied from interface:ScannerBaseClears the current classloader context set on this scanner- Specified by:
clearClassLoaderContextin interfaceScannerBase
-
getClassLoaderContext
public String getClassLoaderContext()
Description copied from interface:ScannerBaseReturns the name of the current classloader context set on this scanner- Specified by:
getClassLoaderContextin interfaceScannerBase- Returns:
- name of the current context
-
setExecutionHints
public void setExecutionHints(Map<String,String> hints)
Description copied from interface:ScannerBaseSet hints for the configuredScanPrioritizerandScanDispatcher. These hints are available on the server side viaScanInfo.getExecutionHints()Depending on the configuration, these hints may be ignored. Hints will never impact what data is returned by a scan, only how quickly it is returned.Using the hint
scan_type=<type>and documenting all of the types for your application is one strategy to consider. This allows administrators to adjust executor and prioritizer config for your application scan types without having to change the application source code.The default configuration for Accumulo will ignore hints. See
HintScanPrioritizerandSimpleScanDispatcherfor examples of classes that can react to hints.- Specified by:
setExecutionHintsin interfaceScannerBase
-
-