public class LuceneIterator extends AbstractLuceneIterator
| Modifier and Type | Field and Description |
|---|---|
protected String |
idField |
protected Set<String> |
idFieldSelector |
bump, field, indexReader, maxErrorDocs, nextDocId, nextLogRecord, normPower, numErrorDocs, skippedErrorMessages, terminfo, weight| Constructor and Description |
|---|
LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
TermInfo terminfo,
Weight weight,
double normPower)
Produce a LuceneIterable that can create the Vector plus normalize it.
|
LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
TermInfo terminfo,
Weight weight,
double normPower,
double maxPercentErrorDocs) |
| Modifier and Type | Method and Description |
|---|---|
protected String |
getVectorName(int documentIndex)
Given the document name, derive a name for the vector.
|
computeNextprotected final String idField
public LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
TermInfo terminfo,
Weight weight,
double normPower)
indexReader - IndexReader to read the documents from.idField - field containing the id. May be null.field - field to use for the Vectorterminfo - terminfoweight - weightnormPower - the normalization value. Must be nonnegative, or LuceneIterable.NO_NORMALIZINGpublic LuceneIterator(org.apache.lucene.index.IndexReader indexReader,
String idField,
String field,
TermInfo terminfo,
Weight weight,
double normPower,
double maxPercentErrorDocs)
indexReader - IndexReader to read the documents from.idField - field containing the id. May be null.field - field to use for the Vectorterminfo - terminfoweight - weightnormPower - the normalization value. Must be nonnegative, or LuceneIterable.NO_NORMALIZINGmaxPercentErrorDocs - most documents that will be tolerated without a term freq vector. In [0,1].LuceneIterator(org.apache.lucene.index.IndexReader, String, String, org.apache.mahout.utils.vectors.TermInfo,
org.apache.mahout.vectorizer.Weight, double)protected String getVectorName(int documentIndex) throws IOException
AbstractLuceneIteratorgetVectorName in class AbstractLuceneIteratordocumentIndex - the lucene document index.IOExceptionCopyright © 2008–2013 The Apache Software Foundation. All rights reserved.