| Interface | Description |
|---|---|
| SplitInput.SplitCallback |
Used to pass information back to a caller once a file has been split without the need for a data object
|
| Class | Description |
|---|---|
| Bump125 |
Helps with making nice intervals at arbitrary scale.
|
| ConcatenateVectorsJob | |
| ConcatenateVectorsReducer | |
| MatrixDumper |
Export a Matrix in various text formats:
* CSV file
Input format: Hadoop SequenceFile with Text key and MatrixWritable value, 1 pair
TODO:
Needs class for key value- should not hard-code to Text.
|
| SequenceFileDumper | |
| SplitInput |
A utility for splitting files in the input format used by the Bayes
classifiers or anything else that has one item per line or SequenceFiles (key/value)
into training and test sets in order to perform cross-validation.
|
| SplitInputJob | |
| SplitInputJob.SplitInputComparator |
Randomly permute key value pairs
|
| SplitInputJob.SplitInputMapper |
Mapper which downsamples the input by downsamplingFactor
|
| SplitInputJob.SplitInputReducer |
Reducer which uses MultipleOutputs to randomly allocate key value pairs
between test and training outputs
|
Copyright © 2008–2013 The Apache Software Foundation. All rights reserved.