|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapreduce.InputFormat<K,V>
org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
org.apache.hadoop.examples.terasort.TeraInputFormat
public class TeraInputFormat
An input format that reads the first 10 characters of each line as the key and the rest of the line as the value. Both key and value are represented as Text.
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
|---|
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter |
| Field Summary |
|---|
| Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
|---|
DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE |
| Constructor Summary | |
|---|---|
TeraInputFormat()
|
|
| Method Summary | |
|---|---|
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
|
List<org.apache.hadoop.mapreduce.InputSplit> |
getSplits(org.apache.hadoop.mapreduce.JobContext job)
|
static void |
writePartitionFile(org.apache.hadoop.mapreduce.JobContext job,
org.apache.hadoop.fs.Path partFile)
Use the input splits to take samples of the input and generate sample keys. |
| Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat |
|---|
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, isSplitable, listStatus, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public TeraInputFormat()
| Method Detail |
|---|
public static void writePartitionFile(org.apache.hadoop.mapreduce.JobContext job,
org.apache.hadoop.fs.Path partFile)
throws Throwable
job - the job to samplepartFile - where to write the output file to
Throwable - if something goes wrong
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
throws IOException
createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>IOException
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext job)
throws IOException
getSplits in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>IOException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||