com.twitter.elephantbird.mapreduce.output
Class RCFileThriftOutputFormat
java.lang.Object
org.apache.hadoop.mapreduce.OutputFormat<K,V>
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable>
com.twitter.elephantbird.mapreduce.output.RCFileOutputFormat
com.twitter.elephantbird.mapreduce.output.RCFileThriftOutputFormat
public class RCFileThriftOutputFormat
- extends RCFileOutputFormat
OutputFormat for storing Thrift objects in RCFile.
Each of the top level fields is stored in a separate column.
Thrift field ids are stored in RCFile metadata.
The user can write either a ThriftWritable with the Thrift object
or a BytesWritable with serialized Thrift bytes. The latter
ensures that all the fields are preserved even if the current Thrift
definition does not match the definition represented in the serialized bytes.
Any fields not recognized by current Thrift class are stored in the last
column.
| Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat |
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.Counter |
| Fields inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat |
BASE_OUTPUT_NAME, PART |
|
Method Summary |
org.apache.hadoop.mapreduce.RecordWriter<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable> |
getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext job)
|
protected ColumnarMetadata |
makeColumnarMetadata()
|
static void |
setClassConf(Class<? extends org.apache.thrift.TBase<?,?>> thriftClass,
org.apache.hadoop.conf.Configuration conf)
Stores supplied class name in configuration. |
| Methods inherited from class org.apache.hadoop.mapreduce.lib.output.FileOutputFormat |
checkOutputSpecs, getCompressOutput, getDefaultWorkFile, getOutputCommitter, getOutputCompressorClass, getOutputName, getOutputPath, getPathForWorkFile, getUniqueFile, getWorkOutputPath, setCompressOutput, setOutputCompressorClass, setOutputName, setOutputPath |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
RCFileThriftOutputFormat
public RCFileThriftOutputFormat()
- internal, for MR use only.
RCFileThriftOutputFormat
public RCFileThriftOutputFormat(TypeRef<? extends org.apache.thrift.TBase<?,?>> typeRef)
makeColumnarMetadata
protected ColumnarMetadata makeColumnarMetadata()
setClassConf
public static void setClassConf(Class<? extends org.apache.thrift.TBase<?,?>> thriftClass,
org.apache.hadoop.conf.Configuration conf)
- Stores supplied class name in configuration. This configuration is
read on the remote tasks to initialize the output format correctly.
getRecordWriter
public org.apache.hadoop.mapreduce.RecordWriter<org.apache.hadoop.io.NullWritable,org.apache.hadoop.io.Writable> getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext job)
throws IOException,
InterruptedException
- Overrides:
getRecordWriter in class RCFileOutputFormat
- Throws:
IOException
InterruptedException
Copyright © 2015 Twitter. All Rights Reserved.