org.apache.hadoop.mapred.gridmix
Class Gridmix

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.mapred.gridmix.Gridmix
All Implemented Interfaces:
org.apache.hadoop.conf.Configurable, org.apache.hadoop.util.Tool

public class Gridmix
extends org.apache.hadoop.conf.Configured
implements org.apache.hadoop.util.Tool

Driver class for the Gridmix3 benchmark. Gridmix accepts a timestamped stream (trace) of job/task descriptions. For each job in the trace, the client will submit a corresponding, synthetic job to the target cluster at the rate in the original trace. The intent is to provide a benchmark that can be configured and extended to closely match the measured resource profile of actual, production loads.


Field Summary
static String GRIDMIX_JOBMONITOR_SLEEPTIME_MILLIS
          The configuration key which determines the duration for which the job-monitor sleeps while polling for job status.
static int GRIDMIX_JOBMONITOR_SLEEPTIME_MILLIS_DEFAULT
          Default value for GRIDMIX_JOBMONITOR_SLEEPTIME_MILLIS.
static String GRIDMIX_JOBMONITOR_THREADS
          The configuration key which determines the total number of job-status monitoring threads.
static int GRIDMIX_JOBMONITOR_THREADS_DEFAULT
          Default value for GRIDMIX_JOBMONITOR_THREADS.
static String GRIDMIX_OUT_DIR
          Output (scratch) directory for submitted jobs.
static String GRIDMIX_QUE_DEP
          The depth of the queue of job descriptions.
static String GRIDMIX_SUB_MUL
          Multiplier to accelerate or decelerate job submission.
static String GRIDMIX_SUB_THR
          Number of submitting threads at the client and upper bound for in-memory split data.
static String GRIDMIX_USR_RSV
          Class used to resolve users in the trace to the list of target users on the cluster.
static org.apache.commons.logging.Log LOG
           
static String ORIGINAL_JOB_ID
          Configuration property set in simulated job's configuration whose value is set to the corresponding original job's id.
static String ORIGINAL_JOB_NAME
          Configuration property set in simulated job's configuration whose value is set to the corresponding original job's name.
 
Constructor Summary
Gridmix()
           
 
Method Summary
protected  org.apache.hadoop.mapred.gridmix.JobFactory createJobFactory(org.apache.hadoop.mapred.gridmix.JobSubmitter submitter, String traceIn, org.apache.hadoop.fs.Path scratchDir, org.apache.hadoop.conf.Configuration conf, CountDownLatch startFlag, UserResolver resolver)
           
protected  org.apache.hadoop.mapred.gridmix.JobMonitor createJobMonitor(Statistics stats, org.apache.hadoop.conf.Configuration conf)
           
protected  org.apache.hadoop.tools.rumen.JobStoryProducer createJobStoryProducer(String traceIn, org.apache.hadoop.conf.Configuration conf)
          Create an appropriate JobStoryProducer object for the given trace.
protected  org.apache.hadoop.mapred.gridmix.JobSubmitter createJobSubmitter(org.apache.hadoop.mapred.gridmix.JobMonitor monitor, int threads, int queueDepth, org.apache.hadoop.mapred.gridmix.FilePool pool, UserResolver resolver, Statistics statistics)
           
 UserResolver getCurrentUserResolver()
           
protected static org.apache.hadoop.mapred.gridmix.GridmixJobSubmissionPolicy getJobSubmissionPolicy(org.apache.hadoop.conf.Configuration conf)
           
protected  org.apache.hadoop.mapred.gridmix.Summarizer getSummarizer()
           
static void main(String[] argv)
           
protected  void printUsage(PrintStream out)
           
 int run(String[] argv)
           
protected  void writeDistCacheData(org.apache.hadoop.conf.Configuration conf)
          Write random bytes in the distributed cache files that will be used by all simulated jobs of current gridmix run, if files are to be generated.
protected  int writeInputData(long genbytes, org.apache.hadoop.fs.Path inputDir)
          Write random bytes at the path <inputDir> if needed.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG

GRIDMIX_OUT_DIR

public static final String GRIDMIX_OUT_DIR
Output (scratch) directory for submitted jobs. Relative paths are resolved against the path provided as input and absolute paths remain independent of it. The default is "gridmix".

See Also:
Constant Field Values

GRIDMIX_SUB_THR

public static final String GRIDMIX_SUB_THR
Number of submitting threads at the client and upper bound for in-memory split data. Submitting threads precompute InputSplits for submitted jobs. This limits the number of splits held in memory waiting for submission and also permits parallel computation of split data.

See Also:
Constant Field Values

GRIDMIX_QUE_DEP

public static final String GRIDMIX_QUE_DEP
The depth of the queue of job descriptions. Before splits are computed, a queue of pending descriptions is stored in memoory. This parameter limits the depth of that queue.

See Also:
Constant Field Values

GRIDMIX_SUB_MUL

public static final String GRIDMIX_SUB_MUL
Multiplier to accelerate or decelerate job submission. As a crude means of sizing a job trace to a cluster, the time separating two jobs is multiplied by this factor.

See Also:
Constant Field Values

GRIDMIX_USR_RSV

public static final String GRIDMIX_USR_RSV
Class used to resolve users in the trace to the list of target users on the cluster.

See Also:
Constant Field Values

GRIDMIX_JOBMONITOR_SLEEPTIME_MILLIS

public static final String GRIDMIX_JOBMONITOR_SLEEPTIME_MILLIS
The configuration key which determines the duration for which the job-monitor sleeps while polling for job status. This value should be specified in milliseconds.

See Also:
Constant Field Values

GRIDMIX_JOBMONITOR_SLEEPTIME_MILLIS_DEFAULT

public static final int GRIDMIX_JOBMONITOR_SLEEPTIME_MILLIS_DEFAULT
Default value for GRIDMIX_JOBMONITOR_SLEEPTIME_MILLIS.

See Also:
Constant Field Values

GRIDMIX_JOBMONITOR_THREADS

public static final String GRIDMIX_JOBMONITOR_THREADS
The configuration key which determines the total number of job-status monitoring threads.

See Also:
Constant Field Values

GRIDMIX_JOBMONITOR_THREADS_DEFAULT

public static final int GRIDMIX_JOBMONITOR_THREADS_DEFAULT
Default value for GRIDMIX_JOBMONITOR_THREADS.

See Also:
Constant Field Values

ORIGINAL_JOB_NAME

public static final String ORIGINAL_JOB_NAME
Configuration property set in simulated job's configuration whose value is set to the corresponding original job's name. This is not configurable by gridmix user.

See Also:
Constant Field Values

ORIGINAL_JOB_ID

public static final String ORIGINAL_JOB_ID
Configuration property set in simulated job's configuration whose value is set to the corresponding original job's id. This is not configurable by gridmix user.

See Also:
Constant Field Values
Constructor Detail

Gridmix

public Gridmix()
Method Detail

writeInputData

protected int writeInputData(long genbytes,
                             org.apache.hadoop.fs.Path inputDir)
                      throws IOException,
                             InterruptedException
Write random bytes at the path <inputDir> if needed.

Returns:
exit status
Throws:
IOException
InterruptedException
See Also:
GenerateData

writeDistCacheData

protected void writeDistCacheData(org.apache.hadoop.conf.Configuration conf)
                           throws IOException,
                                  InterruptedException
Write random bytes in the distributed cache files that will be used by all simulated jobs of current gridmix run, if files are to be generated. Do this as part of the MapReduce job GenerateDistCacheData.JOB_NAME

Throws:
IOException
InterruptedException
See Also:
GenerateDistCacheData

createJobStoryProducer

protected org.apache.hadoop.tools.rumen.JobStoryProducer createJobStoryProducer(String traceIn,
                                                                                org.apache.hadoop.conf.Configuration conf)
                                                                         throws IOException
Create an appropriate JobStoryProducer object for the given trace.

Parameters:
traceIn - the path to the trace file. The special path "-" denotes the standard input stream.
conf - the configuration to be used.
Throws:
IOException - if there was an error.

getJobSubmissionPolicy

protected static org.apache.hadoop.mapred.gridmix.GridmixJobSubmissionPolicy getJobSubmissionPolicy(org.apache.hadoop.conf.Configuration conf)

createJobMonitor

protected org.apache.hadoop.mapred.gridmix.JobMonitor createJobMonitor(Statistics stats,
                                                                       org.apache.hadoop.conf.Configuration conf)
                                                                throws IOException
Throws:
IOException

createJobSubmitter

protected org.apache.hadoop.mapred.gridmix.JobSubmitter createJobSubmitter(org.apache.hadoop.mapred.gridmix.JobMonitor monitor,
                                                                           int threads,
                                                                           int queueDepth,
                                                                           org.apache.hadoop.mapred.gridmix.FilePool pool,
                                                                           UserResolver resolver,
                                                                           Statistics statistics)
                                                                    throws IOException
Throws:
IOException

createJobFactory

protected org.apache.hadoop.mapred.gridmix.JobFactory createJobFactory(org.apache.hadoop.mapred.gridmix.JobSubmitter submitter,
                                                                       String traceIn,
                                                                       org.apache.hadoop.fs.Path scratchDir,
                                                                       org.apache.hadoop.conf.Configuration conf,
                                                                       CountDownLatch startFlag,
                                                                       UserResolver resolver)
                                                                throws IOException
Throws:
IOException

getCurrentUserResolver

public UserResolver getCurrentUserResolver()

run

public int run(String[] argv)
        throws IOException,
               InterruptedException
Specified by:
run in interface org.apache.hadoop.util.Tool
Throws:
IOException
InterruptedException

main

public static void main(String[] argv)
                 throws Exception
Throws:
Exception

printUsage

protected void printUsage(PrintStream out)

getSummarizer

protected org.apache.hadoop.mapred.gridmix.Summarizer getSummarizer()


Copyright © 2014 Apache Software Foundation. All Rights Reserved.