org.apache.hadoop.mapreduce.v2.app
Class MRAppMaster

java.lang.Object
  extended by org.apache.hadoop.yarn.service.AbstractService
      extended by org.apache.hadoop.yarn.service.CompositeService
          extended by org.apache.hadoop.mapreduce.v2.app.MRAppMaster
All Implemented Interfaces:
org.apache.hadoop.yarn.service.Service

public class MRAppMaster
extends org.apache.hadoop.yarn.service.CompositeService

The Map-Reduce Application Master. The state machine is encapsulated in the implementation of Job interface. All state changes happens via Job interface. Each event results in a Finite State Transition in Job. MR AppMaster is the composition of loosely coupled services. The services interact with each other via events. The components resembles the Actors model. The component acts on received event and send out the events to other components. This keeps it highly concurrent with no or minimal synchronization needs. The events are dispatched by a central Dispatch mechanism. All components register to the Dispatcher. The information is shared across different components using AppContext.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.hadoop.yarn.service.CompositeService
org.apache.hadoop.yarn.service.CompositeService.CompositeServiceShutdownHook
 
Nested classes/interfaces inherited from interface org.apache.hadoop.yarn.service.Service
org.apache.hadoop.yarn.service.Service.STATE
 
Field Summary
protected  MRAppMetrics metrics
           
 
Constructor Summary
MRAppMaster(org.apache.hadoop.yarn.api.records.ApplicationAttemptId applicationAttemptId, org.apache.hadoop.yarn.api.records.ContainerId containerId, String nmHost, int nmPort, int nmHttpPort, org.apache.hadoop.yarn.Clock clock, long appSubmitTime)
           
MRAppMaster(org.apache.hadoop.yarn.api.records.ApplicationAttemptId applicationAttemptId, org.apache.hadoop.yarn.api.records.ContainerId containerId, String nmHost, int nmPort, int nmHttpPort, long appSubmitTime)
           
 
Method Summary
protected  void addIfService(Object object)
           
 void cleanupStagingDir()
          clean up staging directories for the job.
protected  ClientService createClientService(AppContext context)
           
protected  ContainerAllocator createContainerAllocator(ClientService clientService, AppContext context)
           
protected  ContainerLauncher createContainerLauncher(AppContext context)
           
protected  org.apache.hadoop.yarn.event.Dispatcher createDispatcher()
           
protected  Job createJob(org.apache.hadoop.conf.Configuration conf)
          Create and initialize (but don't start) a single job.
protected  org.apache.hadoop.yarn.event.EventHandler<JobFinishEvent> createJobFinishEventHandler()
          create an event handler that handles the job finish event.
protected  org.apache.hadoop.yarn.event.EventHandler<JobHistoryEvent> createJobHistoryHandler(AppContext context)
           
protected  Recovery createRecoveryService(AppContext appContext)
          Create the recovery service.
protected  Speculator createSpeculator(org.apache.hadoop.conf.Configuration conf, AppContext context)
           
protected  TaskAttemptListener createTaskAttemptListener(AppContext context)
           
protected  TaskCleaner createTaskCleaner(AppContext context)
           
protected  void downloadTokensAndSetupUGI(org.apache.hadoop.conf.Configuration conf)
          Obtain the tokens needed by the job and put them in the UGI
 List<org.apache.hadoop.mapreduce.v2.api.records.AMInfo> getAllAMInfos()
           
 org.apache.hadoop.yarn.api.records.ApplicationId getAppID()
           
 org.apache.hadoop.yarn.api.records.ApplicationAttemptId getAttemptID()
           
 org.apache.hadoop.mapreduce.OutputCommitter getCommitter()
           
 Map<org.apache.hadoop.mapreduce.v2.api.records.TaskId,org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.TaskInfo> getCompletedTaskFromPreviousRun()
           
 ContainerAllocator getContainerAllocator()
           
 ContainerLauncher getContainerLauncher()
           
 AppContext getContext()
           
 org.apache.hadoop.yarn.event.Dispatcher getDispatcher()
           
protected  org.apache.hadoop.fs.FileSystem getFileSystem(org.apache.hadoop.conf.Configuration conf)
          Create the default file System for this job.
 org.apache.hadoop.mapreduce.v2.api.records.JobId getJobId()
           
 int getStartCount()
           
 TaskAttemptListener getTaskAttemptListener()
           
 void init(org.apache.hadoop.conf.Configuration conf)
           
protected static void initAndStartAppMaster(MRAppMaster appMaster, org.apache.hadoop.yarn.conf.YarnConfiguration conf, String jobUserName)
           
 boolean isNewApiCommitter()
           
protected  boolean keepJobFiles(org.apache.hadoop.mapred.JobConf conf)
           
static void main(String[] args)
           
 void start()
           
protected  void startJobs()
          This can be overridden to instantiate multiple jobs and create a workflow.
protected  void sysexit()
          Exit call.
 
Methods inherited from class org.apache.hadoop.yarn.service.CompositeService
addService, getServices, removeService, stop
 
Methods inherited from class org.apache.hadoop.yarn.service.AbstractService
getConfig, getName, getServiceState, getStartTime, register, unregister
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

metrics

protected final MRAppMetrics metrics
Constructor Detail

MRAppMaster

public MRAppMaster(org.apache.hadoop.yarn.api.records.ApplicationAttemptId applicationAttemptId,
                   org.apache.hadoop.yarn.api.records.ContainerId containerId,
                   String nmHost,
                   int nmPort,
                   int nmHttpPort,
                   long appSubmitTime)

MRAppMaster

public MRAppMaster(org.apache.hadoop.yarn.api.records.ApplicationAttemptId applicationAttemptId,
                   org.apache.hadoop.yarn.api.records.ContainerId containerId,
                   String nmHost,
                   int nmPort,
                   int nmHttpPort,
                   org.apache.hadoop.yarn.Clock clock,
                   long appSubmitTime)
Method Detail

init

public void init(org.apache.hadoop.conf.Configuration conf)
Specified by:
init in interface org.apache.hadoop.yarn.service.Service
Overrides:
init in class org.apache.hadoop.yarn.service.CompositeService

createDispatcher

protected org.apache.hadoop.yarn.event.Dispatcher createDispatcher()

keepJobFiles

protected boolean keepJobFiles(org.apache.hadoop.mapred.JobConf conf)

getFileSystem

protected org.apache.hadoop.fs.FileSystem getFileSystem(org.apache.hadoop.conf.Configuration conf)
                                                 throws IOException
Create the default file System for this job.

Parameters:
conf - the conf object
Returns:
the default filesystem for this job
Throws:
IOException

cleanupStagingDir

public void cleanupStagingDir()
                       throws IOException
clean up staging directories for the job.

Throws:
IOException

sysexit

protected void sysexit()
Exit call. Just in a function call to enable testing.


createJobFinishEventHandler

protected org.apache.hadoop.yarn.event.EventHandler<JobFinishEvent> createJobFinishEventHandler()
create an event handler that handles the job finish event.

Returns:
the job finish event handler.

createRecoveryService

protected Recovery createRecoveryService(AppContext appContext)
Create the recovery service.

Returns:
an instance of the recovery service.

createJob

protected Job createJob(org.apache.hadoop.conf.Configuration conf)
Create and initialize (but don't start) a single job.


downloadTokensAndSetupUGI

protected void downloadTokensAndSetupUGI(org.apache.hadoop.conf.Configuration conf)
Obtain the tokens needed by the job and put them in the UGI

Parameters:
conf -

addIfService

protected void addIfService(Object object)

createJobHistoryHandler

protected org.apache.hadoop.yarn.event.EventHandler<JobHistoryEvent> createJobHistoryHandler(AppContext context)

createSpeculator

protected Speculator createSpeculator(org.apache.hadoop.conf.Configuration conf,
                                      AppContext context)

createTaskAttemptListener

protected TaskAttemptListener createTaskAttemptListener(AppContext context)

createTaskCleaner

protected TaskCleaner createTaskCleaner(AppContext context)

createContainerAllocator

protected ContainerAllocator createContainerAllocator(ClientService clientService,
                                                      AppContext context)

createContainerLauncher

protected ContainerLauncher createContainerLauncher(AppContext context)

createClientService

protected ClientService createClientService(AppContext context)

getAppID

public org.apache.hadoop.yarn.api.records.ApplicationId getAppID()

getAttemptID

public org.apache.hadoop.yarn.api.records.ApplicationAttemptId getAttemptID()

getJobId

public org.apache.hadoop.mapreduce.v2.api.records.JobId getJobId()

getCommitter

public org.apache.hadoop.mapreduce.OutputCommitter getCommitter()

isNewApiCommitter

public boolean isNewApiCommitter()

getStartCount

public int getStartCount()

getContext

public AppContext getContext()

getDispatcher

public org.apache.hadoop.yarn.event.Dispatcher getDispatcher()

getCompletedTaskFromPreviousRun

public Map<org.apache.hadoop.mapreduce.v2.api.records.TaskId,org.apache.hadoop.mapreduce.jobhistory.JobHistoryParser.TaskInfo> getCompletedTaskFromPreviousRun()

getAllAMInfos

public List<org.apache.hadoop.mapreduce.v2.api.records.AMInfo> getAllAMInfos()

getContainerAllocator

public ContainerAllocator getContainerAllocator()

getContainerLauncher

public ContainerLauncher getContainerLauncher()

getTaskAttemptListener

public TaskAttemptListener getTaskAttemptListener()

start

public void start()
Specified by:
start in interface org.apache.hadoop.yarn.service.Service
Overrides:
start in class org.apache.hadoop.yarn.service.CompositeService

startJobs

protected void startJobs()
This can be overridden to instantiate multiple jobs and create a workflow. TODO: Rework the design to actually support this. Currently much of the job stuff has been moved to init() above to support uberization (MR-1220). In a typical workflow, one presumably would want to uberize only a subset of the jobs (the "small" ones), which is awkward with the current design.


main

public static void main(String[] args)

initAndStartAppMaster

protected static void initAndStartAppMaster(MRAppMaster appMaster,
                                            org.apache.hadoop.yarn.conf.YarnConfiguration conf,
                                            String jobUserName)
                                     throws IOException,
                                            InterruptedException
Throws:
IOException
InterruptedException


Copyright © 2012 Apache Software Foundation. All Rights Reserved.