public class JobMaster extends org.apache.flink.runtime.rpc.FencedRpcEndpoint<JobMasterId> implements JobMasterGateway, JobMasterService
JobGraph.
It offers the following methods as part of its rpc interface to interact with the JobMaster remotely:
updateTaskExecutionState(org.apache.flink.runtime.taskmanager.TaskExecutionState) updates the task execution state for given task
| 限定符和类型 | 字段和说明 |
|---|---|
static String |
JOB_MANAGER_NAME
Default names for Flink's distributed components.
|
| 构造器和说明 |
|---|
JobMaster(org.apache.flink.runtime.rpc.RpcService rpcService,
JobMasterId jobMasterId,
JobMasterConfiguration jobMasterConfiguration,
ResourceID resourceId,
JobGraph jobGraph,
HighAvailabilityServices highAvailabilityService,
SlotPoolServiceSchedulerFactory slotPoolServiceSchedulerFactory,
JobManagerSharedServices jobManagerSharedServices,
HeartbeatServices heartbeatServices,
JobManagerJobMetricGroupFactory jobMetricGroupFactory,
OnCompletionActions jobCompletionActions,
org.apache.flink.runtime.rpc.FatalErrorHandler fatalErrorHandler,
ClassLoader userCodeLoader,
ShuffleMaster<?> shuffleMaster,
PartitionTrackerFactory partitionTrackerFactory,
ExecutionDeploymentTracker executionDeploymentTracker,
ExecutionDeploymentReconciler.Factory executionDeploymentReconcilerFactory,
BlocklistHandler.Factory blocklistHandlerFactory,
long initializationTimestamp) |
| 限定符和类型 | 方法和说明 |
|---|---|
void |
acknowledgeCheckpoint(org.apache.flink.api.common.JobID jobID,
ExecutionAttemptID executionAttemptID,
long checkpointId,
CheckpointMetrics checkpointMetrics,
org.apache.flink.util.SerializedValue<TaskStateSnapshot> checkpointState) |
CompletableFuture<Acknowledge> |
cancel(org.apache.flink.api.common.time.Time timeout)
Cancels the currently executed job.
|
void |
declineCheckpoint(DeclineCheckpoint decline) |
CompletableFuture<CoordinationResponse> |
deliverCoordinationRequestToCoordinator(OperatorID operatorId,
org.apache.flink.util.SerializedValue<CoordinationRequest> serializedRequest,
org.apache.flink.api.common.time.Time timeout)
Deliver a coordination request to a specified coordinator and return the response.
|
void |
disconnectResourceManager(ResourceManagerId resourceManagerId,
Exception cause)
Disconnects the resource manager from the job manager because of the given cause.
|
CompletableFuture<Acknowledge> |
disconnectTaskManager(ResourceID resourceID,
Exception cause)
Disconnects the given
TaskExecutor from the
JobMaster. |
void |
failSlot(ResourceID taskManagerId,
AllocationID allocationId,
Exception cause)
Fails the slot with the given allocation id and cause.
|
JobMasterGateway |
getGateway()
Get the
JobMasterGateway belonging to this service. |
CompletableFuture<Void> |
heartbeatFromResourceManager(ResourceID resourceID)
Sends heartbeat request from the resource manager.
|
CompletableFuture<Void> |
heartbeatFromTaskManager(ResourceID resourceID,
TaskExecutorToJobManagerHeartbeatPayload payload)
Sends the heartbeat to job manager from task manager.
|
CompletableFuture<Acknowledge> |
notifyKvStateRegistered(org.apache.flink.api.common.JobID jobId,
JobVertexID jobVertexId,
KeyGroupRange keyGroupRange,
String registrationName,
org.apache.flink.queryablestate.KvStateID kvStateId,
InetSocketAddress kvStateServerAddress)
Notifies that queryable state has been registered.
|
CompletableFuture<Acknowledge> |
notifyKvStateUnregistered(org.apache.flink.api.common.JobID jobId,
JobVertexID jobVertexId,
KeyGroupRange keyGroupRange,
String registrationName)
Notifies that queryable state has been unregistered.
|
CompletableFuture<Acknowledge> |
notifyNewBlockedNodes(Collection<BlockedNode> newNodes)
Notify new blocked node records.
|
void |
notifyNotEnoughResourcesAvailable(Collection<ResourceRequirement> acquiredResources)
Notifies that not enough resources are available to fulfill the resource requirements of a
job.
|
CompletableFuture<Collection<SlotOffer>> |
offerSlots(ResourceID taskManagerId,
Collection<SlotOffer> slots,
org.apache.flink.api.common.time.Time timeout)
Offers the given slots to the job manager.
|
protected void |
onStart() |
CompletableFuture<Void> |
onStop()
Suspend the job and shutdown all other services including rpc.
|
CompletableFuture<RegistrationResponse> |
registerTaskManager(org.apache.flink.api.common.JobID jobId,
TaskManagerRegistrationInformation taskManagerRegistrationInformation,
org.apache.flink.api.common.time.Time timeout)
Registers the task manager at the job manager.
|
void |
reportCheckpointMetrics(org.apache.flink.api.common.JobID jobID,
ExecutionAttemptID executionAttemptID,
long checkpointId,
CheckpointMetrics checkpointMetrics) |
CompletableFuture<ExecutionGraphInfo> |
requestJob(org.apache.flink.api.common.time.Time timeout)
Requests the
ExecutionGraphInfo of the executed job. |
CompletableFuture<JobDetails> |
requestJobDetails(org.apache.flink.api.common.time.Time timeout)
Request the details of the executed job.
|
CompletableFuture<org.apache.flink.api.common.JobStatus> |
requestJobStatus(org.apache.flink.api.common.time.Time timeout)
Requests the current job status.
|
CompletableFuture<KvStateLocation> |
requestKvStateLocation(org.apache.flink.api.common.JobID jobId,
String registrationName)
Requests a
KvStateLocation for the specified InternalKvState registration
name. |
CompletableFuture<SerializedInputSplit> |
requestNextInputSplit(JobVertexID vertexID,
ExecutionAttemptID executionAttempt)
Requests the next input split for the
ExecutionJobVertex. |
CompletableFuture<ExecutionState> |
requestPartitionState(IntermediateDataSetID intermediateResultId,
ResultPartitionID resultPartitionId)
Requests the current state of the partition.
|
CompletableFuture<Acknowledge> |
sendOperatorEventToCoordinator(ExecutionAttemptID task,
OperatorID operatorID,
org.apache.flink.util.SerializedValue<OperatorEvent> serializedEvent) |
CompletableFuture<CoordinationResponse> |
sendRequestToCoordinator(OperatorID operatorID,
org.apache.flink.util.SerializedValue<CoordinationRequest> serializedRequest) |
CompletableFuture<?> |
stopTrackingAndReleasePartitions(Collection<ResultPartitionID> partitionIds)
Notifies the
JobMasterPartitionTracker
to stop tracking the target result partitions and release the locally occupied resources on
TaskExecutors if any. |
CompletableFuture<String> |
stopWithSavepoint(String targetDirectory,
org.apache.flink.core.execution.SavepointFormatType formatType,
boolean terminate,
org.apache.flink.api.common.time.Time timeout)
Stops the job with a savepoint.
|
CompletableFuture<CompletedCheckpoint> |
triggerCheckpoint(org.apache.flink.core.execution.CheckpointType checkpointType,
org.apache.flink.api.common.time.Time timeout)
Triggers taking a checkpoint of the executed job.
|
CompletableFuture<String> |
triggerSavepoint(String targetDirectory,
boolean cancelJob,
org.apache.flink.core.execution.SavepointFormatType formatType,
org.apache.flink.api.common.time.Time timeout)
Triggers taking a savepoint of the executed job.
|
CompletableFuture<Object> |
updateGlobalAggregate(String aggregateName,
Object aggregand,
byte[] serializedAggregateFunction)
Update the aggregate and return the new value.
|
CompletableFuture<Acknowledge> |
updateTaskExecutionState(TaskExecutionState taskExecutionState)
Updates the task execution state for a given task.
|
callAsync, closeAsync, getAddress, getEndpointId, getHostname, getMainThreadExecutor, getRpcService, getSelfGateway, getTerminationFuture, internalCallOnStart, internalCallOnStop, isRunning, registerResource, runAsync, scheduleRunAsync, scheduleRunAsync, start, stop, unregisterResource, validateRunsInMainThreadclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waittriggerCheckpointgetAddress, getTerminationFuturepublic JobMaster(org.apache.flink.runtime.rpc.RpcService rpcService,
JobMasterId jobMasterId,
JobMasterConfiguration jobMasterConfiguration,
ResourceID resourceId,
JobGraph jobGraph,
HighAvailabilityServices highAvailabilityService,
SlotPoolServiceSchedulerFactory slotPoolServiceSchedulerFactory,
JobManagerSharedServices jobManagerSharedServices,
HeartbeatServices heartbeatServices,
JobManagerJobMetricGroupFactory jobMetricGroupFactory,
OnCompletionActions jobCompletionActions,
org.apache.flink.runtime.rpc.FatalErrorHandler fatalErrorHandler,
ClassLoader userCodeLoader,
ShuffleMaster<?> shuffleMaster,
PartitionTrackerFactory partitionTrackerFactory,
ExecutionDeploymentTracker executionDeploymentTracker,
ExecutionDeploymentReconciler.Factory executionDeploymentReconcilerFactory,
BlocklistHandler.Factory blocklistHandlerFactory,
long initializationTimestamp)
throws Exception
Exceptionprotected void onStart()
throws JobMasterException
onStart 在类中 org.apache.flink.runtime.rpc.RpcEndpointJobMasterExceptionpublic CompletableFuture<Void> onStop()
onStop 在类中 org.apache.flink.runtime.rpc.RpcEndpointpublic CompletableFuture<Acknowledge> cancel(org.apache.flink.api.common.time.Time timeout)
JobMasterGatewaycancel 在接口中 JobMasterGatewaytimeout - of this operationpublic CompletableFuture<Acknowledge> updateTaskExecutionState(TaskExecutionState taskExecutionState)
updateTaskExecutionState 在接口中 JobMasterGatewaytaskExecutionState - New task execution state for a given taskpublic CompletableFuture<SerializedInputSplit> requestNextInputSplit(JobVertexID vertexID, ExecutionAttemptID executionAttempt)
JobMasterGatewayExecutionJobVertex. The next input split is
sent back to the sender as a SerializedInputSplit message.requestNextInputSplit 在接口中 JobMasterGatewayvertexID - The job vertex idexecutionAttempt - The execution attempt idpublic CompletableFuture<ExecutionState> requestPartitionState(IntermediateDataSetID intermediateResultId, ResultPartitionID resultPartitionId)
JobMasterGatewayrequestPartitionState 在接口中 JobMasterGatewayintermediateResultId - The execution attempt ID of the task requesting the partition
state.resultPartitionId - The partition ID of the partition to request the state of.public CompletableFuture<Acknowledge> disconnectTaskManager(ResourceID resourceID, Exception cause)
JobMasterGatewayTaskExecutor from the
JobMaster.disconnectTaskManager 在接口中 JobMasterGatewayresourceID - identifying the TaskManager to disconnectcause - for the disconnection of the TaskManagerpublic void acknowledgeCheckpoint(org.apache.flink.api.common.JobID jobID,
ExecutionAttemptID executionAttemptID,
long checkpointId,
CheckpointMetrics checkpointMetrics,
@Nullable
org.apache.flink.util.SerializedValue<TaskStateSnapshot> checkpointState)
public void reportCheckpointMetrics(org.apache.flink.api.common.JobID jobID,
ExecutionAttemptID executionAttemptID,
long checkpointId,
CheckpointMetrics checkpointMetrics)
public void declineCheckpoint(DeclineCheckpoint decline)
public CompletableFuture<Acknowledge> sendOperatorEventToCoordinator(ExecutionAttemptID task, OperatorID operatorID, org.apache.flink.util.SerializedValue<OperatorEvent> serializedEvent)
public CompletableFuture<CoordinationResponse> sendRequestToCoordinator(OperatorID operatorID, org.apache.flink.util.SerializedValue<CoordinationRequest> serializedRequest)
public CompletableFuture<KvStateLocation> requestKvStateLocation(org.apache.flink.api.common.JobID jobId, String registrationName)
KvStateLocationOracleKvStateLocation for the specified InternalKvState registration
name.requestKvStateLocation 在接口中 KvStateLocationOraclejobId - identifying the job for which to request the KvStateLocationregistrationName - Name under which the KvState has been registered.InternalKvState locationpublic CompletableFuture<Acknowledge> notifyKvStateRegistered(org.apache.flink.api.common.JobID jobId, JobVertexID jobVertexId, KeyGroupRange keyGroupRange, String registrationName, org.apache.flink.queryablestate.KvStateID kvStateId, InetSocketAddress kvStateServerAddress)
KvStateRegistryGatewaynotifyKvStateRegistered 在接口中 KvStateRegistryGatewayjobId - identifying the job for which to register a key value statejobVertexId - JobVertexID the KvState instance belongs to.keyGroupRange - Key group range the KvState instance belongs to.registrationName - Name under which the KvState has been registered.kvStateId - ID of the registered KvState instance.kvStateServerAddress - Server address where to find the KvState instance.public CompletableFuture<Acknowledge> notifyKvStateUnregistered(org.apache.flink.api.common.JobID jobId, JobVertexID jobVertexId, KeyGroupRange keyGroupRange, String registrationName)
KvStateRegistryGatewaynotifyKvStateUnregistered 在接口中 KvStateRegistryGatewayjobId - identifying the job for which to unregister a key value statejobVertexId - JobVertexID the KvState instance belongs to.keyGroupRange - Key group index the KvState instance belongs to.registrationName - Name under which the KvState has been registered.public CompletableFuture<Collection<SlotOffer>> offerSlots(ResourceID taskManagerId, Collection<SlotOffer> slots, org.apache.flink.api.common.time.Time timeout)
JobMasterGatewayofferSlots 在接口中 JobMasterGatewaytaskManagerId - identifying the task managerslots - to offer to the job managertimeout - for the rpc callpublic void failSlot(ResourceID taskManagerId, AllocationID allocationId, Exception cause)
JobMasterGatewayfailSlot 在接口中 JobMasterGatewaytaskManagerId - identifying the task managerallocationId - identifying the slot to failcause - of the failingpublic CompletableFuture<RegistrationResponse> registerTaskManager(org.apache.flink.api.common.JobID jobId, TaskManagerRegistrationInformation taskManagerRegistrationInformation, org.apache.flink.api.common.time.Time timeout)
JobMasterGatewayregisterTaskManager 在接口中 JobMasterGatewayjobId - jobId specifying the job for which the JobMaster should be responsibletaskManagerRegistrationInformation - the information for registering a task manager at
the job managertimeout - for the rpc callpublic void disconnectResourceManager(ResourceManagerId resourceManagerId, Exception cause)
JobMasterGatewaydisconnectResourceManager 在接口中 JobMasterGatewayresourceManagerId - identifying the resource manager leader idcause - of the disconnectpublic CompletableFuture<Void> heartbeatFromTaskManager(ResourceID resourceID, TaskExecutorToJobManagerHeartbeatPayload payload)
JobMasterGatewayheartbeatFromTaskManager 在接口中 JobMasterGatewayresourceID - unique id of the task managerpayload - report payloadpublic CompletableFuture<Void> heartbeatFromResourceManager(ResourceID resourceID)
JobMasterGatewayheartbeatFromResourceManager 在接口中 JobMasterGatewayresourceID - unique id of the resource managerpublic CompletableFuture<JobDetails> requestJobDetails(org.apache.flink.api.common.time.Time timeout)
JobMasterGatewayrequestJobDetails 在接口中 JobMasterGatewaytimeout - for the rpc callpublic CompletableFuture<org.apache.flink.api.common.JobStatus> requestJobStatus(org.apache.flink.api.common.time.Time timeout)
JobMasterGatewayrequestJobStatus 在接口中 JobMasterGatewaytimeout - for the rpc callpublic CompletableFuture<ExecutionGraphInfo> requestJob(org.apache.flink.api.common.time.Time timeout)
JobMasterGatewayExecutionGraphInfo of the executed job.requestJob 在接口中 JobMasterGatewaytimeout - for the rpc callExecutionGraphInfo of the executed jobpublic CompletableFuture<CompletedCheckpoint> triggerCheckpoint(org.apache.flink.core.execution.CheckpointType checkpointType, org.apache.flink.api.common.time.Time timeout)
JobMasterGatewaytriggerCheckpoint 在接口中 JobMasterGatewaycheckpointType - to determine how checkpoint should be takentimeout - for the rpc callpublic CompletableFuture<String> triggerSavepoint(@Nullable String targetDirectory, boolean cancelJob, org.apache.flink.core.execution.SavepointFormatType formatType, org.apache.flink.api.common.time.Time timeout)
JobMasterGatewaytriggerSavepoint 在接口中 JobMasterGatewaytargetDirectory - to which to write the savepoint data or null if the default savepoint
directory should be usedformatType - binary format for the savepointtimeout - for the rpc callpublic CompletableFuture<String> stopWithSavepoint(@Nullable String targetDirectory, org.apache.flink.core.execution.SavepointFormatType formatType, boolean terminate, org.apache.flink.api.common.time.Time timeout)
JobMasterGatewaystopWithSavepoint 在接口中 JobMasterGatewaytargetDirectory - to which to write the savepoint data or null if the default savepoint
directory should be usedterminate - flag indicating if the job should terminate or just suspendtimeout - for the rpc callpublic void notifyNotEnoughResourcesAvailable(Collection<ResourceRequirement> acquiredResources)
JobMasterGatewaynotifyNotEnoughResourcesAvailable 在接口中 JobMasterGatewayacquiredResources - the resources that have been acquired for the jobpublic CompletableFuture<Object> updateGlobalAggregate(String aggregateName, Object aggregand, byte[] serializedAggregateFunction)
JobMasterGatewayupdateGlobalAggregate 在接口中 JobMasterGatewayaggregateName - The name of the aggregate to updateaggregand - The value to add to the aggregateserializedAggregateFunction - The function to apply to the current aggregate and
aggregand to obtain the new aggregate value, this should be of type AggregateFunctionpublic CompletableFuture<CoordinationResponse> deliverCoordinationRequestToCoordinator(OperatorID operatorId, org.apache.flink.util.SerializedValue<CoordinationRequest> serializedRequest, org.apache.flink.api.common.time.Time timeout)
JobMasterGatewaydeliverCoordinationRequestToCoordinator 在接口中 JobMasterGatewayoperatorId - identifying the coordinator to receive the requestserializedRequest - serialized request to deliverFlinkException if the task is not running, or no
operator/coordinator exists for the given ID, or the coordinator cannot handle client
events.public CompletableFuture<?> stopTrackingAndReleasePartitions(Collection<ResultPartitionID> partitionIds)
JobMasterGatewayJobMasterPartitionTracker
to stop tracking the target result partitions and release the locally occupied resources on
TaskExecutors if any.public CompletableFuture<Acknowledge> notifyNewBlockedNodes(Collection<BlockedNode> newNodes)
BlocklistListenernotifyNewBlockedNodes 在接口中 BlocklistListenernewNodes - the new blocked node recordspublic JobMasterGateway getGateway()
JobMasterServiceJobMasterGateway belonging to this service.getGateway 在接口中 JobMasterServiceCopyright © 2014–2023 The Apache Software Foundation. All rights reserved.