public class ActiveResourceManager<WorkerType extends ResourceIDRetrievable> extends ResourceManager<WorkerType> implements ResourceEventHandler<WorkerType>
ResourceManager.
This resource manager actively requests and releases resources from/to the external resource
management frameworks. With different ResourceManagerDriver provided, this resource
manager can work with various frameworks.
| 限定符和类型 | 字段和说明 |
|---|---|
protected org.apache.flink.configuration.Configuration |
flinkConfig |
blocklistHandler, ioExecutor, RESOURCE_MANAGER_NAME, resourceManagerMetricGroup| 构造器和说明 |
|---|
ActiveResourceManager(ResourceManagerDriver<WorkerType> resourceManagerDriver,
org.apache.flink.configuration.Configuration flinkConfig,
org.apache.flink.runtime.rpc.RpcService rpcService,
UUID leaderSessionId,
ResourceID resourceId,
HeartbeatServices heartbeatServices,
DelegationTokenManager delegationTokenManager,
SlotManager slotManager,
ResourceManagerPartitionTrackerFactory clusterPartitionTrackerFactory,
BlocklistHandler.Factory blocklistHandlerFactory,
JobLeaderIdService jobLeaderIdService,
ClusterInformation clusterInformation,
org.apache.flink.runtime.rpc.FatalErrorHandler fatalErrorHandler,
ResourceManagerMetricGroup resourceManagerMetricGroup,
ThresholdMeter startWorkerFailureRater,
java.time.Duration retryInterval,
java.time.Duration workerRegistrationTimeout,
java.time.Duration previousWorkerRecoverTimeout,
Executor ioExecutor) |
| 限定符和类型 | 方法和说明 |
|---|---|
void |
declareResourceNeeded(Collection<ResourceDeclaration> resourceDeclarations) |
CompletableFuture<Void> |
getReadyToServeFuture()
Get the ready to serve future of the resource manager.
|
protected ResourceAllocator |
getResourceAllocator() |
protected Optional<WorkerType> |
getWorkerNodeIfAcceptRegistration(ResourceID resourceID)
Get worker node if the worker resource is accepted.
|
protected void |
initialize()
Initializes the framework specific components.
|
protected void |
internalDeregisterApplication(ApplicationStatus finalStatus,
String optionalDiagnostics)
The framework specific code to deregister the application.
|
void |
onError(Throwable exception)
Notifies that an error has occurred that the process cannot proceed.
|
void |
onPreviousAttemptWorkersRecovered(Collection<WorkerType> recoveredWorkers)
Notifies that workers of previous attempt have been recovered from the external resource
manager.
|
protected void |
onWorkerRegistered(WorkerType worker,
WorkerResourceSpec workerResourceSpec) |
void |
onWorkerTerminated(ResourceID resourceId,
String diagnostics)
Notifies that the worker has been terminated.
|
protected void |
registerMetrics() |
void |
requestNewWorker(WorkerResourceSpec workerResourceSpec)
Allocates a resource using the worker resource specification.
|
protected void |
terminate()
Terminates the framework specific components.
|
closeJobManagerConnection, closeTaskManagerConnection, declareRequiredResources, deregisterApplication, disconnectJobManager, disconnectTaskManager, getClusterPartitionsShuffleDescriptors, getInstanceIdByResourceId, getNumberOfRegisteredTaskManagers, getStartedFuture, getWorkerByInstanceId, heartbeatFromJobManager, heartbeatFromTaskManager, jobLeaderLostLeadership, listDataSets, notifyNewBlockedNodes, notifySlotAvailable, onFatalError, onNewTokensObtained, onStart, onStop, registerJobMaster, registerTaskExecutor, releaseClusterPartitions, removeJob, reportClusterPartitions, requestResourceOverview, requestTaskExecutorThreadInfoGateway, requestTaskManagerDetailsInfo, requestTaskManagerFileUploadByName, requestTaskManagerFileUploadByType, requestTaskManagerInfo, requestTaskManagerLogList, requestTaskManagerMetricQueryServiceAddresses, requestThreadDump, sendSlotReport, setFailUnfulfillableRequest, stopWorkerIfSupportedcallAsync, closeAsync, getAddress, getEndpointId, getHostname, getMainThreadExecutor, getRpcService, getSelfGateway, getTerminationFuture, internalCallOnStart, internalCallOnStop, isRunning, registerResource, runAsync, scheduleRunAsync, scheduleRunAsync, start, stop, unregisterResource, validateRunsInMainThreadpublic ActiveResourceManager(ResourceManagerDriver<WorkerType> resourceManagerDriver, org.apache.flink.configuration.Configuration flinkConfig, org.apache.flink.runtime.rpc.RpcService rpcService, UUID leaderSessionId, ResourceID resourceId, HeartbeatServices heartbeatServices, DelegationTokenManager delegationTokenManager, SlotManager slotManager, ResourceManagerPartitionTrackerFactory clusterPartitionTrackerFactory, BlocklistHandler.Factory blocklistHandlerFactory, JobLeaderIdService jobLeaderIdService, ClusterInformation clusterInformation, org.apache.flink.runtime.rpc.FatalErrorHandler fatalErrorHandler, ResourceManagerMetricGroup resourceManagerMetricGroup, ThresholdMeter startWorkerFailureRater, java.time.Duration retryInterval, java.time.Duration workerRegistrationTimeout, java.time.Duration previousWorkerRecoverTimeout, Executor ioExecutor)
protected void initialize()
throws ResourceManagerException
ResourceManagerinitialize 在类中 ResourceManager<WorkerType extends ResourceIDRetrievable>ResourceManagerException - which occurs during initialization and causes the resource
manager to fail.protected void terminate()
throws ResourceManagerException
ResourceManagerterminate 在类中 ResourceManager<WorkerType extends ResourceIDRetrievable>ResourceManagerExceptionprotected void internalDeregisterApplication(ApplicationStatus finalStatus, @Nullable String optionalDiagnostics) throws ResourceManagerException
ResourceManagerThis method also needs to make sure all pending containers that are not registered yet are returned.
internalDeregisterApplication 在类中 ResourceManager<WorkerType extends ResourceIDRetrievable>finalStatus - The application status to report.optionalDiagnostics - A diagnostics message or null.ResourceManagerException - if the application could not be shut down.protected Optional<WorkerType> getWorkerNodeIfAcceptRegistration(ResourceID resourceID)
ResourceManagergetWorkerNodeIfAcceptRegistration 在类中 ResourceManager<WorkerType extends ResourceIDRetrievable>resourceID - The worker resource id@VisibleForTesting public void declareResourceNeeded(Collection<ResourceDeclaration> resourceDeclarations)
protected void onWorkerRegistered(WorkerType worker, WorkerResourceSpec workerResourceSpec)
onWorkerRegistered 在类中 ResourceManager<WorkerType extends ResourceIDRetrievable>protected void registerMetrics()
registerMetrics 在类中 ResourceManager<WorkerType extends ResourceIDRetrievable>public void onPreviousAttemptWorkersRecovered(Collection<WorkerType> recoveredWorkers)
ResourceEventHandleronPreviousAttemptWorkersRecovered 在接口中 ResourceEventHandler<WorkerType extends ResourceIDRetrievable>recoveredWorkers - Collection of worker nodes, in the deployment specific type.public void onWorkerTerminated(ResourceID resourceId, String diagnostics)
ResourceEventHandleronWorkerTerminated 在接口中 ResourceEventHandler<WorkerType extends ResourceIDRetrievable>resourceId - Identifier of the terminated worker.diagnostics - Diagnostic message about the worker termination.public void onError(Throwable exception)
ResourceEventHandleronError 在接口中 ResourceEventHandler<WorkerType extends ResourceIDRetrievable>exception - Exception that describes the error.@VisibleForTesting public void requestNewWorker(WorkerResourceSpec workerResourceSpec)
workerResourceSpec - workerResourceSpec specifies the size of the to be allocated
resourcepublic CompletableFuture<Void> getReadyToServeFuture()
ResourceManagergetReadyToServeFuture 在类中 ResourceManager<WorkerType extends ResourceIDRetrievable>protected ResourceAllocator getResourceAllocator()
getResourceAllocator 在类中 ResourceManager<WorkerType extends ResourceIDRetrievable>Copyright © 2014–2023 The Apache Software Foundation. All rights reserved.