public class ZooKeeperHaServices extends AbstractHaServices
AbstractHaServices using Apache ZooKeeper. The services store
data in ZooKeeper's nodes as illustrated by the following tree structure:
/flink
+/cluster_id_1/leader/resource_manager/latch
| | /connection_info
| | /dispatcher/latch
| | /connection_info
| | /rest_server/latch
| | /connection_info
| |
| |
| +jobgraphs/job-id-1
| | /job-id-2
| +jobs/job-id-1/leader/latch
| | /connection_info
| | /checkpoints/latest
| | /latest-1
| | /latest-2
| | /checkpoint_id_counter
|
+/cluster_id_2/leader/resource_manager/latch
| | /connection_info
| | /dispatcher/latch
| | /connection_info
| | /rest_server/latch
| | /connection_info
| |
| +jobgraphs/job-id-2
| +jobs/job-id-2/leader/latch
| | /connection_info
| | /checkpoints/latest
| | /latest-1
| | /latest-2
| | /checkpoint_id_counter
The root path "/flink" is configurable via the option HighAvailabilityOptions.HA_ZOOKEEPER_ROOT. This makes sure Flink stores its data under specific
subtrees in ZooKeeper, for example to accommodate specific permission.
The "cluster_id" part identifies the data stored for a specific Flink "cluster". This "cluster" can be either a standalone or containerized Flink cluster, or it can be job on a framework like YARN (in a "per-job-cluster" mode).
In case of a "per-job-cluster" on YARN, the cluster-id is generated and configured automatically by the client or dispatcher that submits the Job to YARN.
In the case of a standalone cluster, that cluster-id needs to be configured via HighAvailabilityOptions.HA_CLUSTER_ID. All nodes with the same cluster id will join the same
cluster and participate in the execution of the same set of jobs.
configuration, ioExecutor, loggerDEFAULT_JOB_ID, DEFAULT_LEADER_ID| Constructor and Description |
|---|
ZooKeeperHaServices(CuratorFrameworkWithUnhandledErrorListener curatorFrameworkWrapper,
Executor executor,
org.apache.flink.configuration.Configuration configuration,
BlobStoreService blobStoreService) |
| Modifier and Type | Method and Description |
|---|---|
CheckpointRecoveryFactory |
createCheckpointRecoveryFactory()
Create the checkpoint recovery factory for the job manager.
|
JobGraphStore |
createJobGraphStore()
Create the submitted job graph store for the job manager.
|
protected LeaderElectionService |
createLeaderElectionService(String leaderPath)
Create leader election service with specified leaderName.
|
protected LeaderRetrievalService |
createLeaderRetrievalService(String leaderPath)
Create leader retrieval service with specified leaderName.
|
RunningJobsRegistry |
createRunningJobsRegistry()
Create the registry that holds information about whether jobs are currently running.
|
protected String |
getLeaderPathForDispatcher()
Get the leader path for Dispatcher.
|
String |
getLeaderPathForJobManager(org.apache.flink.api.common.JobID jobID)
Get the leader path for specific JobManager.
|
protected String |
getLeaderPathForResourceManager()
Get the leader path for ResourceManager.
|
protected String |
getLeaderPathForRestServer()
Get the leader path for RestServer.
|
void |
internalCleanup()
Clean up the meta data in the distributed system(e.g.
|
void |
internalCleanupJobData(org.apache.flink.api.common.JobID jobID)
Clean up the meta data in the distributed system(e.g.
|
void |
internalClose()
Closes the components which is used for external operations(e.g.
|
cleanupJobData, close, closeAndCleanupAllData, createBlobStore, getCheckpointRecoveryFactory, getClusterRestEndpointLeaderElectionService, getClusterRestEndpointLeaderRetriever, getDispatcherLeaderElectionService, getDispatcherLeaderRetriever, getJobGraphStore, getJobManagerLeaderElectionService, getJobManagerLeaderRetriever, getJobManagerLeaderRetriever, getResourceManagerLeaderElectionService, getResourceManagerLeaderRetriever, getRunningJobsRegistryclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetWebMonitorLeaderElectionService, getWebMonitorLeaderRetrieverpublic ZooKeeperHaServices(CuratorFrameworkWithUnhandledErrorListener curatorFrameworkWrapper, Executor executor, org.apache.flink.configuration.Configuration configuration, BlobStoreService blobStoreService)
public CheckpointRecoveryFactory createCheckpointRecoveryFactory() throws Exception
AbstractHaServicescreateCheckpointRecoveryFactory in class AbstractHaServicesExceptionpublic JobGraphStore createJobGraphStore() throws Exception
AbstractHaServicescreateJobGraphStore in class AbstractHaServicesException - if the submitted job graph store could not be createdpublic RunningJobsRegistry createRunningJobsRegistry()
AbstractHaServicescreateRunningJobsRegistry in class AbstractHaServicesprotected LeaderElectionService createLeaderElectionService(String leaderPath)
AbstractHaServicescreateLeaderElectionService in class AbstractHaServicesleaderPath - ConfigMap name in Kubernetes or child node path in Zookeeper.protected LeaderRetrievalService createLeaderRetrievalService(String leaderPath)
AbstractHaServicescreateLeaderRetrievalService in class AbstractHaServicesleaderPath - ConfigMap name in Kubernetes or child node path in Zookeeper.public void internalClose()
AbstractHaServicesinternalClose in class AbstractHaServicespublic void internalCleanup()
throws Exception
AbstractHaServicesIf an exception occurs during internal cleanup, we will continue the cleanup in AbstractHaServices.closeAndCleanupAllData() and report exceptions only after all cleanup steps have been
attempted.
internalCleanup in class AbstractHaServicesException - when do the cleanup operation on external storage.public void internalCleanupJobData(org.apache.flink.api.common.JobID jobID)
throws Exception
AbstractHaServicesinternalCleanupJobData in class AbstractHaServicesjobID - The identifier of the job to cleanup.Exception - when do the cleanup operation on external storage.protected String getLeaderPathForResourceManager()
AbstractHaServicesgetLeaderPathForResourceManager in class AbstractHaServicesprotected String getLeaderPathForDispatcher()
AbstractHaServicesgetLeaderPathForDispatcher in class AbstractHaServicespublic String getLeaderPathForJobManager(org.apache.flink.api.common.JobID jobID)
AbstractHaServicesgetLeaderPathForJobManager in class AbstractHaServicesjobID - job idprotected String getLeaderPathForRestServer()
AbstractHaServicesgetLeaderPathForRestServer in class AbstractHaServicesCopyright © 2014–2022 The Apache Software Foundation. All rights reserved.