聊聊flink的MemoryStateBackend

8次阅读

共计 19009 个字符,预计需要花费 48 分钟才能阅读完成。


本文主要研究一下 flink 的 MemoryStateBackend
StateBackend
flink-runtime_2.11-1.7.0-sources.jar!/org/apache/flink/runtime/state/StateBackend.java
@PublicEvolving
public interface StateBackend extends java.io.Serializable {

// ————————————————————————
// Checkpoint storage – the durable persistence of checkpoint data
// ————————————————————————

/**
* Resolves the given pointer to a checkpoint/savepoint into a checkpoint location. The location
* supports reading the checkpoint metadata, or disposing the checkpoint storage location.
*
* <p>If the state backend cannot understand the format of the pointer (for example because it
* was created by a different state backend) this method should throw an {@code IOException}.
*
* @param externalPointer The external checkpoint pointer to resolve.
* @return The checkpoint location handle.
*
* @throws IOException Thrown, if the state backend does not understand the pointer, or if
* the pointer could not be resolved due to an I/O error.
*/
CompletedCheckpointStorageLocation resolveCheckpoint(String externalPointer) throws IOException;

/**
* Creates a storage for checkpoints for the given job. The checkpoint storage is
* used to write checkpoint data and metadata.
*
* @param jobId The job to store checkpoint data for.
* @return A checkpoint storage for the given job.
*
* @throws IOException Thrown if the checkpoint storage cannot be initialized.
*/
CheckpointStorage createCheckpointStorage(JobID jobId) throws IOException;

// ————————————————————————
// Structure Backends
// ————————————————————————

/**
* Creates a new {@link AbstractKeyedStateBackend} that is responsible for holding <b>keyed state</b>
* and checkpointing it. Uses default TTL time provider.
*
* <p><i>Keyed State</i> is state where each value is bound to a key.
*
* @param <K> The type of the keys by which the state is organized.
*
* @return The Keyed State Backend for the given job, operator, and key group range.
*
* @throws Exception This method may forward all exceptions that occur while instantiating the backend.
*/
default <K> AbstractKeyedStateBackend<K> createKeyedStateBackend(
Environment env,
JobID jobID,
String operatorIdentifier,
TypeSerializer<K> keySerializer,
int numberOfKeyGroups,
KeyGroupRange keyGroupRange,
TaskKvStateRegistry kvStateRegistry) throws Exception {
return createKeyedStateBackend(
env,
jobID,
operatorIdentifier,
keySerializer,
numberOfKeyGroups,
keyGroupRange,
kvStateRegistry,
TtlTimeProvider.DEFAULT
);
}

/**
* Creates a new {@link AbstractKeyedStateBackend} that is responsible for holding <b>keyed state</b>
* and checkpointing it.
*
* <p><i>Keyed State</i> is state where each value is bound to a key.
*
* @param <K> The type of the keys by which the state is organized.
*
* @return The Keyed State Backend for the given job, operator, and key group range.
*
* @throws Exception This method may forward all exceptions that occur while instantiating the backend.
*/
default <K> AbstractKeyedStateBackend<K> createKeyedStateBackend(
Environment env,
JobID jobID,
String operatorIdentifier,
TypeSerializer<K> keySerializer,
int numberOfKeyGroups,
KeyGroupRange keyGroupRange,
TaskKvStateRegistry kvStateRegistry,
TtlTimeProvider ttlTimeProvider
) throws Exception {
return createKeyedStateBackend(
env,
jobID,
operatorIdentifier,
keySerializer,
numberOfKeyGroups,
keyGroupRange,
kvStateRegistry,
ttlTimeProvider,
new UnregisteredMetricsGroup());
}

/**
* Creates a new {@link AbstractKeyedStateBackend} that is responsible for holding <b>keyed state</b>
* and checkpointing it.
*
* <p><i>Keyed State</i> is state where each value is bound to a key.
*
* @param <K> The type of the keys by which the state is organized.
*
* @return The Keyed State Backend for the given job, operator, and key group range.
*
* @throws Exception This method may forward all exceptions that occur while instantiating the backend.
*/
<K> AbstractKeyedStateBackend<K> createKeyedStateBackend(
Environment env,
JobID jobID,
String operatorIdentifier,
TypeSerializer<K> keySerializer,
int numberOfKeyGroups,
KeyGroupRange keyGroupRange,
TaskKvStateRegistry kvStateRegistry,
TtlTimeProvider ttlTimeProvider,
MetricGroup metricGroup) throws Exception;

/**
* Creates a new {@link OperatorStateBackend} that can be used for storing operator state.
*
* <p>Operator state is state that is associated with parallel operator (or function) instances,
* rather than with keys.
*
* @param env The runtime environment of the executing task.
* @param operatorIdentifier The identifier of the operator whose state should be stored.
*
* @return The OperatorStateBackend for operator identified by the job and operator identifier.
*
* @throws Exception This method may forward all exceptions that occur while instantiating the backend.
*/
OperatorStateBackend createOperatorStateBackend(Environment env, String operatorIdentifier) throws Exception;
}

StateBackend 接口定义了有状态的 streaming 应用的 state 是如何 stored 以及 checkpointed
flink 目前内置支持 MemoryStateBackend、FsStateBackend、RocksDBStateBackend 三种,如果没有配置默认为 MemoryStateBackend;在 flink-conf.yaml 里头可以进行全局的默认配置,不过具体每个 job 还可以通过 StreamExecutionEnvironment.setStateBackend 来覆盖全局的配置
MemoryStateBackend 可以在构造器中指定大小,默认是 5MB,可以增大但是不能超过 akka frame size;FsStateBackend 模式把 TaskManager 的 state 存储在内存,但是可以把 checkpoint 的 state 存储到 filesystem 中 (比如 HDFS);RocksDBStateBackend 把 working state 存储在 RocksDB 中,checkpoint 的 state 存储在 filesystem
StateBackend 接口定义了 createCheckpointStorage、createKeyedStateBackend、createOperatorStateBackend 方法;同时继承了 Serializable 接口;StateBackend 接口的实现要求是线程安全的
StateBackend 有个直接实现的抽象类 AbstractStateBackend,而 AbstractFileStateBackend 及 RocksDBStateBackend 继承了 AbstractStateBackend,之后 MemoryStateBackend、FsStateBackend 都继承了 AbstractFileStateBackend

AbstractStateBackend
flink-runtime_2.11-1.7.0-sources.jar!/org/apache/flink/runtime/state/AbstractStateBackend.java
/**
* An abstract base implementation of the {@link StateBackend} interface.
*
* <p>This class has currently no contents and only kept to not break the prior class hierarchy for users.
*/
@PublicEvolving
public abstract class AbstractStateBackend implements StateBackend, java.io.Serializable {

private static final long serialVersionUID = 4620415814639230247L;

// ————————————————————————
// State Backend – State-Holding Backends
// ————————————————————————

@Override
public abstract <K> AbstractKeyedStateBackend<K> createKeyedStateBackend(
Environment env,
JobID jobID,
String operatorIdentifier,
TypeSerializer<K> keySerializer,
int numberOfKeyGroups,
KeyGroupRange keyGroupRange,
TaskKvStateRegistry kvStateRegistry,
TtlTimeProvider ttlTimeProvider,
MetricGroup metricGroup) throws IOException;

@Override
public abstract OperatorStateBackend createOperatorStateBackend(
Environment env,
String operatorIdentifier) throws Exception;
}
AbstractStateBackend 声明实现 StateBackend 及 Serializable 接口,这里没有新增其他内容
AbstractFileStateBackend
flink-runtime_2.11-1.7.0-sources.jar!/org/apache/flink/runtime/state/filesystem/AbstractFileStateBackend.java
@PublicEvolving
public abstract class AbstractFileStateBackend extends AbstractStateBackend {

private static final long serialVersionUID = 1L;

// ————————————————————————
// State Backend Properties
// ————————————————————————

/** The path where checkpoints will be stored, or null, if none has been configured. */
@Nullable
private final Path baseCheckpointPath;

/** The path where savepoints will be stored, or null, if none has been configured. */
@Nullable
private final Path baseSavepointPath;

//……

// ————————————————————————
// Initialization and metadata storage
// ————————————————————————

@Override
public CompletedCheckpointStorageLocation resolveCheckpoint(String pointer) throws IOException {
return AbstractFsCheckpointStorage.resolveCheckpointPointer(pointer);
}

// ————————————————————————
// Utilities
// ————————————————————————

/**
* Checks the validity of the path’s scheme and path.
*
* @param path The path to check.
* @return The URI as a Path.
*
* @throws IllegalArgumentException Thrown, if the URI misses scheme or path.
*/
private static Path validatePath(Path path) {
final URI uri = path.toUri();
final String scheme = uri.getScheme();
final String pathPart = uri.getPath();

// some validity checks
if (scheme == null) {
throw new IllegalArgumentException(“The scheme (hdfs://, file://, etc) is null. ” +
“Please specify the file system scheme explicitly in the URI.”);
}
if (pathPart == null) {
throw new IllegalArgumentException(“The path to store the checkpoint data in is null. ” +
“Please specify a directory path for the checkpoint data.”);
}
if (pathPart.length() == 0 || pathPart.equals(“/”)) {
throw new IllegalArgumentException(“Cannot use the root directory for checkpoints.”);
}

return path;
}

@Nullable
private static Path parameterOrConfigured(@Nullable Path path, Configuration config, ConfigOption<String> option) {
if (path != null) {
return path;
}
else {
String configValue = config.getString(option);
try {
return configValue == null ? null : new Path(configValue);
}
catch (IllegalArgumentException e) {
throw new IllegalConfigurationException(“Cannot parse value for ” + option.key() +
” : ” + configValue + ” . Not a valid path.”);
}
}
}
}
AbstractFileStateBackend 继承了 AbstractStateBackend,它有 baseCheckpointPath、baseSavepointPath 两个属性,允许为 null,路径的格式为 hdfs:// 或者 file:// 开头;resolveCheckpoint 方法用于解析 checkpoint 或 savepoint 的 location,这里使用 AbstractFsCheckpointStorage.resolveCheckpointPointer(pointer) 来完成
MemoryStateBackend
flink-runtime_2.11-1.7.0-sources.jar!/org/apache/flink/runtime/state/memory/MemoryStateBackend.java
@PublicEvolving
public class MemoryStateBackend extends AbstractFileStateBackend implements ConfigurableStateBackend {

private static final long serialVersionUID = 4109305377809414635L;

/** The default maximal size that the snapshotted memory state may have (5 MiBytes). */
public static final int DEFAULT_MAX_STATE_SIZE = 5 * 1024 * 1024;

/** The maximal size that the snapshotted memory state may have. */
private final int maxStateSize;

/** Switch to chose between synchronous and asynchronous snapshots.
* A value of ‘UNDEFINED’ means not yet configured, in which case the default will be used. */
private final TernaryBoolean asynchronousSnapshots;

// ————————————————————————

/**
* Creates a new memory state backend that accepts states whose serialized forms are
* up to the default state size (5 MB).
*
* <p>Checkpoint and default savepoint locations are used as specified in the
* runtime configuration.
*/
public MemoryStateBackend() {
this(null, null, DEFAULT_MAX_STATE_SIZE, TernaryBoolean.UNDEFINED);
}

/**
* Creates a new memory state backend that accepts states whose serialized forms are
* up to the default state size (5 MB). The state backend uses asynchronous snapshots
* or synchronous snapshots as configured.
*
* <p>Checkpoint and default savepoint locations are used as specified in the
* runtime configuration.
*
* @param asynchronousSnapshots Switch to enable asynchronous snapshots.
*/
public MemoryStateBackend(boolean asynchronousSnapshots) {
this(null, null, DEFAULT_MAX_STATE_SIZE, TernaryBoolean.fromBoolean(asynchronousSnapshots));
}

/**
* Creates a new memory state backend that accepts states whose serialized forms are
* up to the given number of bytes.
*
* <p>Checkpoint and default savepoint locations are used as specified in the
* runtime configuration.
*
* <p><b>WARNING:</b> Increasing the size of this value beyond the default value
* ({@value #DEFAULT_MAX_STATE_SIZE}) should be done with care.
* The checkpointed state needs to be send to the JobManager via limited size RPC messages, and there
* and the JobManager needs to be able to hold all aggregated state in its memory.
*
* @param maxStateSize The maximal size of the serialized state
*/
public MemoryStateBackend(int maxStateSize) {
this(null, null, maxStateSize, TernaryBoolean.UNDEFINED);
}

/**
* Creates a new memory state backend that accepts states whose serialized forms are
* up to the given number of bytes and that uses asynchronous snashots as configured.
*
* <p>Checkpoint and default savepoint locations are used as specified in the
* runtime configuration.
*
* <p><b>WARNING:</b> Increasing the size of this value beyond the default value
* ({@value #DEFAULT_MAX_STATE_SIZE}) should be done with care.
* The checkpointed state needs to be send to the JobManager via limited size RPC messages, and there
* and the JobManager needs to be able to hold all aggregated state in its memory.
*
* @param maxStateSize The maximal size of the serialized state
* @param asynchronousSnapshots Switch to enable asynchronous snapshots.
*/
public MemoryStateBackend(int maxStateSize, boolean asynchronousSnapshots) {
this(null, null, maxStateSize, TernaryBoolean.fromBoolean(asynchronousSnapshots));
}

/**
* Creates a new MemoryStateBackend, setting optionally the path to persist checkpoint metadata
* to, and to persist savepoints to.
*
* @param checkpointPath The path to write checkpoint metadata to. If null, the value from
* the runtime configuration will be used.
* @param savepointPath The path to write savepoints to. If null, the value from
* the runtime configuration will be used.
*/
public MemoryStateBackend(@Nullable String checkpointPath, @Nullable String savepointPath) {
this(checkpointPath, savepointPath, DEFAULT_MAX_STATE_SIZE, TernaryBoolean.UNDEFINED);
}

/**
* Creates a new MemoryStateBackend, setting optionally the paths to persist checkpoint metadata
* and savepoints to, as well as configuring state thresholds and asynchronous operations.
*
* <p><b>WARNING:</b> Increasing the size of this value beyond the default value
* ({@value #DEFAULT_MAX_STATE_SIZE}) should be done with care.
* The checkpointed state needs to be send to the JobManager via limited size RPC messages, and there
* and the JobManager needs to be able to hold all aggregated state in its memory.
*
* @param checkpointPath The path to write checkpoint metadata to. If null, the value from
* the runtime configuration will be used.
* @param savepointPath The path to write savepoints to. If null, the value from
* the runtime configuration will be used.
* @param maxStateSize The maximal size of the serialized state.
* @param asynchronousSnapshots Flag to switch between synchronous and asynchronous
* snapshot mode. If null, the value configured in the
* runtime configuration will be used.
*/
public MemoryStateBackend(
@Nullable String checkpointPath,
@Nullable String savepointPath,
int maxStateSize,
TernaryBoolean asynchronousSnapshots) {

super(checkpointPath == null ? null : new Path(checkpointPath),
savepointPath == null ? null : new Path(savepointPath));

checkArgument(maxStateSize > 0, “maxStateSize must be > 0”);
this.maxStateSize = maxStateSize;

this.asynchronousSnapshots = asynchronousSnapshots;
}

/**
* Private constructor that creates a re-configured copy of the state backend.
*
* @param original The state backend to re-configure
* @param configuration The configuration
*/
private MemoryStateBackend(MemoryStateBackend original, Configuration configuration) {
super(original.getCheckpointPath(), original.getSavepointPath(), configuration);

this.maxStateSize = original.maxStateSize;

// if asynchronous snapshots were configured, use that setting,
// else check the configuration
this.asynchronousSnapshots = original.asynchronousSnapshots.resolveUndefined(
configuration.getBoolean(CheckpointingOptions.ASYNC_SNAPSHOTS));
}

// ————————————————————————
// Properties
// ————————————————————————

/**
* Gets the maximum size that an individual state can have, as configured in the
* constructor (by default {@value #DEFAULT_MAX_STATE_SIZE}).
*
* @return The maximum size that an individual state can have
*/
public int getMaxStateSize() {
return maxStateSize;
}

/**
* Gets whether the key/value data structures are asynchronously snapshotted.
*
* <p>If not explicitly configured, this is the default value of
* {@link CheckpointingOptions#ASYNC_SNAPSHOTS}.
*/
public boolean isUsingAsynchronousSnapshots() {
return asynchronousSnapshots.getOrDefault(CheckpointingOptions.ASYNC_SNAPSHOTS.defaultValue());
}

// ————————————————————————
// Reconfiguration
// ————————————————————————

/**
* Creates a copy of this state backend that uses the values defined in the configuration
* for fields where that were not specified in this state backend.
*
* @param config the configuration
* @return The re-configured variant of the state backend
*/
@Override
public MemoryStateBackend configure(Configuration config) {
return new MemoryStateBackend(this, config);
}

// ————————————————————————
// checkpoint state persistence
// ————————————————————————

@Override
public CheckpointStorage createCheckpointStorage(JobID jobId) throws IOException {
return new MemoryBackendCheckpointStorage(jobId, getCheckpointPath(), getSavepointPath(), maxStateSize);
}

// ————————————————————————
// state holding structures
// ————————————————————————

@Override
public OperatorStateBackend createOperatorStateBackend(
Environment env,
String operatorIdentifier) throws Exception {

return new DefaultOperatorStateBackend(
env.getUserClassLoader(),
env.getExecutionConfig(),
isUsingAsynchronousSnapshots());
}

@Override
public <K> AbstractKeyedStateBackend<K> createKeyedStateBackend(
Environment env,
JobID jobID,
String operatorIdentifier,
TypeSerializer<K> keySerializer,
int numberOfKeyGroups,
KeyGroupRange keyGroupRange,
TaskKvStateRegistry kvStateRegistry,
TtlTimeProvider ttlTimeProvider,
MetricGroup metricGroup) {

TaskStateManager taskStateManager = env.getTaskStateManager();
HeapPriorityQueueSetFactory priorityQueueSetFactory =
new HeapPriorityQueueSetFactory(keyGroupRange, numberOfKeyGroups, 128);
return new HeapKeyedStateBackend<>(
kvStateRegistry,
keySerializer,
env.getUserClassLoader(),
numberOfKeyGroups,
keyGroupRange,
isUsingAsynchronousSnapshots(),
env.getExecutionConfig(),
taskStateManager.createLocalRecoveryConfig(),
priorityQueueSetFactory,
ttlTimeProvider);
}

// ————————————————————————
// utilities
// ————————————————————————

@Override
public String toString() {
return “MemoryStateBackend (data in heap memory / checkpoints to JobManager) ” +
“(checkpoints: ‘” + getCheckpointPath() +
“‘, savepoints: ‘” + getSavepointPath() +
“‘, asynchronous: ” + asynchronousSnapshots +
“, maxStateSize: ” + maxStateSize + “)”;
}
}

MemoryStateBackend 继承了 AbstractFileStateBackend,实现 ConfigurableStateBackend 接口 (configure 方法);它将 TaskManager 的 working state 及 JobManager 的 checkpoint state 存储在 JVM heap 中 (但是为了高可用,也可以设置 checkpoint state 存储到 filesystem);MemoryStateBackend 仅仅用来做实验用途,比如本地启动或者所需的 state 非常小,对于生产需要改为使用 FsStateBackend(将 TaskManager 的 working state 存储在内存,但是将 JobManager 的 checkpoint state 存储到文件系统以支持更大的 state 存储)
MemoryStateBackend 有个 maxStateSize 属性 (默认 DEFAULT_MAX_STATE_SIZE 为 5MB),每个 state 的大小不能超过 maxStateSize,一个 task 的所有 state 不能超过 RPC 系统的限制 (默认是 10MB,可以修改但不建议),所有 retained checkpoints 的 state 大小总和不能超过 JobManager 的 JVM heap 大小;另外如果创建 MemoryStateBackend 时未指定 checkpointPath 及 savepointPath,则会从 flink-conf.yaml 中读取全局默认值;MemoryStateBackend 里头还有一个 asynchronousSnapshots 属性,是 TernaryBoolean 类型 (TRUE、FALSE、UNDEFINED),其中 UNDEFINED 表示没有配置,将会使用默认值
MemoryStateBackend 的 createCheckpointStorage 创建的是 MemoryBackendCheckpointStorage;createOperatorStateBackend 方法创建的是 OperatorStateBackend;createKeyedStateBackend 方法创建的是 HeapKeyedStateBackend

小结

StateBackend 接口定义了有状态的 streaming 应用的 state 是如何 stored 以及 checkpointed;目前内置支持 MemoryStateBackend、FsStateBackend、RocksDBStateBackend 三种,如果没有配置默认为 MemoryStateBackend;在 flink-conf.yaml 里头可以进行全局的默认配置,不过具体每个 job 还可以通过 StreamExecutionEnvironment.setStateBackend 来覆盖全局的配置
StateBackend 接口定义了 createCheckpointStorage、createKeyedStateBackend、createOperatorStateBackend 方法;同时继承了 Serializable 接口;StateBackend 接口的实现要求是线程安全的;StateBackend 有个直接实现的抽象类 AbstractStateBackend,而 AbstractFileStateBackend 及 RocksDBStateBackend 继承了 AbstractStateBackend,之后 MemoryStateBackend、FsStateBackend 都继承了 AbstractFileStateBackend
MemoryStateBackend 继承了 AbstractFileStateBackend,实现 ConfigurableStateBackend 接口 (configure 方法);它将 TaskManager 的 working state 及 JobManager 的 checkpoint state 存储在 JVM heap 中;MemoryStateBackend 的 createCheckpointStorage 创建的是 MemoryBackendCheckpointStorage;createOperatorStateBackend 方法创建的是 OperatorStateBackend;createKeyedStateBackend 方法创建的是 HeapKeyedStateBackend

doc
State Backends

正文完
 0