1. 类加载机制简介
Java 默认的类加载机制是 双亲委派模型。
Flink 则为用户类和框架的类抵触提供了 child-first 的类加载模式,这样可能肯定水平上缩小因为框架降级导致应用的某局部依赖和用户的依赖版本不兼容的问题(当然不能彻底解决,将这部分类替换成用户的依赖版本也可能导致 Flink 框架在运行过程中呈现 NoSuchMethod 等异样)。
但针对一些外围类,Flink 仍旧还是优先从框架提供的环境中加载。
为了实现这些能力,Flink 提供了两个对于类加载的配置:
- classloader.resolve-order:有两个值,child-first 和 parent-first,前者是下面提到的优先加载用户类的模式,后者是 Java 默认的双亲委派模式。Flink 默认应用 child-first。
- classloader.parent-first-patterns.default:这里定义了一系列模式,用来匹配一些要害类,被这些模式匹配中的类都会优先加载 Flink 环境的类,而不会从用户类中进行加载,比方 flink-connector-kafka_2.11-1.12.2.jar,必须放到 flink lib 目录下,而不能简略的通过拓展指定 classpath。
2. 源码解读
创立入口可看 ClientUtils 的 buildUserCodeClassLoader 办法:
public static URLClassLoader buildUserCodeClassLoader(List<URL> jars, List<URL> classpaths, ClassLoader parent, Configuration configuration) {URL[] urls = new URL[jars.size() + classpaths.size()];
for (int i = 0; i < jars.size(); i++) {urls[i] = jars.get(i);
}
for (int i = 0; i < classpaths.size(); i++) {urls[i + jars.size()] = classpaths.get(i);
}
// 1. 读取配置获取 alwaysParentFirstLoaderPatterns,这部分 pattern 匹配的类都优先从 Flink 环境加载
final String[] alwaysParentFirstLoaderPatterns =
CoreOptions.getParentFirstLoaderPatterns(configuration);
// 2. 读取配置获取类加载模式
final String classLoaderResolveOrder =
configuration.getString(CoreOptions.CLASSLOADER_RESOLVE_ORDER);
FlinkUserCodeClassLoaders.ResolveOrder resolveOrder =
FlinkUserCodeClassLoaders.ResolveOrder.fromString(classLoaderResolveOrder);
final boolean checkClassloaderLeak =
configuration.getBoolean(CoreOptions.CHECK_LEAKED_CLASSLOADER);
return FlinkUserCodeClassLoaders.create(
resolveOrder,
urls,
parent,
alwaysParentFirstLoaderPatterns,
NOOP_EXCEPTION_HANDLER,
checkClassloaderLeak);
}
能够看一下 CoreOptions.getParentFirstLoaderPatterns(configuration),其实和 classLoaderResolveOrder 的读取一样,调用了 Configuration.getString() 办法:
public static String[] getParentFirstLoaderPatterns(Configuration config) {String base = config.getString(ALWAYS_PARENT_FIRST_LOADER_PATTERNS);
String append = config.getString(ALWAYS_PARENT_FIRST_LOADER_PATTERNS_ADDITIONAL);
return parseParentFirstLoaderPatterns(base, append);
}
别离看一下三个 ConfigOption 的申明:
2.1 CHECK_LEAKED_CLASSLOADER
public static final ConfigOption<String> CLASSLOADER_RESOLVE_ORDER =
ConfigOptions.key("classloader.resolve-order")
.defaultValue("child-first")
.withDescription(
"Defines the class resolution strategy when loading classes from user code, meaning whether to"
+ "first check the user code jar (\"child-first\") or the application classpath (\"parent-first\")."
+ "The default settings indicate to load classes first from the user code jar, which means that user code"
+ "jars can include and load different dependencies than Flink uses (transitively).");
能够看到这里设置了默认值:child-first。
2.2 ALWAYS_PARENT_FIRST_LOADER_PATTERNS
public static final ConfigOption<String> ALWAYS_PARENT_FIRST_LOADER_PATTERNS =
ConfigOptions.key("classloader.parent-first-patterns.default")
.defaultValue("java.;scala.;org.apache.flink.;com.esotericsoftware.kryo;org.apache.hadoop.;javax.annotation.;org.slf4j;org.apache.log4j;org.apache.logging;org.apache.commons.logging;ch.qos.logback;org.xml;javax.xml;org.apache.xerces;org.w3c")
.withDeprecatedKeys("classloader.parent-first-patterns")
.withDescription("A (semicolon-separated) list of patterns that specifies which classes should always be"
+ "resolved through the parent ClassLoader first. A pattern is a simple prefix that is checked against"
+ "the fully qualified class name. This setting should generally not be modified. To add another pattern we"
+ "recommend to use \"classloader.parent-first-patterns.additional\"instead.");
这里的默认值是
"java.;scala.;org.apache.flink.;com.esotericsoftware.kryo;org.apache.hadoop.;javax.annotation.;org.slf4j;org.apache.log4j;org.apache.logging;org.apache.commons.logging;ch.qos.logback;org.xml;javax.xml;org.apache.xerces;org.w3c"
这些值以 ; 为分隔,示意以 java.、org.apache.flink 等结尾的类都会优先从 Flink 环境中加载。
2.3 ALWAYS_PARENT_FIRST_LOADER_PATTERNS_ADDITIONAL
public static final ConfigOption<String> ALWAYS_PARENT_FIRST_LOADER_PATTERNS_ADDITIONAL =
ConfigOptions.key("classloader.parent-first-patterns.additional")
.defaultValue("")
.withDescription("A (semicolon-separated) list of patterns that specifies which classes should always be"
+ "resolved through the parent ClassLoader first. A pattern is a simple prefix that is checked against"
+ "the fully qualified class name. These patterns are appended to \""
+ ALWAYS_PARENT_FIRST_LOADER_PATTERNS.key()
+ "\".");
和 2 相似,这部分能够了解为用户自定义的心愿优先从 Flink 环境中加载的类。
2.4 用户代码类加载器的创立
public static URLClassLoader create(
ResolveOrder resolveOrder,
URL[] urls,
ClassLoader parent,
String[] alwaysParentFirstPatterns,
Consumer<Throwable> classLoadingExceptionHandler,
boolean checkClassLoaderLeak) {
// resolveOrder 创立不同的类加载器
switch (resolveOrder) {
case CHILD_FIRST:
return childFirst(
urls,
parent,
alwaysParentFirstPatterns,
classLoadingExceptionHandler,
checkClassLoaderLeak);
case PARENT_FIRST:
return parentFirst(urls, parent, classLoadingExceptionHandler, checkClassLoaderLeak);
default:
throw new IllegalArgumentException("Unknown class resolution order:" + resolveOrder);
}
}
2.5 ChildFirstClassLoader
parent first 的类加载器即原生的双亲委派模型,因而只需看一下 ChildFirstClassLoader 的加载逻辑:
protected Class<?> loadClassWithoutExceptionHandling(String name, boolean resolve)
throws ClassNotFoundException {
// First, check if the class has already been loaded
Class<?> c = findLoadedClass(name);
if (c == null) {
// check whether the class should go parent-first
for (String alwaysParentFirstPattern : alwaysParentFirstPatterns) {if (name.startsWith(alwaysParentFirstPattern)) {return super.loadClassWithoutExceptionHandling(name, resolve);
}
}
try {
// check the URLs
c = findClass(name);
} catch (ClassNotFoundException e) {
// let URLClassLoader do it, which will eventually call the parent
c = super.loadClassWithoutExceptionHandling(name, resolve);
}
} else if (resolve) {resolveClass(c);
}
return c;
}
- 首先查看该类是否已加载;
- 若该类没有加载,那么用 alwaysParentFirstPatterns 尝试匹配;
- 对匹配胜利的类应用父类加载器加载(这个父类加载器应用的还是双亲委派模型);
- 匹配失败的类则应用该类加载进行加载。