关于大数据:hudi源码编译

39次阅读

共计 1034 个字符,预计需要花费 3 分钟才能阅读完成。

环境 & 源码筹备

git 源码

登陆 https://github.com/apache/hudi

git clone https://github.com/apache/hudi.git

代码编译

可用的 maven settings 设置(还是会短少一些包)

 <mirrors> 
    <!-- mirror | Specifies a repository mirror site to use instead of a given 
      repository. The repository that | this mirror serves has an ID that matches 
      the mirrorOf element of this mirror. IDs are used | for inheritance and direct 
      lookup purposes, and must be unique across the set of mirrors. | --> 
    <mirror>
    <id>aliyunmaven</id>
    <mirrorOf>*</mirrorOf>
    <name>spring-plugin</name>
    <url>https://maven.aliyun.com/repository/spring-plugin</url>
 </mirror>

  <mirror>
    <id>central</id>
    <name>Maven Repository Switchboard</name>
    <url>https://repo1.maven.org/maven2/</url>
    <mirrorOf>central</mirrorOf>
</mirror>


  </mirrors> 

筹备完结当前能够开始编译

mvn clean package -DskipTests

上面说一说编译遇到的问题
1. 首先短少包:
pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar
这个手动下载的,下载地址:
https://public.nexus.pentaho….

2. 短少 io.confluent 包
参考:https://blog.csdn.net/weixin_…
将包补齐即可打包胜利

启动命令

/usr/local/spark/bin/spark-shell \
  --jars `ls /Users/##/SourceCode/hudi/hudi/packaging/hudi-spark-bundle/target/hudi-spark-bundle_2.11-*.*.*-SNAPSHOT.jar` \

正文完
 0