关于blog:Halo博客的百度定时页面提交

5次阅读

共计 7654 个字符,预计需要花费 20 分钟才能阅读完成。

Halo 博客的百度定时页面提交

前言

  • 好不容易搭建好博客,写了些自认为有点意思的文章,然而没人看!!因为没有提交到搜索引擎,所以基本没人能搜到嘛~。尽管 Next 主题提供了百度主动提交的配置,然而百度收录曾经不再提供推动收录的服务,所以 Next 主题的配置也没啥用了
  • 百度收录网站中提供了三种收录形式,其中 API 提交最快捷,因而思考应用 Java 实现将 Halo 博客文章推送到百度收录中

    • API 提交
    • sitemap 提交
    • 手动提交
  • Halo 提供了用于获取文章列表的 API,因而思路很简略:应用 Java 定时工作线程池依照固定的工夫距离从 Halo API 中获取全副的文章链接,而后调用百度收录 API,向百度提交文章链接

    • 百度收录对于频繁提交旧的链接有肯定的限度,如果常常反复提交旧链接,会下调配额,甚至可能会失去 API 推送性能的权限,如果常常提交新的文章链接,可能适当进步配额。因而 须要建设一个简略的缓存,提交链接时滤除旧的曾经提交过的链接
  • 只管 Google 应用站点地图就曾经能很好地进行链接的抓取了,不必独自提交,然而 Google 同样举荐应用 indexing API 被动提交要收录的链接,具体可参考 Halo 博客的谷歌定时页面提交

工程搭建

  • 建设 Gradle 工程,配置文件如下所示

    plugins {
        id 'java'
        id 'application'
    }
    
    group 'xyz.demoli'
    version '1.0-SNAPSHOT'
    
    
    sourceCompatibility = 1.11
    
    mainClassName="xyz.demoli.Main"
    
    repositories {mavenCentral()
    }
    
    application{applicationDefaultJvmArgs = ['-Duser.timezone=GMT+8']
    }
    
    dependencies {
        testImplementation 'org.junit.jupiter:junit-jupiter-api:5.8.1'
        testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.8.1'
        // https://mvnrepository.com/artifact/com.squareup.okhttp3/okhttp
        implementation group: 'com.squareup.okhttp3', name: 'okhttp', version: '4.9.3'
        implementation 'com.google.code.gson:gson:2.9.0'
        // https://mvnrepository.com/artifact/org.apache.logging.log4j/log4j-api
        implementation group: 'org.apache.logging.log4j', name: 'log4j-api', version: '2.14.1'
        // https://mvnrepository.com/artifact/org.apache.logging.log4j/log4j-core
        implementation group: 'org.apache.logging.log4j', name: 'log4j-core', version: '2.14.1'
        // https://mvnrepository.com/artifact/org.projectlombok/lombok
        compileOnly group: 'org.projectlombok', name: 'lombok', version: '1.18.22'
        annotationProcessor group: 'org.projectlombok', name: 'lombok', version: '1.18.22'
    }
    
    test {useJUnitPlatform()
    }
    • annotationProcessor group: 'org.projectlombok', name: 'lombok', version: '1.18.22'保障 gradle 我的项目中 lombok 的注解能够被正确解析
    • applicationDefaultJvmArgs参数的设置是为了解决后续服务部署在容器中时日志打印工夫不是东八区时区的问题
  • 配置文件如下:

    prefix=https://blog.demoli.xyz
    postAPI=%s/api/content/posts?api_access_key=%s&page=%d
    apiAccessKey=***
    baiduUrl=http://data.zz.baidu.com/urls?
    token=***
    • apiAccessKey是在 Halo 博客设置中设定的

    • prefix是 Halo 博客的首页拜访 URL
    • token是百度提交平台为用户提供的提交 token,在百度提交网站页面中有展现
  • 日志配置文件如下:

    <?xml version="1.0" encoding="utf-8" ?>
    
    <configuration status="INFO">
        <appenders>
            <console name="console" target="SYSTEM_OUT">
                <PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
            </console>
        </appenders>
    
        <loggers>
            <root level="INFO">
                <appender-ref ref="console"/>
            </root>
        </loggers>
    </configuration>
  • 整个工程只有两个外围类

    • PostScrap

      import com.google.gson.Gson;
      import com.google.gson.JsonArray;
      import com.google.gson.JsonElement;
      import com.google.gson.JsonObject;
      import java.io.IOException;
      import java.io.InputStream;
      import java.util.ArrayList;
      import java.util.HashSet;
      import java.util.List;
      import java.util.Properties;
      import java.util.Set;
      import java.util.stream.Collectors;
      import okhttp3.OkHttpClient;
      import okhttp3.Request;
      import okhttp3.Response;
      
      /**
       * 应用 Halo API 获取文章链接
       */
      public class PostScrap {
      
          static private String postAPI;
          static private String apiAccessKey;
          static private String prefix;
          // 缓存
          static private final Set<String> links = new HashSet<>();
      
          // 留神 properties 配置文件中字符串不必加引号
          static {try (InputStream stream = PostScrap.class.getResourceAsStream("/config.properties")) {Properties properties = new Properties();
                  properties.load(stream);
                  apiAccessKey = properties.getProperty("apiAccessKey");
                  prefix = properties.getProperty("prefix");
                  postAPI = properties.getProperty("postAPI");
              } catch (IOException e) {e.printStackTrace();
              }
          }
      
          /**
           * 发动申请获取全副文章链接
           *
           * @return
           */
          public static List<String> getPosts() {List<String> res = new ArrayList<>();
      
              OkHttpClient client = new OkHttpClient();
              Request initialRequest =
                  new Request.Builder().get().url(String.format(postAPI, prefix, apiAccessKey, 0))
                      .build();
      
              try (Response response = client.newCall(initialRequest).execute()) {res = handlePage(response, client);
              } catch (IOException e) {e.printStackTrace();
              }
              return res;
          }
      
          /**
           * 解决分页
           *
           * @param initialResponse
           * @param client
           * @return
           * @throws IOException
           */
          private static List<String> handlePage(Response initialResponse, OkHttpClient client)
              throws IOException {
      
              JsonObject jsonObject =
                  new Gson().fromJson(initialResponse.body().string(), JsonObject.class);
              JsonArray array = jsonObject.get("data").getAsJsonObject().get("content").getAsJsonArray();
              int pages = jsonObject.get("data").getAsJsonObject().get("pages").getAsInt();
      
              // jsonArray 转为 List
              List<String> posts = new ArrayList<>();
              for (JsonElement element : array) {posts.add(element.getAsJsonObject().get("fullPath").getAsString());
              }
      
              // 分页查问
              for (int i = 1; i < pages; i++) {
                  Request request =
                      new Request.Builder().get().url(String.format(postAPI, prefix, apiAccessKey, i))
                          .build();
                  try (Response response = client.newCall(request).execute()) {jsonObject = new Gson().fromJson(response.body().string(), JsonObject.class);
                      array = jsonObject.get("data").getAsJsonObject().get("content").getAsJsonArray();
                      for (JsonElement element : array) {posts.add(element.getAsJsonObject().get("fullPath").getAsString());
                      }
                  } catch (IOException e) {e.printStackTrace();
                  }
              }
      
              // 缓存过滤
              return posts.stream().map(content -> prefix + content).filter(links::add)
                  .collect(Collectors.toList());
          }
      }
    • BaiduSubmitter

      import com.google.gson.Gson;
      import com.google.gson.JsonObject;
      import java.io.IOException;
      import java.io.InputStream;
      import java.util.Optional;
      import java.util.Properties;
      import lombok.extern.log4j.Log4j2;
      import okhttp3.MediaType;
      import okhttp3.OkHttpClient;
      import okhttp3.Request;
      import okhttp3.RequestBody;
      import okhttp3.Response;
      
      /**
       * 提交百度收录
       */
      @Log4j2
      public class BaiduSubmitter {
      
          static private String submitUrl;
      
          static {try (InputStream stream = PostScrap.class.getResourceAsStream("/config.properties")) {Properties properties = new Properties();
                  properties.load(stream);
                  String baiduUrl = properties.getProperty("baiduUrl");
                  String site = properties.getProperty("prefix");
                  String token = properties.getProperty("token");
                  submitUrl = baiduUrl + "site=" + site + "&token=" + token;
              } catch (IOException e) {e.printStackTrace();
              }
          }
      
          /**
           * 提交链接
           */
          public static void submit() {OkHttpClient client = new OkHttpClient();
              Optional<String> urlStrings =
                  PostScrap.getPosts().stream().reduce((out, url) -> out + "\n" + url);
              if (urlStrings.isEmpty()) {log.info("无新增文章");
                  return;
              }
              RequestBody body = RequestBody.create(MediaType.get("text/plain"), urlStrings.get());
              Request request = new Request.Builder().post(body).url(submitUrl)
                  .header("Content-Type", "text/plain")
                  .build();
              try (Response response = client.newCall(request).execute()) {JsonObject jsonObject = new Gson().fromJson(response.body().string(), JsonObject.class);
                  if (jsonObject.get("error") != null) {log.error("提交失败: {}", jsonObject.get("message").getAsString());
                  }
                  log.info("提交胜利 {} 条链接,残余额度: {},链接清单如下:", jsonObject.get("success").getAsInt(),
                      jsonObject.get("remain").getAsInt());
                  log.info(urlStrings.get());
              } catch (IOException e) {e.printStackTrace();
              }
          }
      }
  • Main

    public class Main {public static void main(String[] args) {Executors.newScheduledThreadPool(1)
                .scheduleWithFixedDelay(BaiduSubmitter::submit, 0, 12, TimeUnit.HOURS);
        }
    }

工程部署

  • 我的项目根目录执行gradle build -x test
  • build/distributions/BaiduSubmitter-1.0-SNAPSHOT.tar 拷贝到装置有 Java 环境的服务器

    tar xf BaiduSubmitter-1.0-SNAPSHOT.tar`
    cd BaiduSubmitter-1.0-SNAPSHOT
    nohup bin/BaiduSubmitter > nohup.out &
  • tail -f nohup.out查看日志

补充

  • 博主是一个 Docker 容器的究极爱好者,因为应用容器能够保障宿主机环境的”污浊“,所以这里补充应用 Docker 容器部署服务的形式
  • 首先将我的项目构建失去的软件包 build/distributions/BaiduSubmitter-1.0-SNAPSHOT.tar 拷贝到服务器,解压并重新命名,创立 Dockerfile

    tar xf BaiduSubmitter-1.0-SNAPSHOT.tar
    mkdir -p blogSubmitter/baiduSubmitter
    mv BaiduSubmitter-1.0-SNAPSHOT blogSubmitter/baiduSubmitter/baidu
    cd blogSubmitter/baiduSubmitter
    touch Dockerfile
  • Dockerfile 文件如下:

    FROM openjdk:11
    COPY . /submitter
    WORKDIR /submitter
    # 更改时区
    RUN rm -rf /etc/localtime
    RUN ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
    CMD ["nohup","baidu/bin/BaiduSubmitter","&"]
  • 创立 yaml 配置文件,应用 Docker Compose 构建服务

    cd blogSubmitter
    touch submitter.yaml
    version: '3.1'
    services:
      blog-baidu-submitter: 
        build: ./baiduSubmitter 
        container_name: blogBaiduSubmitter
        restart: unless-stopped
  • 执行 docker-compose -f submitter.yaml up -d 创立服务

注意事项

  • 如果更改了源码,须要从新构建镜像,此时要把之前的镜像删除(应该有更好的解决办法,有待改善,比方应用 volume 的形式执行挂载)

参考

  • Gradle Application Plugin
  • 解决 Docker 容器和宿主机工夫不统一的问题
正文完
 0