Halo博客的百度定时页面提交

前言

  • 好不容易搭建好博客,写了些自认为有点意思的文章,然而没人看!!因为没有提交到搜索引擎,所以基本没人能搜到嘛~。尽管Next主题提供了百度主动提交的配置,然而百度收录曾经不再提供推动收录的服务,所以Next主题的配置也没啥用了
  • 百度收录网站中提供了三种收录形式,其中API提交最快捷,因而思考应用Java实现将Halo博客文章推送到百度收录中

    • API提交
    • sitemap提交
    • 手动提交
  • Halo提供了用于获取文章列表的API,因而思路很简略:应用Java定时工作线程池依照固定的工夫距离从Halo API中获取全副的文章链接,而后调用百度收录API,向百度提交文章链接

    • 百度收录对于频繁提交旧的链接有肯定的限度,如果常常反复提交旧链接,会下调配额,甚至可能会失去API推送性能的权限,如果常常提交新的文章链接,可能适当进步配额。因而须要建设一个简略的缓存,提交链接时滤除旧的曾经提交过的链接
  • 只管Google应用站点地图就曾经能很好地进行链接的抓取了,不必独自提交,然而Google同样举荐应用indexing API被动提交要收录的链接,具体可参考Halo博客的谷歌定时页面提交

工程搭建

  • 建设Gradle工程,配置文件如下所示

    plugins {    id 'java'    id 'application'}group 'xyz.demoli'version '1.0-SNAPSHOT'sourceCompatibility = 1.11mainClassName="xyz.demoli.Main"repositories {    mavenCentral()}application{    applicationDefaultJvmArgs = ['-Duser.timezone=GMT+8']}dependencies {    testImplementation 'org.junit.jupiter:junit-jupiter-api:5.8.1'    testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.8.1'    // https://mvnrepository.com/artifact/com.squareup.okhttp3/okhttp    implementation group: 'com.squareup.okhttp3', name: 'okhttp', version: '4.9.3'    implementation 'com.google.code.gson:gson:2.9.0'    // https://mvnrepository.com/artifact/org.apache.logging.log4j/log4j-api    implementation group: 'org.apache.logging.log4j', name: 'log4j-api', version: '2.14.1'    // https://mvnrepository.com/artifact/org.apache.logging.log4j/log4j-core    implementation group: 'org.apache.logging.log4j', name: 'log4j-core', version: '2.14.1'    // https://mvnrepository.com/artifact/org.projectlombok/lombok    compileOnly group: 'org.projectlombok', name: 'lombok', version: '1.18.22'    annotationProcessor group: 'org.projectlombok', name: 'lombok', version: '1.18.22'}test {    useJUnitPlatform()}
    • annotationProcessor group: 'org.projectlombok', name: 'lombok', version: '1.18.22'保障gradle我的项目中lombok的注解能够被正确解析
    • applicationDefaultJvmArgs参数的设置是为了解决后续服务部署在容器中时日志打印工夫不是东八区时区的问题
  • 配置文件如下:

    prefix=https://blog.demoli.xyzpostAPI=%s/api/content/posts?api_access_key=%s&page=%dapiAccessKey=***baiduUrl=http://data.zz.baidu.com/urls?token=***
    • apiAccessKey是在Halo博客设置中设定的

    • prefix是Halo博客的首页拜访URL
    • token是百度提交平台为用户提供的提交token,在百度提交网站页面中有展现
  • 日志配置文件如下:

    <?xml version="1.0" encoding="utf-8" ?><configuration status="INFO">    <appenders>        <console name="console" target="SYSTEM_OUT">            <PatternLayout pattern="%d{yyyy-MM-dd HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>        </console>    </appenders>    <loggers>        <root level="INFO">            <appender-ref ref="console"/>        </root>    </loggers></configuration>
  • 整个工程只有两个外围类

    • PostScrap

      import com.google.gson.Gson;import com.google.gson.JsonArray;import com.google.gson.JsonElement;import com.google.gson.JsonObject;import java.io.IOException;import java.io.InputStream;import java.util.ArrayList;import java.util.HashSet;import java.util.List;import java.util.Properties;import java.util.Set;import java.util.stream.Collectors;import okhttp3.OkHttpClient;import okhttp3.Request;import okhttp3.Response;/** * 应用Halo API获取文章链接 */public class PostScrap {    static private String postAPI;    static private String apiAccessKey;    static private String prefix;    // 缓存    static private final Set<String> links = new HashSet<>();    // 留神properties配置文件中字符串不必加引号    static {        try (InputStream stream = PostScrap.class.getResourceAsStream("/config.properties")) {            Properties properties = new Properties();            properties.load(stream);            apiAccessKey = properties.getProperty("apiAccessKey");            prefix = properties.getProperty("prefix");            postAPI = properties.getProperty("postAPI");        } catch (IOException e) {            e.printStackTrace();        }    }    /**     * 发动申请获取全副文章链接     *     * @return     */    public static List<String> getPosts() {        List<String> res = new ArrayList<>();        OkHttpClient client = new OkHttpClient();        Request initialRequest =            new Request.Builder().get().url(String.format(postAPI, prefix, apiAccessKey, 0))                .build();        try (Response response = client.newCall(initialRequest).execute()) {            res = handlePage(response, client);        } catch (IOException e) {            e.printStackTrace();        }        return res;    }    /**     * 解决分页     *     * @param initialResponse     * @param client     * @return     * @throws IOException     */    private static List<String> handlePage(Response initialResponse, OkHttpClient client)        throws IOException {        JsonObject jsonObject =            new Gson().fromJson(initialResponse.body().string(), JsonObject.class);        JsonArray array = jsonObject.get("data").getAsJsonObject().get("content").getAsJsonArray();        int pages = jsonObject.get("data").getAsJsonObject().get("pages").getAsInt();        // jsonArray转为List        List<String> posts = new ArrayList<>();        for (JsonElement element : array) {            posts.add(element.getAsJsonObject().get("fullPath").getAsString());        }        // 分页查问        for (int i = 1; i < pages; i++) {            Request request =                new Request.Builder().get().url(String.format(postAPI, prefix, apiAccessKey, i))                    .build();            try (Response response = client.newCall(request).execute()) {                jsonObject = new Gson().fromJson(response.body().string(), JsonObject.class);                array = jsonObject.get("data").getAsJsonObject().get("content").getAsJsonArray();                for (JsonElement element : array) {                    posts.add(element.getAsJsonObject().get("fullPath").getAsString());                }            } catch (IOException e) {                e.printStackTrace();            }        }        // 缓存过滤        return posts.stream().map(content -> prefix + content).filter(links::add)            .collect(Collectors.toList());    }}
    • BaiduSubmitter

      import com.google.gson.Gson;import com.google.gson.JsonObject;import java.io.IOException;import java.io.InputStream;import java.util.Optional;import java.util.Properties;import lombok.extern.log4j.Log4j2;import okhttp3.MediaType;import okhttp3.OkHttpClient;import okhttp3.Request;import okhttp3.RequestBody;import okhttp3.Response;/** * 提交百度收录 */@Log4j2public class BaiduSubmitter {    static private String submitUrl;    static {        try (InputStream stream = PostScrap.class.getResourceAsStream("/config.properties")) {            Properties properties = new Properties();            properties.load(stream);            String baiduUrl = properties.getProperty("baiduUrl");            String site = properties.getProperty("prefix");            String token = properties.getProperty("token");            submitUrl = baiduUrl + "site=" + site + "&token=" + token;        } catch (IOException e) {            e.printStackTrace();        }    }    /**     * 提交链接     */    public static void submit() {        OkHttpClient client = new OkHttpClient();        Optional<String> urlStrings =            PostScrap.getPosts().stream().reduce((out, url) -> out + "\n" + url);        if (urlStrings.isEmpty()) {            log.info("无新增文章");            return;        }        RequestBody body = RequestBody.create(MediaType.get("text/plain"), urlStrings.get());        Request request = new Request.Builder().post(body).url(submitUrl)            .header("Content-Type", "text/plain")            .build();        try (Response response = client.newCall(request).execute()) {            JsonObject jsonObject = new Gson().fromJson(response.body().string(), JsonObject.class);            if (jsonObject.get("error") != null) {                log.error("提交失败: {}", jsonObject.get("message").getAsString());            }            log.info("提交胜利 {} 条链接,残余额度: {},链接清单如下:", jsonObject.get("success").getAsInt(),                jsonObject.get("remain").getAsInt());            log.info(urlStrings.get());        } catch (IOException e) {            e.printStackTrace();        }    }}
  • Main

    public class Main {    public static void main(String[] args) {        Executors.newScheduledThreadPool(1)            .scheduleWithFixedDelay(BaiduSubmitter::submit, 0, 12, TimeUnit.HOURS);    }}

工程部署

  • 我的项目根目录执行gradle build -x test
  • build/distributions/BaiduSubmitter-1.0-SNAPSHOT.tar拷贝到装置有Java环境的服务器

    tar xf BaiduSubmitter-1.0-SNAPSHOT.tar`cd BaiduSubmitter-1.0-SNAPSHOTnohup bin/BaiduSubmitter > nohup.out &
  • tail -f nohup.out查看日志

补充

  • 博主是一个Docker容器的究极爱好者,因为应用容器能够保障宿主机环境的”污浊“,所以这里补充应用Docker容器部署服务的形式
  • 首先将我的项目构建失去的软件包build/distributions/BaiduSubmitter-1.0-SNAPSHOT.tar拷贝到服务器,解压并重新命名,创立Dockerfile

    tar xf BaiduSubmitter-1.0-SNAPSHOT.tarmkdir -p blogSubmitter/baiduSubmittermv BaiduSubmitter-1.0-SNAPSHOT blogSubmitter/baiduSubmitter/baiducd blogSubmitter/baiduSubmittertouch Dockerfile
  • Dockerfile文件如下:

    FROM openjdk:11COPY . /submitterWORKDIR /submitter# 更改时区RUN rm -rf /etc/localtimeRUN ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtimeCMD ["nohup","baidu/bin/BaiduSubmitter"," &"]
  • 创立yaml配置文件,应用Docker Compose构建服务

    cd blogSubmittertouch submitter.yaml
    version: '3.1'services:  blog-baidu-submitter:     build: ./baiduSubmitter     container_name: blogBaiduSubmitter    restart: unless-stopped
  • 执行docker-compose -f submitter.yaml up -d创立服务

注意事项

  • 如果更改了源码,须要从新构建镜像,此时要把之前的镜像删除(应该有更好的解决办法,有待改善,比方应用volume的形式执行挂载)

参考

  • Gradle Application Plugin
  • 解决Docker容器和宿主机工夫不统一的问题