以图灵社区每周特价页面为例: https://www.ituring.com.cn/tag/36527

抓取内容

这个页面每周一都会更新三个半价电子书,先剖析页面xpath,找出法则.

复制后粘贴,能够看到三本书的xpath别离是:

//*[@id="tag-book"]/div/ul/li[1]/div[2]/h4/a

//*[@id="tag-book"]/div/ul/li[2]/div[2]/h4/a

//*[@id="tag-book"]/div/ul/li[3]/div[2]/h4/a

只有两头的li[]内容不一样.

因而options局部,xpath局部能够用li[*]匹配所有状况.能够这样写:

{  "expected_update_period_in_days": "2",  "url": "https://www.ituring.com.cn/tag/36527",  "type": "html",  "mode": "on_change",  "extract": {    "url": {      "xpath": "//*[@id=\"tag-book\"]/div/ul/li[*]/div[2]/h4/a",      "value": "@href"    },    "title": {      "xpath": "//*[@id=\"tag-book\"]/div/ul/li[*]/div[2]/h4/a",      "value": "@title"    }  }}

推送内容

应用slack agent推送.
options局部:

{  "webhook_url": "https://hooks.slack.com/services/xxxx/xxxxxxxx",  "channel": "#book",  "username": "Huginn",  "message": "{{title}}    https://www.ituring.com.cn{{url}}",  "icon": ""}

成果