## 1. 资料
##### 1.1.第三方包
* [github.com/PuerkitoBio/goquery](https://godoc.org/github.com/PuerkitoBio/goquery)
* [github.com/go-redis/redis](https://godoc.org/github.com/PuerkitoBio/goquery)
* [beego框架定时任务包](https://beego.me/docs/module/toolbox.md#task)
##### 1.2 接口
* [百度新闻:美剧关键字](http://news.baidu.com/ns?cl=2&rn=20&tn=news&word=%E7%BE%8E%E5%89%A7)
* [钉钉群BOT文档](https://open-doc.dingtalk.com/docs/doc.htm?spm=a219a.7629140.0.0.t8inXi&treeId=257&articleId=105735&docType=1#s6)
## 2. 初始化项目变量
package main
import (
var (
redisClient *redis.Client //redis 缓存
dingdingURL = "https://oapi.dingtalk.com/robot/send?access_token=dingding_talk_group_bot_webhook_token"
baiduNewsUrlWithSearchKeyword = "http://news.baidu.com/ns?cl=2&rn=20&tn=news&word=%E7%89%A9%E8%81%94%E7%BD%91"
const (
newsFeed = "news_feed"//爬取到的百度新闻redis key
newsPost = "news_post"//已发送的百度新闻redis key
newsList = "iot_news" //储存了的百度新闻redis key
func init() {
redisClient = redis.NewClient(&redis.Options{
Addr: "",
Password: "ddfrfgtre4353252", // redis password
DB: 0, // redis 数据库ID
点击“复制”按钮,即可获得这个机器人对应的Webhook地址,赋值给 `dingdingURl`
## 3 `func newBot`
##### 3.1 使用goquery和网页元素选择器语法提取有用信息
func newsBot() error {
// 获取html doc
doc, err := goquery.NewDocument(baiduNewsUrlWithSearchKeyword)
if err != nil {
return nil
//使用redis pipelien 减少redis连接数
pipe := redisClient.Pipeline()
// 使用selector xpath 语法获取有用信息
// 储存新闻到redis中 newsList
// 储存新闻ur到redis-set 建newfeed 为以后是用sdiff 找出没有发送的新闻
doc.Find("div.result").Each(func(i int, s *goquery.Selection) {
// For each item found, get the band and title
URL, _ := s.Find("h3 > a").Attr("href")
Source := s.Find("p.c-author").Text()
Title := s.Find("h3 > a").Text()
markdown := fmt.Sprintf("- [%s](%s) _%s_", Title, URL, Source)
pipe.HSet(newsList, URL, markdown)
pipe.SAdd(newsFeed, URL)
//执行redis pipeline
##### 3.2 排除以发送的新闻,拼接markdown字符串
//使用redis sdiff找出没有发送的新闻url
unSendNewsUrls := redisClient.SDiff(newsFeed, newsPost).Val()
//新闻按dingding文档markdonw 规范拼接
content := ""
for _, url := range unSendNewsUrls {
md := redisClient.HGet(newsList, url).Val()
content = content + " \n " + md
pipe.SAdd(newsPost, url)
##### 3.3 调用钉钉群机器人接口
//如果有未发送新闻 请求钉钉webhook
if content != "" {
formt := `
"msgtype": "markdown",
"markdown": {
"text": "%s"
body := fmt.Sprintf(formt, content)
jsonValue := []byte(body)
//xiang见钉钉文档 https://open-doc.dingtalk.com/docs/doc.htm?spm=a219a.7629140.0.0.karFPe&treeId=257&articleId=105735&docType=1
resp, err := http.Post(dingdingURL, "application/json", bytes.NewBuffer(jsonValue))
if (err != nil) {
return err
return nil
`func newBot`函数完成
## 4. 设置定时任务
func main() {
defer redisClient.Close()
//每天 8点 13点 18点 自动执行爬虫和机器人
dingdingNewBot := toolbox.NewTask("dingding-news-bot", "0 0 8,13,18 * * *", newsBot)
//dingdingNewBot := toolbox.NewTask("dingding-news-bot", "0 40 */1 * * *", newsBot)
//err := dingdingNewBot.Run()
// if err != nil {
// log.Fatal(err)
// }
toolbox.AddTask("dingding-news-bot", dingdingNewBot)
defer toolbox.StopTask()
select {}
> [spec 格式是参照](https://beego.me/docs/module/toolbox.md#task)
## 5 编译运行
go build main.go
nohup ./main &
## 6 最后
