各语言简单爬虫
Python 简单爬虫
import requests, re
if __name__ == "__main__":
r = requests.get('http://docs.python-requests.org/zh_CN/latest/user/quickstart.html')
r.encoding = "UTF-8"
print(r.text) # 用于打印页面内容
# 正则搜索 .表示任意字符*表示任意个数,group(第一个括号)
search = re.search('href="#">(.*)</a><ul>', r.text)
print(search.group(1))
golang简单爬虫
package main
import (
"fmt"
"io/ioutil"
"net/http"
"regexp"
)
func main() {
resp, _ := http.Get("https://studygolang.com/static/pkgdoc/pkg/net_http.htm")
defer resp.Body.Close()
bytes, _ := ioutil.ReadAll(resp.Body)
re := regexp.MustCompile(`<meta name="private:description" content="(.*)">`)
b := re.FindSubmatch(bytes)[1]
fmt.Println(string(b))
}
有疑问加站长微信联系(非本文作者)