#### 背景
这两天尝试写个Go爬虫爬北邮人论坛,期望能登录后保存cookie,后续的访问都带着这个cookie。查看资料推荐用`net/http/cookiejar`。
目前能登录成功,获取成功登录Json。但是发现并未获取登录后的cookie,导致后续直接Get帖子正文报错**“您未登录,请登录后继续操作”**
请教各位大神,这种情况哪里出错了?
#### 实现
```
package main
import (
"net/http/cookiejar"
"net/url"
"strings"
"fmt"
"net/http"
"crypto/tls"
"io/ioutil"
)
func main() {
// init cookiejar
var cookieJar *cookiejar.Jar
cookieJar, _ = cookiejar.New(nil)
// init client with cookiejar
httpClient := &http.Client{
Jar: cookieJar,
}
// login param
postValues := url.Values{}
postValues.Set("id", "ID")
postValues.Set("passwd", "PWD")
postValues.Set("s-mode", "0")
postValues.Set("CookieDate", "3")
// request for login
httpReq, _ := http.NewRequest("POST", "https://bbs.byr.cn/user/ajax_login.json", strings.NewReader(postValues.Encode()))
httpReq.Header.Set("Content-Type", "application/x-www-form-urlencoded; param=value")
httpReq.Header.Add("X-Requested-With", "XMLHttpRequest")
httpReq.Header.Add("Connection", "keep-alive")
httpReq.Header.Add("User-Agent", "Mozilla/5.0")
httpReq.Header.Add("Referer", "https://bbs.byr.cn")
httpReq.Header.Add("Accept", "application/json, text/javascript, */*; q=0.01")
httpReq.Header.Add("authority", "bbs.byr.cn")
// for nginx/1.10
httpClient.Transport = &http.Transport{
TLSNextProto: make(map[string]func(authority string, c *tls.Conn) http.RoundTripper),
}
// login
httpResp, _ := httpClient.Do(httpReq)
fmt.Printf("req cookies: %s \n", httpReq.Cookies())
fmt.Printf("resp cookies: %s \n", httpResp.Cookies())
// request to get article content
httpReq1, _ := http.NewRequest("GET", "https://bbs.byr.cn/article/Golang/842", nil)
httpReq1.Header.Add("X-Requested-With", "XMLHttpRequest")
httpResp1, _ := httpClient.Do(httpReq1)
body, _ := ioutil.ReadAll(httpResp1.Body)
fmt.Println(string(body))
}
```
输出(可见cookie为空):
```
req cookies: []
resp cookies: []
(...省略...)
<h5>产生错误的可能原因:</h5><ul><li><samp class="ico-pos-dot"></samp>您未登录,请登录后继续操作</li>
(...省略...)
```
**困扰多时,求各位指点**
更多评论
貌似go解析Set-Cookie时认为 [ 是无效的字符,所以httpResp.Cookies()返回空,下面是把httpResp整个打印出来获取的Set-Cookie header:
"Set-Cookie":[]string{"nforum[UTMPUSERID]=guest; path=/; domain=bbs.byr.cn", "nforum[UTMPKEY]=21970208; path=/; domain=bbs.byr.cn", "nforum[UTMPNUM]=29282; path=/; domain=bbs.byr.cn", "nforum[UTMPUSERID]=guest; path=/; domain=bbs.byr.cn", "nforum[UTMPKEY]=21970208; path=/; domain=bbs.byr.cn", "nforum[UTMPNUM]=29282; path=/; domain=bbs.byr.cn"}
下面是go解析cookie时依照的RFC标准: <http://tools.ietf.org/html/rfc6265 >
cookie-pair = cookie-name "=" cookie-value
cookie-name = token
cookie-value = *cookie-octet / ( DQUOTE *cookie-octet DQUOTE )
cookie-octet = %x21 / %x23-2B / %x2D-3A / %x3C-5B / %x5D-7E
; US-ASCII characters excluding CTLs,
; whitespace DQUOTE, comma, semicolon,
; and backslash
token = 1*<any CHAR except CTLs or separators>
separators = "(" | ")" | "<" | ">" | "@"
| "," | ";" | ":" | "\" | <">
| "/" | "[" | "]" | "?" | "="
| "{" | "}" | SP | HT
#3