net/html adds a body tag even if the source document don't have one?

blov · · 501 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

Writing a web scraper with Go using the <code>net/html</code> package, on one unit test I noticed that it couldn't fail as I made it to behave, even if the source document use for the test don't have a <code><body></code> tag. The node looks like this: <pre><code>&{Parent:0x18813100 FirstChild:<nil> LastChild:<nil> PrevSibling:0x18813240 NextSibling:<nil> Type:3 DataAtom:body Data:body Namespace: Attr:[]} </code></pre> and this is how the source document use for that test looks: <pre><code><!DOCTYPE html> <html> <head> <meta charset='utf-8'> </head> </html> </code></pre> That's how the net/html was made? I would like to know! :) <hr/>**评论：** HectorJ: <pre>It seems it does: <a href="https://github.com/golang/net/blob/master/html/parse.go#L678" rel="nofollow">https://github.com/golang/net/blob/master/html/parse.go#L678</a> <pre><code>p.parseImpliedToken(StartTagToken, a.Body, a.Body.String()) </code></pre> <a href="https://github.com/golang/net/blob/master/html/parse.go#L1956" rel="nofollow">https://github.com/golang/net/blob/master/html/parse.go#L1956</a> <pre><code>// parseImpliedToken parses a token as though it had appeared in the parser's // input. </code></pre></pre>HadronHubbub: <pre>In HTML, <a href="https://html.spec.whatwg.org/multipage/semantics.html#the-body-element" rel="nofollow">the body tags are optional.</a> Any HTML parser that conforms to the specification will do the same thing and give you a body element even if there are no body tags. (Unfortunately you'll find that many "HTML" parsing libraries get this wrong and basically act like they're parsing XML without draconian error-handling).</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

501 次点击

加入收藏微博

net

github

web

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

net/html adds a body tag even if the source document don't have one?

用户登录

今日阅读排行

一周阅读排行

最新主题