Why does go/token.FileSet give each file's EOF a position?

agolangf · · 425 次点击

这是一个分享于的资源，其中的信息可能已经有所发展或是发生改变。

See <a href="https://github.com/golang/go/blob/master/src/go/token/position.go#L404:" rel="nofollow">https://github.com/golang/go/blob/master/src/go/token/position.go#L404:</a> <pre><code>base += size + 1 // +1 because EOF also has a position </code></pre> <hr/>**评论：** jerf: <pre>I doubt it's anything "profound", it's probably just that they decided to A: represent EOF as a token, which is good for the parsing phase B: require that all tokens have a position and C: require that no two tokens overlap (which, if you think about it, also precludes tokens of size zero, since a size zero token would always overlap with its neighbors if you think about ranges of bytes). If you take those three properties of tokens, none of which are all that shocking on their own, all of which are sensible and are easier to deal with than trying to deal with tokens that violate those rules, then you end up implying that the EOF token must be one past the end of the file. (And presumably of length one, because why would you do it any other way? I mean, you could, but to what advantage?)</pre>indil7: <pre>I forgot there’s an EOF token. You’re probably right. Thanks!</pre>0xjnml: <pre>Because a N runes file has N + 1 scanner positions. Imagine a single letter file. In the beginning the scanner position is before the letter, after the file is scanned completely, the scanner is after the single letter (and at EOF). Those two positions are clearly distinct, aren't they?</pre>indil7: <pre>In that case, wouldn’t there need to be a position before the first letter too? Is that how the Go scanner actually works? I can imagine it instead working by starting at the letter, processing the first token, then advancing past the end of the file and stopping, no extra valid position required, it’s just past the end.</pre>0xjnml: <pre><blockquote> In that case, wouldn’t there need to be a position before the first letter too? </blockquote> That's exactly how it is. In the one-letter example, two positions are needed as discussed before: <blockquote> In the beginning the scanner position is before the letter, after the file is scanned completely, the scanner is after the single letter (and at EOF). </blockquote> <pre><code>| A | ----- ^ ^ | |__ after A, at EOF | |__ before A </code></pre> Edit: formatting</pre>indil7: <pre>But there’s only an extra position after the end. There isn’t one before the beginning too.</pre>0xjnml: <pre>One letter, two positions. One before the letter, one after the letter. Are we looking at the same picture (above)?</pre>indil7: <pre>You’ve drawn three positions for a one-byte file.</pre>0xjnml: <pre>The picture consists of a single letter <code>A</code> and two position markers <code>|</code>. The first position marker is labeled <code>before A</code>, the second position marker is labeled <code>after A, at EOF</code>. There's no third position nor third position marker in the picture.</pre>apparentlymart: <pre>Along with what others already said, I can't speak for the Go scanner in particular but in my own adventures in scanning/parsing I always give EOF a position because then you have something to report in error messages that are triggered by an unexpected EOF, without having to special-case EOF with its own error message. That is, you can report e.g. "unterminated string literal" with the source position pointed at either a real token or EOF, rather than having to have a special "unexpected EOF during string literal" error case to pass this context on to the user. Additionally, if there are text editor build/verify steps integrated with your language then it gives the editor a specific location to show the squiggly red underline when an error occurs at EOF.</pre>indil7: <pre>What’s bad about simulating getting an unexpected EOF token by detecting there are no more tokens? You can still report the same error message if you want. Wouldn’t reporting an unexpected EOF token at a file offset outside the file be invalid for an editor?</pre>

入群交流（和以上内容无关）：加入Go大咖交流群，或添加微信：liuxiaoyan-s 备注：入群；或加QQ群：692541889

425 次点击

加入收藏微博

github

0 回复

添加一条新回复（您需要登录后才能回复没有账号？）

请尽量让自己的回复能够对别人有帮助
支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
支持 @ 本站用户；支持表情（输入 : 提示），见 Emoji cheat sheet
图片支持拖拽、截图粘贴等方式上传

Why does go/token.FileSet give each file's EOF a position?

用户登录

今日阅读排行

一周阅读排行

最新主题