<p>Hi everyone, </p>
<p>I'm trying to go through a file through a Scanner or a Reader line by line, then check if each line matches a certain pattern. </p>
<p>Once this is done, I'd like to record the position of the line to edit it, but file.Seek(0,1) doesn't work well, as Scanner and Reader are buffered and will only show the position of their buffer (every 4096 bytes if I'm not wrong) which does not help a lot. </p>
<p>Any idea about how I could tackle that? Thank you!</p>
<p>Edit: why the downvotes, did I miss something? </p>
<hr/>**评论:**<br/><br/>TheMerovius: <pre><p>You can simply <a href="https://play.golang.org/p/71SU2zAw1oe" rel="nofollow">count the bytes</a> you get from the Scanner. This still has a caveat, though, if you have non-unix line endings. A way around that is to supply <a href="https://play.golang.org/p/pc27YHolgfv" rel="nofollow">your own SplitFunc</a> and have that take track -- but I found the SplitFunc API to be somewhat hard to use and it includes a bunch of subtleties. Be aware, that the code I linked isn't actually well-tested, so it might be totally wrong.</p>
<p>Also, as others have pointed out, keep in mind that editing files is non-trivial, unless your replacements are 1:1 in number of bytes. Like, if you want to delete (or shorten) a line, you'd have to move all bytes <em>after</em> that line forward and truncate the file. If you want to insert text, you have to actually take care of buffering the overwritten contents. Like, in a sense, files behave like <code>[]byte</code>: You can <code>append</code>, you can <code>copy</code>, but you can't delete or insert (and the <a href="https://github.com/golang/go/wiki/SliceTricks#insert" rel="nofollow">common tricks</a> for slices use the fact that the language will hide a bunch of the book-keeping and buffering for you, which you'll have to do manually). Of course, if you'd change the length of a file, you'd also invalidate all the other line-offsets <em>after</em> it.</p>
<p>So, unless you always want to do 1:1 replacement, it's <em>far</em> easier, more sensible and probably just as performant, to write a new file line-by-line and rename it over the old one.</p></pre>Vaglame: <pre><p>Oh thank you very much that's exactly what I was looking for!</p></pre>iCurlmyster: <pre><p>If I understand your question correctly you could open it as an *os.File object and use WriteAt. I mean you would have to record the byte position though</p></pre>Vaglame: <pre><p>Thanks for the answer. It's not exactly the writing part that bothers me, it is the reading part. I have found no way to keep track of the actual position when reading a file with a Scanner or a Reader. </p>
<p>For example if I have two lines </p>
<pre><code>this is line one
this is line two
</code></pre>
<p>and I do scanner.Scan(), even if I only get the text of the first line, the file.Seek methods returns the offset of the end of the file.</p></pre>iCurlmyster: <pre><p>Oh okay, yeah, I read it wrong then.</p>
<p>The only thing I can think of right now is you could keep track of how many bytes you are reading with Reader.Read. But other than that I can’t think of anything off the top of my head with scanner, but I also don’t claim to be a golang guru. </p>
<p>Sorry I can’t be more help than that.</p></pre>Vaglame: <pre><p>I'll give it a try, thanks :)</p></pre>albatr0s: <pre><p>It's much easier to open a second file and write the results there, once you are done you just rename it.</p></pre>shovelpost: <pre><p>You could use <a href="https://golang.org/pkg/bufio/#example_Scanner_lines" rel="nofollow">Scanner</a> which by default reads the input as a set of lines.</p></pre>Vaglame: <pre><p>Yes but it reads it using a buffer, which makes it very difficult to record its actual position in the file, so far I haven't found a seek() function that could say what the Scanner has read so far. In short I'm looking for something like this: <a href="https://golang.org/pkg/text/scanner/#Scanner.Pos" rel="nofollow">https://golang.org/pkg/text/scanner/#Scanner.Pos</a>, but for bufio.</p></pre>shovelpost: <pre><blockquote>
<p>I'm trying to go through a file through a Scanner or a Reader line by line, then check if each line matches a certain pattern.</p>
<p>Once this is done, I'd like to record the position of the line to edit it</p>
</blockquote>
<p>It might be helpful to tell us why you want to do this. Manipulating files has never been the easiest of tasks. Depending on what you are trying to achieve there might be a better way.</p></pre>Vaglame: <pre><p>I'm trying to keep a record of ip adresses of a few devices. I write them in a file with this format:</p>
<pre><code>device1 address1
device2 address2
</code></pre>
<p>etc.</p>
<p>So, every time this address changes, I (try to) update this information in the file</p></pre>shovelpost: <pre><blockquote>
<p>I'm trying to keep a record of ip adresses of a few devices. I write them in a file with this format:</p>
<pre><code>device1 address1
device2 address2
</code></pre>
<p>etc.</p>
<p>So, every time this address changes, I (try to) update this information in the file</p>
</blockquote>
<p>Based on what you said, I can't see any good reason to save the information in a file. You're only making your life harder.</p>
<p>Save yourself from all the trouble and use an embedded database like <a href="https://github.com/boltdb/bolt" rel="nofollow">Bolt</a>.</p></pre>Vaglame: <pre><p>I'll give it a try, thank you :) </p></pre>tgaz: <pre><p>If your file is line-based text, you usually don't edit in-place since line lengths generally change (as <a href="/u/albatr0s" rel="nofollow">/u/albatr0s</a> points out)...</p>
<p>So for a generic "framework/scaffolding" for this problem, I'd use the scanner and output the result to a new temporary file, and when you hit the thing you want to change, you output that instead while discarding the input. Then rename the temp file to the old file. Remove the temp file on error (use a <code>defer</code> and just ignore the error).</p>
<p>Oh, and I have no idea why you were downvoted. Maybe because the question is about a trivial problem and better suited for StackOverflow or such. But the downvoting feels like it breaks the first subreddit rule.</p></pre>Vaglame: <pre><p>Thanks a lot. Would you consider this solution even if the file must accessed to relatively frequently? </p></pre>tgaz: <pre><p>It's the only sane option in a POSIX environment, so yes. Depends on the file size, I guess. But if you care about performance, you should probably not be using a text format anyway. More importantly, it's the only atomic way of modifying a file, which solves concurrency correctness.</p>
<p>Just use a buffered writer, or you might pay a penalty for writing the small <code>Scanner</code> fragments.</p></pre>Vaglame: <pre><p>What format would you advise instead of the text one? </p></pre>tgaz: <pre><p>That's impossible to say without knowing your use-case. If this is data you have to interface with some other system, you may of course not have a choice.</p>
<p>I've never had the need to modify a text file from Go. For quick things, I'd normally use a Shell script with <code>awk</code> or <code>sed</code>, or a Python script. Manipulating text in Go isn't that bad, but for simple text file manipulation, Python code is definitely nicer.</p></pre>0xjnml: <pre><p><code>sed</code></p></pre>
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889
0 回复
- 请尽量让自己的回复能够对别人有帮助
- 支持 Markdown 格式, **粗体**、~~删除线~~、
`单行代码`
- 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
- 图片支持拖拽、截图粘贴等方式上传