Why is checking whether a string is a palindrome so hard?

agolangf · · 314 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>Many of you have probably heard of this question or been assigned it in school or an interview. It seems simple:</p> <p>for i := 0; i &lt; len(str) / 2; i++ { if str[i] != str[len(str) - 1 - i] { return false } }</p> <pre><code>return true </code></pre> <p>Except it isn&#39;t outside of ASCII land. Code points vary in byte length. Graphemes vary in code point length.</p> <p>You can get something messy looking working for single code point Unicode subsets with unicode.DecodeLastRuneInString. You can get something the kind-of-sort-of working with the extended text library&#39;s Form iterator, but it isn&#39;t in-place.</p> <p>One component of what makes this problem so difficult is how difficult it is to get the number of characters in a Unicode string.</p> <p>Please advise.</p> <p>Edit: I am trying to build an in-place solution.</p> <hr/>**评论:**<br/><br/>skarlso: <pre><p>Why not just reverse it, and compare? Runes go a long way too.</p></pre>venju: <pre><p>Characters may be multiple runes long (see grapheme clusters), and reversing at the rune level can make two identical grapheme clusters differ.</p></pre>skarlso: <pre><p>That&#39;s why I&#39;m saying runes. Runes should keep the multibyte character like &#39;你&#39; in tack, I think. Though I have to cook up a short script to be absolutely sure about that. :)</p></pre>skarlso: <pre><p>Yep, this works absolutely fine:</p> <p><a href="https://play.golang.org/p/_MD94Z3qhB" rel="nofollow">https://play.golang.org/p/_MD94Z3qhB</a> Playground to prove it.</p></pre>Fwippy: <pre><p>That doesn&#39;t work for multi-rune characters: <a href="https://play.golang.org/p/-S3T_Qqh12" rel="nofollow">https://play.golang.org/p/-S3T_Qqh12</a></p> <p>Further reading: <a href="https://blog.golang.org/normalization" rel="nofollow">https://blog.golang.org/normalization</a></p> <blockquote> <p>Theoretically, there is no bound to the number of runes that can make up a Unicode character.</p> </blockquote></pre>skarlso: <pre><p>Fair enough. Nice.</p></pre>chewxy: <pre><pre><code>asRunes := []rune(str) l := len(asRunes) </code></pre></pre>robpike: <pre><p>That (converting to runes) is also wrong in the presence of combining characters.</p> <p>Human language is messy, that&#39;s all there is to it. So handling it correctly is in general also messy, and very difficult.</p> <p>To answer the original question with a question: What is a character? The answer is nothing like as simple as you&#39;d like. See <a href="https://blog.golang.org/strings" rel="nofollow">https://blog.golang.org/strings</a> for some background.</p></pre>skarlso: <pre><p>Hello Mr. Pike. I thought that using runes would solve this issue. Also, going with the rune reverse algo which is in the bare string package, would that solve the problem? I have a short playground here to prove it: <a href="https://play.golang.org/p/_MD94Z3qhB" rel="nofollow">https://play.golang.org/p/_MD94Z3qhB</a> which correctly preserve a multibyte character.</p></pre>TheMerovius: <pre><p>Again, the question is, &#34;what is a character&#34;. Example: <a href="https://play.golang.org/p/-tKQxgiezM" rel="nofollow">https://play.golang.org/p/-tKQxgiezM</a></p></pre>skarlso: <pre><p>Good point!</p></pre>chewxy: <pre><p>Agreed</p></pre>cjbprime: <pre><p>I guess I don&#39;t see the problem. Why would the divide-by-2 compare ever fail on realistic input? Yes, you could have different size characters, but if you do, I think you just guaranteed that it&#39;s not a palindrome?</p></pre>venju: <pre><p>Reversing the string at the byte level will make two identical multibyte characters differ from one another.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

314 次点击  
加入收藏 微博
0 回复
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传