Golang Date Parsing

agolangf · · 545 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>Am I missing something in trying to parse this XML?</p> <p><a href="https://play.golang.org/p/j4DaQLYAXid" rel="nofollow">https://play.golang.org/p/j4DaQLYAXid</a></p> <p>The date attribute on schedule is getting parsed into &#34;0001-01-01 00:00:00 +0000 UTC&#34; instead of the appropriate date.</p> <hr/>**评论:**<br/><br/>dcowboy: <pre><p>Set the type of Date to string and handle any parsing needs afterward. time.Time does not play nice with XML unmarshalling.</p> <pre><code>type Schedule struct { Date string `xml:&#34;date,attr&#34;` } func main() { validXML := `&lt;schedule date=&#34;2018-01-01&#34; /&gt;` s := Schedule{} xml.Unmarshal([]byte(validXML), &amp;s) fmt.Println(s.Date) parsedDate, _ := time.Parse(&#34;2006-01-02&#34;, s.Date) fmt.Println(parsedDate) } </code></pre></pre>JGailor: <pre><p>Thanks, this is very useful. But is the expectation really that you would reparse this date everywhere you passed a reference to that struct around?</p></pre>Jemaclus: <pre><p>Try checking the Unmarshal error: <a href="https://play.golang.org/p/wz8uyts3_a4" rel="nofollow">https://play.golang.org/p/wz8uyts3_a4</a></p></pre>JGailor: <pre><p>+1 for the help and useful tip, but can Go really not parse a basic RFC 3339 spec date out of the box? This is not an uncommon way of representing dates across APIs and data formats.</p></pre>threemux: <pre><p>You keep saying this - Go can parse any time format you want, you just have to tell it what format to use. There are hundreds of time formats in use (several of which are specified using RFCs!) What are you expecting here? I&#39;d argue that RFC3339 is usually used to reference the full datetime, which this is not. Go even has a constant for an RFC3339 date format:</p> <p><a href="https://godoc.org/time#pkg-constants" rel="nofollow">https://godoc.org/time#pkg-constants</a></p></pre>JGailor: <pre><p>I&#39;m saying this because Go will already automatically parse other dates, and this particular format is called explicitly in RFC 3339:</p> <p><a href="https://tools.ietf.org/html/rfc3339" rel="nofollow">https://tools.ietf.org/html/rfc3339</a></p> <p><code>full-date = date-fullyear &#34;-&#34; date-month &#34;-&#34; date-mday</code></p> <p>I&#39;m surprised because it seems like a very arbitrary decision and is not one I&#39;ve typically encountered in other languages. I know that Go can parse arbitrary formats, but given that the Go XML parser will automatically convert other date formats it doesn&#39;t seem unusual to expect it to handle a standard.</p></pre>threemux: <pre><p>Yeah that&#39;s a part of the ABNF format given for the standard, which is this:</p> <p>date-time = full-date &#34;T&#34; full-time</p> <p>It&#39;d be a bit strange for a standard called &#34;Date and Time on the Internet: Timestamps&#34; to define a standard for just dates, no? I think that&#39;s where the disconnect really is here: you&#39;re expecting a timestamp type (called time.Time) to handle a format which omits time. I&#39;m curious what other formats it automatically converts - seems to me it (ironically) tries RFC3339 timestamp and then fails if the passed value doesn&#39;t conform. </p></pre>JGailor: <pre><p>Yeah, my mistake in a misreading.</p> <p><code>Pretty much, yes - RFC 3339 is listed as a profile of ISO 8601. Most notably RFC 3339 requires a complete representation of date and time (only fractional seconds are optional). The</code></p> <p>An assumption I had from working so long with ISO 8601, where just a date is part of the standard (as well as just a time, and a combined datetime). A bit frustrating, as there is a lot of data out there in the world that is bound to a particular day but not any particular time of day.</p></pre>threemux: <pre><p>Agreed - given your use case I can see why this is frustrating. If you get things in formats other than XML, take a look at UnmarshalText. That&#39;s what&#39;s being used in this case, as it&#39;s defined on time.Time:</p> <p><a href="https://github.com/golang/go/blob/master/src/time/time.go#L1249" rel="nofollow">https://github.com/golang/go/blob/master/src/time/time.go#L1249</a></p></pre>threemux: <pre><p>time.Time is a common format for all time types, so it&#39;s picking a default to try and parse and failing in this case. You&#39;ll need to define UnmarshalXMLAttr on a new type (or as a struct member) and then use that type as part of the unmarshal:</p> <p><a href="https://play.golang.org/p/_LOw1xB6Uxt" rel="nofollow">https://play.golang.org/p/_LOw1xB6Uxt</a></p> <p>EDIT: Forgot to mention that what you&#39;re really doing here is implementing xml.UmarshallerAttr for your type. If the value you want is in something other than an attribute, you&#39;ll need to implement xml.Unmarshaller: <a href="https://godoc.org/encoding/xml#Unmarshaler" rel="nofollow">https://godoc.org/encoding/xml#Unmarshaler</a></p></pre>JGailor: <pre><p>Got it; this is really useful to know. I was kind of pulling out my hair and getting to the point of using SAX parsing and doing most of the work by hand. I understand that the core libraries can&#39;t cover every use case, but for a &#34;batteries included&#34; language it feels like being able to handle the most common standards should be a minimum bar to clear.</p> <p>*** Also kind of a bummer that once you&#39;ve decided to implement the UnmarshalAttr interface you basically are on your own for all attributes fields.</p></pre>threemux: <pre><blockquote> <p>You are basically on your own for all attributes fields</p> </blockquote> <p>I don&#39;t follow. Do you have a large variety of different datetime formats you&#39;re trying to parse? Most attributes I imagine you&#39;ll be unmarshalling into a number or string or similar and that will work. You can reuse this type anytime you see a date in that format and it will parse correctly. </p></pre>JGailor: <pre><p>I actually do have a wide variety of date time formats to parse; these are television schedule files that are used around the world, and have different types of dates and times in them.</p></pre>JGailor: <pre><p>However, carefully reading over your code, and using the type alias for time.Time has made it a lot more clear to me how you would implement this. I appreciate the help.</p></pre>threemux: <pre><p>No problem - given your usecase this is going to be a bit more painful than parsing xml normally is. Thankfully you&#39;ll only have to do this once for each different kind of time you see. </p></pre>wittywitwitty: <pre><p><a href="http://fuckinggodateformat.com/" rel="nofollow">http://fuckinggodateformat.com/</a></p></pre>JGailor: <pre><p>I get the Go way of date formats, but this is an RFC 3339 spec date (and an ISO 8601 date) and it&#39;s more than a little surprising that Go&#39;s XML parser cannot properly deserialize these out of the box. It really feels like missing the forest for the trees.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

545 次点击  
加入收藏 微博
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传