Best way to process varying XML structures

polaris · · 790 次点击    
这是一个分享于 的资源,其中的信息可能已经有所发展或是发生改变。
<p>I am building a server that will receive varying structured XML payloads. Some will be similar but with added/missing nodes, some will be a totally different object.</p> <p>Luckily I can read the root node and figure out the type of object, but even then I have to first unmarshal the XML to find out what type it is.</p> <pre><code>&lt;Message Type=&#34;sometype&#34;&gt; </code></pre> <p>or</p> <pre><code>&lt;Message Type=&#34;someothertype&#34;&gt; </code></pre> <p>The built in XML processor requires a rigid struct, so I am tinkering with <a href="https://github.com/clbanning/mxj" rel="nofollow">https://github.com/clbanning/mxj</a> which allows me to decode into a map[string]interface{} type. This is pretty ugly still and requires multiple node and type checks.</p> <pre><code>mv, _ := mxj.NewMapXml([]byte(XML)) var t map[string]interface{} if mv[&#34;Message&#34;] != nil { t = mv[&#34;Message&#34;].(map[string]interface{}) if t[&#34;-Type&#34;] != nil { switch t[&#34;-Type&#34;] { case &#34;sometype&#34;: processSomeType(XML) case &#34;someothertype: processSomeOtherType(XML) } } } </code></pre> <p>Is there a better way than decoding twice? I think if I can accurately pull the Type out without decoding the entire thing into map[string]interface{} then re-decoding it into one of my rigid structs I will be fine. </p> <p>One thought was to just build a very basic parser to scan for the type and then decode to whatever struct matches.</p> <hr/>**评论:**<br/><br/>JokerSp3: <pre><p>I will try to get you a code example later but look at using a custom unmarshal function on a struct that can handle all message types</p></pre>jerf: <pre><p>First, just to set expectations, any convenient mechanism for dealing with XML must necessarily fail on some XML.</p> <p>But it sounds to me like you will probably be OK with encoding/xml, because the Decoder object has what seems to be a little-understood feature, which is that it can be used as a hybrid of a SAX parser using <code>Token</code> and a struct decoder using <code>DecodeElement</code>. See for example <a href="https://play.golang.org/p/4RE1JpEqVf" rel="nofollow">https://play.golang.org/p/4RE1JpEqVf</a> .</p> <p>This allows you to examine any wrapper elements up to and including the first element of the thing you want to parse before having to decide on a type, without double-parsing.</p> <p>The downside is you will do a bit of parsing yourself. You may want to write some convenience functions for doing things like &#34;get me the next start element&#34;, since <code>Token</code> will return all the whitespace as char data, for instance.</p></pre>RwKroon: <pre><p>It&#39;s not an option to have clients send a header with message type?</p></pre>NikkoTheGreeko: <pre><p>No, It&#39;s data coming from a Fortune 50 company. Having them change the API is highly unlikely.</p> <p>To top of my frustration, it&#39;s a brand new system that was just launched and they decided to use XML instead of JSON.</p></pre>RwKroon: <pre><p>Is there some plaintext you can grep on maybe? Cheaper than double decoding...</p></pre>clbanning: <pre><p>clbanning/mxj is not the best for a simple switch{} statement - try something like <a href="https://play.golang.org/p/rKaJVqypx_" rel="nofollow">https://play.golang.org/p/rKaJVqypx_</a></p></pre>NikkoTheGreeko: <pre><p>So you suggest unmarshaling twice?</p></pre>clbanning: <pre><p>Stylistically, that&#39;s the simplest unless you want to work with a map[string]interface{} value. If all the &#34;types&#34; are known your code will be much cleaner. clbanning/mxj was originally designed for a message bus where messages could have any structure the connected services wanted to use - there are lots of other examples in clbanning/mxj/examples subdirectory, as well.</p></pre>NikkoTheGreeko: <pre><p>Ya, this is a message bus. I am pretty sure I know ahead of time all the possible different structures, but there are dozens of combinations.</p></pre>

入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

790 次点击  
加入收藏 微博
0 回复
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传