Cleaning up JSON before unmarshaling

polaris · 2017-10-17 09:00:13 · 629 次点击    
这是一个分享于 2017-10-17 09:00:13 的资源,其中的信息可能已经有所发展或是发生改变。

Hi!

I'm consuming two separate third-party services. First service uses PascalCase for their keys, and the second uses camelCase. I decided to standardize over snake_case.

The payload of one of these services can have up to hundreds of keys.

How can I unmarshal camelCase or snake_case, but always marshal snake_case?


评论:

TheMerovius:

If you have a field that will always be there, you can define two structs with different fields and use that field to distinguish, which to use: https://play.golang.org/p/7_YcTY4Qt2

You could also unmarshal into a map[string]interface{} and then iterate over that and, depending on the key, manually unmarshal them, or write the equivalent using reflection, or try and find a package that allows you to decode a map[string]interface{} into a struct via reflection (there probably is one).

Frankly, this is one of the reasons I've always been kind of dissatisfied with the encoding/json API; it treats how to encode/decode as a property of a type, instead of a property of the process.

fllr:

Yeah. It has been incredibly frustrating. It was disappointing noticing the discrepancies on the apis, for sure, but this should have definitely been easier.

Do you have an example of the manual unmarshal? It doesn’t seem at all straight forward is a strongly typed system (I’m coming from python)

The two struct solution will unfortunately not work for the api with hundreds of keys

TheMerovius:

The two struct solution will unfortunately not work for the api with hundreds of keys

You are going to have to define a struct anyway. Seems to be a straightforward case for copy-paste, followed by search-and-replace. If you are not defining a struct (i.e. are throwing around maps, which FTR you shouldn't do), then just use map[string]interface{} and do

for k, v := range m {
    k2 := camel2snake(k)
    if k != k2 {
        delete(m, k)
        m[k] = v
    }
}

in your unmarshal function.

Do you have an example of the manual unmarshal? It doesn’t seem at all straight forward is a strongly typed system

That's because it isn't :) If I had a simple example, I would've provided it. When I said "manually unmarshal it", I meant writing the equivalent of

var m map[string]interface{}
json.Unmarshal(data, &m)
foo.SomeKey = m["some_key"].(string)
if v, ok := m["someKey"]; ok {
    foo.SomeKey = v.(string)
}

which, of course, is "not practical for 100s of fields" :)

For the reflection-based solution, you want to do essentially the same thing that encoding/json is doing to decide where to store unmarshaled values. Unfortunately, this is non-trivial, which is why I am not-providing an example. If you want to go that route, I'd suggest to start by reading the source code of encoding/json.

But I do believe, there is a high likelihood that someone had a similar need and already wrote something like func Map2Value(m map[string]interface{}, v interface{}) error, that you could use like

var m map[string]interface{}
json.Unmarshal(data, &m)
for k, v := range m {
    k2 := camel2snake(k)
    if k != k2 {
        delete(m, k)
        m[k] = v
    }
}
var foo Foo
return Map2Value(m, &foo)

But as I never had this problem, I don't know any specific packages and don't want to recommend any I don't know. So I'd recommend googling this.

Honestly, IMO the "two structs" approach is the most sensible, even if it requires copious amounts of copy-paste. But that's just… like… um… my opinion. Man. :)

fllr:

I ended up writing my own UnmarshalJSON function, and it works like a charm! Thanks for the sensible responses! :)

drvd:

Best advice probably is : Don't. Keep each API the way it works. A snake_case standard is not helpful at all if dealing with encoding/json.

8lall0:

I agree. If you can't guarantee the same api "standard", it's just useless work.

fllr:

What to you mean? It’s not useful to try and consume two different api sets and make something useful out of it?

sacrehubert:

It’s not useful to try and consume two different api sets and make something useful out of it?

That very much depends on what you're trying to do. We need context in order to give you a good answer.

fllr:

No. It's not up to you what is necessary to make my project work.

drvd:

It’s not useful to try and consume two different api sets and make something useful out of it?

It is. But this can be done easier if consumed without further modification or rewriting (even if you call this wasteful rewriting "standardization").

fllr:

It's about keeping a consistent interface inside the project. Consistency matters. And the fact that these two services are differing is important. I found a solution, though.

drvd:

You consistent interface inside the application does not depend on the details of the external serialization format.

fllr:

That's what I'm trying to fix

hell_0n_wheel:

I decided to standardize

You're committing the classic blunder of "too many standards! I'll create my own..." as parodied by XKCD.

Unless you can give us a very compelling reason for doing so (and so far you haven't), I'd say this is your mistake right here.

fllr:

I'm not publishing this standard. This is a standard that will live inside of my environment.

Honestly, it should have been to hard to allow my service to consume datasets using multiple different key casings...

birkbork:

Instead of keeping your internal standard as snake_case keys, you could internally keep a standardized representation of the state using a struct, which can be created from either of your external api:s.

This way you can have a "standard" internal structure, that is by no means tied to either of the external api:s

fllr:

I mean... That is what I'm trying to do. How would I go about doing that?

sacrehubert:

Why is this necessary?

fllr:

I have no control over the data coming from the wire, and standardization is important

sacrehubert:

I have no control over the data coming from the wire

Right

and standardization is important

Why? What happens if it's not standardized?

fllr:

I'm going to have to remember whether or not I'm talking to service X or service Y, and thus have to remember if using PascalCase, or camelCase. Either way, it should be simple to make this easy to be consumed however I see it fit. I have 10 years of experience in the industry. I don't need anyone second guessing the needs I find in my own project just because they don't know the answer to the question I'm asking.

snippet2:

You could embed the struct but I'd just keep things simple. Encase you want to add more.


入群交流(和以上内容无关):加入Go大咖交流群,或添加微信:liuxiaoyan-s 备注:入群;或加QQ群:692541889

629 次点击  
加入收藏 微博
0 回复
暂无回复
添加一条新回复 (您需要 登录 后才能回复 没有账号 ?)
  • 请尽量让自己的回复能够对别人有帮助
  • 支持 Markdown 格式, **粗体**、~~删除线~~、`单行代码`
  • 支持 @ 本站用户;支持表情(输入 : 提示),见 Emoji cheat sheet
  • 图片支持拖拽、截图粘贴等方式上传