json是elasticsearch响应的内容,有多个aggregation就会有多级buckets,数据字段无法提前预知。
```json
{
"took": 596,
"timed_out": false,
"_shards": {
"total": 11,
"successful": 11,
"failed": 0
},
"hits": {
"total": 1121497,
"max_score": 0,
"hits": []
},
"aggregations": {
"day.raw": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "20170418",
"doc_count": 1121497,
"channel.raw": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "channel01",
"doc_count": 901649,
"acttype.raw": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "show",
"doc_count": 424711,
"ct": {
"value": 143760
}
},
{
"key": "click",
"doc_count": 253006,
"ct": {
"value": 114883
}
},
{
"key": "install",
"doc_count": 139527,
"ct": {
"value": 68115
}
},
{
"key": "installed",
"doc_count": 84405,
"ct": {
"value": 49037
}
}
]
}
},
{
"key": "channel02",
"doc_count": 107639,
"acttype.raw": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "show",
"doc_count": 50364,
"ct": {
"value": 17019
}
},
{
"key": "click",
"doc_count": 32334,
"ct": {
"value": 14123
}
},
{
"key": "install",
"doc_count": 19891,
"ct": {
"value": 9259
}
},
{
"key": "installed",
"doc_count": 5050,
"ct": {
"value": 2922
}
}
]
}
},
{
"key": "channel03",
"doc_count": 69671,
"acttype.raw": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "show",
"doc_count": 26617,
"ct": {
"value": 8229
}
},
{
"key": "click",
"doc_count": 22793,
"ct": {
"value": 7812
}
},
{
"key": "install",
"doc_count": 19919,
"ct": {
"value": 6165
}
},
{
"key": "installed",
"doc_count": 342,
"ct": {
"value": 290
}
}
]
}
},
{
"key": "channel04",
"doc_count": 42511,
"acttype.raw": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "show",
"doc_count": 22565,
"ct": {
"value": 8044
}
},
{
"key": "click",
"doc_count": 11601,
"ct": {
"value": 5890
}
},
{
"key": "install",
"doc_count": 7208,
"ct": {
"value": 3802
}
},
{
"key": "installed",
"doc_count": 1137,
"ct": {
"value": 761
}
}
]
}
},
{
"key": "channel05",
"doc_count": 27,
"acttype.raw": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "show",
"doc_count": 20,
"ct": {
"value": 7
}
},
{
"key": "click",
"doc_count": 7,
"ct": {
"value": 2
}
}
]
}
}
]
}
}
]
}
}
}
```
最终能解析成二维数组,输出表格。
如
```
+--------------+-----------+----+----+
| day | channel| acttype | ct |
+--------------+-----------+----+----+
| cloudmonitor | * | 1 | 1 |
| mysql.sys | localhost | 1 | 1 |
| root | 192.% | 1 | 1 |
| root | localhost | 1 | 1 |
+--------------+-----------+----+
4 rows in set (0.00 sec)
```
json 如果不考虑字符编码问题。从第一个字符读取到最后一个字符。按照规律可以一次解析完毕的。
对于特异的json,自己实现解析比用基本库的会好一些。
#27
更多评论
buckets是个Array. key: key, doc_count, [ct], [XXX.raw]. [ ]表示可能有
xxx.raw是个Object. key: doc_count_error_upper_bound, sum_other_doc_count, buckets
aggregations是个Object. key: xxx.raw
每个都写成一个函数进行处理。前两个有相互调用的情况,但是最终buckets肯定会以没有xxx.raw项结束,不会死循环下去。
#2