我不知道如何正确地将我的
JSON数据转换为有用的数据帧.这是显示我的数据结构的一些示例数据:
{ "data":[ {"track":[ {"time":"2015","midpoint":{"x":6,"y":8},"realworld":{"x":1,"y":3},"coordinate":{"x":16,"y":38}},{"time":"2015","y":37}},{"time":"2016","y":9},"realworld":{"x":2,"y":38}} ]},{"track":[ {"time":"2015","midpoint":{"x":5,"realworld":{"x":-1,"midpoint":{"x":3,"y":15},"realworld":{"x":-9,"y":2},"coordinate":{"x":17,"y":7},"realworld":{"x":-2,"y":39}} ]}]}
我有很多曲目,我希望数据集看起来像这样:
track time midpoint realworld coordinate 1 1 1 2 2 2 2 3
到目前为止我有这个:
json_file <- "testdata.json" data <- fromJSON(json_file) data2 <- list.stack(data,fill=TRUE)
现在它出来像这样:
我怎样才能以正确的格式获得这个?
解决方法
使用fromJSON读取时添加flatten = TRUE参数.这将为您提供一个嵌套列表,其中包含三个数据帧的最深层次.使用:
library(jsonlite) # read the json jsondata <- fromJSON(txt,flatten = TRUE) # bind the dataframes in the nested 'track' list together dat <- do.call(rbind,jsondata$data$track) # add a track variable dat$track <- rep(1:length(jsondata$data$track),sapply(jsondata$data$track,nrow))
得到:
> dat time midpoint.x midpoint.y realworld.x realworld.y coordinate.x coordinate.y track 1 2015 6 8 1 3 16 38 1 2 2015 6 8 1 3 16 37 1 3 2016 6 9 2 3 16 38 1 4 2015 5 9 -1 3 16 38 2 5 2015 5 9 -1 3 16 38 2 6 2016 5 9 -1 3 16 38 2 7 2015 3 15 -9 2 17 38 2 8 2015 6 7 -2 3 16 39 3
另一种更短的方法是将jsonlite与data.table包中的rbindlist结合使用:
library(jsonlite) library(data.table) # read the json jsondata <- fromJSON(txt,flatten = TRUE) # bind the dataframes in the nested 'track' list together # and include an id-column at the same time dat <- rbindlist(jsondata$data$track,idcol = 'track')
或者以类似的方式使用dplyr包中的bind_rows:
library(dplyr) dat <- bind_rows(jsondata$data$track,.id = 'track')
使用数据:
txt <- '{ "data":[ {"track":[ {"time":"2015","y":39}} ]}]}'