有没有办法让dplyr将数据库管道数据连接到该数据库中的新表,从不在本地下载数据?
我想做一些事情:
tbl(con,"mytable") %>% group_by(dt) %>% tally() %>% write_to(name = "mytable_2",schema = "transformed")
解决方法
虽然我完全同意学习sql的建议,但您可以利用dplyr不会提取数据直到它必须使用dplyr构建查询,添加TO TABLE子句,然后运行sql语句这一事实.使用dplyr :: do(),如:
# CREATE A DATABASE WITH A 'FLIGHTS' TABLE library(Rsqlite) library(dplyr) library(nycflights13) my_db <- src_sqlite("~/my_db.sqlite3",create = T) flights_sqlite <- copy_to(my_db,flights,temporary = FALSE,indexes = list( c("year","month","day"),"carrier","tailnum")) # BUILD A QUERY QUERY = filter(flights_sqlite,year == 2013,month == 1,day == 1) %>% select( year,month,day,carrier,dep_delay,air_time,distance) %>% mutate( speed = distance / air_time * 60) %>% arrange( year,carrier) # ADD THE "TO TABLE" CLAUSE AND EXECUTE THE QUERY do(paste(unclass(QUERY$query$sql),"TO TABLE foo"))
你甚至可以写一点功能来做到这一点:
to_table <- function(qry,tbl) dplyr::do(paste(unclass(qry$query$sql),"TO TABLE",tbl))
filter(flights_sqlite,carrier) %>% to_table('foo')