R packages

    Pooling database connections in R The goal of the pool package is to abstract away the logic of connection management and the performance cost of fetching a new connection from a remote database. These concerns are especially prominent in interactive contexts, like Shiny apps (which connect to a remote database) or even at the R console. So, while this package is of most practical value to Shiny developers, there is no harm if it is used in other contexts.
    Using DBI DBI separates the connectivity to the DBMS into a “front-end” and a “back-end”. Applications use only the exposed front-end API. The back-end facilities that communicate with specific DBMSs (SQLite, MySQL, PostgreSQL, MonetDB, etc.) are provided by drivers (other packages) that get invoked automatically through S4 methods. The following example illustrates some of the DBI capabilities: library(DBI) # Create an ephemeral in-memory RSQLite database con <- dbConnect(RSQLite::SQLite(), dbname = ":memory:") dbListTables(con) dbWriteTable(con, "mtcars", mtcars) dbListTables(con) dbListFields(con, "mtcars") dbReadTable(con, "mtcars") # You can fetch all results: res <- dbSendQuery(con, "SELECT * FROM mtcars WHERE cyl = 4") dbFetch(res) dbClearResult(res) # Or a chunk at a time res <- dbSendQuery(con, "SELECT * FROM mtcars WHERE cyl = 4") while(!
    Using an ODBC driver The odbc package provides a DBI-compliant interface to Open Database Connectivity (ODBC) drivers. It allows for an efficient, easy way to setup connection to any database using an ODBC driver, including SQL Server, Oracle, MySQL, PostgreSQL, SQLite and others. The implementation builds on the nanodbc C++ library. ODBC drivers can typically be downloaded from your database vendor, or they can be downloaded from RStudio when used with RStudio professional products.
    Using dplyr with databases As well as working with local in-memory data stored in data frames, dplyr also works with remote on-disk data stored in databases. This is particularly useful in two scenarios: Your data is already in a database. You have so much data that it does not all fit into memory simultaneously and you need to use some external storage engine. (If your data fits in memory, there is no advantage to putting it in a database; it will only be slower and more frustrating.