Connecting to Snowflake With R
Table of Contents
Introduction
As my organization Lederne are in a journey of switching from on-prem systems into Snowflake as our dataplatform, I have looked far and wide in order to find a production ready docker image that supports using R with Snowflake. As I could not find anything in my search, I decided to build an image myself. Thankfully a lot of the driver setup was already build for a Python image created by Zohar Nissare-Houssen in his snowflake-jupyter repo.
Usecases
I use R for a lot of data-wrangling, but also to help me with a lot of Data Engineering tasks. With R connected to snowflake, I have been able to:
- Speed up the writing of some SQL queries by utilizing the DBPlyr package
- Create and clean tables
- Schedule tasks that builds models from SnowfLake
How to use the container
The container is open for everyone to get at My github page. Below I have written how to use the image:
-
To build the image after pulling it from GitHub, write
docker build --tag dev:latest .
in your terminal where you have changed the directory to the folder the docker file is in. -
Write
docker run -d -p 8787:8787 -e PASSWORD=1234 dev:latest
(replace pass to your liking). If you want, you can mount a directory with a project to work on like this:
docker run --rm -p 8787:8787 -v C:/Users/DannysComputer/Documents/rstudio:/home/rstudio -e PASSWORD=1234 dev:latest
In order to just execute a script you can use:
docker run dev:latest Rscript -e 'source("/home/rstudio/main.R")'
-
You can now log in at http://localhost:8787/
Connect to snowflake
Use this to set up a connection to Snowflake:
#Load libraries
library(tidyverse)
library(odbc)
library(DBI)
#log in
con <- DBI::dbConnect(
drv = odbc::odbc(),
UID = "username",
PWD = "password",
Server = "yourAccount.west-europe.azure.snowflakecomputing.com",
Warehouse = 'COMPUTE_WH',
Driver = "SnowflakeDSIIDriver",
Database = "yourDataBase",
Schema = "yourSchema")
Alternatively, you can edit the odbc.ini file included in this repo and connect by running:
con <- dbConnect(odbc(), "snowflake")
Using snowflake with R and Tidyverse commands
Below is some examples of things you can do after you have set up your connection with Snowflake
#Connect to a table or view
df <- tbl(con, 'yourTableName')
#Full load table
data <- df%>%collect
#Load 10 rows
data <- head(10)%>%collect
#add table
dbCreateTable(con, "iris", iris)
#remove table
dbRemoveTable(con, "iris")
#upload data
dbWriteTable(con, "iris", iris, overwrite = TRUE, row.names = FALSE)
Rounding things up
There are so many ways to interact with Snowflake. It is definitely one of my favorite platforms to work on at the moment. I personally enjoy using R for a variety of tasks, and having build this image has made things a bit easier for me. I hope this can be of use to others in the R commmunity using Snowflake.
Let me know if this post is easy to follow or if it needs additional explanation by leaving a comment.