Connecting to Snowflake With R

Table of Contents

Introduction

As my organization Lederne are in a journey of switching from on-prem systems into Snowflake as our dataplatform, I have looked far and wide in order to find a production ready docker image that supports using R with Snowflake. As I could not find anything in my search, I decided to build an image myself. Thankfully a lot of the driver setup was already build for a Python image created by Zohar Nissare-Houssen in his snowflake-jupyter repo.

Usecases

I use R for a lot of data-wrangling, but also to help me with a lot of Data Engineering tasks. With R connected to snowflake, I have been able to:

  • Speed up the writing of some SQL queries by utilizing the DBPlyr package
  • Create and clean tables
  • Schedule tasks that builds models from SnowfLake

How to use the container

The container is open for everyone to get at My github page. Below I have written how to use the image:

  1. To build the image after pulling it from GitHub, write docker build --tag dev:latest . in your terminal where you have changed the directory to the folder the docker file is in.

  2. Write docker run -d -p 8787:8787 -e PASSWORD=1234 dev:latest

    (replace pass to your liking). If you want, you can mount a directory with a project to work on like this:

    docker run --rm -p 8787:8787 -v C:/Users/DannysComputer/Documents/rstudio:/home/rstudio -e PASSWORD=1234 dev:latest

    In order to just execute a script you can use:

    docker run dev:latest Rscript -e 'source("/home/rstudio/main.R")' 
    
  3. You can now log in at http://localhost:8787/

Connect to snowflake

Use this to set up a connection to Snowflake:

#Load libraries
library(tidyverse)
library(odbc)
library(DBI)
#log in
con <- DBI::dbConnect(
  drv    = odbc::odbc(), 
  UID    = "username", 
  PWD    = "password", 
  Server = "yourAccount.west-europe.azure.snowflakecomputing.com",
  Warehouse = 'COMPUTE_WH',
  Driver = "SnowflakeDSIIDriver",
  Database = "yourDataBase",
  Schema = "yourSchema")

Alternatively, you can edit the odbc.ini file included in this repo and connect by running:

con <- dbConnect(odbc(), "snowflake")

Using snowflake with R and Tidyverse commands

Below is some examples of things you can do after you have set up your connection with Snowflake

#Connect to a table or view
df <- tbl(con, 'yourTableName')

#Full load table
data <- df%>%collect

#Load 10 rows
data <- head(10)%>%collect

#add table
dbCreateTable(con, "iris", iris)

#remove table
dbRemoveTable(con, "iris")

#upload data
dbWriteTable(con, "iris", iris, overwrite = TRUE, row.names = FALSE)

Rounding things up

There are so many ways to interact with Snowflake. It is definitely one of my favorite platforms to work on at the moment. I personally enjoy using R for a variety of tasks, and having build this image has made things a bit easier for me. I hope this can be of use to others in the R commmunity using Snowflake.

Let me know if this post is easy to follow or if it needs additional explanation by leaving a comment.

Avatar
Danny Mawani Holmgaard
Data Engineer

I work with getting the right data into the right places, so actions can be taken from it.

comments powered by Disqus

Related