Credential Management

Introduction

This vignette provides best practices for credential management to improve the ease and portability of code across analysts.

With access to data PEPFAR’s DATIM, USAID Google Drive, and, the USAID’s Data Development Commons, analysts have different credentials to all these systems. Passwords should never be stored in scripts and its best not to include user names as well. Using glamr will systematize USAID analysts code. User names will be called the same thing and will be stored in the same location across machines.

Let’s go through an example of how this might be used, staring by loading the packages we’ll need.

library(glamr)

Store credentials

Before using credentials you’ll need to store then on your local OS. The glamr package utilize keyring to store and access the usernames and passwords, using some wrappers around keyring functions.

Let’s start by storing our USAID email address, which we will use to access Google Drive. You’ll only be storing your email, not password. This function also allows you to store your email in your .Rprofile, allowing googledrive::drive_auth() and googlesheets4::gs4_auth() to run without you having to enter your email.

set_email("[email protected]")

Next up we’ll store our DATIM credentials. For this, you’ll need to enter you DATIM user name. When you run the code, you will get a prompt/pop-up window in RStudio to enter your password.

set_datim("rshah")

Lastly, if you have access to Amazon Web Services (s3) for the DDC, you can also store your s3 access and secret keys.

set_s3access(access = "ABDCEDFF", 
             secret = "MLIZD998SD")

These functions have now passed your credentials in your operating system’s credential manager - Credential Store on Windows and Keychain on MacOS. We can use keyring to see all the “services”, or accounts, stored there.

keyring::key_list()

Old way of doing things

Without using the glamr package, analysts would have had to write out their username and authenticate. Prior to saving and pushing the code to GitHub, the analyst would have to remove their email. Another analyst would have to review the code and find where to put their email in manually if running themselves.

library(googlesheets4)

#email
  email <- "[email protected]"

#authenticate
  gs4_auth(email)

#specific Google Sheet Unique ID (found at the end of the URL)
  sheet_id <- '5mD3ndk08Sdd3dn1dm29smD'
  
#read directly into R (similar to read_csv or read_xlsx)
  df <- read_sheet(as_sheets_id(sheet_id), "Sheet1")
  

We faced a similar issue with DATIM credentials, where we had to either write out our username and remove prior to pushing to GitHub or some analysts stored this information in scripts in different locations on their different machines, using different object name, eg user, myuser, datim_acct, datim_login, etc.

#DATIM user
  user <- "rshah"

#pull DATIM table of OU/country UIDs and sub-national unit levels
  ou_table <- datim_outable(user, mypwd(user))

Accessing stored credentials

The old way of doing things was inefficient and posed a risk of posting credentials accidentally. The previous method required storing the username in your code and then using it to pull from an encrypted local file that stored the password associated with the username using glamr::mypwd(). Each analyst would have to change the username if they ran the code. The analyst, now, will perform the same task, but won’t have to write our their username in the code since its loaded into the session with load_secrets() and always assigned/called the same thing.

Now that they’re stored after using set_email() and set_datim(), we can load up our credentials at the beginning of our code and be available for your current R session. In addition to storing your email, load_secrets() will authenticate with Google Drive if you have the googledrive and/or googlesheets4 packages.

load_secrets()

Your account information is stored for the session in Options and can be accessed directly via getOptions("email") or getOptions("datim"). We also have two wrapper functions to pull your DATIM information since you may need to include that in an API request - datim_user() and datim_pwd()

How does this help? Instead of having to manually enter your USAID email, it can be loaded automatically and already authenticated for Google API by running load_secrets(). The user can also specify which authentication they want to provide in a session, rather than loading all of them. For example, you could just specify to authenticate for use with Google Drive and Sheets - load_secrets("email").

library(googlesheets4)
library(glamr)

#setup session
  load_secrets()

#specific Google Sheet Unique ID (found at the end of the URL)
  sheet_id <- '5mD3ndk08Sdd3dn1dm29smD'

#read directly into R (similar to read_csv or read_xlsx)
  df <- read_sheet(as_sheets_id(sheet_id), "Sheet1")
  
#pull DATIM table of OU/country UIDs and sub-national unit levels
  ou_table <- datim_outable(datim_user(), datim_pwd())