This vignette provides best practices for credential management to improve the ease and portability of code across analysts.
With access to data PEPFAR’s DATIM, USAID Google Drive, and, the
USAID’s Data Development Commons, analysts have different credentials to
all these systems. Passwords should never be stored in scripts and its
best not to include user names as well. Using glamr
will
systematize USAID analysts code. User names will be called the same
thing and will be stored in the same location across machines.
Let’s go through an example of how this might be used, staring by loading the packages we’ll need.
Before using credentials you’ll need to store then on your local OS.
The glamr
package utilize keyring
to store and
access the usernames and passwords, using some wrappers around
keyring
functions.
Let’s start by storing our USAID email address, which we will use to
access Google Drive. You’ll only be storing your email, not password.
This function also allows you to store your email in your .Rprofile,
allowing googledrive::drive_auth()
and
googlesheets4::gs4_auth()
to run without you having to
enter your email.
set_email("[email protected]")
Next up we’ll store our DATIM credentials. For this, you’ll need to enter you DATIM user name. When you run the code, you will get a prompt/pop-up window in RStudio to enter your password.
Lastly, if you have access to Amazon Web Services (s3) for the DDC, you can also store your s3 access and secret keys.
These functions have now passed your credentials in your operating
system’s credential manager - Credential Store on Windows and Keychain
on MacOS. We can use keyring
to see all the “services”, or
accounts, stored there.
Without using the glamr
package, analysts would have had
to write out their username and authenticate. Prior to saving and
pushing the code to GitHub, the analyst would have to remove their
email. Another analyst would have to review the code and find where to
put their email in manually if running themselves.
library(googlesheets4)
#email
email <- "[email protected]"
#authenticate
gs4_auth(email)
#specific Google Sheet Unique ID (found at the end of the URL)
sheet_id <- '5mD3ndk08Sdd3dn1dm29smD'
#read directly into R (similar to read_csv or read_xlsx)
df <- read_sheet(as_sheets_id(sheet_id), "Sheet1")
We faced a similar issue with DATIM credentials, where we had to
either write out our username and remove prior to pushing to GitHub or
some analysts stored this information in scripts in different locations
on their different machines, using different object name, eg
user
, myuser
, datim_acct
,
datim_login
, etc.
The old way of doing things was inefficient and posed a risk of
posting credentials accidentally. The previous method required storing
the username in your code and then using it to pull from an encrypted
local file that stored the password associated with the username using
glamr::mypwd()
. Each analyst would have to change the
username if they ran the code. The analyst, now, will perform the same
task, but won’t have to write our their username in the code since its
loaded into the session with load_secrets()
and always
assigned/called the same thing.
Now that they’re stored after using set_email()
and
set_datim()
, we can load up our credentials at the
beginning of our code and be available for your current R session. In
addition to storing your email, load_secrets()
will
authenticate with Google Drive if you have the googledrive
and/or googlesheets4
packages.
Your account information is stored for the session in
Options
and can be accessed directly via
getOptions("email")
or getOptions("datim")
. We
also have two wrapper functions to pull your DATIM information since you
may need to include that in an API request - datim_user()
and datim_pwd()
How does this help? Instead of having to manually enter your USAID
email, it can be loaded automatically and already authenticated for
Google API by running load_secrets()
. The user can also
specify which authentication they want to provide in a session, rather
than loading all of them. For example, you could just specify to
authenticate for use with Google Drive and Sheets -
load_secrets("email")
.
library(googlesheets4)
library(glamr)
#setup session
load_secrets()
#specific Google Sheet Unique ID (found at the end of the URL)
sheet_id <- '5mD3ndk08Sdd3dn1dm29smD'
#read directly into R (similar to read_csv or read_xlsx)
df <- read_sheet(as_sheets_id(sheet_id), "Sheet1")
#pull DATIM table of OU/country UIDs and sub-national unit levels
ou_table <- datim_outable(datim_user(), datim_pwd())