The allofus package
includes several functions designed to help you manage and transfer
files between your personal workspace and a shared bucket in Google
BigQuery. Understanding the difference between these two storage
locations is crucial:
Newer (“workbench 2.0”) All of Us workspaces don’t come with a bucket
already set up, so aou_ls_bucket() and the other bucket
functions below have nothing to point to until you create one. Run this
once per workspace:
This creates (or resolves, if you’ve already created one) a bucket
named "workspace-bucket" and sets
WORKSPACE_BUCKET for you. You can also create a temporary
bucket, whose contents are automatically deleted after a couple of
weeks, for intermediate files you don’t need to keep:
You only need to do this once: the resolved bucket URL is cached to
~/.aou-env, so future R sessions pick it up automatically
without recreating anything. If you ever want to check what’s cached
(e.g., which buckets you’ve already created in this workspace), you can
read the file directly:
Use aou_ls_workspace() to list files in your workspace.
This function is handy for quickly checking which files you have stored
locally.
Similarly, aou_ls_bucket() lists files in your bucket.
This function can be used to view files that you or your collaborators
have saved for shared access.
You can also use the pattern argument with these
functions to filter the listed files based on a naming pattern.
These functions are used in conjunction with R’s reading and writing functions. You can store any type of data in both the workspace and the bucket.
Once you’ve processed or created a file in your workspace, you might
want to move it to the bucket for permanent storage or to share it with
collaborators. Use aou_workspace_to_bucket() for this
purpose.
Here’s a typical workflow using these functions:
write.csv() or
write.rds().aou_workspace_to_bucket().aou_bucket_to_workspace() to bring files into your
workspace as needed.