class: center, middle, inverse, title-slide .title[ # Local R, R Projects, and Data Management ] .subtitle[ ## IBS 519 - week 8 - TA session ] .author[ ### Ashlyn Johnson ] .date[ ### 10/12/22 ] ---
## Agenda ### First Half of Class - How to install R and R Studio on your computer! - What are R projects and why are they useful? - How should we organize our files? - RStudio and Github ### Second Half of Class - Homework Questions - Help with R/RStudio install if you want it! --- class: inverse, center, middle # Installing R and RStudio --- ## Why? --- ## Why? - After the course is over, your access to RStudio Cloud will be limited. - 25 project hours per month - 1 GB RAM per project -- - More customization - Easier to access files on your own computer - Don't (always) need access to internet --- ## How (R first, then RStudio) <iframe src="https://rstudio-education.github.io/hopr/starting.html" width="100%" height="400px" data-external="1"></iframe> --- ## Note: - While Base R (what you just installed) comes with many libraries/packages preloaded, there are many that do not come built in. - In RStudio Cloud, we have installed packages for you. - You will need to install packages for yourself on your local version of R. ```r install.packages("package_name") library(package_name) ``` --- class: middle, center, inverse # Local R and R Studio Demo --- class: inverse, center, middle # R Projects --- ## What is an R project? -- - Each instance/assignment that you work on in RStudio Cloud is an R Project. - Working with an R project allows all of your work to be bundled together and (somewhat) self-contained. - Elements of an R project: - .Rproj file: contains project options and serves as shortcut - .Rproj.user: hidden directory/folder, contains project specific temporary files - Your files! (code, data, reports, etc. ) ??? You have seen and worked with R projects before in R studio cloud --- ## Why use an R project? - When working within a R project, the **working directory** is automatically set to the folder where the .Rproj file lives. - No longer need to set the working directory manually - Allows you to use relative instead of absolute file paths (less typing!) - All of your files for a particular scientific project/analysis are contained together, enhancing organization - Easily combine R projects with version control systems (Git and Github) - Disclaimer: If you have Rmarkdowns in an R project that live within a folder inside of your main R project folder, the working directory will revert to the folder in which the Rmarkdown lives. - Use the `here` package to circumvent this or store all of your Rmd files in the root of your R project. ??? enhancing organization keeps you from making errors and facilitates reproducibility By root of your R project, I mean the same folder where your .Rproj file lives. --- class: inverse, center, middle # R project demo --- class: inverse, center, middle # File Management within R Projects --- ### Poor file management now will haunt your future self. -- .pull-left[ <blockquote class="twitter-tweet"><p lang="en" dir="ltr">Finishing a PhD is like finishing a group project where your partner made a ton of mistakes at the beginning of the assignment. Except your partner is just you 4 years ago.</p>— John M. Mola (@_JohnMola) <a href="https://twitter.com/_JohnMola/status/1125193701205012480?ref_src=twsrc%5Etfw">May 6, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> ] -- .pull-right[ Let's be kind to future us! ] --- ## File Management within R Projects - There is no one solution - Find what works for you and be consistent - Generally, it is advisable to keep R scripts, raw data, processed data, and results in separate folders/directories --- ## Examples .pull-left.w60[ [Example 1, Krista L. Destasio](https://kdestasio.github.io/post/r_best_practices/) ``` project ├── .Rproj ├── README.txt ├── R scripts/ R markdown files ├── data: raw data ├── doc: manuscripts/summaries ├── figs: plots, images, etc. created by code ├── output: non-figures created by scripts └── src: functions ``` ] .pull-right.w40[ [Example 2, David Keyes](https://kdestasio.github.io/post/r_best_practices/) ``` project ├── .Rproj ├── README.txt ├── R scripts / Rmarkdown files ├── data-raw ├── data: processed └── R: functions ``` ] --- class: inverse, center, and middle # R and Git --- ## Version Control - Important aspect of data management is **version control** - **Version control**: system that tracks changes to file(s) over time, allowing you to revert to past versions - **Git** is software for **distributed version control** - **Github** is a website that offers a cloud-based Git repository - Importantly, Git is easily integrated with RStudio to link individual R projects with individual Github repositories ??? Mention that Git is free and open source distributed version control is when the complete codebase and it's entire history is mirrored on every developer's computer. So each user has a working copy and history. Good for collaboration. --- ## Examples - [Hadley Wickham](https://github.com/hadley) - [Melinda Higgins](https://github.com/melindahiggins2000) -- - [Me](https://github.com/ashlyngjohnson) ??? But github is not just for famous software developers. If any of you plan to continue on your coding journey, I encourage you to create a github account. For example, here is mine. Earlier this semester, Jim encouraged you all to create a personal website. And in a few different ways, a github account can facilitate that. --- ## Happy Git with R by Jenny Bryan .pull-left[ - Takes you from Git-Zero to Git-Hero - My favorite book for using Github with RStudio ] .pull-right[ <iframe src="https://happygitwithr.com/" width="100%" height="400px" data-external="1"></iframe> ] --- class: center, inverse, middle # RStudio and Git Demo --- # Reproducibility - The tools and strategies that we've talked about today will make your analyses more **reproducible** - R projects keep analyses pertaining to a particular scientific project contained - Organized files within an R project allow others and yourself to easily navigate and reproduce your code/analysis - R projects can be linked with Github repositories which facilitates version control, collaboration with others, and sharing projects online --- class: middle, center, inverse # Questions?