# Example of reading a CSV file from the data folder
data <- read_csv("r-data/your-dataset.csv")
Appendix B — Downloading and Preparing the Data
To fully engage with the exercises and examples in this book, you’ll need to download the datasets provided. The data is organized in a folder named r-data
, which contains all the files we’ll use throughout the chapters.
B.1 Downloading the Data
-
Access the Data Folder
Visit the following link to access the
r-data
folder on Google Drive:
https://bit.ly/r-data-directory or https://drive.google.com/drive/folders/1ZhI-t94uZa82KD8hEN0f1WALfCiRFWCP -
Download the
r-data
Folder- Once you’re on the Google Drive page, you should see the
r-data
folder listed. - Right-click on the
r-data
folder and select Download. - Google Drive will compress the folder into a ZIP file before downloading it to your computer.
- Once you’re on the Google Drive page, you should see the
-
Unzip the Folder
- After the download is complete, locate the ZIP file on your computer (usually in your Downloads folder).
- Extract the contents of the ZIP file:
- Windows: Right-click the ZIP file and select Extract All, then follow the prompts.
- macOS: Double-click the ZIP file to extract it.
-
Linux: Right-click and select Extract Here, or use the command line
unzip filename.zip
.
-
Verify the Contents
- Open the extracted
r-data
folder to ensure all files are present. - You should see various datasets in formats like CSV, Excel, and others, which we’ll use in different labs.
- Open the extracted
B.2 Setting Up Your Working Directory
To keep your work organized and ensure consistency across exercises, we’ll create a dedicated RStudio Project for each lab or exercise that uses data from the r-data
folder. This approach helps manage your files efficiently and ensures that your working directory is correctly set for each task.
B.2.1 Creating a New RStudio Project for Each Exercise
-
Identify the Lab or Exercise
- Determine which lab or exercise you’re working on (e.g., Lab 2, Exercise 4.1).
-
Create a Directory for the Project
- On your computer, create a new folder with a meaningful name for the lab or exercise, such as
Lab2_Project
orExercise4_1_Project
.
- On your computer, create a new folder with a meaningful name for the lab or exercise, such as
-
Copy Necessary Data Files
From the extracted
r-data
folder, copy the specific data files needed for the exercise into your new project folder.Alternatively, you can copy the entire
r-data
folder into your project directory if multiple datasets are required.
-
Create a New RStudio Project
Open RStudio.
Go to File > New Project.
Choose Existing Directory.
Browse to the directory you just created for the lab or exercise.
Select the folder and click Create Project.
-
Organize Your Project Files
-
Within your project directory, consider creating subfolders such as
data
,scripts
, andoutput
to further organize your work.Place your data files in the
data
folder.Save your R scripts in the
scripts
folder.Direct any output files (like graphs or reports) to the
output
folder.
-
-
Working Within the Project
When you open the RStudio Project, your working directory is automatically set to the project’s root directory.
When reading or writing files, use relative paths starting from the project directory to ensure your code works on any system where the project folder is set as the working directory.
Make sure to use forward slashes /
in the file path, even on Windows.
B.2.2 Benefits of Using Separate Projects for Each Exercise
Organization: Keeps your work for each lab or exercise neatly contained, preventing files from different tasks from mixing.
Reproducibility: By maintaining all necessary files within each project, you make it easier to revisit or share your work without missing dependencies.
Clarity: Helps you focus on the specific objectives of each exercise without distractions from other projects.
B.3 Data Usage and Ethics
The datasets and link provided are safe and intended for educational use in conjunction with this book to help you practice and apply the concepts covered. Please use the data responsibly and refrain from using it for any unauthorized purposes.
Privacy: Be mindful that while the datasets are fictional or anonymized, they may represent sensitive topics. Handle all data with respect and confidentiality.
Attribution: If you use the datasets in any presentations or projects outside of this book’s exercises, please acknowledge the source appropriately.
B.4 Getting Help
If you encounter any issues downloading or accessing the data:
Check Your Internet Connection: Ensure you have a stable connection when downloading the data.
Try a Different Browser: Sometimes switching browsers can resolve download issues.
By setting up the data as described, you’ll be ready to dive into the hands-on labs and fully engage with the practical exercises. Having the data organized and accessible will streamline your workflow and enhance your learning experience.
Happy analyzing!