Chapter 6

The text covers the use of folder and projects. The authors provide some OS specific hints. Read the text.

What Is Real?

The point here is that having a base data file (or a bunch of data files) and the code that massages it (them), creates the dataset that is analyzed, and provides the analysis is better than just having the final data or the output. Reproducibility is a big topic these days. Projects with RMD files make reproducilibility easy.

Where Does Your Analysis Live?

Set up a project folder and stick to it. When you use RStudio to set up a project, it assosciates the working directory with the project. You can use subfolders for things like raw data and backups. Be consistent across projects and you will have a much easier time of reproducing your work when someone asks for something a couple of years after the original analysis.

Paths and Directories/Folders

Beware of the special character “\” — backslash. I agree with the authors in the use of “/” — (forward) slash. Stick with the Linux/OS X convention and you will not run into things like “\n” and “\t”. You can try to remember to double backslash everything, but at some point you will forget. And, experience suggests that debugging that pesky missing “\” is a real pain.

Use relative, not absolute, paths. If you ever lend a project to someone and they store it in a different place than you do, they will thank you for using relative paths. I have a number of partitions on my machine. My school stuff goes on my “D:” drive/partition. I generally put things in a semester folder. Within that I use different folders for different classes. Within this folder are folders for different things like TeX files, old grades, and notes. If I used absolute paths in my RMD files, you would have to replicate my folder structure — or replace all of my paths with your own. By using relative paths, we only have to agree on the file structure within the project.