2 Some Basics
“Learning to write programs stretches your mind, and helps you think better."
- Bill Gates, 1955-
2.1 First Steps
Upon opening R in Windows, two things will appear in the console of the R Graphical User Interface (R-GUI)17. These are the license disclaimer (blue text at the top of the console) and the command line prompt, i.e., \(\boldsymbol{>}\) (Fig 2.1). The prompt indicates that R is ready for a command. All commands in R must begin at \(\boldsymbol{>}\).
The default appearance of this simple interface will vary slightly among operating systems. In the Windows R-GUI, the command line prompt and user commands are colored red (Fig 2.1), and output, including errors and warnings, are colored blue. In Mac OS, the command line prompt will be purple, user inputs will be blue, and output will be black. In Unix/Linux, wherein R will generally run from a shell command line, absent of any menus, all three will be black18. These console defaults can often be greatly modified and customized when using R with an appropriate Integrated Development Environment (IDE) (Section 2.7.3).
We can exit R at any time by typing q()
in the console, closing the GUI window (non-Linux only), or by selecting Exit from the pulldown File menu (non-Linux only).

Figure 2.1: An aged, but still recognizable R console: R version 2.15.1, ‘Roasted Marshmallows’, ca. 2012.
2.2 First Operations
As an introduction we can use R to evaluate a simple mathematical expression. Type 2 + 2
and press Enter.
2 + 2
[1] 4
The output term [1]
means, “this is the first requested element.” In this case there is just one requested element, \(4\), the solution to \(2 + 2\). If the output elements cannot be held on a single console line, then R would begin the second line of output with the element number comprising the first element of the new line. For instance, the command rnorm(20)
will take 20 pseudo-random samples (see footnote in Section 9.6.7) from a standard normal distribution (see Ch 3 in Aho (2014)). We have:
rnorm(20)
[1] -1.07681600 -0.12916910 0.43102091 -0.59504704 0.28437423
[6] 2.09798848 -0.47150646 0.66189228 -0.50411240 0.57065698
[11] 0.19438861 -0.09344153 0.75595706 -3.18338031 -2.13038113
[16] -0.66466199 -0.47244628 -0.46971852 0.28603184 -0.43981785
The reappearance of the command line prompt indicates that R is ready for another command. Multiple commands can be entered on a single line, separated by semicolons. Note, however, that this is considered poor programming style, as it may make your code more difficult to understand by a third party.
2 + 2; 3 + 2
[1] 4
[1] 5
R commands are generally insensitive to spaces. This allows the use of spaces to make code more legible. To my eyes, the command 2 + 2
is simply easier to read and debug than 2+2
.
2.2.1 Use Your Scroll Keys
As with many other command line environments, the scroll keys (Fig 2.2) provide an important shortcut in R. Instead of editing a line of code by tediously mouse-searching for an earlier command to copy, paste and then modify, you can simply scroll back through your earlier work using the upper scroll key, i.e., \(\uparrow\) . Accordingly, scrolling down using \(\downarrow\) will allow you to move forward through earlier commands.

Figure 2.2: Typical scroll direction keys on a keyboard.
2.2.2 Note to Self: #
R will not recognize commands preceded by #
. As a result this is a good way for us to leave messages to ourselves.
# Note at beginning of line
2 + 2
[1] 4
We can even place comments in the middle of an expression, as long the expression is finished on a new line.
2 + # Note in middle of line
+ 2
[1] 4
In the “best” code writing style it is recommended that one place a space after #
before beginning a comment, and to insert two spaces following code before placing #
in the middle of a line. This convention is followed above.
2.2.3 Unfinished Commands
R will be unable to move on to a new task when a command line is unfinished. For example, type
and press Enter. We note that the continuation prompt, +
, is now where the command prompt should be. R is telling us the command is unfinished. We can get back to the command prompt by finishing the function, clicking Misc\(>\)Stop current computation or Misc\(>\)Stop all computations from the R-toolbar (non-Linux only), typing Ctrl + C (Linux), or by pressing the Esc key (all OS).
2.3 Expressions, Assignments and Objects
All entries in R are either expressions or assignments. If an entry is an expression, it will be evaluated, printed, and discarded. Examples include: 2 + 2
. Conversely, an assignment evaluates an expression, and assigns a label to the expression output, thereby creating an R-object. This important activity has prompted the motto: “everything created or loaded in R is an object.”19
To create an object, we use the assignment operator: <-
. The operator represents an arrow that points toward a user-defined expression label.
Example 2.1 \(\text{}\)
To create an R-object named x
, that contains the result of the expression 2 + 2
, I can type:
x <- 2 + 2
The code: x <- 2 + 2
literally means: “x
is \(2 + 2\).”
The assignment operator can go on either side of an expression. Thus, as an alternative, I could have typed:
2 + 2 -> x
The leftward assignment operator, <-
, is generally used instead of the rightward, ->
, because it is easier to conceptualize the relationship object name -> object definiton
.
\(\blacksquare\)
Results of an assignment are generally not automatically printed. However, for most common object classes (see Section 2.3.4) summaries can be easily obtained20.
Example 2.2 \(\text{}\)
To print the result of Example 2.1 (to see the object x
), I can simply type its name:
x
[1] 4
or
print(x)
[1] 4
\(\blacksquare\)
Notably, for Example 2.1 above, we could have typed x = 2 + 2
with the same assignment results.
x = 2 + 2
x
[1] 4
The equals sign, however, has limited applicability as an assignment operator21. Thus, in this document, I use <-
for object assignments, and save =
for specifying arguments in R functions (Ch 8).
Note that the R-console can quickly become cluttered and confusing. To remove console text (without actually getting rid of any of the objects created in a session) press Ctrl + L or, from the Edit pulldown menu, click Clear console (non-Linux only).
2.3.1 Naming Objects
When assigning names to R-objects we should try to keep the names simple, and avoid names that already represent important definitions and functions. These include: TRUE, FALSE, NULL, NA, NaN,
and Inf
. In addition, we cannot have names:
- beginning with a numeric value,
- containing spaces, colons, and semicolons,
- containing mathematical operators (e.g.,
*
,+
,-
,^
,/
,=
), - containing important R metacharacters (e.g.,
@
,#
,?
,!
,%
,&
,|
).
However, even these “forbidden” names and characters can be used if one encloses them in backticks, also called accent grave characters. For example, the code, `?` <- 2 + 2
will create an object named `?`
, containing the number 4.
Names should, if possible, be descriptive. Thus, for a object containing 20 random observations from a normal distribution, the name rN20
may be superior to the easily-typed, but anonymous name, x
. Finally, we should remember that R is case sensitive. That is, each of the following \(2^4\) combinations will be recognized as distinct: name, Name, nAme, naMe, namE, NAme, nAMe, naME, NaMe, nAmE, NamE, naME, NAMe, nAME, NaME, NAmE, NAME
.
2.3.2 Listing Objects
The lexical scoping characteristics of R (Section 1.4.1.2) have important consequences when considering objects and their names. An object name will be tied to a particular R environment –a specialized storage system whose features are formally considered, alongside R functions, in Ch 8.
The R session itself is defined to be the so-called global environment: .GlobalEnv
.
`
<environment: R_GlobalEnv>
Only objects in the current (caller) environment can be directly accessed by calling their names22. By default, an object will be assigned by R to the environment where it was defined, although this can be modified. I can list objects within particular environments using the function objects()
or ls()
.
Example 2.3 \(\text{}\)
Object searches from objects()
and ls()
are limited, by default, to the current environment –which, for this document, is the global environment. Currently, I only have x
(which has been applied and modified several times) in GlobalEnv
.
objects()
[1] "x"
\(\blacksquare\)
2.3.3 Combining Data
To define a collection of numbers (or other data or objects) as a single entity one can use the important R function c()
, which means “combine”.
Example 2.4 \(\text{}\)
To define the numbers 23, 34, and 10 collectively as an object named x
, I would type:
x <- c(23, 34, 10)
We could then do something like:
x + 7
[1] 30 41 17
Note that seven was added to each element in x
.
\(\blacksquare\)
2.3.4 Object Classes
Under the idiom of object oriented programming (OOP), an object may have attributes that allow it to be evaluated correctly, and associated methods appropriate for those attributes (e.g., specific functions for plotting, printing, etc.)23.
R objects will generally have a class, identifiable with the function class()
.
class(x)
[1] "numeric"
Objects in class numeric
and several other common classes can be evaluated mathematically. Common R classes are shown in Table 2.1. We will create objects from all of these classes, and learn about their characteristics over the next few chapters. We will also learn how to create our own personalized classes and associated methods (Ch 8).
Class | Example | |
---|---|---|
1 | logical |
x <- TRUE |
2 | numeric |
x <- 2 + 2 |
3 | integer |
x <- 1:3 |
4 | character |
x <- c("a","b","c") |
6 | complex |
x <- 5i |
13 | raw |
x <- raw(2) |
7 | expression |
x <- expression(x * 4) |
12 | list |
x <- list() |
5 | factor |
x <- factor("a","a","b") |
8 | function |
x <- function(y)y + 1 |
9 | matrix |
x <- matrix(nrow = 2, rnorm(4)) |
10 | array |
x <- array(rnorm(8), c(2, 2, 2)) |
11 | data.frame |
x <- data.frame(v1 = c(1,2), v2 = c("a","b")) |
2.3.5 Object Base Types
All R objects will have so-called base types that define their underlying C language data structures24. There are currently 24 base types used by R (R Core Team 2024a), and it is unlikely that more will be developed in the near future (Wickham 2019). These entities are listed in Table 2.2. The meaning and usage of some of the base types may seem clear, for instance, integer
and character
, which are also class designations (Table 2.1). Several base types are specifically addressed in later chapters, including list
, complex
, logical
, integer
, and NULL
(Chs 3 and 7), character
and language
(Chs 4 and 5), closure
, special
, builtin
, environment
, pairlist
, S4
, promise
, and symbol
(Ch 8) and raw
and double
(Ch 12). Base types meant for C-internal processes, i.e., any
, bytecode
, promise
, ...
, weakref
, externalptr
, and char
, are not easily accessible with interpreted R code (R Core Team 2024b).
Base type | Example | Application |
---|---|---|
NULL |
x <- NULL |
vectors |
logical |
x <- TRUE |
vectors |
integer |
x <- 1L |
vectors |
complex |
x <- 1i |
vectors |
double |
x <- 1 |
vectors |
list |
x <- list() |
vectors |
character |
x <- "a" |
vectors |
raw |
x <- raw(2) |
vectors |
closure |
x <- function(y)y + 1 |
closure functions |
special |
x <- `[`
|
special functions |
builtin |
x <- sum |
builtin functions |
expression |
x <- expression(x * 4) |
expressions |
environment |
x <- globalenv() |
environments |
symbol |
x <- quote(a) |
language components |
language |
x <- quote(a + 1) |
language components |
pairlist |
x <- formals(mean) |
language components |
S4 |
x <- stats4::mle(function(x=1)x^2) |
non-simple objects |
any |
No example | C-internal |
bytecode |
No example | C-internal |
promise |
No example | C-internal |
... |
No example | C-internal |
weakref |
No example | C-internal |
externalptr |
No example | C-internal |
char |
No example | C-internal |
Base types of numeric objects define their storage mode, i.e., the way R caches them in its primary memory25. Base types can be identified using the function typeof()
.
Example 2.5 \(\text{}\)
For example:
typeof(x)
[1] "double"
We see that x
has storage mode double
, meaning that its numeric values are stored using up to 53 bits, resulting in recognizable and distinguishable values between approximately \(5 \times 10^{-323}\) and \(2 \times 10^{307}\) (see Ch 12 for more information).
The R session itself (the global environment) has base type environment
:
typeof(.GlobalEnv)
[1] "environment"
\(\blacksquare\)
2.3.6 Object Attributes
Many R-objects will also have attributes (i.e., characteristics particular to the object or object class).
Example 2.6 \(\text{}\)
Typing:
attributes(x)
NULL
indicates that x
(as defined in Example 2.4) does not have additional attributes. However, using coercion (Section 3.3.4) we can define x
to be an object of class matrix
(a collection of data in a row and column format (see Section 3.1.2)).
attributes(as.matrix(x))
$dim
[1] 3 1
Now x
has the attribute dim
(i.e., dimension). Specifically,
x
is a three-celled matrix. It has three rows and one column.
\(\blacksquare\)
Amazingly, classes and attributes allow R to simultaneously store and distinguish objects with the same name. For instance:
[1] 2
[1] 2
In general, it is not advisable to name objects after frequently used functions. Nonetheless, the function mean()
, which calculates the arithmetic mean of a collection of data, is distinguishable from the new user-created object mean
, because these objects have different identifiable class characteristics. We can remove the user-created object mean
, with the function rm()
. This leaves behind only the function mean()
.
rm(mean)
mean
function (x, ...)
UseMethod("mean")
<bytecode: 0x000001b7b181e7e8>
<environment: namespace:base>
The process of how these objects are distinguished by R is further elaborated in Section 8.8.
2.4 Getting Help
There is no single perfect source for information/documentation for all aspects of R. Detailed manuals from CRAN are available concerning the R language definition, basic operations, and package development. These resources, however, often assume a familiarity with Unix/Linux operating systems and computer science terminology. Thus, they may not be particularly helpful to biologists who are new to R.
2.4.1 help()
and ?
A comprehensive help system is available for many R components including operators, and loaded package dataframes and functions. The system can be accessed via the question mark, ?
, operator and the function help()
.
Example 2.7 \(\text{}\)
For instance, if I wanted to know more about the plot()
function, I could type:
?plot
or
help(plot)
\(\blacksquare\)
Documentation for packaged R functions (Section 3.5) must include an annotated description of function arguments, along with other pertinent information, and documentation for packaged datasets must include descriptions of dataset variables26. The quality of documentation will generally be excellent for functions from packages in the default R download (i.e., the R-distribution packages, see Section 3.5), but will vary from package to package otherwise. A list of arguments for a function, and their default values, can (often) be obtained using the function formals()
.
formals(plot)
$x
$y
$...
For help and documentation concerning programming metacharacters used in R (for instance @
, #
, ?
, !
, %
, &
, |
), one would enclose the metacharacters with quotes. For example, to find out more information about the logical operator &
I could type help("&")
or ? "&"
. Placing two question marks in front of a topic will cause R to search for help files concerning with respect to all packages in a workstation.
Example 2.8 \(\text{}\)
For instance, type:
??lm
or, alternatively
help.search(lm)
for a huge number of help files on linear model functions identified through fuzzy matching.
\(\blacksquare\)
Help for particular R-questions can often be found online using the search engine at http://search.r-project.org/. This link is provided in the Help pulldown menu in the R console (non-Linux only). Helpful online discussions can also be found at Stack Overflow, and Stats Exchange.
2.4.2 demo()
and example()
The function demo()
allows one access to coded examples that developers have worked out for a particular function or topic. For instance, type:
demo(graphics)
for a brief demonstration of R graphics. Typing
demo(persp)
will provide a demonstration of 3D perspective plots. And, typing:
demo(Hershey)
will provide a demonstration of available modifiable symbols from the Hershey family of fonts (see Ch 6 in Hershey (1967)). Finally, typing:
demo()
lists all of the demos available in the loaded libraries for a particular workstation. The function example()
usually provides less involved demonstrations from the man
package directories (short for user manual, see Ch 10) in an R package. For instance, type:
example(plotmath)
for a coded demonstration of mathematical graphics.
2.4.3 Vignettes
R packages often contain vignettes. These are short documents that generally describe the theory underlying algorithms and guidance on how to correctly use package functions. Vignettes can be accessed with the function
vignette()
. To view all vignettes for all installed packages (Section 3.5.1), type:
vignette(all = TRUE)
To view all vignettes available for loaded packages (see Section 3.5.2), type:
vignette(all = FALSE)
To view vignettes for the R contributed package asbio (following its installation), type:
vignette(package = "asbio")
To see the vignette simpson
in package asbio, type:
vignette("simpson", package = "asbio")
The function browseVignettes()
provides an HTML-browser that allows interactive vignette searches.
2.5 Options
To enhance an R session, we can adjust the appearance of the R-console and customize options that affect expression output. These include the characteristics of the graphics devices, the width of print output in the R-console, and the number of print lines and print digits. Changes to some of these parameters can be made by going to Edit\(>\)GUI Preferences in the R-toolbar. Many other parameters can be changed using the options()
function. To see all alterable options one can type:
options()
The resulting list is extensive. To modify options, one would simply define the desired change within parentheses following a call to options
. For instance, to see the default number of digits, I would type:
options("digits")
$digits
[1] 7
To change the default number of digits in output from 7 to 5 in the current session, I would type:
options(digits = 5)
# demonstration using pi
pi
[1] 3.1416
One can revert back to default options by restarting an R session.
2.5.1 Advanced Options
To store user-defined options and start up procedures, an.Rprofile
file will exist in your R program etc directory. This location would be something like: \(\ldots\)R/R-version/etc. R will silently run commands in the .Rprofile
file upon opening. Thus, by customizing the .Rprofile
file one can “permanently” set session options, load installed packages, define your favorite package repository (Section 3.5), and even create aliases and defaults for frequently used functions.
The .Rprofile
file located in the etc directory is the so-called .Rprofile.site
file. Additional .Rprofile
files can be placed in the working directory (see below). R will check for these and run them after running the .Rprofile.site
file.
Example 2.9 \(\text{}\)
Here is the content of one of my current .Rprofile
files.
options(repos = structure(c("http://ftp.osuosl.org/pub/cran/")))
.First <- function(){
library(asbio)
cat("\nWelcome to R Ken! ", date(), "\n")
}
.Last <- function(){
cat("\nGoodbye Ken", date(), "\n")
}
The command options(repos = structure(c("http://ftp.osuosl.org/pub/cran/")))
(Line 1) defines my preferred CRAN repository mirror site (Section 3.5). The function .First( )
(Lines 2-5) will be run at the start of the R session and .Last( )
(Lines 6-8) will be run at the end of the session. R functions will formally introduced in Ch 8. As we go through this book it will become clear that these lines of code force R to say hello, and to load the package asbio, and print the date/time (using the function date()
) when it opens, and to say goodbye, and print the date/time when it closes (although the farewell will only be seen when running R from a shell interface, e.g., the Windows Command Prompt).
\(\blacksquare\)
One can create .Rprofile
files, and many other types of R extension files using the function file.create()
. For instance, the code:
file.create("defaults.Rprofile")
will place an empty, editable,.Rprofile
file called defaults
in the working directory.
2.6 The Working Directory
By default, the R working directory is set to be the home directory of the workstation. The command getwd()
shows the current file path for the working directory.
getwd()
[1] "C:/Users/ahoken/Documents/GitHub/Amalgam"
The working directory can be changed with the command setwd(filepath)
, where filepath
is the location of the desired directory, or by using pulldown menus, i.e., File\(>\)Change dir (non-Linux only). Because R developed under Unix, we must specify directory hierarchies using forward slashes or by doubling backslashes.
Example 2.10 \(\text{}\)
To establish a working directory file path to the Windows directory:
C:\Users\User\Documents, I would type:
setwd("C:/Users/User/Documents")
or
setwd("C:\\Users\\User\\Documents")
\(\blacksquare\)
2.7 Saving and Loading Your Work
As noted in Ch 1, an R session is allocated with a fixed amount of memory that is managed in an on-the-fly manner. An unfortunate consequence of this is that if R crashes, all unsaved information from the work session will be lost. Thus, session work should be saved often. Note that R will not give a warning if you are writing over session files from the R console. The old file will simply be replaced. Three general approaches for saving non-graphics data are possible. These are: 1) saving the history, 2) saving objects, and 3) saving R script. All three of these operations can be greatly facilitated by using an R integrated development environment (IDE) like RStudio (Section 2.9).
2.7.1 R History
To view the history (i.e., the commands that have been used in a session) one can use history(n)
where n
is the number of previous command lines one wishes to see27. For instance, to see the last three commands, one would type28:
history(3)
To save the session history in Windows one can use File\(>\)Save History or the function savehistory()
. For instance, to save the session history to the working directory under the name history1
, I could type:
savehistory(file = "history1.Rhistory")
We can view the code in this file from any text editor. To load the history from a previous session one can use File\(>\)Load History (non-Linux only) or the function
loadhistory()
. For instance, to load history1
I would type:
loadhistory(file = "history1.Rhistory")
To save the history at the end of (almost) every interactive Windows or Unix-alike R session, one can alter the .Rprofile
file .Last
function to include:
.Last <- function() if(interactive()) try(savehistory("~/.Rhistory"))
2.7.2 R Objects
To save all of the objects available in the current R-session one can use File\(>\)Save Workspace (non-Linux only), or simply type:
This procedure saves session objects to the working directory as a nameless file using an .RData
extension. The file will be opened, silently, with the inception of the next R- session, and cause objects used or created in the previous session to be available. Indeed, R will automatically execute all .RData
files in the working directory for use in a session. Stored .RData
files can also be loaded using File\(>\)Load Workspace (non-Linux only). One can also save .RData
objects to a specific directory location and use a specific file name using: File\(>\)Save Workspace, or with the flexible function save()
.
R data file formats, including .rda, and .RData, (extensions for R data files), and .R (the format for R scripts), can be read into R using the function load()
. Users new to a command line environment will be reassured by typing:
load(file.choose())
The function file.choose()
will allow one to browse interactively for files to load using dialog boxes. Detailed procedures for importing (reading) and exporting (saving) data with a row and column format, and an explicit delimiter (e.g. .csv files) are described in Ch 3.
2.7.3 R Scripts
To save an R script as an executable source file, it is best to use an Integrated Development Environment (IDE) compatible with R. R contains its own IDE, the R-editor, which is useful for writing, editing, and saving scripts as .r extension files. To access the R-editor go to File\(>\)New script (non-Linux only) or type the shortcut Ctrl + F + N (Fig 2.3). Code written in the R IDE can be sent directly to the R-console by copying and pasting or by selecting code and using the shortcut Ctrl + R.

Figure 2.3: The R-editor providing code for a famous computational exercise.
Aside from the R-editor, a number of other IDEs outside of allow straightforward generation of R script files, and a direct link between text editors, that provide syntax highlighting for R code, and the R-console itself. These include RWinEdt (an R package plugin for WinEdt ), Tinn-R, a recursive acronym for Tinn is not Notepad, ESS (Emacs Speaks Statistics), Jupyter Notebook, a web-based IDE originally designed for Python, but useful for many languages, and particularly RStudio, which will be introduced later in this chapter29.
Saved R scripts can be called and executed using the function source()
. To browse interactively for source code files, one can type:
or go to File\(>\)Source R code.
2.8 Basic Mathematics
A large number of mathematical operators and functions are available with a conventional download of R.
Elementary mathematical operators, common mathematical constants, trigonometric functions, derivative functions, integration approaches, and basic statistical functions are shown in shown in Tables 2.3 - 2.9.
2.8.1 Elementary Operations
Operator | Operation | To find: | We type: |
---|---|---|---|
+ |
addition | \(2 + 2\) | 2 + 2 |
- |
subtraction | \(2 - 2\) | 2 - 2 |
* |
multiplication | \(2 \times 2\) | 2 * 2 |
/ |
division | \(\frac{2}{3}\) | 2/3 |
%% |
modulo | remainder of \(\frac{5}{2}\) | 5%%2 |
%/% |
integer division | \(\frac{5}{2}\) without remainder | 5%/%2 |
^ |
exponentiation | \(2^3\) | 2^3 |
abs(x) |
\(\mid x \mid\) | \(\mid -23.7 \mid\) | abs(-23.7) |
round(x, digits = d) |
round \(x\) to \(d\) digits | round \(-23.71\) to 1 digit | round(-23.71, 1) |
ceiling(x) |
round \(x\) up to closest whole num. | ceiling(2.3) | ceiling(2.3) |
floor(x) |
round \(x\) down to closest whole num. | floor(2.3) | floor(2.3) |
sqrt(x) |
\(\sqrt{x}\) | \(\sqrt{2}\) | sqrt(2) |
log(x) |
\(\log_e{x}\) | \(\log_e{5}\) | log(5) |
log(x, base = b) |
\(\log_b{x}\) | \(\log_{10}{5}\) | log(5, base = 10) |
factorial(x) |
\(x!\) | \(5!\) | factorial(5) |
gamma(x) |
\(\Gamma(x)\) | \(\Gamma(3.2)\) | gamma(3.2) |
choose(n,x) |
\(\binom{n}{x}\) | \(\binom{5}{2}\) | choose(5,2) |
sum(x) |
\(\sum_{i=1}^{n}x_i\) | sum of x
|
sum(x) |
cumsum(x) |
cumulative sum | cum. sum of x
|
cumsum(x) |
prod(x) |
\(\prod_{i=1}^{n}x_i\) | product of x
|
prod(x) |
cumprod(x) |
cumulative product | cum. prod. of x
|
cumprod(x) |
2.8.2 Associativity and Precedence
Note that the operation:
2 + 6 * 5
[1] 32
is equivalent to \(2 + (6 \cdot 5) = 32\). This is because the *
operator gets higher priority (precedence) than +
. Evaluation precedence can be modified with parentheses:
(2 + 6) * 5
[1] 40
In the absence of operator precedence, mathematical operations in R are (generally) read from left to right (that is, their associativity is from left to right) (Table 2.4). This corresponds to the conventional order of operations in mathematics. For instance:
2 + 2^(2 + 1)
[1] 10
Precedent | Operator | Description | Associativity |
---|---|---|---|
1 | ^ |
exponent | right to left |
2 | %% |
modulo | left to right |
3 |
* /
|
multiplication, division | left to right |
4 |
+ -
|
addition, subtraction | left to right |
2.8.3 Function Arguments
R functions generally require a user to specify arguments (in parentheses) following the function name. For instance, sqrt()
and factorial()
each require one argument, a call to data itself. Thus, to solve \(1/\sqrt{22!}\), I could type:
[1] 2.9827e-11
To solve \(\Gamma \left( \sqrt[3]{23\pi} \right)\), I could type:
gamma((23 * pi)^(1/3))
[1] 7.411
By default the function log()
computes natural logarithms, i.e.,
[1] 1
The log()
function can also compute logarithms to a particular base by specifying the base in an optional second argument called base
. For instance, to solve the operation: \(\log_{10}3 + \log_{3}5\), one could type:
[1] 1.9421
Arguments can be specified by the order that they occur in the list of arguments in the function code, or by calling the argument by name. In the code above I know that the first argument in log()
is a call to data, and the second argument defines the base. I may not, however, remember the argument order in a function, or may wish to only change certain arguments from a large allotment. In this case it is better to specify an argument by calling its name and defining its value with an equals sign.
[1] 1.9421
2.8.4 Constants
R allows easy access to most conventional constants (Table 2.5).
Operator | Operation | To find: | We type: |
---|---|---|---|
-Inf |
\(-\infty\) | \(-\infty\) | -Inf |
Inf |
\(\infty\) | \(\infty\) | Inf |
pi |
\(\pi = 3.141593 \dots\) | \(\pi\) | pi |
exp(1) |
\(e = 2.718282 \dots\) | \(e\) | exp(1) |
exp(x) |
\(e^x\) | \(e^3\) | exp(3) |
2.8.5 Trigonometry
R assumes that the inputs for trigonometric functions are in radians. Of course degrees can be obtained from radians using \(Degrees = Radians \times 180/\pi\), or conversely \(Radians = Degrees \times \pi /180\) (Table 2.6).
Operator | Operation | To find: | We type: |
---|---|---|---|
cos(x) |
\(\text{cos}(x)\) | \(\text{cos}(3 \text{ rad.})\) | cos(3) |
sin(x) |
\(\text{sin}(x)\) | \(\text{sin}(45^{\circ})\) | sin(45 * pi/180) |
tan(x) |
\(\text{tan}(x)\) | \(\text{tan}(3 \text{ rad.})\) | tan(3) |
acos(x) |
\(\text{acos}(x)\) | \(\text{acos}(45^{\circ})\) | acos(45 * pi/180) |
asin(x) |
\(\text{asin}(x)\) | \(\text{asin}(3 \text{ rad.})\) | asin(3) |
atan(x) |
\(\text{atan}(x)\) | \(\text{atan}(45^{\circ})\) | atan(45 * pi/180) |
2.8.6 Derivatives
The function D()
finds symbolic and numerical derivatives of simple expressions. It requires two arguments, 1) a mathematical function specified as an object of class expression
, and 2) the variable name in the differential (the denominator in the difference quotient).
Objects of class expression
, can be created using the function expression()
, and evaluated with the function eval()
).
Example 2.11 \(\text{}\)
Here is an example of how the functions expression()
and eval()
are used:
eval(expression(2 + 2))
[1] 4
Of course we wouldn’t bother to use expression()
and eval()
in such simple applications.
\(\blacksquare\)
Table 2.7 contains specific examples using D()
.
To find: | We type: |
---|---|
\(\frac{d}{dx}5x\) | D(expression(5 * x), "x") |
\(\frac{d^2}{dx^2} 5x^2\) | D(D(expression(5 * x^2), "x"), "x") |
\(\frac{\partial}{\partial x} 5xy + y\) | D(expression(5 * x * y + y), "x") |
Example 2.12 \(\text{}\)
Thus, to solve:
\[\frac{d}{dx} 20x^{-4}\]
I could use:
e <- expression(20 * x^(-4))
D(e, "x")
20 * (x^((-4) - 1) * (-4))
Unfortunately, it is left to us to simplify the ugly output. That is, \[\begin{aligned} \frac{d}{dx}(20x^{-4}) &= \\ &= 20 \times (x^{(-4) - 1)} \times (-4))\\ &= -80x^{-5} \\ &= -\frac{80}{x^5} \end{aligned}\]
\(\blacksquare\)
Several other R functions provide tidier derivative results compared to D()
, although they require the installation and loading of additional packages (see Section 9.6.2). For instance, the function Deriv()
, from the package Deriv can be applied using two approaches30.
- Under the first approach a differentiable function is defined as an R
function
(see Ch 8) whose one argument is the variable name in the differential. This function is then used as the single required argument inDeriv()
. - With the second approach a differentiable function is defined as a character string. This is then used as the first argument in
Deriv()
. The variable name in the differential is defined in a second argument.
Example 2.13 \(\text{}\)
To obtain the derivative in Example 2.12 using Deriv()
we would first install the Deriv package (for instance using: install.packages("Deriv")
) and load the
package:
library(Deriv)
Under the first approach we could then type:
d <- Deriv(function(x) 20 * x^(-4))
d
function (x)
-(80/x^5)
Note that the output, d
, is a function, allowing one to obtain instantaneous slopes for specified x
values.
d(c(-1, 2, 3, 5.2))
[1] 80.000000 -2.500000 -0.329218 -0.021041
Under the second approach, we could specify
Deriv("20 * x^(-4)", "x")
[1] "-(80/x^5)"
Note that the output is a character string.
Both approaches allow one to obtain higher order derivatives and partial derivatives. For instance,
Deriv(d) # second derivative
function (x)
400/x^6
[1] "400/x^6"
D()
results can also be simplified directly with function Simplify()
from the package Deriv. For the current Example one could use:
e <- expression(20 * x^(-4))
Simplify(D(e, "x"))
-(80/x^5)
\(\blacksquare\)
2.8.7 Integration
The function integrate
solves definite integrals. It requires three arguments. The first is an R function defining the integrand. The second and third are the lower and upper bounds of integration.
Example 2.14 \(\text{}\)
To solve:
\[\int^4_2 3x^2dx\]
we could type:
f <- function(x){3 * x^2}
integrate(f, 2, 4)
56 with absolute error < 6.2e-13
\(\blacksquare\)
R functions are explicitly addressed in Ch 8.
2.8.8 Statistics
R, of course, contains a huge number of statistical functions. These will generally require sample data for summarization. Data can be brought into R from spreadsheet files or other data storage files (we will learn how to do this shortly). As we have learned, data can also be assembled in R. For instance,
x <- c(1, 2, 3)
Statistical estimators can be separated into point estimators, which estimate an underlying parameter that has a single true value (from a Frequentist viewpoint), and intervallic estimators, which estimate the bounds of an interval that is expected, preceding sampling, to contain a parameter at some probability (Aho 2014). Point estimators can be further classified as estimators of location, scale, shape, and order statistics (Table 2.8). Measures of location estimate the typical or central value from a sample. Examples include the arithmetic mean and the sample median. Measures of scale quantify data variability or dispersion. Examples include the sample standard deviation and the sample interquartile range (IQR). Shape estimators describe the shape (i.e., symmetry and peakedness) of a data distribution. Examples include the sample skewness and sample kurtosis. Finally, the \(k\)th order statistic of a sample is equal to its \(k\)th-smallest value. Examples include the data minimum, the data maximum, and other quantiles (including the median). Intervallic estimators include confidence intervals (Table 2.9). A huge number of other statistical estimating, modelling, and hypothesis testing algorithms are also available for the R environment. For guidance, see Venables and Ripley (2002), Aho (2014), and Fox and Weisberg (2019), among others.
Function | Acronym | Description | Estimator type |
---|---|---|---|
mean(x) |
\(\bar{x}\) | arithmetic mean of \(x\) | location |
mean(x, trim = t) |
trimmed mean of \(x\) for \(0 \leq t \leq 1\). | location | |
asbio::G.mean(x) |
\(GM\) | geometric mean of \(x\) | location |
asbio::H.mean(x) |
\(HM\) | harmonic mean of \(x\) | location |
median(x) |
\(\tilde{x}\) | median of \(x\) | location order statistic |
asbio::Mode(x) |
\(mode(x)\) | mode of \(x\) | location |
sd(x) |
\(s\) | standard deviation of \(x\) | scale |
var(x) |
\(s^2\) | variance of \(x\) | scale |
cov(x, y) |
\(cov(x,y)\) | covariance of \(x\) and \(y\) | scale |
cor(x, y) |
\(r_{x,y}\) | Pearson correlation of \(x\) and \(y\) | scale |
IQR(x) |
\(IQR\) | interquartile range of \(x\) | scale order statistic |
mad(x) |
\(MAD\) | median absolute deviation of \(x\) | scale |
asbio::skew(x) |
\(g_1\) | skew of \(x\) | shape |
asbio::kurt(x) |
\(g_2\) | kurtosis of \(x\) | shape |
min(x) |
\(min(x)\) | min of \(x\) | order statistic |
max(x) |
\(max(x)\) | max of \(x\) | order statistic |
quantile(x, prob = p) |
\(\hat{F}^{-1}(p)\) | quantile of \(x\) at lower-tailed probability \(p\) | order statistic |
Function | Description |
---|---|
asbio::ci.mu.z(x, conf, sigma) |
Conf. int. for \(\mu\) at level conf . True SD = sigma . |
asbio::ci.mu.t(x, conf) |
Conf. int. for \(\mu\) at level conf . \(\sigma\) unknown. |
asbio::ci.median(x, conf) |
Conf. int. for true median at level conf . |
2.9 RStudio
RStudio is an open source IDE for R (Fig 2.4). RStudio greatly facilitates writing R code, saving and examining R objects and history, and many other processes. These include, but are not limited to, documenting session workflows, writing R package documentation, calling and receiving code from other languages, and even developing web-based graphical user interfaces. RStudio can currently be downloaded at (https://posit.co/products/open-source/rstudio/). Like R itself, RStudio can be used with Windows, Mac, and Unix/Linux operating systems. Unlike R, RStudio has both freeware and commercial versions31. We will use the former here.

Figure 2.4: The RStudio logo.
RStudio is generally implemented using a four pane workspace (Fig 2.5). These are: 1) the code editor, 2) the R-console, 3) the environment and histories panel, and 4) the plots and other miscellany panel. Tabs in panels may vary to a small degree depending on the source code being edited, and whether an RStudio project is open (Section 2.9.1).

Figure 2.5: Interfaces for RStudio 2023.06.2 Build 561.
The RStudio Code Editor panel (Fig 2.5, Panel 1) allows one to create R scripts and even scripts for other languages that can be called to and from R (Ch 9). The code panel can also be used to create and edit session documentation files (see Section 2.9.2 below) and other important R file types. A new R script can be created for editing within the code editor by going to File\(>\)New\(>\)R Script. Commands from an R script can be sent to the R console using the shortcut Ctrl + Enter (Windows and Linux) or Cmd + Enter (Mac).
The R-console panel (Fig 2.5, Panel 2) by default, is identical in functionality to the R console of the most recent version of R on your workstation (assuming that all of the paths and environments are set up correctly on your computer). Thus, the console panel can be used directly for typing and executing R code, or for receiving commands from the code editor (Panel 1).
The Environments and History panel (Fig 2.5, Panel 3) can be used to: 1) show a list of R objects available in your R session (the Environment tab), or 2) show, search, and select from the history of all previous commands (History tab). This panel also provides an interface for point and click import of data files including .csv, .xls, and many other file formats (Import Dataset pulldown within the Environment tab).
The Plots and Miscellany panel (Fig 2.5, Panel 4) can be used to show: 1) files in the working directory, 2) a scrollable history of plots and image files, and 3) a list of available packages (via the Packages tab), with facilities for updating and installing packages. If a package is in the GUI list, then the package is currently loaded. Packages and their installation, updating, and loading are formally introduced in Section 3.5. The panel’s Files pulldown tab allows straightforward establishment of working directories (although this can still be done at the command line using
setwd()
) (Fig 2.7). The panel’s Help tap opens automatically when uses?
orhelp
for particular R topics (Section 2.4).
CAUTION!
Be very careful when managing files in the Plots and Miscellany panel, as you can permanently delete files without (currently) the possibility of recovery from a Recycling Bin.
2.9.1 RStudio Project
An RStudio project can be be created via the File pulldown menu (Fig 2.7). A project allows all related files (data, figures, summaries, etc.) to be easily organized together by setting the working directory to be the location of the project .Rproj file.
2.9.2 Workflow Documentation
We can document workflow and simultaneously run/test R session code by either:
- Creating an R Markdown32 .rmd file that can be compiled to generate an .html, .pdf, or MS Word\(^{\circledR}\) .doc file, or
- Using Sweave, an approach that implements the LaTeX33 document preparation system.
2.9.2.1 R Markdown
The R Markdown document processing workflow in RStudio is shown Fig 2.6. These steps are highly modifiable, but can also be run in a more or less automated manner, requiring little understanding of underlying processes.

Figure 2.6: The process of document creation in R Markdown. Functions in the package rmarkdown control conversion of .rmd files to Markdown .md files, using utilities in the package knitr. Pandoc first creates a .tex file when rendering LaTeX PDF documents.
Use of R Markdown and .rmd files requires the package rmarkdown (Allaire et al. 2024), which comes pre-installed in RStudio.
As an initial step, all underlying .rmd files must include a brief YAML34 header (see below) containing document metadata. The remainder of the .rmd document will contain text written in Markdown syntax, and code chunks. The knit()
function from package knitr Xie (2015), also installed with RStudio, executes all evaluable code within chunks, and formats the code and output for processing within Pandoc, a program for converting markup files from one language to another35. Pandoc uses the YAML header to guide this conversion. As an example, if one has requested HTML output, the simple Markdown text: This is a script
will be converted to the HTML formatted: <p>This is a script</p>
. One can also write HTML script or CSS code36 directly into an .rmd document (see Section 11.5).
If the desired output is PDF, Pandoc will convert the .md file into a temporary .tex file, which is then processed by the LaTex typesetting system. Support for LaTeX can be found at its official website, and at a large number of informal user-driven venues, including Stack Exchange and Overleaf, an online LaTeX application. LaTeX will compile the .tex file into a .pdf file. In this process, the tinytex package (Xie 2024), which installs the stripped-down LaTeX distribution TinyTex, can be used.
Creating an R Markdown document is simple in RStudio. We first open an empty .rmd document by navigating to File \(>\) New File \(>\)R Markdown (Fig 2.7).

Figure 2.7: Part of the RStudio File pulldown menu.
You will delivered to the GUI shown in Fig 2.8. Note that by default Markdown compilation generates an HTML document.

Figure 2.8: RStudio GUI for creating an R Markdown document.
The GUI opens a R Markdown (.rmd) skeleton document with a tentative YAML header.

Figure 2.9: YAML header to an R Markdown (.rmd) skeleton document.
Among other options37, the default HTML output can be changed to one of:
output: pdf_document
to create a LaTex \(\rightarrow\) PDF document, or
output: word_document
to create a Word\(^{\circledR}\) document.
Unformatted text can generally be written directly into an R Markdown document. There are a number of useful Markdown shortcuts for creating headings, formatted text, and other content.
- Pound signs (e.g.,
#
,##
,###
) can be used as (increasingly nested) hierarchical section delimiters. - Italic, bold, and monospace code fonts can be specified by enclosing text in asterisks, double asterisks, and back ticks, respectively. That is,
*italic*
,**bold**
, and`code`
result in: italic, bold, andcode
. - Unordered lists can be created with newlines preceded with asterisks,
*
, and ordered lists can be specified with newlines beginning with numbers, e.g.,1.
,2.
, etc.
A brief introduction to R Markdown can be found at: http://rmarkdown.rstudio.com. A thorough description of R Markdown is given in Xie, Allaire, and Grolemund (2018) and Xie, Dervieux, and Riederer (2020). The latter text is currently available as an online resource.
2.9.2.1.1 R code in R Markdown chunks
The knitr R package facilitates report building in both HTML and LaTeX \(\rightarrow\) PDF formats, within the framework of rmarkdown (Fig 2.6). Under knitr, R Markdown lines beginning ```{r }
and ending ```
delimit an R code “chunk” to be potentially run in R.
Example 2.15 \(\text{}\)
For example, the chunk:
would prompt knitr to: 1) show the code in an appropriate highlighted style, 2) run the code in R (i.e., take the mean of the three numbers), and 3) print the evaluation result.
\(\blacksquare\)
The chunk header, ```{r }
, can contain additional options. These include suppressing code evaluation, ```{r , eval = F}
, and/or suppressing code printing, ```{r , echo = F}
. For a complete list of chunk options, run
str(knitr::opts_chunk$get())
Code chunks can be generated by going to Code\(>\)Insert Chunk or by using the RStudio shortcut Ctrl + Alt + I (Windows and Linux) or Cmd + Alt + I (Mac).
R code can also be invoked inline in a R Markdown document using the format:
`r some code`
For instance, I could seamlessly place three random numbers generated from a the continuous uniform distribution, \(f(x) = UNIF(0,1)\), inline into text using:
`r runif(3)`
Here I run an iteration using “hidden” inline R code: 0.05143, 0.27118, 0.48557.
2.9.2.1.2 Equations
Inline equations for both R Markdown and Sweave (discussed below) can be specified under the LaTeX system, which uses dollar signs, $
, to delimit equations. For instance, to obtain the inline equation: \(P(\theta|y) = \frac{P(y|\theta)P(\theta)}{P(y)}\), i.e., Bayes theorem, I could type the LaTeX script into R Markdown:
$P(\theta|y) = \frac{P(y|\theta)P(\theta)}{P(y)}$
Display-style equations can be specified with two dollar signs, $$
. For instance, $$P(\theta|y) = \frac{P(y|\theta)P(\theta)}{P(y)}$$
results in:
\[P(\theta|y) = \frac{P(y|\theta)P(\theta)}{P(y)}\]
A cheatsheet for LaTeX equation writing can be found here.
2.9.2.1.3 Figures
Probably the simplest way to place external figures into a document is by applying the function knitr::include_graphics()
from within a chunk. The following R Markdown code would insert Fig1.jpg
(contained in the working directory) into an R Markdown document.
Figures can also be generated from the execution of R plotting functions (see Ch 6, 7). For instance, the following R Markdown code would place a simple R-generated scatterplot into the document:
2.9.2.1.4 Tables
R Markdown tables can be created by specifying the following format (outside of a chunk).
First Header | Second Header
------------- | -------------
Content Cell | Content Cell
Content Cell | Content Cell
Tables, however, can also be be generated by executing R functions within chunks. I generally use the function knitr::kable()
to create R Markdown \(\rightarrow\) Pandoc \(\rightarrow\) HTML tables because it is relatively simple to use, and allows straightforward tabling of R output.
Example 2.16 \(\text{}\)
Table 2.10, shows data from the Loblolly
dataset in the package datasets. The data track the growth of loblolly pine trees (Pinus taeda) with respect to seed type and age. The function head()
, nested in kable()
, allows one to access the first or last components of an R data storage object. By default, head()
returns the first six values (in this case, the first six dataframe rows).
height | age | Seed | |
---|---|---|---|
1 | 4.51 | 3 | 301 |
15 | 10.89 | 5 | 301 |
29 | 28.72 | 10 | 301 |
43 | 41.74 | 15 | 301 |
57 | 52.70 | 20 | 301 |
71 | 60.92 | 25 | 301 |
\(\blacksquare\)
I often use functions in the package xtable to build R Markdown \(\rightarrow\) Pandoc \(\rightarrow\) LaTeX \(\rightarrow\) PDF tables. Under this approach, one could create Table 2.10 using:
This method would also require that one use the command results = 'asis'
in the chunk options.
One can even call for different table approaches on the fly. For instance, I could use the command eval = knitr::is_html_output())
, in the options of a Markdown chunk when using table code that optimizes HTML formatting, and use eval = knitr::is_latex_output())
to create a table that optimizes LaTeX formatting.
Aside from knitr::kable()
and xtable, there are many other R functions and packages that can be used to create R Markdown tables, particularly for HTML output. These include:
- The kableExtra (Zhu et al. 2022) package extends
knitr::kable()
by including styles for fonts, features for specific rows, columns, and cells, and straightforward merging and grouping of rows and/or columns. Most kableExtra features extend to both HTML and PDF formats. - DT (Xie, Cheng, and Tan 2024), a wrapper for HTML tables that uses the JavaScript (see Section 11.3) library DataTables. Among other features, DT allows straightforward implementation in interactive Shiny apps (Section 11.5).
- Like DT, the reactable package (Lin 2023) creates flexible, interactive HTML embedded tables. As with DT, reactable tables add complications when those interactives are considered as conventional tables in R markdown, with captions and referable labels.
Xie, Dervieux, and Riederer (2020) discuss several other alternatives.
Below I use the function reactable()
from the reactable package to create a table with sortable columns and scrollable rows (Table 2.11).
# install.packages("reactable")
library(reactable)
reactable(Loblolly, pagination = FALSE, highlight = TRUE, height = 250)
Example 2.17 \(\text{}\)
An R Markdown (.rmd) skeleton file generated by RStudio (Figs 2.7-2.9) contains documentation text, interspersed with example R code in chunks. These been have been modified below to create a simple R markdown document for summarizing the Loblolly
dataset (Fig 2.10).

Figure 2.10: An R Markdown (.rmd) file with documentation text and interspersed R code in chunks.
Note the use of echo = FALSE
in the final chunk to suppress printing of R code. A snapshot of the knitted HTML is shown in Fig 2.11.

Figure 2.11: An HTML document knit from Markdown code in the previous figure. Note that code is displayed (by default) as well as executed.
\(\blacksquare\)
2.9.2.1.5 bookdown
A large number of useful auxiliary features are available for R Markdown, through the R package bookdown (Xie (2023)). These include an extended capacity for figure, table, and section numbering and referencing. The bookdown package is not pre-installed in RStudio, and will require user-installation. See Section 9.6.2 for more information on loading and installing packages.
install.packages("bookdown") # install bookdown package
To use bookdown we must modify the output:
designation in the YAML header to have a bookdown-specific output. For instance,
output: bookdown::html_document2
to create an HTML document, or
output: bookdown::pdf_document2
to create a LaTeX \(\rightarrow\) PDF document. or
output: bookdown::word_document2
to create an MS Word\(^{\circledR}\) document38.
Numbering R-generated plots and tables in R in bookdown requires specification of a chunk label after the language reference, e.g., r
, in the chunk generating the plot ot table. Importantly, many table generating R functions (e.g., knitr::kable()
and xtable::xtable()
, see below) also contain a label
argument that allows referencing and numbering.
Example 2.18 \(\text{}\)
In the chunk header below I use the label lobplot
. Note that a space is included after r
. Captions can be specified in the chunk header using the chunk option fig.cap
or tab.cap
for figures and tables, respectively. The option fig.cap
is used below:
```{r lobplot, echo=FALSE, fig.cap= "Loblolly pine height versus age."}
\(\blacksquare\)
Cross-references within the text can be made using the syntax \@ref(type:label)
, where label
is the chunk label and type
is the environment being referenced (e.g., fig
, tab
, or eq
). For Example 2.18, we might want to type something like: “see Figure \@ ref(fig:lobplot)
.” in some non-chunk component of the Markdown document.
Specification of a bookdown output format, will result in automated numbering of sections39. To turn this numbering off, one could modify the YAML output to be:
output:
bookdown::html_document2:
number_sections: false
The code indents shown above are important because YAML, like the language Python, uses significant indentation.
To omit numbering for certain sections, one would retain the default bookdown output, and add {-}
after the unnumbered section heading, e.g.,
# This section is unnumbered {-}
Thorough guidance for bookdown is provided in Xie (2016), which can be viewed as an open-source online document.
2.9.2.2 Sweave
Under the Sweave documentation approach, high quality PDF documents are generated from LaTeX .tex files, which in turn are created from Sweave .rnw files. A skeleton .rnw document can be generated in RStudio by going to File\(>\)New File\(>\)R Sweave40.
2.9.2.2.1 R code in Sweave chunks
Sweave chunks can be implemented using knitr-style formatting, or with formatting under the function Sweave()
(Leisch 2002). Switching between these formats in RStudio requires altering options in Build\(>\)Configure Build Tools\(>\)Sweave.
In RStudio, Sweave code chunks are initiated which <<>>=
, which serves as a chunk header, and are closed with @
.
Example 2.19 \(\text{}\)
Including the chunk below in an .rnw file would: 1) cause the R source code to be printed in a LaTeX-rendered PDF, 2) run the code in R (the mean of the three number would be calculated), and 3) print the evaluated result in the output PDF.
\(\blacksquare\)
Chunk options in Sweave()
are often similar to those in knitr, but are more limited (see vignette("Sweave")
).
Example 2.20 \(\text{}\)
In Fig 2.12 I create an .rnw file, based on an RStudio skeleton, with text and analyses reflecting those used with R Markdown in Example 2.17. We note that instead of the Markdown YAML header, we now have lines in the preamble defining the type of desired document (e.g., article) and the LaTeX packages needed for document compilation (e.g., amsmath). All non-chunk text, including figure and table captions and cross-referencing must follow LaTeX guidelines.

Figure 2.12: A Sweave (.rnw) file with documentation text and interspersed code in chunks.
Fig 2.13 shows a snapshot of the result, following automated .rnw \(\rightarrow\) knitr \(\rightarrow\) LaTeX \(\rightarrow\) .pdf compilation in RStudio.

Figure 2.13: A .pdf document resulting from compilation of Sweave code in the previous figure.
\(\blacksquare\)
2.9.2.3 Purl
R chunk code can be extracted from an .rmd or an .rnw file using the function knitr::purl()
. For instance, assume that the R Markdown loblolly pine summary shown in Fig 2.10 is saved in the working directory under the name lob.rmd
. Code from the file will be extracted to a script file called lob.R
, located in the working directory, if one types:
purl("lob.rmd")
Exercises
-
Create an R Markdown document to contain your homework assignment. Modify the YAML header to allow numbering of figures and tables, but not sections. This will require use of the bookdown package (see Section 2.9.2.1.5). Install bookdown at the R console (not within a document chunk). To test the formatting, perform the following steps:
- Create a section header called
Question 1
and a subsection header called (a). Under (a) type"completed"
. - Under the subsection header (b), insert a chunk, and create a simple plot of points at the coordinates: \(\{1,1\}\), \(\{2,2\}\), \(\{3,3\}\), by typing the code:
plot(1:3)
in the chunk. Create a label for the chunk, and a create caption for the plot using the knitr chunk option,fig.cap
. - Under the subsection header (c), create a cross reference for the plot from (b) (see Section 2.9.2.1.5).
- Under the subsection header (d), write the equation, \(y_i = \hat{\beta}_0 + \hat{\beta}_1x_i + \hat{\varepsilon_i}\), using LaTeX. As noted earlier, a LaTeX equation cheatsheet can be found here.
- Render (knit) the final document as either an .html file or a .doc file. Include other assigned exercises for this Chapter as directed, using the general formatting approach given in Question 1.
- Create a section header called
-
Perform the following operations.
- Leave a note to yourself.
- Create and examine an object called
x
that contains the numeric entries 1, 2, and 3. - Make a copy of
x
calledy
. - Show the class of
y
. - Show the base type of
y
. - Show the attributes of
y
. - List the current objects in your work session.
- Identify your working directory.
Distinguish R expressions and assignments.
-
Sometimes R reports unexpected results for its classes and base types.
-
Solve the following mathematical operations using R.
- \(1 + 3/10 + 2\)
- \((1 + 3)/10 + 2\)
- \(\left(4 \cdot \frac{(3 - 4)}{23}\right)^2\)
- \(\log_2(3^{1/2})\)
- \(3\boldsymbol{x}^3 + 3\boldsymbol{x}^2 + 2\) where \(\boldsymbol{x} = \{0, 1.5, 4, 6, 8, 10\}\)
- \(4(\boldsymbol{x} + \boldsymbol{y})\) where \(\boldsymbol{x} = \{0, 1.5, 4, 6, 8\}\) and \(\boldsymbol{y} = \{-2, 0.5, 3, 5, 8\}\).
- \(\frac{d}{dx} \tan(x) 2.3 \cdot e^{3x}\)
- \(\frac{d^2}{dx^2} \frac{3}{4x^4}\)
- \(\int_3^{12} 24x + \ln(x)dx\)
- \(\int_{-\infty}^{\infty} \frac{1}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}dx\) (i.e., find the area under a standard normal pdf).
- \(\int_{-\infty}^{\infty}\frac{x}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}dx\) (i.e., find \(E(X)\) for a standard normal pdf).
- \(\int_{-\infty}^{\infty}\frac{x^2}{\sqrt{2\pi}}e^{-\frac{x^2}{2}}dx\) (i.e., find \(E(X^2)\) for a standard normal pdf).
- Find the sum, cumulative sum, product, cumulative product, arithmetic mean, median and variance of the data
x = c(0, 1.5, 4, 6, 8, 10)
.
The velocity of the earth’s rotation on its axis at the equator, \(E\), is approximately 1674.364 km/h, or 1040.401 m/h41. We can calculate the velocity of the rotation of the earth at any latitude with the equation, \(V = \cos(\)latitude\(^\text{o}) \times E\). Using R, simultaneously calculate rotational velocities for latitudes of 0,30,60, and 90 degrees north, or south, latitude (they will be the same). Remember, the function
cos()
assumes inputs are in radians, not degrees.