Subsections

12.3 Data types and data structures

Inidividual values are either strings, numbers, or logical values (R also supports complex values with an imaginary component).

There is a distinction between integers and real values, but integer values tend to be coerced to real values if anything is done to them. If an integer is required it is best to ensure it by explicitly using a function that generates integer values.

Chapter 7 discussed the amount of memory required to store various types of values. For the specific case of R (on a 32-bit operating system), (ASCII) text uses 1 byte per character. An integer uses 4 bytes, as does a logical value, and a real number uses 8 bytes. The function object.size() returns the approximate number of bytes used by an R object in memory.

> object.size(1:1000)

[1] 4024

> object.size(as.numeric(1:1000))

[1] 8024

The simplest data structure in R is a vector. Most operators and many functions accept vector arguments and return a vector result. All elements of a vector must have the same basic type.

Matrices and arrays are multidimensional analogues of the vector. All elements must have the same type.

Data frames are collections of vectors where each vector must have the same length, but different vectors can have different types. This data structure is the standard way to represent a data set in R.

Lists are like vectors that can have different types of object in each component. In the simplest case, each component of a list may be vector of values. Like the data frame, each component can be a vector of a different basic type, but for lists there is no requirement that each component has the same size. More generally, the components of a list can be more complex objects, such as matrices, data frames, or even other lists. Lists can be used to efficiently represent hierarchical data in R.

12.3.1 The workspace

When quitting R, the option is given to save the current workspace. The workspace consists of all symbols that have been assigned a value during the session.

It is possible to save the value of only specific symbols using the save() command. The load() function can be used to load objects from disk that were created using save(). For very large objects, the save() function has a compress argument.

It is possible to have R exit without asking whether to save the workspace by supplying the argument -no-save when starting R.12.1

Paul Murrell

Creative Commons License
This document is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.