DATA SCIENCE USING R
VIII SEMESTER
DS-427T
UNIT-2
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
1
9/20/2024
How to Run R
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
2
9/20/2024
R Sessions and Functions
R Sessions:- R is a case-sensitive, interpreted language. You can enter commands one at a time at the command prompt (>) or run a set of commands from a source file.
There are a wide variety of data types, including vectors, matrices, data frames (similar to datasets), and lists (collections of objects).
The standard assignment operator in R is <-. = can also used, but this is discouraged, as it does not work in some special situations.
There are no fixed types associated with variables.
The variables can be printed without any print statement by giving name of the variable.
Comments (#) are especially valuable for documenting program code, but they are useful in interactive sessions.
Note:- Prompt for new input is „>‟
„+‟ is a line continuation character inR.
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
3
9/20/2024
R Sessions and Functions
Functions:- A function is a group of instructions that takes inputs, uses them to compute other values, and returns a result.
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
4
9/20/2024
R Sessions and Functions
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
5
9/20/2024
R Sessions and Functions
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
6
9/20/2024
R Sessions and Functions
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
7
9/20/2024
R Sessions and Functions
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
8
9/20/2024
BASIC MATH
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
9
9/20/2024
VARIABLES
Variables:- Variables are integral part of any programming language. R does not require variable types to be declared. A variable can take on any available datatype.It can hold any R object such as a function, the result of an analysis or a plot. A single variable, at one point hold a number, then later hold a character and then later a number again. Variable Assignment:- There a number of waysto assign a value to a variable, it does not depend on the type of value being assigned. There is no need to declare your variable first:
Example:-
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
10
9/20/2024
VARIABLES
Removing Variables:- rm() function is used to remove variables. This frees up memory so that R can store more objects,althought it does not necessarily free up memory for the operating system.
There is no “undo”; once the variable is removed.
Variable names are case sensitive.
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
11
9/20/2024
VARIABLES
Variable (Object) Names: Certain variable names are reserved for particular purposes. Some reserved symbols are: c q t C D F I T
### meaning of c q t C D F I T
? ## to see help document
?c ## c means Combine Values into a Vector or List ?q ## q means Terminate an R Session
?t ## t means Matrix Transpose
?C ## C means sets contrast for a factor
?D ## D means Symbolic and Algorithmic Derivatives of Simple Expressions
?F ## F means logical vector Character strings >F ##[1] FALSE
?I ##Inhibit Interpretation/Conversion of Objects c("T", "TRUE", "True", "true") are true, c("F", "FALSE", "False", "false") as false, and all others as NA.
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
12
9/20/2024
DATA TYPES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
13
9/20/2024
DATA TYPES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
14
9/20/2024
DATA TYPES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
15
9/20/2024
DATA TYPES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
16
9/20/2024
VECTORS
Vector:- Vectors must be homogeneous i.e, the type of data in a given vector must all be the same. Vectors are one-dimensional arrays that can hold numeric data, character data, or logical data. The combine function c() is used to form the vector. Here are examples of each type of vector:
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
17
9/20/2024
VECTORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
18
9/20/2024
VECTORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
19
9/20/2024
VECTORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
20
9/20/2024
VECTORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
21
9/20/2024
ADVANCED DATA STRUCTURES
Advanced Data Structures:- R has a wide variety of objects for holding data, including scalars, vectors, matrices, arrays, data frames, and lists. They differ in terms of the type of data they can hold, how they’re created, their structural complexity, and the notation used to identify and access individual elements.
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
22
9/20/2024
DATA FRAMES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
23
9/20/2024
DATA FRAMES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
24
9/20/2024
DATA FRAMES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
25
9/20/2024
DATA FRAMES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
26
9/20/2024
LISTS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
27
9/20/2024
LISTS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
28
9/20/2024
MATRICES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
29
9/20/2024
MATRICES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
30
9/20/2024
MATRICES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
31
9/20/2024
MATRICES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
32
9/20/2024
MATRICES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
33
9/20/2024
MATRICES
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
34
9/20/2024
ARRAYS
Arrays:- An array is essentially a multidimensional vector. It must all be of the same type and
individual elements are accessed in a similar fashion using brackets. The first element is the row
index, the second is the column index and the remaining elements are for outer dimensions.
The array function takes a dim attribute which creates the required number of dimension. In the below
example we create an array with two elements which are 2x3 matrices each. Creating an array:
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
35
9/20/2024
ARRAYS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
36
9/20/2024
CLASSES
Classes:- R possesses a simple generic function mechanism which can be used for an object-oriented style of programming. Method dispatch takes place based on the class of the first argument to the generic function.
Usage
class(x)
class(x) <-
value unclass(x)
inherits(x, what, which = FALSE)
oldClass(x)
oldClass(x) <- value
Arguments
Details
Here, we describe the so called “S3” classes (and methods). For “S4” classes (and methods), see „Formal classes‟ below. Many R objects have a class attribute, a character vector giving the names of the classes from which the object inherits. (Functions oldClass and oldClass<- get and set the attribute, which can also be done directly.) If the object does not have a class attribute, it has an implicit class, notably "matrix", "array", "function" or "numeric" or the result of typeof(x) (which is similar to mode(x)), but for type "language" and mode "call", where the following extra classes exist for the corresponding function calls: if, while, for, =, <-, (, {, call. Note that NULL objects cannot have attributes (hence not classes) and attempting to assign a class is an error. When a generic function fun is applied to an object with class attribute c("first", "second"), the system searches for a function called fun.first and, if it finds it, applies it to the object. If no such function is found, a function called fun.second is tried. If no class name produces a suitable function, the function fun.default is used (if it exists). If there is no class attribute, the implicit class is tried, then the default method.
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
37
9/20/2024
CLASSES
Formal classes
An additional mechanism of formal classes, nicknamed “S4”, is available in package methods which is attached by default. For objects which have a formal class, its name is returned by class as a character vector of length one and method dispatch can happen on several arguments, instead of only the first. However, S3 method selection attempts to treat objects from an S4 class as if they had the appropriate S3 class attribute, as does inherits. Therefore, S3 methods can be defined for S4 classes. See the „Introduction‟ and „Methods_for_S3‟ help pages for basic information on S4 methods and for the relation between these and S3 methods. The replacement version of the function sets the class to the value provided. For classes that have a formal definition, directly replacing the class this way is strongly deprecated. The expression as(object, value) is the way to coerce an object to a particular class. The analogue of inherits for formal classes is is. The two functions behave consistently with one exception: S4 classes can have conditional inheritance, with an explicit test. In this case, is will test the condition, but inherits ignores all conditional superclasses. Functions oldClass and oldClass<- behave in the same way as functions of those names in S-PLUS 5/6, but in R UseMethod dispatches on the class as returned by class (with some interpolated classes: see the link) rather than oldClass. However, group generics dispatch on the oldClass for efficiency, and internal generics only dispatch on objects for which is.object is true.
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
38
9/20/2024
CLASSES
In older versions of R, assigning a zero-length vector with class removed the class: it is now an error (whereas it still works for oldClass). It is clearer to always assign NULL to remove the class.
Examples
x <- 10
class(x) # "numeric"
oldClass(x) # NULL
inherits(x, "a") #FALSE
class(x) <- c("a", "b")
inherits(x,"a") #TRUE
inherits(x, "a", TRUE) # 1
inherits(x, c("a", "b", "c"), TRUE) # 1 2 0
class( quote(pi) )
# "name"
## regular calls
class( quote(sin(pi*x)) ) # "class"
## special calls
class( quote(x <- 1) )
# "<-"
class( quote((1 < 2)) )
# "("
class( quote( if(8<3) pi ) ) # "if"
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
39
9/20/2024
R PROGRAMMING STRUCTURES & CONTROL STATEMENTS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
40
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
41
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
42
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
43
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
44
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
45
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
46
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
47
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
48
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
49
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
50
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
51
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
52
9/20/2024
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
53
9/20/2024
LOOPING OVER NON-VECTOR SETS
Looping Over Non-vector Sets:- R does not directly support iteration over nonvectorsets, but there are a couple of indirect yet easy ways to accomplishit:
apply( ):- Applies on 2D arrays (matrices), data frames to find aggregate functions like sum, mean, median, standard deviation. syntax:- apply(matrix,margin,fun, ...) margin = 1 indicates row = 2 indicates col
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
54
9/20/2024
LOOPING OVER NON-VECTOR SETS
Use lapply( ), assuming that the iterations of the loop are independent of each other, thus allowing them to be performed in any order. Lapply( ) can be applies on dataframes,lists and vectors and return a list. lapply returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X. Syntax:lapply(X, FUN, ..)
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
55
9/20/2024
LOOPING OVER NON-VECTOR SETS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
56
9/20/2024
IF-ELSE
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
57
9/20/2024
ARITHMETIC AND BOOLEAN OPERATORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
58
9/20/2024
ARITHMETIC AND BOOLEAN OPERATORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
59
9/20/2024
ARITHMETIC AND BOOLEAN OPERATORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
60
9/20/2024
ARITHMETIC AND BOOLEAN OPERATORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
61
9/20/2024
ARITHMETIC AND BOOLEAN OPERATORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
62
9/20/2024
ARITHMETIC AND BOOLEAN OPERATORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
63
9/20/2024
ARITHMETIC AND BOOLEAN OPERATORS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
64
9/20/2024
RETURN VALUES
Return Values:- Functions are generally used for computing some value, so they need a mechanism to supply that value back to the caller. This is called returning.
The return value of a function can be any R object. Although the return value is often a list, it could even be another function.
You can transmit a value back to the caller by explicitly calling return(). Without this call, the value of the last executed statement will be returned by default.
If the last statement in the call function is a for( ) statement, which returns the value NULL.
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
65
9/20/2024
FUNCTIONS ARE OBJECTS
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
66
9/20/2024
DEFAULT VALUES FOR ARGUMENT
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
67
9/20/2024
RECURSION
Recursion:-Recursion is a programming technique in which, a function calls itself repeatedly for some input.
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
68
9/20/2024
THANKS….
Department of Computer Science and Engineering, BVCOE New Delhi Subject: Data Science Using R , Instructor: Ms Rachna Narula
69
9/20/2024