R Programming Language - Basics of R
Welcome to Our R Programming Blog
R Programming is a powerful language used for statistical computing, data analysis, and data visualization. This blog is developed by I M.Sc PG Computer Science students to share basic and practical knowledge of R programming.
The main objective of this blog is to help students understand R concepts easily through explanations, examples, and programs.
Introduction to R Programming
R is a powerful programming language specially designed for data analysis, statistics, and visualization. It is widely used by data analysts, statisticians, researchers, and anyone who works with data.
R is an open-source programming language used mainly for:
- Data cleaning
- Data manipulation
- Statistical analysis
- Data visualization
- Machine learning
- Report generation
It was created by Ross Ihaka and Robert Gentleman in the 1990s and has now become one of the most used tools in data science.
Why Use R Programming?
- Simple and powerful for statistics
- Excellent data visualization tools
- Free and open-source
- Supports advanced analytics
- Large number of packages
- Works on Windows, Linux, and macOS
- Strong community support
- Best suited for researchers and analysts
How to Install R Programming
Steps to install R:
- Visit the official R website
- Download R for your operating system (Windows / Linux / Mac)
- Run the installer
- Install RStudio (recommended IDE)
What is RStudio?
RStudio is an integrated development environment (IDE) that makes working with R easier. It provides a console, script editor, and plotting tools.
Data Types in R
Data types define the kind of value that a variable can store and the operations that can be performed on it.
In R, data types specify whether the data is numeric, integer, complex, character, logical,
- Numeric
- Integer
- Complex
- Character
- Logical
R Variables
Variables are containers for storing data values.
Example:
name <- "John"
age <- 40
name # output "John"
age # output 40
OPERATORS
An operator is a symbol that tells the compiler to perform specific mathematical or logical manipulations. R language is rich in built-in operators and provides following types of operators.
Types of Operators
We have the following types of operators in R programming −
- Arithmetic Operators
- Relational Operators
- Logical Operators
- Assignment Operators
- Miscellaneous Operators
Example:
v <- c( 2, 5.5, 6)
t <- c(8, 3, 4) #output
print(v+t) [1] 10.0 8.5 10.0
Decision Making Statements in R
Decision-making statements allow a program to take decisions and execute different code blocks based on given conditions.
1. if Statement
The if statement executes a block of code only when the given condition is TRUE.
Example
x <- 10
if (x > 5) {
print("x is greater than 5")
}
2. If Else Statement
The if...else statement executes one block when the condition is TRUE and another block when it is FALSE.
Example
x <- 3
if (x %% 2 == 0) {
print("Even number")
} else {
print("Odd number")
}
3. Else if Statement
Used to check multiple conditions in sequence.
Example
marks <- 75
if (marks >= 90) {
print("Grade A")
} else if (marks >= 60) {
print("Grade B")
} else {
print("Grade C")
}
4. Nested if Statement
An if statement inside another if statement.
Example
x <- 15
if (x > 10) {
if (x < 20) {
print("x is between 10 and 20")
}
}
5. Switch Statement
The switch statement selects one block of code from multiple choices based on an expression.
Loop
A loop statement allows us to execute a statement or group of statements multiple times.
1. for Loop
The for loop is used to iterate over a sequence.
Example
for (i in 1:5) {
print(i)
}
2. while Loop
The while loop executes the code as long as the condition is TRUE.
Example
i <- 1
while (i <= 5) {
print(i)
i <- i + 1
}
4. break Statement
break is used to terminate the loop immediately.
Example
for (i in 1:10) {
if (i == 6) {
break
}
print(i)
}
5. next Statement
next skips the current iteration and moves to the next one.
Example
for (i in 1:5) {
if (i == 3) {
next
}
print(i)
}
DATA STRUCTURES IN R
Types of Data Structures
- Vector
- List
- Matrix
- Data Frame
- Factor
Example of Vector
numbers <- c(10, 20, 30, 40)
print(numbers)
Linear Regression in R
● Linear regression is a statistical approach used to model the relationship between a dependent variable and one or more independent variables.
● A straight line is assumed to approximate this relationship.
● The goal is to identify the line that minimizes discrepancies between the observed data points and predicted values.
Simple Linear Regression
● (single dependent variable, single independent variable)
Multiple Linear Regression
● (single dependent variable, multiple independent variables)
Linear Regression Line
● A regression line shows the relationship between the dependent and independent variables. It can either exhibit.
Positive Linear Relationship:
● As the independent variable increases, the dependent variable increases.
Negative Linear Relationship:
● As the independent variable increases, the dependent variable decreases.
Assumptions of Linear Regression
Linear regression algorithm assumes the following:
Linear relationship:
● The dependent and independent variables are linearly related.
No multicollinearity:
● Independent variables should not be highly correlated.
Homoscedasticity:
● The error term should remain constant across all levels of the independent variables.
Normal distribution of error terms:
● Error terms should follow a normal distribution.
No autocorrelation:
● The error terms should not show patterns.
Comments
Post a Comment