R Programming Language - Basics of R

 Welcome to Our R Programming Blog

R Programming is a powerful language used for statistical computing, data analysis, and data visualization. This blog is developed by I M.Sc PG Computer Science students to share basic and practical knowledge of R programming.

The main objective of this blog is to help students understand R concepts easily through explanations, examples, and programs.

Introduction to R Programming

R is a powerful programming language specially designed for data analysis, statistics, and visualization. It is widely used by data analysts, statisticians, researchers, and anyone who works with data.

R is an open-source programming language used mainly for:

  • Data cleaning
  • Data manipulation
  • Statistical analysis
  • Data visualization
  • Machine learning
  • Report generation

It was created by Ross Ihaka and Robert Gentleman in the 1990s and has now become one of the most used tools in data science.

 

Why Use R Programming?

  • Simple and powerful for statistics
  • Excellent data visualization tools
  • Free and open-source
  • Supports advanced analytics
  • Large number of packages
  • Works on Windows, Linux, and macOS
  • Strong community support
  • Best suited for researchers and analysts
 INSTALLATION OF R 

How to Install R Programming

Steps to install R:

  1. Visit the official R website
  2. Download R for your operating system (Windows / Linux / Mac)
  3. Run the installer
  4. Install RStudio (recommended IDE)

What is RStudio?

RStudio is an integrated development environment (IDE) that makes working with R easier. It provides a console, script editor, and plotting tools.

Data Types in R

Data types define the kind of value that a variable can store and the operations that can be performed on it.

In R, data types specify whether the data is numeric, integer, complex, character, logical,

  • Numeric
  • Integer
  • Complex
  • Character
  • Logical
Example:

x <- 10.5        #output
class(x)         [1] "numeric"

R Variables

Variables are containers for storing data values.

Example:

name <- "John"
age <- 40

name   # output "John"
age    # output 40

OPERATORS

An operator is a symbol that tells the compiler to perform specific mathematical or logical manipulations. R language is rich in built-in operators and provides following types of operators.

Types of Operators

We have the following types of operators in R programming −

  • Arithmetic Operators
  • Relational Operators
  • Logical Operators
  • Assignment Operators
  • Miscellaneous Operators

Example:

v <- c( 2, 5.5, 6)

t <- c(8, 3, 4)            #output

print(v+t)                  [1] 10.0  8.5  10.0

Decision Making Statements in R

Decision-making statements allow a program to take decisions and execute different code blocks based on given conditions.

1. if Statement

The if statement executes a block of code only when the given condition is TRUE.

Example

x <- 10

if (x > 5) {

  print("x is greater than 5")

}

2. If Else Statement

The if...else statement executes one block when the condition is TRUE and another block when it is FALSE.

Example

x <- 3

if (x %% 2 == 0) {

  print("Even number")

} else {

  print("Odd number")

}

3. Else if Statement

Used to check multiple conditions in sequence.

Example

marks <- 75

if (marks >= 90) {

  print("Grade A")

} else if (marks >= 60) {

  print("Grade B")

} else {

  print("Grade C")

}

4. Nested if Statement

An if statement inside another if statement.

Example

x <- 15

if (x > 10) {

  if (x < 20) {

    print("x is between 10 and 20")

  }

}

5. Switch Statement

The switch statement selects one block of code from multiple choices based on an expression.

Loop

A loop statement allows us to execute a statement or group of statements multiple times.

1. for Loop

The for loop is used to iterate over a sequence.

Example

for (i in 1:5) {

  print(i)

}

2. while Loop

The while loop executes the code as long as the condition is TRUE.

Example

i <- 1

while (i <= 5) {

  print(i)

  i <- i + 1

}

4. break Statement

break is used to terminate the loop immediately.

Example

for (i in 1:10) {

  if (i == 6) {

    break

  }

  print(i)

}

5. next Statement

next skips the current iteration and moves to the next one.

Example

for (i in 1:5) {

  if (i == 3) {

    next

  }

  print(i)

}

DATA STRUCTURES IN R 

Types of Data Structures

  1. Vector
  2. List
  3. Matrix
  4. Data Frame
  5. Factor

Example of Vector

numbers <- c(10, 20, 30, 40)

print(numbers)

Linear Regression in R

        Linear regression is a statistical approach used to model the relationship between a dependent variable and one or more independent variables.

         A straight line is assumed to approximate this relationship.

        The goal is to identify the line that minimizes discrepancies between the observed data points and predicted values.

 Types of linear regression:

Simple Linear Regression

         (single dependent variable, single independent variable)

Multiple Linear Regression

         (single dependent variable, multiple independent variables)

Linear Regression Line

        A regression line shows the relationship between the dependent and independent variables. It can either exhibit.

Positive Linear Relationship:

         As the independent variable increases, the dependent variable increases.

Negative Linear Relationship:

         As the independent variable increases, the dependent variable decreases.

Assumptions of Linear Regression

Linear regression algorithm assumes the following:

Linear relationship:

        The dependent and independent variables are linearly related.

No multicollinearity:

        Independent variables should not be highly correlated.

Homoscedasticity:

        The error term should remain constant across all levels of the independent variables.

Normal distribution of error terms:

        Error terms should follow a normal distribution.

No autocorrelation:

        The error terms should not show patterns.



Comments