R: Basic usage

R is widely used software for statistic research.

CSV load

read.csv() function is a preset loader for comma-separated CSV with 1st row header.
You can assign parsed data to a variable with <- operator.

> data <- read.csv("target.csv")

Call read.csv without () for detail options.
You can specify separaor, quote and so on.

> read.csv
function (file, header = TRUE, sep = ",", quote = "\"", dec = ".",
    fill = TRUE, comment.char = "", ...)
read.table(file = file, header = header, sep = sep, quote = quote,
    dec = dec, fill = fill, comment.char = comment.char, ...)
<bytecode: 0x55a2daa926e0>
<environment: namespace:utils>

Basic Metrics

summary() function shows basic metrics including min/max, mean and quartiles.

> summary(data)
     name              points
 Length:112         Min.   : 3.00
 Class :character   1st Qu.:35.00
 Mode  :character   Median :50.50
                    Mean   :52.02
                    3rd Qu.:68.75
                    Max.   :99.00

Draw histgram

hist() function draws the histgram for specified column.
You can confirm the distribution shape.

When data is a matrix, column name with prefixed $ operator is also needed.

> hist(data$points, breaks=20)

breaks option is the number of classes.
Resulting graph will open in another window. If you don’t get one, you may need to configure your window system like X.

R script

R script is a just bunch of R functions written in a plain text file.

#!/usr/bin/env Rscript

# Parse commandline args
args <- commandArgs(trailingOnly = T)

# List of functions
data<-read.csv(args[1])
summary(data)
  • You can execute script directly like ./summary.R target.csv
    • Specify Rscript as Shebang.
    • Need to chmod +x ./summary.R
  • commandArgs() parses command linne args.
    • trailingOnly = T option is needed for most cases.
⁋ Feb 27, 2023↻ Nov 7, 2024
中馬崇尋
Chuma Takahiro