R is widely used software for statistic research.
CSV load
read.csv()
function is a preset loader for comma-separated CSV with 1st row header.
You can assign parsed data to a variable with <-
operator.
> data <- read.csv("target.csv")
Call read.csv
without ()
for detail options.
You can specify separaor, quote and so on.
> read.csv
function (file, header = TRUE, sep = ",", quote = "\"", dec = ".",
fill = TRUE, comment.char = "", ...)
read.table(file = file, header = header, sep = sep, quote = quote,
dec = dec, fill = fill, comment.char = comment.char, ...)
<bytecode: 0x55a2daa926e0>
<environment: namespace:utils>
Basic Metrics
summary()
function shows basic metrics including min/max, mean and quartiles.
> summary(data)
name points
Length:112 Min. : 3.00
Class :character 1st Qu.:35.00
Mode :character Median :50.50
Mean :52.02
3rd Qu.:68.75
Max. :99.00
Draw histgram
hist()
function draws the histgram for specified column.
You can confirm the distribution shape.
When data is a matrix, column name with prefixed $
operator is also needed.
> hist(data$points, breaks=20)
breaks
option is the number of classes.
Resulting graph will open in another window. If you don’t get one, you may need to configure your window system like
X.
R script
R script is a just bunch of R functions written in a plain text file.
#!/usr/bin/env Rscript
# Parse commandline args
args <- commandArgs(trailingOnly = T)
# List of functions
data<-read.csv(args[1])
summary(data)
- You can execute script directly like
./summary.R target.csv
- Specify
Rscript
as Shebang. - Need to
chmod +x ./summary.R
- Specify
commandArgs()
parses command linne args.trailingOnly = T
option is needed for most cases.
Chuma Takahiro