-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHAPTER 3c_training and test sets_calssification modeling.Rmd
44 lines (28 loc) · 1.62 KB
/
CHAPTER 3c_training and test sets_calssification modeling.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
---
title: "Chapter3C_TRAINING AND TEST SETS_CLASSIFICATION MODELING"
author: "Silvia Antón"
date: "`r Sys.Date()`"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
You evaluate a classification model's performance by starting with a "confusion matrix". This informs you as to how many true positives, true negatives, false positives, and false negatives there are as a result.As with regression statistics, classification statistics have many tools with which you can evaluate the final performarnce of the model.
```{r}
iris.df<-iris
iris.df$Species<-as.character(iris.df$Species)
iris.df$Species[iris.df$Species!="setosa"]<-"other"
iris.df$Species<-as.factor(iris.df$Species)
iris.samples<-sample(1:nrow(iris.df),nrow(iris.df)*0.7,replace=FALSE)
training.iris<-iris.df[iris.samples,]
test.iris<-iris.df[-iris.samples,]
library(randomForest)
iris.rf<-randomForest(Species~.,data=training.iris)
iris.predictions<-predict(iris.rf,test.iris)
table(iris.predictions,test.iris$Species)
```
In a binary class truth table, there are two outcomes:either the predicted value is some class, or it is not. In this case, you're focusing on wheter the model predicted a setosa class or something else. There are 4 values for the confusion table.
TRUE POSITIVES: The model predicted setosa classes and got them right.
TRUE NEGATIVES: tHE model predicted other classes and got them right.
FALSE POSITIVES: The model predicted setosa classes, but the correct answer was other.
FALSE NEGATIVES: The model predicted other classes, but the correct answer was setosa.