CHAPTER 3c_training and test sets_calssification modeling.Rmd

---
title: "Chapter3C_TRAINING AND TEST SETS_CLASSIFICATION MODELING"
author: "Silvia Antón"
date: "`r Sys.Date()`"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

You evaluate a classification model's performance by starting with a "confusion matrix". This informs you as to how many true positives, true negatives, false positives, and false negatives there are as a result.As with regression statistics, classification statistics have many tools with which you can evaluate the final performarnce of the model. 

```{r}

iris.df<-iris

iris.df$Species<-as.character(iris.df$Species)

iris.df$Species[iris.df$Species!="setosa"]<-"other"

iris.df$Species<-as.factor(iris.df$Species)

iris.samples<-sample(1:nrow(iris.df),nrow(iris.df)*0.7,replace=FALSE)

training.iris<-iris.df[iris.samples,]
test.iris<-iris.df[-iris.samples,]

library(randomForest)

iris.rf<-randomForest(Species~.,data=training.iris)
iris.predictions<-predict(iris.rf,test.iris)
table(iris.predictions,test.iris$Species)
```

In a binary class truth table, there are two outcomes:either the predicted value is some class, or it is not. In this case, you're focusing on wheter the model predicted a setosa class or something else. There are 4 values for the confusion table.

TRUE POSITIVES: The model predicted setosa classes and got them right.

TRUE NEGATIVES: tHE model predicted other classes and got them right.

FALSE POSITIVES: The model predicted setosa classes, but the correct answer was other.

FALSE NEGATIVES: The model predicted other classes, but the correct answer was setosa.