[ML] Tree Based Learning Algorithms

Tree-based learning algorithms, also known as Cart (Classification and Regression Trees), are a popular technique for predicting numeric and categorical outputs.
Tree-based methods, which include decision trees, bagging, random forests, and boosting, are considered highly effective in the space of supervised learning.
This is partly due to their high accuracy and versatility as they can be used to predict both discrete and continuous outcomes.

Decision Trees

Decision trees create a decision structure to interpret patterns by splitting data into groups using variables that best split the data into homogenous or numerically relevant groups based on entropy (a measure of variance in the data among different classes).
The primary appeal of decision trees is they can be displayed graphically as a tree-like graph.
Unlike an actual tree, the decision tree is displayed upside down with the leaves located at the bottom or foot of the free.
Each branch represents the outcome of a decision/variable and each leaf node represents a class label, such as “Go to beach” or “Stay in.”
Decision rules are subsequently marked by the path from the root of the tree to a terminal leaf node.

Example

Let’s use a decision tree classifier to predict the outcome of a user clicking on an advert using the advertising dataset.

advertising.csv

0.10MB

1-2. Import libraries/ Dataset

import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import classification_report, confusion_matrix

df = pd.read_csv('/content/advertising.csv')

3. Convert non-numeric variables

df = pd.get_dummies(df, columns=['Country', 'City'])

4. Remove columns

del df['Ad Topic Line']
del df['Timestamp']

df.head()

5. Set X and y variables

X = df.drop('Clicked on Ad',axis=1)
y = df['Clicked on Ad']
 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=10, shuffle=True)

6. Set algorithm

model = DecisionTreeClassifier()
model.fit(X_train,y_train)

7. Evaluate

model_predict = model.predict(X_test)

print(confusion_matrix(y_test, model_predict))
print(classification_report(y_test, model_predict))

https://github.com/erica00j/machinelearning/blob/main/decision_Tree.ipynb

GitHub - erica00j/machinelearning

Contribute to erica00j/machinelearning development by creating an account on GitHub.

github.com

'인공지능 > Machine Learning' 카테고리의 다른 글

[ML] Tree Based Learning Algorithms - Gradient Boosting (0)	2022.11.30
[ML] Tree Based Learning Algorithms - Random Forests (0)	2022.11.30
[ML] k-NEAREST NEIGHBORS 예제 (0)	2022.11.15
[ML] k-NEAREST NEIGHBORS (k-최근접 이웃 알고리즘) (0)	2022.11.15
[ML] Bias & Variance (0)	2022.11.15

고구마의 개발

[ML] Tree Based Learning Algorithms - Decision Trees

Decision Trees

1-2. Import libraries/ Dataset

3. Convert non-numeric variables

4. Remove columns

5. Set X and y variables

6. Set algorithm

7. Evaluate

'인공지능 > Machine Learning' 카테고리의 다른 글

댓글

티스토리툴바

[ML] Tree Based Learning Algorithms - Decision Trees

Decision Trees

1-2. Import libraries/ Dataset

3. Convert non-numeric variables

4. Remove columns

5. Set X and y variables

6. Set algorithm

7. Evaluate

'인공지능 > Machine Learning' 카테고리의 다른 글

관련글

댓글

티스토리툴바