Decision tree analysis
Line 6: | Line 6: | ||
A common machine learning algorithm called decision tree analysis is used to categorize and predict outcomes based on a collection of input features <ref name="Breiman">Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Chapman and Hall</ref>. Each node represents a choice, and each branch represents one or more potential outcomes, creating a tree-like model of decisions and their potential effects. In order to decide what actions to take at each node based on the input features, the algorithm learns from a training collection of labeled data. | A common machine learning algorithm called decision tree analysis is used to categorize and predict outcomes based on a collection of input features <ref name="Breiman">Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Chapman and Hall</ref>. Each node represents a choice, and each branch represents one or more potential outcomes, creating a tree-like model of decisions and their potential effects. In order to decide what actions to take at each node based on the input features, the algorithm learns from a training collection of labeled data. | ||
− | Many industries, including banking, medicine, marketing, and engineering, use decision tree analysis <ref name="Kotsiantis">Kotsiantis, S. B., Zaharakis, I. D., & Pintelas, P. E. (2006). Machine learning: a review of classification and combining techniques. Artificial Intelligence Review, 26(3), 159-190 </ref>. It is especially helpful for issues with binary outcomes, like predicting client churn, determining credit risk, and spotting fraud <ref name"Wasserman">Wasserman, L. (2013). All of statistics: A concise course in statistical inference. Springer Science & Business Media(Wasserman, 2013)</ref>. | + | Many industries, including banking, medicine, marketing, and engineering, use decision tree analysis <ref name="Kotsiantis">Kotsiantis, S. B., Zaharakis, I. D., & Pintelas, P. E. (2006). Machine learning: a review of classification and combining techniques. Artificial Intelligence Review, 26(3), 159-190 </ref>. It is especially helpful for issues with binary outcomes, like predicting client churn, determining credit risk, and spotting fraud <ref name"Wasserman">Wasserman, L. (2013). All of statistics: A concise course in statistical inference. Springer Science & Business Media(Wasserman, 2013)</ref>. Decision trees are a popular option for outlining the decision-making process to stakeholders because they are simple to understand <ref name="Hastie">Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer</ref>. |
+ | Decision trees can, however, be vulnerable to overfitting, which can result in subpar performance on new data. To get around this, methods like regularization, ensembling, and trimming can be used to improve the model (Hastie et al., 2009). | ||
+ | |||
+ | Pruning is a technique that involves cutting off limbs from the tree that do not increase the model's accuracy when applied to fresh data. A method called assembling joins different decision trees to produce a more reliable model. By penalizing complexity in the tree, regularization keeps it from getting too complicated and overfitting the training data (Hastie et al., 2009). | ||
Revision as of 15:18, 19 February 2023
Abstract
A common machine learning algorithm called decision tree analysis is used to categorize and predict outcomes based on a collection of input features [1]. Each node represents a choice, and each branch represents one or more potential outcomes, creating a tree-like model of decisions and their potential effects. In order to decide what actions to take at each node based on the input features, the algorithm learns from a training collection of labeled data.
Many industries, including banking, medicine, marketing, and engineering, use decision tree analysis [2]. It is especially helpful for issues with binary outcomes, like predicting client churn, determining credit risk, and spotting fraud [3]. Decision trees are a popular option for outlining the decision-making process to stakeholders because they are simple to understand [4].
Decision trees can, however, be vulnerable to overfitting, which can result in subpar performance on new data. To get around this, methods like regularization, ensembling, and trimming can be used to improve the model (Hastie et al., 2009).
Pruning is a technique that involves cutting off limbs from the tree that do not increase the model's accuracy when applied to fresh data. A method called assembling joins different decision trees to produce a more reliable model. By penalizing complexity in the tree, regularization keeps it from getting too complicated and overfitting the training data (Hastie et al., 2009).
Application
Limitations
References
- ↑ Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Chapman and Hall
- ↑ Kotsiantis, S. B., Zaharakis, I. D., & Pintelas, P. E. (2006). Machine learning: a review of classification and combining techniques. Artificial Intelligence Review, 26(3), 159-190
- ↑ Wasserman, L. (2013). All of statistics: A concise course in statistical inference. Springer Science & Business Media(Wasserman, 2013)
- ↑ Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Chapman and Hall.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.
Kotsiantis, S. B., Zaharakis, I. D., & Pintelas, P. E. (2006). Machine learning: a review of classification and combining techniques. Artificial Intelligence Review, 26(3), 159-190.
Wasserman, L. (2013). All of statistics: A concise course in statistical inference. Springer Science & Business Media.