Interactive Random Forest Regression Visualization

Data Points

Random Forest Prediction

True Relationship

Individual Trees

Controls

Data Generation

Number of Points:

Data Pattern:

Noise Level: 1.0

Random Forest Parameters

Number of Trees:

Max Tree Depth:

Min Samples to Split:

Max Features:

Visualization Options

Show Individual Trees

Show True Relationship

Show Residuals

Show Decision Boundaries

Model Performance

True Relationship: y = 0.5x² + 1

MSE (Mean Squared Error): -

R² Value: -

Number of Trees: 5

Tree Depth: 3

How Random Forest Regression Works

What is Random Forest Regression?

Random Forest is an ensemble learning method that builds multiple decision trees and merges their predictions. For regression tasks, it averages the predictions from all trees to produce a final output.

Key Components:

Decision Trees: Each tree splits the data based on feature values to create homogeneous subgroups
Bootstrap Sampling: Each tree is trained on a random subset of the data with replacement
Feature Randomness: At each split, only a random subset of features is considered
Ensemble Averaging: Final prediction is the average of all individual tree predictions

Hyperparameters:

Number of Trees: More trees generally improve performance but increase computation time
Max Depth: Controls how deep each tree can grow (deeper trees can model more complex patterns but may overfit)
Min Samples Split: Minimum number of samples required to split a node (helps control overfitting)
Max Features: Number of features to consider when looking for the best split

Advantages:

Handles non-linear relationships well
Robust to outliers and noise
Provides feature importance measures
Less prone to overfitting than individual decision trees