AutoML Module

Intelligent model selection, hyperparameter optimization, and ensemble creation with advanced optimization algorithms.

Overview

The AutoML module provides intelligent model selection, hyperparameter optimization, and ensemble creation. It automatically selects the best models for your data and optimizes their performance using advanced algorithms.

Basic Usage

from ai_ml_framework.auto_ml import AutoMLSelector

automl = AutoMLSelector(problem_type='classification')
best_model = automl.auto_select_and_train(X, y)

With Optimization

automl = AutoMLSelector(
    problem_type='classification',
    optimization=True,
    cv_folds=5
)

models = automl.auto_select_models(X, y)

AutoMLSelector

class AutoMLSelector(problem_type: str, models: List[str] = None)

Intelligent model selection and training.

Key Features

  • Automatic model selection
  • Problem type detection
  • Ensemble methods
  • Performance evaluation
  • Model recommendations

Methods

auto_select_and_train(X: pd.DataFrame, y: pd.Series) -> Any

Auto-select and train best model.

Parameters:
  • X: Feature matrix
  • y: Target vector
Returns: Trained best model
auto_select_models(X: pd.DataFrame, y: pd.Series) -> Dict[str, Any]

Auto-select multiple models.

Parameters:
  • X: Feature matrix
  • y: Target vector
Returns: Dictionary of trained models
create_ensemble(X: pd.DataFrame, y: pd.Series, method: str = 'voting') -> Any

Create ensemble model.

Parameters:
  • X: Feature matrix
  • y: Target vector
  • method: Ensemble method ('voting', 'stacking', 'blending')
Returns: Ensemble model

HyperparameterOptimizer

class HyperparameterOptimizer(problem_type: str)

Advanced hyperparameter optimization using Optuna.

Key Features

  • Optuna integration
  • Multiple optimization strategies
  • Ensemble optimization
  • Early stopping
  • Parallel optimization

Methods

optimize_model(model: Any, X: pd.DataFrame, y: pd.Series, n_trials: int = 100) -> Tuple[Any, Dict[str, Any]]

Optimize model hyperparameters.

Parameters:
  • model: Base model to optimize
  • X: Feature matrix
  • y: Target vector
  • n_trials: Number of optimization trials
Returns: Tuple of (optimized_model, best_parameters)

ModelEvaluator

class ModelEvaluator(problem_type: str)

Comprehensive model evaluation and comparison.

Key Features

  • Multiple evaluation metrics
  • Cross-validation
  • Model comparison
  • Performance analysis
  • Visualization support

Methods

evaluate_models(models: Dict[str, Any], X: pd.DataFrame, y: pd.Series, cv: int = 5) -> Dict[str, Dict[str, float]]

Evaluate multiple models.

Parameters:
  • models: Dictionary of trained models
  • X: Feature matrix
  • y: Target vector
  • cv: Cross-validation folds
Returns: Dictionary of evaluation results

Examples

Basic AutoML

python
from ai_ml_framework.auto_ml import AutoMLSelector
from sklearn.model_selection import train_test_split

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Initialize AutoML
automl = AutoMLSelector(problem_type='classification')

# Auto-select and train models
models = automl.auto_select_models(X_train, y_train)
print(f"Selected {len(models)} models")

# Evaluate models
results = automl.evaluate_models(models, X_test, y_test)
best_model_name = max(results.keys(), key=lambda x: results[x]['accuracy'])
print(f"Best model: {best_model_name}")

# Get best model
best_model = automl.auto_select_and_train(X_train, y_train)
print(f"Best model accuracy: {best_model.score(X_test, y_test):.3f}")

Hyperparameter Optimization

python
from ai_ml_framework.auto_ml import HyperparameterOptimizer

# Initialize optimizer
optimizer = HyperparameterOptimizer(problem_type='classification')

# Optimize model
optimized_model, best_params = optimizer.optimize_model(
    models[best_model_name], 
    X_train, 
    y_train, 
    n_trials=50
)

print(f"Best parameters: {best_params}")
print(f"Optimized accuracy: {optimized_model.score(X_test, y_test):.3f}")

Ensemble Methods

python
# Create voting ensemble
voting_ensemble = automl.create_ensemble(
    X_train, y_train, method='voting'
)

# Create stacking ensemble
stacking_ensemble = automl.create_ensemble(
    X_train, y_train, method='stacking'
)

# Compare ensembles
evaluator = ModelEvaluator(problem_type='classification')

voting_results = evaluator.evaluate_single_model(
    voting_ensemble, X_test, y_test
)
stacking_results = evaluator.evaluate_single_model(
    stacking_ensemble, X_test, y_test
)

print(f"Voting ensemble accuracy: {voting_results['accuracy']:.3f}")
print(f"Stacking ensemble accuracy: {stacking_results['accuracy']:.3f}")

Model Comparison

python
# Compare multiple models
comparison_df = evaluator.compare_models(results, 'accuracy')
print("Model Comparison:")
print(comparison_df)

# Get model recommendations
recommendations = automl.get_model_recommendations(X_train, y_train)
print("\nModel Recommendations:")
for model, score in recommendations.items():
    print(f"  {model}: {score:.3f}")