AutoML Module
Intelligent model selection, hyperparameter optimization, and ensemble creation with advanced optimization algorithms.
Overview
The AutoML module provides intelligent model selection, hyperparameter optimization, and ensemble creation. It automatically selects the best models for your data and optimizes their performance using advanced algorithms.
AutoMLSelector
class AutoMLSelector(problem_type: str, models: List[str] = None)
Intelligent model selection and training.
Key Features
- Automatic model selection
- Problem type detection
- Ensemble methods
- Performance evaluation
- Model recommendations
Methods
auto_select_and_train(X: pd.DataFrame, y: pd.Series) -> Any
Auto-select and train best model.
Parameters:
X: Feature matrixy: Target vector
Returns: Trained best model
auto_select_models(X: pd.DataFrame, y: pd.Series) -> Dict[str, Any]
Auto-select multiple models.
Parameters:
X: Feature matrixy: Target vector
Returns: Dictionary of trained models
create_ensemble(X: pd.DataFrame, y: pd.Series, method: str = 'voting') -> Any
Create ensemble model.
Parameters:
X: Feature matrixy: Target vectormethod: Ensemble method ('voting', 'stacking', 'blending')
Returns: Ensemble model
HyperparameterOptimizer
class HyperparameterOptimizer(problem_type: str)
Advanced hyperparameter optimization using Optuna.
Key Features
- Optuna integration
- Multiple optimization strategies
- Ensemble optimization
- Early stopping
- Parallel optimization
Methods
optimize_model(model: Any, X: pd.DataFrame, y: pd.Series, n_trials: int = 100) -> Tuple[Any, Dict[str, Any]]
Optimize model hyperparameters.
Parameters:
model: Base model to optimizeX: Feature matrixy: Target vectorn_trials: Number of optimization trials
Returns: Tuple of (optimized_model, best_parameters)
ModelEvaluator
class ModelEvaluator(problem_type: str)
Comprehensive model evaluation and comparison.
Key Features
- Multiple evaluation metrics
- Cross-validation
- Model comparison
- Performance analysis
- Visualization support
Methods
evaluate_models(models: Dict[str, Any], X: pd.DataFrame, y: pd.Series, cv: int = 5) -> Dict[str, Dict[str, float]]
Evaluate multiple models.
Parameters:
models: Dictionary of trained modelsX: Feature matrixy: Target vectorcv: Cross-validation folds
Returns: Dictionary of evaluation results
Examples
Basic AutoML
python
from ai_ml_framework.auto_ml import AutoMLSelector
from sklearn.model_selection import train_test_split
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
# Initialize AutoML
automl = AutoMLSelector(problem_type='classification')
# Auto-select and train models
models = automl.auto_select_models(X_train, y_train)
print(f"Selected {len(models)} models")
# Evaluate models
results = automl.evaluate_models(models, X_test, y_test)
best_model_name = max(results.keys(), key=lambda x: results[x]['accuracy'])
print(f"Best model: {best_model_name}")
# Get best model
best_model = automl.auto_select_and_train(X_train, y_train)
print(f"Best model accuracy: {best_model.score(X_test, y_test):.3f}")
Hyperparameter Optimization
python
from ai_ml_framework.auto_ml import HyperparameterOptimizer
# Initialize optimizer
optimizer = HyperparameterOptimizer(problem_type='classification')
# Optimize model
optimized_model, best_params = optimizer.optimize_model(
models[best_model_name],
X_train,
y_train,
n_trials=50
)
print(f"Best parameters: {best_params}")
print(f"Optimized accuracy: {optimized_model.score(X_test, y_test):.3f}")
Ensemble Methods
python
# Create voting ensemble
voting_ensemble = automl.create_ensemble(
X_train, y_train, method='voting'
)
# Create stacking ensemble
stacking_ensemble = automl.create_ensemble(
X_train, y_train, method='stacking'
)
# Compare ensembles
evaluator = ModelEvaluator(problem_type='classification')
voting_results = evaluator.evaluate_single_model(
voting_ensemble, X_test, y_test
)
stacking_results = evaluator.evaluate_single_model(
stacking_ensemble, X_test, y_test
)
print(f"Voting ensemble accuracy: {voting_results['accuracy']:.3f}")
print(f"Stacking ensemble accuracy: {stacking_results['accuracy']:.3f}")
Model Comparison
python
# Compare multiple models
comparison_df = evaluator.compare_models(results, 'accuracy')
print("Model Comparison:")
print(comparison_df)
# Get model recommendations
recommendations = automl.get_model_recommendations(X_train, y_train)
print("\nModel Recommendations:")
for model, score in recommendations.items():
print(f" {model}: {score:.3f}")