Custom Metrics in CatBoost: Regression and Classification Examples
CatBoost, a high-performance gradient boosting library, provides the flexibility to define custom metrics that can be tailored to specific business requirements or domain-specific goals. This article demonstrates how to create and use custom metrics in CatBoost for both classification and regression tasks.
Custom Metric for Classification
We’ll start with a classification example. We’ll use a custom metric based on profit calculation for a binary classification problem using the Titanic dataset.
Here’s the complete code for the custom classification metric:
from catboost import CatBoostClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from scipy.special import expit
import numpy as np
# Load dataset
df = sns.load_dataset('titanic')
X = df[['survived', 'pclass', 'age', 'sibsp', 'fare']]
y = X.pop('survived')
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=100)
class ProfitMetric:
@staticmethod
def get_profit(y_true, y_pred):
# Apply logistic function to get probabilities
y_pred = expit(y_pred).astype(int)
y_true = y_true.astype(int)
tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
# Calculate profit
profit = 400 * tp - 200 * fn - 100 * fp
return profit
def…