Model Sources¶
ModelSource / ComparedModelSource — fitted-model adapters that feed the diagnostics.
Ferrum — a statistical visualization library with a Rust core.
ComparedModelSource ¶
Multi-model wrapper exposing the same surface as ModelSource.
Every derived-data method is proxied through each underlying
ModelSource and the per-model outputs are concatenated with a
model: Utf8 column stamped on each frame, so downstream chart
builders can route color="model" to render one curve per model.
_X, _y, _feature_names, and _class_names resolve to
the first source's values (every wrapped source shares X / y
by construction in ModelSource.compare, so any one will do);
accessing _model raises since there is no single estimator.
model_names reports the configured ordering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sources
|
dict[str, ModelSource]
|
Mapping from model name (used for the |
required |
Examples:
>>> import ferrum as fm
>>> cms = fm.ModelSource.compare({"ridge": ridge, "lasso": lasso}, X, y)
>>> fm.roc_chart(cms) # overlay both curves
>>> cms.model_names
['ridge', 'lasso']
>>> cms.roc_curve() # long-form frame with `model` column
model_names
property
¶
Ordered list of model display names.
Returns the keys of the sources dict supplied at construction time,
in insertion order. Each name corresponds to the value written into the
model column on every derived-data DataFrame.
Returns:
| Type | Description |
|---|---|
list[str]
|
Model names in the order they were registered. |
ModelSource ¶
Bases: PredictionsMixin, ClassificationCurvesMixin, FeatureImportanceMixin, ModelSelectionMixin, ClusteringMixin, RankingMixin, BaseSource
Wrap a fitted estimator + dataset and expose model-diagnostic derived data as polars DataFrames.
Constructing a ModelSource is sklearn-free — only attribute
introspection runs at __init__ time. Derived-data methods that
need sklearn / shap / umap lazy-import on call, so import ferrum
never pulls those packages into the user's process unless they
actually compute a diagnostic that requires them.
Each derived-data method returns a long-form polars DataFrame
whose schema is documented in ferrum._diagnostics.schemas —
chart builders and Visualizers consume the same frames.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Any
|
A fitted estimator. Must expose at least |
required |
X
|
DataFrame | DataFrame | Table | ndarray
|
Feature matrix. Coerced internally to a polars DataFrame; any
|
required |
y
|
array - like
|
Target. Required by methods that depend on ground truth (every
method except |
None
|
feature_names
|
sequence of str
|
Column labels. Defaults to |
None
|
class_names
|
sequence of str
|
Per-class display labels for classification diagnostics.
Defaults to |
None
|
sample_weight
|
array - like
|
Per-row weights forwarded to sklearn scorers that accept them. |
None
|
random_state
|
int
|
Seed propagated to every derived-data method whose underlying compute consumes randomness (importances permutation, SHAP background sampling, UMAP / t-SNE / MDS embeddings, cross-validation curves, partial-dependence sampling). Deterministic methods ignore the value. |
None
|
Examples:
>>> import ferrum as fm
>>> source = fm.ModelSource(model, X, y, random_state=0)
>>> fm.roc_chart(source) # use directly with a figure function
>>> source.predictions() # access derived data as a DataFrame
>>> source.confusion_matrix(normalize="true")
X
property
¶
Feature matrix coerced to a polars DataFrame.
Returns the value supplied to __init__ (after coercion).
Use this for read-only access from chart builders and external
callers — source._X is an internal alias preserved for
back-compat.
y
property
¶
Target series, or None when no y was supplied.
Returns the polars Series the constructor coerced from the
y argument. None means unsupervised — methods that
need ground truth raise on call.
model
property
¶
The wrapped fitted estimator.
Returns the model object supplied at construction time unchanged.
Chart builders use it for occasional native introspection (e.g.
model.classes_, model.n_clusters); prefer the public
derived-data methods when one exists.
feature_names
property
¶
capabilities
property
¶
Protocol attributes present on the wrapped estimator.
A frozen subset of _PROTOCOL_ATTRS ("predict",
"predict_proba", "coef_", "feature_importances_", …)
detected at construction time via hasattr. Derived-data methods
gate on this set to pick the appropriate code path and raise
AttributeError with a clear message when a required attribute
is absent.
Returns:
| Type | Description |
|---|---|
frozenset[str]
|
Attribute names that are present on the wrapped model. |
rank1d ¶
Univariate feature ranking.
algorithm in {"shapiro", "variance", "covariance"}. The
Shapiro-Wilk and variance algorithms operate on X alone;
"covariance" ranks features by absolute sample covariance with
y and therefore requires y to be present.
Output schema (SCHEMA_RANK1D): feature: Utf8,
score: Float64, rank: Int64. Rows are pre-sorted by descending
score so rank=1 is always the top feature.
rank2d ¶
Pairwise feature ranking — long-form correlation matrix.
algorithm in {"pearson", "spearman", "kendall", "covariance"}.
All algorithms now run in Rust (Kendall uses Knight's O(n log n)).
Output schema (SCHEMA_RANK2D): feature_x: Utf8,
feature_y: Utf8, correlation: Float64 — one row per
ordered pair of features, p × p rows total.
silhouette ¶
Per-sample silhouette values, sorted within cluster descending.
Returns one row per sample with columns sample_id (original X
index), y_position (sequential 0..n-1 stack order — used by
mark_silhouette to render bars in a tightly-packed Rousseeuw
layout), cluster, and silhouette_value.
k is informational; if provided, the result is filtered to
clusters in range(k).
pca_variance ¶
Explained-variance ratio per principal component plus the cumulative running sum.
If the wrapped model exposes explained_variance_ratio_ (e.g.
sklearn.decomposition.PCA), reads it directly (backward compat).
Otherwise computes from raw X via Rust SVD.
embeddings ¶
Low-dimensional embedding of X via UMAP / t-SNE / PCA.
Returns dim_0 … dim_{n_components-1} plus a label column
(y when provided, else zeros — used to color the scatter).
random_state is taken from the source's random_state.
intercluster_distance ¶
2D embedding of cluster centers + cluster size.
Returns one row per cluster with cluster (Int64), x / y
(Float64, the 2D embedded coordinate), and size (Int64, sample
count). Requires the wrapped model to expose cluster_centers_.
learning_curve ¶
Learning curve: score per (train_size, fold, split).
Returns long-form rows — one per (train_size, fold, split). Each
row carries the per-fold score plus the per-(train_size, split)
aggregates mean_score, std_score, lower, upper (95%
CI on the mean). Chart builders dedupe by (train_size, split) to
render a ribbon + line; the per-fold rows enable per-fold strip
overlays if a future caller wants them.
validation_curve ¶
Validation curve: score per (param_value, fold, split).
Same shape as learning_curve but parameterized by an
estimator hyperparameter sweep. param is the kwarg name on
the wrapped estimator (e.g. "alpha" for Ridge).
cv_scores ¶
Per-fold cross-validation scores.
Returns one row per (fold, split) — train and test scores for each cross-validation fold. Chart builders use this for boxplot / bar / strip distributions across folds.
alpha_selection ¶
Regularization-strength sweep for linear models.
Returns one row per (alpha, fold) — the per-fold test score on the
held-out split — plus per-alpha mean_score / std_score
aggregates. Chart builders dedupe by alpha to render a single
line, and use argmax(mean_score) to mark the best alpha.
importances ¶
importances(*, method: str = 'builtin', n_repeats: int = 30, scoring: Any = None, random_state: int | None = None) -> pl.DataFrame
Feature importance per feature, sorted by descending |importance|.
method="builtin" reads the wrapped model's feature_importances_
(tree-based estimators) or coef_ (linear estimators, averaged
absolute value across classes for multi-output linears). std is
zero in this path — sklearn's built-in attribute exposes no
per-feature variance.
method="permutation" calls sklearn's
permutation_importance with n_repeats/scoring and
populates std with the per-feature standard deviation across
repeats.
shap_values ¶
Long-form SHAP values per (sample, feature, class).
Returns a DataFrame with sample_id, feature, shap_value,
feature_value, feature_value_normalized, class_label.
- Regression:
class_labelis the constant"target"on every row. - Binary classifiers:
class_labelis the positive-class name on every row; SHAP values are for the positive class. - Multi-class classifiers: one row per (sample, feature, class);
class_labelcarries the class name. The result hasn_samples * n_features * n_classesrows total.
Explainer is auto-picked by model capability:
coef_:shap.LinearExplainer(deterministic, fast).feature_importances_:shap.TreeExplainer(deterministic for tree ensembles).- otherwise:
shap.KernelExplainer(model-agnostic; uses the firstmin(50, N)rows of X as the background unless an explicitbackgroundarray is passed).
partial_dependence ¶
partial_dependence(features: list[str | int], *, grid_resolution: int = 100, kind: str = 'average') -> pl.DataFrame
Partial dependence per feature.
kind="average" (default) returns the marginal PD curve per
feature with sample_id = -1 (one row per grid point per
feature).
kind="individual" returns per-sample ICE curves: one row per
(feature, sample_id, grid_point) triple with sample_id in
[0, n_samples). Chart builders pair this with the detail
encoding channel on sample_id to render one polyline per
sample.
kind="both" returns the union of the two: ICE rows plus
average rows (sample_id = -1), so a downstream chart can
overlay both layers on the same DataFrame.
roc_curve ¶
ROC curve(s). One row per (class, threshold). auc repeats per class.
For binary classifiers with average=None (default), returns a
single curve on the positive (second) class. For multiclass,
returns one-vs-rest curves per class; pass average in
{"micro", "macro", "weighted"} to additionally include a summary
curve under class="<average>".
pr_curve ¶
Precision-recall curve(s). One row per (class, threshold).
For binary classifiers, returns a single curve on the positive
(second) class — average is accepted for API symmetry with
the multiclass path but has no effect because binary classifiers
have only one curve to draw. For multiclass:
average=None(default) — returns one-vs-rest curves per class.average in {"micro", "macro", "weighted"}— returns a single summary curve withclass="<average>"and no per-class rows. Macro / weighted variants interpolate per- class precision over a shared recall grid (100 points); micro ravels the binarized labels into one curve.thresholdis NaN on every row of macro / weighted summaries (recall-grid interpolation discards thresholds) and follows sklearn's padding convention for micro.
threshold is NaN at the final (recall=0) point of every
per-class curve per sklearn's convention.
calibration_curve ¶
Calibration (reliability) curve for binary classifiers.
Returns one row per non-empty bin with mean_predicted,
fraction_positive, and count. Delegates to the
calibration_kernel Rust kernel.
cumulative_gain ¶
Cumulative-gain curve per class. Appends a 2-row class='baseline'
diagonal for plotting reference.
lift_curve ¶
Lift curve per class. Appends a 2-row class='baseline' line at
lift=1.0.
discrimination_threshold ¶
Discrimination threshold sweep — binary classifiers only.
Sweeps n_thresholds evenly-spaced thresholds in [0, 1] and
reports precision, recall, F1, and queue_rate at each. queue_rate
is the hand-computed fraction (y_score >= t).mean().
When cv is an int, runs the same sweep on each fold's held-out
scores from a freshly-cloned + re-fit estimator and averages
per-threshold metrics across folds. Pass a splitter object with a
.split() method to override.
confusion_matrix ¶
Confusion matrix in long form: one row per (actual, predicted) cell.
normalize: None for raw counts, "true"/"pred"/"all"
for sklearn-style normalization. value is the (possibly
normalized) count; value_fmt is a stringified label suitable for
mark_text overlay (integer counts when unnormalized, two-decimal
fractions when normalized).
predictions ¶
Return y_true, y_pred, residual, studentized_residual, cooks_distance, leverage.
leverage is the diagonal of the hat matrix
H = X (XᵀX)⁻¹ Xᵀ for linear estimators (those exposing
coef_); NaN otherwise. Used by the residuals-vs-leverage
panel of multi-panel residuals charts.
probabilities ¶
Return y_true + one column per class with predicted probability.
compare
classmethod
¶
Build a ComparedModelSource over one ModelSource per model.
Each value in models is wrapped in its own ModelSource with the
shared X and y. The returned ComparedModelSource proxies
every derived-data method through all wrapped sources and stamps the
model name as a model column on the concatenated output, so
downstream chart builders can route color="model".
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
models
|
dict[str, Any]
|
Mapping from display name to fitted estimator. Each estimator is
wrapped in its own |
required |
X
|
array - like
|
Feature matrix shared by all models. Accepted types match
|
required |
y
|
array - like
|
Target shared by all models. Required by most derived-data
methods (same constraints as |
None
|
**kwargs
|
Any
|
Keyword arguments forwarded verbatim to each |
{}
|
Returns:
| Type | Description |
|---|---|
ComparedModelSource
|
Multi-model wrapper whose derived-data methods return long-form
DataFrames with an extra |
Examples: