Skip to content

ferrum.plots

Unified plot function package — all ferrum figure-level functions.

Organized by domain: classification, regression, distribution, matrix, model_selection, clustering, explanation, ranking.

roc_chart

roc_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, per_class: bool = True, average: str | None = 'macro', annotate_auc: bool = True, subtitle: str | None = None, compare: dict[str, Any] | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

ROC curve chart for a classifier.

Plots true-positive rate vs. false-positive rate, one curve per class (default) or a single macro/micro/weighted-averaged curve. Supports multi-model comparison via compare=.

Parameters:

Name Type Description Default
model estimator, ModelSource, or dict of str -> estimator

Fitted sklearn-compatible classifier, an explicit ferrum.ModelSource, or a dict of named estimators for comparison. When a dict is passed, each estimator is evaluated and curves are overlaid. Mutually exclusive with y_true/y_pred.

None
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

True class labels. Required when model is a raw estimator.

None
y_true array - like

Ground-truth class labels for the precomputed path. Must be paired with y_pred; mutually exclusive with model.

None
y_pred array - like

Soft scores / probabilities for the precomputed path. 1-D for binary classifiers (positive-class scores); 2-D (n_samples, n_classes) for multiclass.

None
per_class bool

When True, one ROC curve per class is drawn using the one-vs-rest scheme. When False, a single curve averaged per average is drawn.

True
average ('macro', 'micro', 'weighted')

Averaging method used when per_class=False. Ignored when per_class=True.

"macro"
annotate_auc bool

When True, injects one text label per class showing the AUC value to 3 decimal places, anchored in the lower-right corner of the plot.

True
subtitle str or None

Optional subtitle rendered beneath the active chart title.

None
compare dict of str -> estimator or None

Additional estimators to overlay. Keys become model labels. model is treated as the base model (label "base").

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

ROC curve chart with one line per class (or averaged curve).

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import LogisticRegression
>>> fm.roc_chart(LogisticRegression().fit(X_train, y_train), X_test, y_test)

Precomputed path — bypass the model entirely:

>>> fm.roc_chart(y_true=y_test, y_pred=clf.predict_proba(X_test))

pr_chart

pr_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, per_class: bool = True, average: str | None = 'macro', annotate_ap: bool = True, iso_lines: bool = False, subtitle: str | None = None, compare: dict[str, Any] | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Precision-recall curve chart for a classifier.

Plots precision vs. recall, one curve per class (default) or a single averaged curve. Both axes are pinned to [0, 1.05] so curves render against the full precision-recall space (same convention as sklearn and yellowbrick). Supports multi-model comparison via compare=.

Parameters:

Name Type Description Default
model estimator, ModelSource, or dict of str -> estimator

Fitted sklearn-compatible classifier, an explicit ferrum.ModelSource, or a dict of named estimators for comparison.

None
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

True class labels. Required when model is a raw estimator.

None
y_true array - like

Ground-truth class labels for the precomputed path. Must be paired with y_pred; mutually exclusive with model.

None
y_pred array - like

Soft scores / probabilities for the precomputed path. 1-D for binary classifiers (positive-class scores); 2-D (n_samples, n_classes) for multiclass.

None
per_class bool

When True, one PR curve per class is drawn using the one-vs-rest scheme. When False, a single curve averaged per average is drawn. Binary classifiers always render a single curve regardless.

True
average ('macro', 'micro', 'weighted')

Averaging method used when per_class=False. Ignored when per_class=True. Macro and weighted variants interpolate per-class precision over a shared recall grid; micro ravels the binarized labels into a single curve.

"macro"
annotate_ap bool

When True, injects one text label per class near the lower- right corner of the plot showing average precision (AP) to 3 decimal places.

True
iso_lines bool

When True, overlays F-score iso-contours at F={0.2, 0.4, 0.6, 0.8} so users can read off combined precision-recall quality directly from the chart.

False
subtitle str or None

Optional subtitle rendered beneath the active chart title.

None
compare dict of str -> estimator or None

Additional estimators to overlay. Keys become model labels. model is treated as the base model (label "base").

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Precision-recall curve chart with one line per class (or an averaged summary curve).

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import LogisticRegression
>>> fm.pr_chart(LogisticRegression().fit(X_train, y_train), X_test, y_test)

Precomputed path:

>>> fm.pr_chart(y_true=y_test, y_pred=clf.predict_proba(X_test))

calibration_chart

calibration_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, n_bins: int = 10, strategy: str = 'uniform', annotate_brier: bool = True, subtitle: str | None = None, compare: dict[str, Any] | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Calibration (reliability) curve for one or more classifiers.

Plots mean predicted probability vs. fraction of positives in each bin, following sklearn's calibration_curve convention. Supports multi-model comparison via compare=.

Parameters:

Name Type Description Default
model estimator, ModelSource, or dict of str -> estimator

Fitted sklearn-compatible classifier, an explicit ferrum.ModelSource, or a dict of named estimators for comparison. When a dict is passed, each estimator is evaluated and curves are overlaid.

None
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

True binary labels. Required when model is a raw estimator.

None
y_true array - like

Ground-truth class labels for the precomputed path. Must be paired with y_pred; mutually exclusive with model.

None
y_pred array - like

Soft scores / probabilities for the precomputed path. 1-D for binary classifiers (positive-class scores); 2-D (n_samples, n_classes) for multiclass.

None
n_bins int

Number of probability bins for the reliability diagram.

10
strategy ('uniform', 'quantile')

Binning strategy forwarded to sklearn.calibration.calibration_curve. "uniform" uses equally-spaced bins; "quantile" uses equal-frequency bins.

"uniform"
annotate_brier bool

When True, attaches the :class:BrierLabel composite (spec 3.11) showing the Brier score per series. The chart title also encodes the Brier value when exactly one model is shown.

True
subtitle str or None

Optional one-line subtitle drawn beneath the chart title.

None
compare dict of str -> estimator or None

Additional estimators to overlay. Keys become model labels. model is treated as the base model (label "base").

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Reliability diagram with one curve per model plus a perfect- calibration diagonal reference.

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import LogisticRegression
>>> fm.calibration_chart(LogisticRegression().fit(X_train, y_train), X_test, y_test)

Precomputed path (y_pred = 1-D predicted probabilities for positive class):

>>> fm.calibration_chart(y_true=y_test, y_pred=clf.predict_proba(X_test)[:, 1])

gain_chart

gain_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, compare: dict[str, Any] | None = None, subtitle: str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Cumulative-gain curve for a classifier.

Plots the fraction of positive cases captured vs. the fraction of samples scored, one curve per class. Useful for evaluating the benefit of targeting a top-ranked subset. The categorical legend is replaced with endpoint-anchored direct labels -- unconditional, matching the learning_curve / validation_curve / lift sibling figures (Schwabish C8 audit-rework, 2026-05-12). Use Chart(df).mark_gain(...) directly to keep the legend.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible classifier or an explicit ferrum.ModelSource.

None
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

True class labels. Required when model is a raw estimator.

None
y_true array - like

Ground-truth class labels for the precomputed path. Must be paired with y_pred; mutually exclusive with model.

None
y_pred array - like

Soft scores / probabilities for the precomputed path. 1-D for binary classifiers (positive-class scores); 2-D (n_samples, n_classes) for multiclass.

None
compare dict[str, estimator] or None

Multi-model overlay. Keys are display names; values are fitted estimators. Routes through _resolve_source -> ComparedModelSource.

None
subtitle str or None

Optional subtitle rendered beneath the active chart title.

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Cumulative-gain curve with one line per class.

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import LogisticRegression
>>> fm.gain_chart(LogisticRegression().fit(X_train, y_train), X_test, y_test)

Precomputed path (y_pred = soft scores, 1-D binary or 2-D multiclass):

>>> fm.gain_chart(y_true=y_test, y_pred=clf.predict_proba(X_test))

lift_chart

lift_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, compare: dict[str, Any] | None = None, subtitle: str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Lift curve for a classifier.

Plots the ratio of positive-hit rate in the scored top-n vs. random baseline, one curve per class. Values above 1 indicate the model outperforms random selection at that depth. The categorical legend is replaced with endpoint-anchored direct labels -- unconditional (Schwabish C8 audit-rework, 2026-05-12). Use Chart(df).mark_lift(...) directly to keep the legend.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible classifier or an explicit ferrum.ModelSource.

None
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

True class labels. Required when model is a raw estimator.

None
y_true array - like

Ground-truth class labels for the precomputed path. Must be paired with y_pred; mutually exclusive with model.

None
y_pred array - like

Soft scores / probabilities for the precomputed path. 1-D for binary classifiers (positive-class scores); 2-D (n_samples, n_classes) for multiclass.

None
compare dict[str, estimator] or None

Multi-model overlay. Keys are display names; values are fitted estimators. Routes through _resolve_source -> ComparedModelSource.

None
subtitle str or None

Optional subtitle rendered beneath the active chart title.

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Lift curve with one line per class.

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import LogisticRegression
>>> fm.lift_chart(LogisticRegression().fit(X_train, y_train), X_test, y_test)

Precomputed path (y_pred = soft scores, 1-D binary or 2-D multiclass):

>>> fm.lift_chart(y_true=y_test, y_pred=clf.predict_proba(X_test))

discrimination_threshold_chart

discrimination_threshold_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, n_thresholds: int = 50, metrics: tuple[str, ...] = ('precision', 'recall', 'f1', 'queue_rate'), cv: Any = None, threshold_line: bool = False, optimum_label: bool = True, compare: dict[str, Any] | None = None, subtitle: str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Discrimination-threshold sweep chart for a binary classifier.

Plots multiple classification metrics (precision, recall, F1, queue rate) as a function of the decision threshold, allowing users to select an operating point that balances competing objectives. The underlying data is unpivoted to long form for multi-metric rendering.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible binary classifier or an explicit ferrum.ModelSource.

None
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

True binary labels. Required when model is a raw estimator.

None
y_true array - like

Ground-truth class labels for the precomputed path. Must be paired with y_pred; mutually exclusive with model.

None
y_pred array - like

Soft scores / probabilities for the precomputed path. 1-D for binary classifiers (positive-class scores); 2-D (n_samples, n_classes) for multiclass.

None
n_thresholds int

Number of evenly spaced threshold values in [0, 1] to evaluate.

50
metrics tuple of str

Metric names to plot. Each must be a column in the threshold sweep DataFrame.

("precision", "recall", "f1", "queue_rate")
cv int, cross-validator, or None

When provided, the threshold sweep is computed via cross- validation rather than a single train/test split.

None
threshold_line bool

When True, injects a vertical reference rule at the threshold that maximises F1.

False
optimum_label bool

When True, overlays a text annotation at the F1-optimum point showing the threshold and F1 value (e.g. "max F1 = 0.872 @ t=0.43"). Composes with threshold_line; either can be enabled independently.

True
compare dict[str, estimator] or None

Multi-model overlay. Keys are display names; values are fitted estimators. Routes through _resolve_source -> ComparedModelSource.

None
subtitle str or None

Optional subtitle rendered beneath the active chart title.

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Multi-metric line chart over the threshold range.

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import LogisticRegression
>>> fm.discrimination_threshold_chart(LogisticRegression().fit(X_train, y_train), X_test, y_test)

Precomputed path (y_pred = 1-D positive-class scores; cv= not supported):

>>> fm.discrimination_threshold_chart(y_true=y_test, y_pred=clf.predict_proba(X_test)[:, 1])

confusion_matrix_chart

confusion_matrix_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, normalize: str | None = 'true', annotate: bool = True, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Confusion matrix heatmap for a classifier.

Renders an ordinal heatmap of (actual class, predicted class) cell values with optional per-cell text annotations. Normalization follows sklearn's confusion_matrix conventions.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible classifier or an explicit ferrum.ModelSource.

None
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

True class labels. Required when model is a raw estimator.

None
y_true array - like

Ground-truth class labels for the precomputed path. Must be paired with y_pred; mutually exclusive with model.

None
y_pred array - like

Predicted class labels for the precomputed path.

None
normalize ('true', 'pred', 'all')

Normalization scheme for cell values. None shows raw counts; "true" normalizes over actual classes (rows); "pred" normalizes over predicted classes (columns); "all" normalizes over the total number of samples.

"true"
annotate bool

When True, overlays the numeric cell value as a text label in each cell.

True
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Confusion matrix heatmap with optional cell annotations.

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import LogisticRegression
>>> fm.confusion_matrix_chart(LogisticRegression().fit(X_train, y_train), X_test, y_test)

Precomputed path (y_pred = 1-D hard class labels):

>>> fm.confusion_matrix_chart(y_true=y_test, y_pred=clf.predict(X_test))

class_prediction_error_chart

class_prediction_error_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, normalize: bool = False, show_counts: bool = True, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Class prediction-error stacked bar chart for a classifier.

One bar per predicted class, stacked by actual class so misclassified segments are visually distinct. The data source is the unnormalized confusion matrix reshaped to long form.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible classifier or an explicit ferrum.ModelSource.

None
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

True class labels. Required when model is a raw estimator.

None
y_true array - like

Ground-truth class labels for the precomputed path. Must be paired with y_pred; mutually exclusive with model.

None
y_pred array - like

Predicted class labels for the precomputed path.

None
normalize bool

When True, each bar is normalized to 100% (relative composition). When False, bars show raw sample counts.

False
show_counts bool

When True, overlays per-segment count text at the vertical centre of each bar segment (empty segments -- value == 0 -- are skipped). Raw counts are shown regardless of normalize.

True
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Stacked bar chart with one bar per predicted class.

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import LogisticRegression
>>> fm.class_prediction_error_chart(LogisticRegression().fit(X_train, y_train), X_test, y_test)

Precomputed path (y_pred = 1-D hard class labels):

>>> fm.class_prediction_error_chart(y_true=y_test, y_pred=clf.predict(X_test))

classification_report_chart

classification_report_chart(model: Any = None, X: Any = None, y: Any = None, *, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Per-class precision / recall / F1 heatmap for a classifier.

Renders a rect-plus-text heatmap where rows are class labels, columns are metrics (precision, recall, f1-score), and cell color encodes the metric value. Each cell is annotated with its value to two decimal places.

Parameters:

Name Type Description Default
model estimator or ModelSource

A fitted sklearn-compatible classifier that exposes predict, or an explicit ferrum.ModelSource.

None
X array - like

Feature matrix. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
y array - like

True labels. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Heatmap with per-class precision / recall / F1-score cells.

Examples:

>>> import ferrum as fm
>>> fm.classification_report_chart(clf, X_test, y_test)

class_balance_chart

class_balance_chart(y: Any, *, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Bar chart of per-class label counts.

Computes the count of each unique class label and renders a vertical bar chart. No model is required — operates on the target array alone.

Parameters:

Name Type Description Default
y array - like

Target label array (1-D). Accepts polars Series, numpy arrays, and any iterable convertible to a flat list.

required
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Vertical bar chart with class labels on x and counts on y.

Examples:

>>> import ferrum as fm
>>> fm.class_balance_chart(y_train)

residuals_chart

residuals_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, kind: str = 'studentized', cook_threshold: float | str | None = None, panels: Any = 'auto', annotate_metrics: bool = True, subtitle: str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Residuals diagnostic chart for a regression estimator.

Plots residuals vs. fitted values. Optional Cook's distance highlighting marks observations whose leverage-adjusted influence exceeds a user-supplied threshold. panels="auto" returns the canonical 4-panel diagnostic layout (residuals-vs-fitted, QQ, scale-location, residuals-vs-leverage) as a 2x2 grid.

Parameters:

Name Type Description Default
model estimator or ModelSource

A fitted sklearn-compatible regression estimator, an explicit ferrum.ModelSource, or a dict of named estimators.

None
X array - like

Feature matrix. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
y array - like

Target vector. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
y_true array - like

Ground-truth target values for the precomputed path. Must be paired with y_pred; mutually exclusive with model.

None
y_pred array - like

Predicted target values for the precomputed path.

None
kind ('studentized', 'raw')

Residual type to plot on the y axis. "studentized" uses internally studentized residuals; "raw" uses raw residuals.

"studentized"
cook_threshold float, "auto", or None

Threshold for Cook's distance outlier highlighting. A float is used as an absolute cutoff; "auto" applies the 4 / n rule (Hair et al.); None disables highlighting. Cook's distance is only defined for estimators that expose coef_; non-linear models surface NaN and produce no outliers.

None
panels "auto", None, "single", or list of str

Panel selection. "auto" ships the canonical 4-panel layout -- residuals_vs_fitted, qq, scale_location, and residuals_vs_leverage -- laid out as a 2x2 grid. None or "single" returns just the residuals-vs-fitted panel. Pass an explicit list such as ["residuals_vs_fitted", "qq"] to customize the panel set.

"auto"
annotate_metrics bool

Overlay a top-right corner annotation showing R^2/RMSE/MAE computed from the fit. Pass False to render the residual scatter alone.

True
subtitle str or None

Optional subtitle rendered beneath the active chart title.

None
random_state int or None

Seed forwarded to ModelSource; does not affect deterministic residuals computation.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Residuals-vs-fitted scatter chart with optional Cook's-distance outlier overlay.

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import Ridge
>>> fm.residuals_chart(Ridge().fit(X_train, y_train), X_test, y_test)

Precomputed path — residuals are computed as y_true − y_pred internally. Leverage and Cook's distance are unavailable, so the leverage panel is omitted when panels="auto":

>>> fm.residuals_chart(y_true=y_test, y_pred=reg.predict(X_test))

prediction_error_chart

prediction_error_chart(model: Any = None, X: Any = None, y: Any = None, *, y_true: Any = None, y_pred: Any = None, reference_line: bool = True, ci: float | None = None, reference_band: bool = False, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Actual-vs-predicted scatter for a regression estimator.

Plots y_true on the y axis against y_pred on the x axis with an optional reference line (y = x diagonal) and an optional residual-based confidence ribbon.

Parameters:

Name Type Description Default
model estimator or ModelSource

A fitted sklearn-compatible regression estimator, an explicit ferrum.ModelSource, or None when pre-computed arrays are supplied via y_true / y_pred.

None
X array - like

Feature matrix. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
y array - like

Target vector. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
y_true array - like

Pre-computed true labels. Use with y_pred to bypass model inference entirely.

None
y_pred array - like

Pre-computed predictions. Use with y_true to bypass model inference entirely.

None
reference_line bool

Overlay the dashed y = x diagonal.

True
ci float or None

Confidence level in (0, 1). When set, overlays a ribbon spanning the central ci fraction of residuals around the reference line. Raises ValueError if not in (0, 1).

None
reference_band bool

When True (and ci is None), overlays a ±1 RMSE ribbon around the reference line.

False
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Actual-vs-predicted scatter with optional reference line and confidence ribbon.

Examples:

>>> import ferrum as fm
>>> fm.prediction_error_chart(model, X_test, y_test, reference_line=True)

cooks_distance_chart

cooks_distance_chart(model: Any = None, X: Any = None, y: Any = None, *, threshold: float | str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Cook's distance / residuals-vs-leverage diagnostic for a linear estimator.

Renders the residuals-vs-leverage panel with optional Cook's-distance outlier highlighting. Useful for identifying influential observations in a linear regression fit.

Parameters:

Name Type Description Default
model estimator or ModelSource

A fitted sklearn-compatible regression estimator that exposes coef_ (required for leverage-aware Cook's distance), or an explicit ferrum.ModelSource.

None
X array - like

Feature matrix. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
y array - like

Target vector. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
threshold float, "auto", or None

Cook's-distance threshold for outlier highlighting. A float is used as an absolute cutoff; "auto" applies the 4 / n rule (Hair et al.); None disables highlighting.

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Residuals-vs-leverage panel with optional Cook's-distance outlier overlay.

Examples:

>>> import ferrum as fm
>>> fm.cooks_distance_chart(linear_model, X_test, y_test, threshold="auto")

lmplot

lmplot(data: Any, *, x: str, y: str, hue: Any = None, col: Any = None, row: Any = None, method: str = 'lm', ci: Any = 95, order: int = 1, scatter: bool = True, scatter_kws: Any = None, line_kws: Any = None, truncate: bool = True, x_bins: Any = None, x_estimator: Any = None, x_jitter: Any = None, logx: bool = False, show_metrics: bool = True, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> Chart

Linear (and non-linear) regression scatter overlay.

Builds a layered chart with an optional scatter (mark_point) and a regression fit line, dispatching to the appropriate transform:

  • "lm" -- mark_smooth(method="lm") (polynomial degree controlled by order).
  • "loess" -- mark_smooth(method="loess").
  • "logistic" -- Logistic transform + mark_line.
  • "glm" -- Glm transform + mark_line.
  • "robust" -- Robust transform + mark_line.

Parameters:

Name Type Description Default
data DataFrame - like

Input data accepted by Chart(data).

required
x str

Column name for the horizontal (predictor) axis (required).

required
y str

Column name for the vertical (response) axis (required).

required
hue str or encoding

Column name to map to color; fit lines are drawn per hue level.

None
col str

Column name for faceting across columns.

None
row str

Column name for faceting across rows.

None
method ('lm', 'logistic', 'glm', 'loess', 'robust')

Fitting method.

"lm"
ci int or None

Confidence interval level (0--100) shown as a band around the fit line. Pass None to suppress.

95
order int

Polynomial degree forwarded to mark_smooth when method="lm".

1
scatter bool

Include a scatter layer (mark_point). Set to False to show only the fit line.

True
scatter_kws dict

Extra keyword arguments forwarded to the scatter mark_point call (e.g. {"opacity": 0.3, "size": 20}).

None
line_kws dict

Extra keyword arguments forwarded to the regression-line mark call (e.g. {"stroke_width": 3}).

None
truncate bool

When True (default), the fit line spans only the observed data range (min to max of x). When False, raises ValueError because extending the fit line beyond the data range requires Rust-side x_range support (tracked in the design spec WI-7).

True
x_bins any

Forwarded as x_bins to mark_smooth for binning the x-axis before fitting (method="lm" only).

None
x_estimator any

Forwarded as x_estimator to mark_smooth (method="lm" only).

None
x_jitter float or None

When set, applies Jitter(axis="x", width=x_jitter) to the scatter layer.

None
logx bool

Apply a log scale to the x-axis on both scatter and fit layers.

False
show_metrics bool

Schwabish SB-followup (2026-05-12): overlay a top-right corner annotation with R^2 / RMSE / MAE computed from the OLS fit (method="lm", no hue). Silently skipped for non-LM methods (loess, robust, logistic, glm -- different metric space) or when hue is set (per-group corners would crowd).

True
theme Theme

Visual theme applied via Chart.theme().

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode().

{}

Returns:

Type Description
Chart

Layered chart (scatter + fit) or fit-only when scatter=False. May be faceted.

Raises:

Type Description
ValueError

If method is not one of the supported values.

ValueError

If truncate=False (extending the fit line beyond the data range is not yet supported).

Examples:

>>> import ferrum as fm
>>> fm.lmplot(df, x="total_bill", y="tip")

Logistic regression with per-sex fit lines:

>>> fm.lmplot(df, x="total_bill", y="smoker_int", method="logistic", hue="sex")

Polynomial fit (degree 2) with no confidence band:

>>> fm.lmplot(df, x="size", y="tip", order=2, ci=None)

residplot

residplot(data: Any, *, x: str, y: str, lowess: bool = False, order: int = 1, robust: bool = False, dropna: bool = True, show_metrics: bool = True, zero_line: bool = True, label: Any = None, color: Any = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> Chart

Residual-diagnostic scatter plot.

Computes regression residuals via Smooth(output="residuals") (or Robust(output="residuals") when robust=True) and plots (x, residual) with mark_point. When lowess=True, a mark_line lowess smoother is layered over the residuals to help diagnose non-linearity.

Parameters:

Name Type Description Default
data DataFrame - like

Input data accepted by Chart(data).

required
x str

Column name for the horizontal axis (predictor; required).

required
y str

Column name used to compute residuals (response; required).

required
lowess bool

Overlay a lowess smoother on the residuals using Smooth(method="loess").

False
order int

Polynomial degree of the regression used to compute residuals.

1
robust bool

Use Robust regression (MM-estimator) instead of OLS when computing residuals. Compatible with show_metrics=True and zero_line=True (annotations flow through the Robust transform's same opt-in kwargs).

False
dropna bool

Drop rows where x or y is null before fitting.

True
show_metrics bool

Schwabish SB-followup (2026-05-12): overlay a top-right corner annotation with R^2 / RMSE / MAE computed inside the Rust Smooth/Robust transform via the inject_metrics=True kwarg -- same single execution model as fitted residuals, no Python-side regression duplication.

True
zero_line bool

Schwabish SB-followup: draw a dashed horizontal reference at y=0 via the Rust transform's inject_zero_ref=True opt-in.

True
label str

Legend label for the residual series. When set, a constant _label column is injected and mapped to color, producing a single-entry legend.

None
color str or encoding

Column name or constant color forwarded to Chart.encode(color=).

None
theme Theme

Visual theme applied via Chart.theme().

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode().

{}

Returns:

Type Description
Chart

Scatter of residuals (possibly with a lowess layer, zero reference line, and corner R^2/RMSE/MAE annotation).

Examples:

>>> import ferrum as fm
>>> fm.residplot(df, x="total_bill", y="tip")

Robust residuals with annotations:

>>> fm.residplot(df, x="size", y="tip", robust=True)

regplot

regplot(data: Any, *, x: str, y: str, hue: Any = None, method: str = 'lm', ci: Any = 95, order: int = 1, scatter: bool = True, scatter_kws: Any = None, line_kws: Any = None, truncate: bool = True, x_jitter: Any = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> Chart

Axes-level regression scatter plot.

The axes-level equivalent of lmplot — identical API except col= and row= are excluded because regplot does not facet. Every parameter is forwarded to lmplot unchanged.

Parameters:

Name Type Description Default
data DataFrame - like

Input data accepted by Chart(data).

required
x str

Column name for the horizontal (predictor) axis (required).

required
y str

Column name for the vertical (response) axis (required).

required
hue str or encoding

Column name to map to color; fit lines are drawn per hue level.

None
method ('lm', 'logistic', 'glm', 'loess', 'robust')

Fitting method forwarded to lmplot.

"lm"
ci int or None

Confidence interval level (0--100) shown as a band around the fit line. Pass None to suppress.

95
order int

Polynomial degree forwarded to mark_smooth when method="lm".

1
scatter bool

Include a scatter layer (mark_point).

True
scatter_kws dict

Extra keyword arguments forwarded to the scatter mark_point call.

None
line_kws dict

Extra keyword arguments forwarded to the regression-line mark call.

None
truncate bool

When True, the fit line spans only the observed data range.

True
x_jitter float or None

When set, applies Jitter(axis="x", width=x_jitter) to the scatter layer.

None
theme Theme

Visual theme applied via Chart.theme().

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode().

{}

Returns:

Type Description
Chart

Layered chart (scatter + fit) or fit-only when scatter=False.

Raises:

Type Description
ValueError

If method is not one of the supported values.

ValueError

If truncate=False (extending the fit line beyond the data range is not yet supported).

Examples:

>>> import ferrum as fm
>>> fm.regplot(df, x="total_bill", y="tip")

Robust regression:

>>> fm.regplot(df, x="size", y="tip", method="robust")

importance_chart

importance_chart(model: Any, X: Any = None, y: Any = None, *, method: str = 'builtin', top_k: int | None = 20, orient: str = 'horizontal', error_bars: bool = True, show_values: bool = True, subtitle: str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Feature-importance bar chart for an estimator.

Extracts feature importances from the model via the selected method and renders a ranked bar chart. Error bars are drawn from importance +/- std when available.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible estimator or an explicit ferrum.ModelSource. The estimator must expose feature_importances_ or coef_ for method="builtin".

required
X array - like

Feature matrix. Required when model is a raw estimator. Also required for method="permutation".

None
y array - like

Target vector. Required when model is a raw estimator. Also required for method="permutation".

None
method ('builtin', 'permutation')

Importance extraction method. "builtin" reads feature_importances_ or coef_ directly (std=0). "permutation" runs sklearn.inspection.permutation_importance and populates std for the error bars.

"builtin"
top_k int or None

Maximum number of features to display, ordered by descending absolute importance. Pass None to show all features.

20
orient ('horizontal', 'vertical')

Bar orientation. "horizontal" maps feature names to the y axis; "vertical" maps them to the x axis.

"horizontal"
error_bars bool

When True, draws ±1 std error bars around each bar. Has no visual effect when method="builtin" (std=0).

True
show_values bool

When True, overlays the numeric importance value as a text label on each bar.

True
subtitle str or None

Optional subtitle rendered beneath the active chart title.

None
random_state int or None

Seed forwarded to ModelSource and to permutation_importance when method="permutation".

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

Feature-importance bar chart ranked by absolute importance.

Examples:

>>> import ferrum as fm
>>> from sklearn.ensemble import RandomForestClassifier
>>> fm.importance_chart(RandomForestClassifier().fit(X_train, y_train), X_test, y_test)

shap_chart

shap_chart(model: Any, X: Any = None, y: Any = None, *, kind: str = 'beeswarm', max_display: int = 20, sample_idx: int | None = None, order: str = 'abs_mean', background: Any = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

SHAP value chart for an estimator.

Dispatches to one of three chart types based on kind. The beeswarm (default) shows per-sample per-feature SHAP values colored by z-scored feature magnitude. The bar chart aggregates mean absolute SHAP per feature. The waterfall chart shows cumulative contributions for a single sample.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible estimator or an explicit ferrum.ModelSource. SHAP computation requires a tree-based or kernel-explainer-compatible model.

required
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

Target vector. Required when model is a raw estimator; not used by the SHAP computation itself.

None
kind ('beeswarm', 'bar', 'waterfall')

Chart type. "beeswarm" renders one point per (sample, feature) colored by z-scored feature value. "bar" renders mean(|shap|) per feature as a horizontal bar. "waterfall" renders cumulative per-feature contributions for the sample at sample_idx.

"beeswarm"
max_display int

Maximum number of features to display, selected by the order ranking criterion.

20
sample_idx int or None

Row index (0-based) of the sample to explain. Required when kind="waterfall"; ignored for other kinds.

None
order ('abs_mean', 'max')

Feature ranking criterion across all three kinds. "abs_mean" ranks by mean absolute SHAP value; "max" by max absolute SHAP value. Drives both the top-max_display selection and the bar/waterfall layout order so all three chart types agree on "most important".

"abs_mean"
background array - like or None

Background dataset for kernel SHAP explainers. When None, the full training set is used. Ignored for tree SHAP.

None
random_state int or None

Seed forwarded to ModelSource; SHAP computation itself is deterministic for tree explainers.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

SHAP beeswarm, bar, or waterfall chart depending on kind.

Raises:

Type Description
ValueError

If kind="waterfall" and sample_idx is not provided.

ValueError

If kind is not one of "beeswarm", "bar", "waterfall".

Examples:

>>> import ferrum as fm
>>> from sklearn.ensemble import GradientBoostingClassifier
>>> fm.shap_chart(GradientBoostingClassifier().fit(X_train, y_train), X_test, y_test)

shap_beeswarm_chart

shap_beeswarm_chart(model: Any, X: Any = None, y: Any = None, *, max_display: int = 20, order: str = 'abs_mean', background: Any = None, per_class: bool = False, zero_line: bool = True, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

SHAP beeswarm chart -- per-sample SHAP scatter colored by z-scored value.

per_class=True on a multi-class classifier facets the chart by class. per_class=False (default) renders a single panel using the first class (the only group on regression and binary).

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible estimator or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

Target vector. Required when model is a raw estimator.

None
max_display int

Maximum number of features to display.

20
order ('abs_mean', 'max')

Feature ranking criterion. "abs_mean" ranks by mean absolute SHAP value; "max" by max absolute SHAP value.

"abs_mean"
background array - like or None

Background dataset for kernel SHAP explainers. Ignored for tree SHAP.

None
per_class bool

Facet by class on multi-class classifiers.

False
zero_line bool

Overlay a dashed vertical reference rule at shap_value = 0 so the sign of each feature's contribution is immediately legible. Automatically skipped on the multi-panel per_class path. Pass False to suppress.

True
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

SHAP beeswarm chart.

Examples:

>>> import ferrum as fm
>>> from sklearn.ensemble import GradientBoostingClassifier
>>> fm.shap_beeswarm_chart(GradientBoostingClassifier().fit(X_train, y_train), X_test, y_test)

shap_bar_chart

shap_bar_chart(model: Any, X: Any = None, y: Any = None, *, max_display: int = 20, order: str = 'abs_mean', background: Any = None, per_class: bool = False, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

SHAP bar chart -- mean absolute SHAP per feature.

per_class=True on a multi-class classifier facets the chart by class. per_class=False (default) renders a single panel using the first class.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible estimator or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

Target vector. Required when model is a raw estimator.

None
max_display int

Maximum number of features to display.

20
order ('abs_mean', 'max')

Feature ranking criterion. "abs_mean" ranks by mean absolute SHAP value; "max" by max absolute SHAP value.

"abs_mean"
background array - like or None

Background dataset for kernel SHAP explainers. Ignored for tree SHAP.

None
per_class bool

Facet by class on multi-class classifiers.

False
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

SHAP bar chart (features x mean |SHAP|).

Examples:

>>> import ferrum as fm
>>> from sklearn.ensemble import GradientBoostingClassifier
>>> fm.shap_bar_chart(GradientBoostingClassifier().fit(X_train, y_train), X_test, y_test)

shap_waterfall_chart

shap_waterfall_chart(model: Any, X: Any = None, y: Any = None, *, sample_idx: int, max_display: int = 20, order: str = 'abs_mean', background: Any = None, per_class: bool = False, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

SHAP waterfall chart -- cumulative per-feature contributions for one sample.

per_class=True on a multi-class classifier facets the chart by class (one waterfall panel per class for the same sample). per_class=False (default) renders a single panel using the first class.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible estimator or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

Target vector. Required when model is a raw estimator.

None
sample_idx int

Row index (0-based) of the sample to explain. Required.

required
max_display int

Maximum number of features to display.

20
order ('abs_mean', 'max')

Feature ranking criterion. "abs_mean" ranks by mean absolute SHAP value; "max" by max absolute SHAP value.

"abs_mean"
background array - like or None

Background dataset for kernel SHAP explainers. Ignored for tree SHAP.

None
per_class bool

Facet by class on multi-class classifiers.

False
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

SHAP waterfall chart for the sample at sample_idx.

Examples:

>>> import ferrum as fm
>>> from sklearn.ensemble import GradientBoostingClassifier
>>> fm.shap_waterfall_chart(
...     GradientBoostingClassifier().fit(X_train, y_train), X_test, y_test,
...     sample_idx=0,
... )

pdp_chart

pdp_chart(model: Any, X: Any = None, y: Any = None, *, features: list | None = None, grid_resolution: int = 100, kind: str = 'average', ice_alpha: float = 0.2, center: bool = False, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Partial-dependence plot (PDP) for one or more features.

Renders one facet panel per feature, each showing how the model prediction changes as that feature varies across its observed range while all other features are held at their column means. Supports average PDP, individual conditional expectation (ICE), and both overlaid.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible estimator or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

Target vector. Required when model is a raw estimator; not used by PDP computation.

None
features list of str or int, required

Column names or integer indices of the features to plot. Each feature gets its own facet panel. Must be provided; raises ValueError when None.

None
grid_resolution int

Number of evenly spaced grid points along each feature's range.

100
kind ('average', 'individual', 'both')

"average" renders the mean PDP line. "individual" renders one ICE line per sample. "both" overlays the average curve on top of the ICE lines.

"average"
ice_alpha float

Opacity of individual ICE lines when kind is "individual" or "both".

0.2
center bool

When True, each curve is anchored at zero by subtracting the value at the smallest grid point, making relative changes across features directly comparable.

False
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

Faceted PDP chart with one panel per feature.

Raises:

Type Description
ValueError

If features is None.

Examples:

>>> import ferrum as fm
>>> from sklearn.ensemble import GradientBoostingRegressor
>>> fm.pdp_chart(GradientBoostingRegressor().fit(X_train, y_train), X_test, features=["age", "income"])

learning_curve_chart

learning_curve_chart(model: Any, X: Any = None, y: Any = None, *, cv: int = 5, scoring: Any = None, train_sizes: Any = None, ci_style: str = 'band', subtitle: str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Learning curve chart showing score vs. training set size.

Plots cross-validated train and validation scores as training size grows, revealing overfitting, underfitting, and data-hunger. The CI band is drawn from per-fold score variance.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted (or unfitted) sklearn-compatible estimator or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

Target vector. Required when model is a raw estimator.

None
cv int

Number of cross-validation folds.

5
scoring str, callable, or None

Scoring metric forwarded to sklearn.model_selection.learning_curve. When None, the estimator's default scorer is used.

None
train_sizes array - like or None

Relative or absolute training sizes to evaluate. When None, sklearn's default np.linspace(0.1, 1.0, 5) is used.

None
ci_style ('band', 'errorbar')

Visual style of the confidence interval. "band" draws a shaded ribbon; "errorbar" draws error bars.

"band"
subtitle str or None

Optional subtitle rendered beneath the active chart title.

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Learning curve with train and validation score lines plus CI.

Examples:

>>> import ferrum as fm
>>> from sklearn.svm import SVC
>>> fm.learning_curve_chart(SVC(), X_train, y_train, cv=5)

validation_curve_chart

validation_curve_chart(model: Any, X: Any = None, y: Any = None, *, param: str = 'alpha', values: Any = None, cv: int = 5, scoring: Any = None, log_scale: Any = 'auto', ci_style: str = 'band', subtitle: str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Plot score vs. a single hyperparameter value.

Sweeps one hyperparameter over a supplied value range and plots cross-validated train and validation scores, revealing the bias- variance tradeoff for that parameter. The x axis is log-scaled automatically when the value range spans more than two orders of magnitude.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted (or unfitted) sklearn-compatible estimator or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

Target vector. Required when model is a raw estimator.

None
param str

Name of the hyperparameter to sweep, passed to sklearn.model_selection.validation_curve as param_name.

"alpha"
values (array - like, required)

Values of param to evaluate. Must be provided; raises ValueError when None.

None
cv int

Number of cross-validation folds.

5
scoring str, callable, or None

Scoring metric. When None, the estimator's default scorer is used.

None
log_scale bool or 'auto'

Whether to use a log scale on the x axis. "auto" enables log scale when max(values) / min(non-zero values) > 100.

"auto"
ci_style ('band', 'errorbar')

Visual style of the confidence interval.

"band"
subtitle str or None

Optional subtitle rendered beneath the active chart title.

None
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Validation curve with train and validation score lines plus CI.

Raises:

Type Description
ValueError

If values is None.

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import Ridge
>>> fm.validation_curve_chart(Ridge(), X_train, y_train, param="alpha", values=[0.01, 0.1, 1, 10])

cv_scores_chart

cv_scores_chart(model: Any, X: Any = None, y: Any = None, *, cv: int = 5, scoring: Any = None, kind: str = 'box', split: str = 'both', random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Per-fold cross-validation score distribution chart.

Visualizes the distribution of scores across folds for train and/or validation splits, making variance and consistency across folds immediately visible.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted (or unfitted) sklearn-compatible estimator or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

Target vector. Required when model is a raw estimator.

None
cv int

Number of cross-validation folds.

5
scoring str, callable, or None

Scoring metric. When None, the estimator's default scorer is used.

None
kind ('box', 'strip', 'bar')

Chart type. "box" renders a box-and-whisker per split; "strip" renders individual fold points; "bar" renders the mean score per split as a bar (pre-aggregated).

"box"
split ('both', 'train', 'test')

Which CV split to show. "both" shows train and validation side by side; "train" or "test" restricts to one split.

"both"
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

Per-fold CV score distribution chart.

Examples:

>>> import ferrum as fm
>>> from sklearn.ensemble import RandomForestClassifier
>>> fm.cv_scores_chart(RandomForestClassifier(), X_train, y_train, cv=10)

alpha_selection_chart

alpha_selection_chart(model: Any, X: Any = None, y: Any = None, *, alphas: Any = None, cv: int = 5, scoring: Any = None, log_scale: bool = True, highlight_best: bool = True, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Regularization-strength (alpha) selection chart.

Plots cross-validated mean score as a function of regularization strength, helping users identify the optimal alpha for penalized estimators such as Ridge, Lasso, and ElasticNet.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted (or unfitted) sklearn-compatible penalized estimator or an explicit ferrum.ModelSource. The estimator must accept an alpha constructor parameter.

required
X array - like

Feature matrix. Required when model is a raw estimator.

None
y array - like

Target vector. Required when model is a raw estimator.

None
alphas (array - like, required)

Regularization-strength values to sweep. Must be provided; raises ValueError when None.

None
cv int

Number of cross-validation folds.

5
scoring str, callable, or None

Scoring metric. When None, the estimator's default scorer is used.

None
log_scale bool

When True, the alpha axis is log-scaled.

True
highlight_best bool

When True, injects a vertical reference rule at the alpha that maximises mean CV score.

True
random_state int or None

Seed forwarded to ModelSource.

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None

Returns:

Type Description
Chart

CV-score-vs-alpha line chart with optional best-alpha rule.

Raises:

Type Description
ValueError

If alphas is None.

Examples:

>>> import ferrum as fm
>>> from sklearn.linear_model import Ridge
>>> fm.alpha_selection_chart(Ridge(), X_train, y_train, alphas=[0.001, 0.01, 0.1, 1, 10, 100])

cluster_diagnostics

cluster_diagnostics(X: Any, *, ks: Any, method: str = 'kmeans', scoring: str = 'both', n_init: int = 10, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart | HConcatChart'

Elbow and silhouette diagnostics over a range of cluster counts.

Fits one clusterer per value of k and renders the requested diagnostic panel(s): distortion (inertia) vs. k, mean silhouette score vs. k, or both side-by-side. Unlike the other figure functions, this sweeps the model class itself rather than wrapping a single pre-fitted ModelSource.

Parameters:

Name Type Description Default
X array - like

Feature matrix. All samples are used for fitting and scoring. Polars DataFrames, pandas DataFrames, and 2D numpy arrays are accepted.

required
ks iterable of int

Values of k (number of clusters) to evaluate.

required
method ('kmeans', 'hierarchical')

Clustering algorithm.

  • "kmeans" -- sklearn.cluster.KMeans. Uses the estimator's native inertia_ attribute.
  • "hierarchical" -- sklearn.cluster.AgglomerativeClustering (Ward linkage). Inertia is computed manually as the sum of squared distances from each sample to its cluster centroid, since AgglomerativeClustering does not expose inertia_.

DBSCAN is intentionally not supported -- its number of clusters is determined by eps / min_samples, not by a swept k, so the chart's elbow / silhouette-vs-k axis doesn't apply.

"kmeans"
scoring ('elbow', 'silhouette', 'both')

Which diagnostic panel(s) to render.

  • "elbow" -- inertia-vs-k line chart only.
  • "silhouette" -- mean silhouette-vs-k line chart only.
  • "both" -- side-by-side HConcatChart of the two.
"elbow"
n_init int

Number of KMeans initializations per k; forwarded to sklearn.cluster.KMeans(n_init=...). Ignored for hierarchical clustering, which is deterministic given the linkage strategy.

10
random_state int or None

Random seed for KMeans initialisation. When None, defaults to seed 0 for deterministic results. Ignored for hierarchical clustering.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart or HConcatChart

Chart when scoring is "elbow" or "silhouette"; HConcatChart with both panels side-by-side when scoring="both".

Raises:

Type Description
ValueError

If method is unknown or scoring is not in the supported set.

Examples:

>>> import ferrum as fm
>>> fm.cluster_diagnostics(X_train, ks=range(2, 11))
>>> fm.cluster_diagnostics(X_train, ks=range(2, 11),
...                         method="hierarchical", scoring="silhouette")

intercluster_distance_chart

intercluster_distance_chart(model: Any, X: Any = None, *, k: int | None = None, method: str = 'mds', random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Intercluster distance map: 2D embedding of cluster centers.

Embeds cluster centers into 2D using MDS so their pairwise distances are approximately preserved. Each center is rendered as a size-encoded circle (bubble area proportional to cluster count). A 15% padding is added around the data range so large bubbles do not clip at axis edges.

Parameters:

Name Type Description Default
model fitted clusterer or ModelSource

A fitted sklearn-compatible clusterer (must expose cluster_centers_ or n_clusters) or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix used to compute cluster member counts. Required when model is a raw estimator.

None
k int or None

Number of clusters. When None, inferred from model.n_clusters or len(model.cluster_centers_); raises ValueError if neither attribute is present.

None
method ('mds', 'tsne')

Dimensionality-reduction method for embedding cluster centers. "mds" (default) uses sklearn MDS to preserve pairwise distances; "tsne" uses t-SNE (perplexity clamped to min(5, k-1) for small cluster counts).

"mds"
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

2D scatter chart of embedded cluster centers sized by cluster population.

Raises:

Type Description
ValueError

If k is None and the model exposes neither n_clusters nor cluster_centers_.

Examples:

>>> import ferrum as fm
>>> from sklearn.cluster import KMeans
>>> fm.intercluster_distance_chart(KMeans(n_clusters=5).fit(X_train), X_train)

pca_scree_chart

pca_scree_chart(model: Any, X: Any = None, *, n_components: int | None = None, cumulative_line: bool = True, threshold: float | None = 0.95, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

PCA scree chart showing explained variance per component.

Plots per-component explained variance ratio as bars, with an optional cumulative-variance overlay line and a horizontal reference rule at a target cumulative-variance threshold.

Parameters:

Name Type Description Default
model PCA estimator, ModelSource, or DataFrame

A fitted sklearn.decomposition.PCA instance, an explicit ferrum.ModelSource wrapping one, an unfitted PCA estimator (fit is run on X), or a raw polars/pandas DataFrame (variance computed via Rust SVD -- no sklearn needed).

required
X array - like

Feature matrix. Required when model is a raw (unfitted) estimator; ignored when it is already a ModelSource with data bound.

None
n_components int or None

Number of components to display. When None, all available components from the fitted PCA are shown.

None
cumulative_line bool

When True, overlays a cumulative explained-variance line.

True
threshold float or None

When a float is given, draws a horizontal reference rule at that cumulative-variance level (e.g. 0.95 marks where 95% of variance is explained). Pass None to omit the rule.

0.95
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

PCA scree bar chart with optional cumulative line and threshold rule.

Examples:

>>> import ferrum as fm
>>> from sklearn.decomposition import PCA
>>> fm.pca_scree_chart(PCA(n_components=10).fit(X_train), threshold=0.90)

Raw DataFrame (no sklearn required):

>>> fm.pca_scree_chart(X_train, n_components=10)

silhouette_chart

silhouette_chart(model: Any, X: Any = None, *, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Rousseeuw silhouette plot for a fitted clusterer.

Computes per-sample silhouette coefficients and renders a horizontal bar chart sorted by cluster, one bar per sample, colored by cluster label.

Parameters:

Name Type Description Default
model fitted clusterer or ModelSource

A fitted sklearn-compatible clustering estimator that exposes labels_, or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

Horizontal silhouette bar chart with per-cluster color encoding.

Examples:

>>> import ferrum as fm
>>> from sklearn.cluster import KMeans
>>> fm.silhouette_chart(KMeans(n_clusters=3, random_state=0).fit(X), X)

manifold_chart

manifold_chart(model: Any, X: Any = None, *, method: str = 'umap', random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Low-dimensional manifold-embedding scatter (UMAP / t-SNE / PCA).

Projects the input data to two dimensions via the selected embedding algorithm and renders a point chart with axes dim_0 / dim_1 colored by cluster label.

Parameters:

Name Type Description Default
model fitted clusterer or ModelSource

A fitted clustering estimator whose labels_ attribute colors the points, or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Required when model is a raw estimator; ignored when it is already a ModelSource.

None
method str

Embedding algorithm. Typical values are "umap", "tsne", and "pca".

"umap"
random_state int or None

Seed forwarded to ModelSource for stochastic embeddings.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

2-D scatter plot of the embedded data, colored by cluster label.

Examples:

>>> import ferrum as fm
>>> from sklearn.cluster import KMeans
>>> fm.manifold_chart(KMeans(n_clusters=4, random_state=0).fit(X), X, method="tsne")

elbow_chart

elbow_chart(model: Any, X: Any, *, ks: Any, metric: str = 'distortion', random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Elbow / score sweep over a range of k for a clustering algorithm.

Fits one model per k value and plots the selected score metric against k. Useful for identifying the optimal number of clusters visually.

Parameters:

Name Type Description Default
model type

Uninstantiated clustering class (e.g. sklearn.cluster.KMeans). Must accept n_clusters, random_state, and n_init keyword arguments.

required
X array - like

Feature matrix used to fit each per-k model.

required
ks sequence of int

The candidate k values to sweep (e.g. range(2, 11)).

required
metric ('distortion', 'silhouette', 'calinski_harabasz')

Score to optimize. "distortion" is minimized; "silhouette" and "calinski_harabasz" are maximized.

"distortion"
random_state int or None

Seed passed to every per-k model instantiation.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

Score-vs-k line chart with the optimal k annotated.

Examples:

>>> import ferrum as fm
>>> from sklearn.cluster import KMeans
>>> fm.elbow_chart(KMeans, X_train, ks=range(2, 9))

rank_chart

rank_chart(source: Any, X: Any = None, y: Any = None, *, rank: str = '2d', algorithm: str | None = None, top_k: int | None = None, annot: bool = True, orient: str = 'horizontal', color_field: str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Feature-ranking chart: univariate bar or pairwise heatmap.

Computes a ranking score for each feature (or each feature pair) and renders either a ranked bar chart (rank="1d") or a pairwise correlation heatmap (rank="2d"). Accepts a fitted estimator, ModelSource, or a raw DataFrame / 2D array (no model required for most algorithms).

Parameters:

Name Type Description Default
source estimator, ModelSource, DataFrame, or array-like

Input data. When a fitted estimator or ModelSource is supplied, the feature matrix is taken from the bound data. When a DataFrame or 2D array is supplied, X is used as the feature matrix if provided.

required
X array - like

Feature matrix. Used when source is a raw estimator (not a ModelSource) or when source is a raw DataFrame and X overrides it.

None
y array - like

Target vector. Required only for algorithm="covariance" which routes through ModelSource.rank1d.

None
rank ('1d', '2d')

Ranking mode. "1d" computes a univariate score per feature and renders a horizontal bar chart (or vertical with orient="vertical"). "2d" computes pairwise scores and renders a heatmap.

"1d"
algorithm str or None

Ranking algorithm. When None, defaults to "shapiro" for rank="1d" and "pearson" for rank="2d".

None
top_k int or None

For rank="1d", truncate to the top-k features by score. Has no effect for rank="2d".

None
annot bool

For rank="2d", overlays the correlation value (2 decimal places) as a text label in each heatmap cell. Has no effect for rank="1d".

True
orient ('horizontal', 'vertical')

Bar orientation for rank="1d"; ignored for rank="2d".

"horizontal"
color_field str or None

Column name to use for bar fill color in rank="1d"; when None, a single color is used.

None
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

Ranked bar chart (rank="1d") or pairwise heatmap (rank="2d").

Raises:

Type Description
ValueError

If rank is not "1d" or "2d".

Examples:

>>> import ferrum as fm
>>> fm.rank_chart(X_train, rank="2d")
>>> fm.rank_chart(X_train, rank="1d", algorithm="shapiro", top_k=10)

.. deprecated:: 2026-05-12 Use :func:rank1d_chart or :func:rank2d_chart directly. This dispatcher remains as a shim that forwards to the appropriate sibling and will be removed in a future major release.

rank1d_chart

rank1d_chart(source: Any, X: Any = None, y: Any = None, *, algorithm: str | None = None, top_k: int | None = None, orient: str = 'horizontal', color_field: str | None = None, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Univariate feature-ranking bar chart.

Computes a per-feature ranking score and renders a horizontal (or vertical with orient="vertical") bar chart sorted by score. Accepts a fitted estimator, ModelSource, or a raw DataFrame / 2D array; the "covariance" algorithm requires y.

Parameters:

Name Type Description Default
source estimator, ModelSource, DataFrame, or array-like

Input data. When a fitted estimator or ModelSource is supplied, the feature matrix is taken from the bound data.

required
X optional

Feature matrix / target -- forwarded to _resolve_source when source is a raw estimator.

None
y optional

Feature matrix / target -- forwarded to _resolve_source when source is a raw estimator.

None
algorithm str or None

Ranking algorithm. None selects "shapiro".

None
top_k int or None

Limit the chart to the top-k features by score. None shows all features.

None
orient ('horizontal', 'vertical')

Bar orientation.

"horizontal"
color_field str or None

Column name to map to bar color.

None
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

Ranked bar chart.

Examples:

>>> import ferrum as fm
>>> fm.rank1d_chart(model, X_train, algorithm="shapiro")

rank2d_chart

rank2d_chart(source: Any, X: Any = None, y: Any = None, *, algorithm: str | None = None, annot: bool = True, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Pairwise feature-correlation heatmap.

Computes pairwise feature correlation (or covariance) and renders a heatmap. Accepts a fitted estimator, ModelSource, or a raw DataFrame / 2D array (no model required).

Parameters:

Name Type Description Default
source estimator, ModelSource, DataFrame, or array-like

Input data.

required
X optional

Feature matrix / target.

None
y optional

Feature matrix / target.

None
algorithm str or None

Ranking algorithm. None selects "pearson".

None
annot bool

Overlay the correlation value (2 decimals) on each cell.

True
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

Pairwise correlation heatmap.

Examples:

>>> import ferrum as fm
>>> fm.rank2d_chart(model, X_train, algorithm="pearson")

parallel_coordinates_chart

parallel_coordinates_chart(data: Any, *, features: list[str] | None = None, hue: str | None = None, rescale: str | None = 'minmax', alpha: float = 0.5, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Parallel coordinates chart for multivariate data.

Renders one polyline per sample, with each feature mapped to a vertical axis. Features are optionally rescaled to a common range before plotting so all axes are visually comparable. Samples are colored by a grouping column when hue is provided.

Parameters:

Name Type Description Default
data polars.DataFrame, pandas.DataFrame, or array-like

Input data. Polars and pandas DataFrames are used directly; 2D numpy arrays are auto-named f0, f1, etc.

required
features list of str or None

Column names to use as parallel axes. When None, all columns except hue are used.

None
hue str or None

Column name to color samples by (e.g. a target class or cluster id). Pass None for monochrome lines.

None
rescale ('minmax', 'zscore')

Per-feature rescaling applied before rendering so axes share a common visual range. "minmax" maps to [0, 1]; "zscore" standardizes to zero mean and unit variance; None uses raw feature values.

"minmax"
alpha float

Opacity of individual polylines; lower values reduce overplot in dense datasets.

0.5
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

Parallel coordinates chart with one polyline per sample.

Raises:

Type Description
ValueError

If any name in features is not a column in data.

ValueError

If rescale is not one of "minmax", "zscore", or None.

Examples:

>>> import ferrum as fm
>>> fm.parallel_coordinates_chart(X_df, hue="species", rescale="minmax")

decision_boundary_chart

decision_boundary_chart(model: Any, X: Any, y: Any = None, *, features: tuple = (0, 1), grid_resolution: int = 200, proba: bool = False, scatter: bool = True, random_state: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None) -> 'Chart'

Decision-boundary heatmap for a classifier over a 2D feature slice.

Builds a grid_resolution x grid_resolution grid over two selected features, holds all other features fixed at their column means, and colors each cell by the model's predicted class (or probability). Optionally overlays training-point scatter.

Parameters:

Name Type Description Default
model estimator or ModelSource

Fitted sklearn-compatible classifier or an explicit ferrum.ModelSource.

required
X array - like

Feature matrix. Must be provided (not optional) so the grid bounds and column means can be computed.

required
y array - like or None

True labels. Used only for the scatter overlay when scatter=True; not required otherwise.

None
features tuple of (int or str, int or str)

Two feature indices or column names to use for the x and y axes of the grid. All other features are fixed at their column means. Exactly 2 features are required.

(0, 1)
grid_resolution int

Number of grid points along each axis; total cells = grid_resolution**2.

200
proba bool

When True and the model exposes predict_proba, the color channel uses predict_proba[:, 1] (continuous probability). When False, the color channel uses predict (discrete class index).

False
scatter bool

When True, overlays a scatter of training points colored by y on top of the boundary heatmap via the + compositor. Note: the overlay currently renders as horizontal concatenation per the ChartSpec one-batch contract.

False
random_state int or None

Seed forwarded to ModelSource.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
theme Theme or None

Ferrum theme to apply to the returned chart.

None

Returns:

Type Description
Chart

Decision-boundary heatmap, optionally with training-point scatter overlay.

Raises:

Type Description
ValueError

If features does not contain exactly 2 elements.

Examples:

>>> import ferrum as fm
>>> from sklearn.svm import SVC
>>> fm.decision_boundary_chart(SVC().fit(X_train, y_train), X_train, y_train, features=(0, 1))

displot

displot(data: Any, *, x: Any = None, y: Any = None, hue: Any = None, col: Any = None, row: Any = None, kind: str = 'hist', fill: bool = True, cumulative: bool = False, log_scale: bool = False, stat: str = 'count', bins: Any = 'sturges', bandwidth: Any = 'scott', bw_adjust: float = 1.0, multiple: str = 'layer', kde: bool = False, rug: bool = False, height: float | None = None, aspect: float | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> Chart

Univariate distribution plot.

Convenience wrapper that dispatches to mark_histogram, mark_density, or mark_tick based on kind. The multiple parameter controls how overlapping groups (from hue) are positioned, and the kde / rug flags optionally layer additional marks on top of the primary kind.

Parameters:

Name Type Description Default
data DataFrame - like

Input data accepted by Chart(data).

required
x str or encoding

Column name for the distribution variable (horizontal axis).

None
y str or encoding

Column name for the distribution variable (vertical axis).

None
hue str or encoding

Column name to map to color (one distribution per level).

None
col str

Column name for faceting across columns.

None
row str

Column name for faceting across rows.

None
kind ('hist', 'kde', 'ecdf', 'rug')

Which distribution mark to draw. "hist" calls mark_histogram; "kde" calls mark_density (filled by default); "ecdf" builds a cumulative frequency line via Bin + mark_line; "rug" calls mark_tick.

"hist"
fill bool

Fill the area under the KDE curve (kind="kde" only).

True
cumulative bool

Produce a cumulative histogram or density (kind="hist" and kind="kde").

False
log_scale bool

Apply a log scale to the x axis.

False
stat ('count', 'density')

Statistic to plot on the value axis for kind="hist". "density" normalises so the total area integrates to 1.

"count"
bins int or str

Binning rule for kind="hist". An integer is forwarded as bin_count; a string ("sturges", "fd", etc.) lets the Rust engine decide the count automatically.

"sturges"
bandwidth str

Bandwidth selector for kind="kde" ("scott" or "silverman").

"scott"
bw_adjust float

Multiplicative bandwidth adjustment for kind="kde".

1.0
multiple ('layer', 'stack', 'fill', 'dodge')

How to render multiple distributions (one per hue level). "layer" overlays them (Identity); "dodge" places them side by side; "stack" and "fill" use Stack with offset="zero" or offset="normalize" respectively.

"layer"
kde bool

When True and kind != "kde", layer a mark_density on top of the primary mark.

False
rug bool

When True and kind != "rug", layer a mark_tick rug on top of the primary mark.

False
height float or None

Height of the chart in pixels. Width is derived from aspect.

None
aspect float or None

Aspect ratio (width = height * aspect). Requires height.

None
theme Theme

Visual theme applied via Chart.theme().

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode().

{}

Returns:

Type Description
Chart

Configured chart (possibly layered, faceted, or sized).

Raises:

Type Description
ValueError

If kind or multiple is not one of the supported values.

ValueError

If kind="ecdf" is used without specifying x=.

Examples:

>>> import ferrum as fm
>>> fm.displot(df, x="sepal_length")

KDE with per-species coloring:

>>> fm.displot(df, x="sepal_length", hue="species", kind="kde")

Stacked histogram with an overlaid rug:

>>> fm.displot(df, x="tip", hue="sex", multiple="stack", rug=True)

catplot

catplot(data: Any, *, x: Any = None, y: Any = None, hue: Any = None, col: Any = None, row: Any = None, kind: str = 'strip', order: Any = None, hue_order: Any = None, orient: Any = None, dodge: bool = False, jitter: bool = True, native_scale: bool = False, ci: Any = 95, n_boot: int = 1000, seed: int | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> Chart

Categorical figure-level function.

Dispatches to the appropriate mark based on kind:

  • "strip" -- mark_point with Jitter (when jitter=True, default) or Dodge (when dodge=True and hue is set).
  • "swarm" -- mark_swarm.
  • "box" -- mark_boxplot (box + whiskers + outliers).
  • "violin" -- mark_violin (kernel-density outline).
  • "boxen" -- mark_boxen (letter-value / extended box).
  • "point" -- mark_point per observation on the categorical axis.
  • "bar" -- mark_bar per observation on the categorical axis.
  • "count" -- Aggregate(count) + mark_bar.

Parameters:

Name Type Description Default
data DataFrame - like

Input data accepted by Chart(data).

required
x str or encoding

Column name for the horizontal axis (categorical by default).

None
y str or encoding

Column name for the vertical axis (value by default).

None
hue str or encoding

Column name to map to color (one visual group per level).

None
col str

Column name for faceting across columns.

None
row str

Column name for faceting across rows.

None
kind ('strip', 'swarm', 'box', 'violin', 'boxen', 'point', 'bar', 'count')

Which categorical mark to draw.

"strip"
order list of str

Explicit ordering for the categorical axis levels. Passed as sort=order on the categorical-axis encoding so the domain renders in the given order.

None
hue_order list of str

Explicit ordering for hue levels. Passed as sort=hue_order on the color encoding.

None
orient ('h', 'v', None)

"h" flips the axes (x becomes the value axis, y the category axis) and applies CoordFlip. "v" and None are both treated as vertical (the default); no error is raised for other values.

"h"
dodge bool

When True and hue is set, apply Dodge so each hue level is drawn side-by-side rather than overlaid.

False
jitter bool

For kind="strip", add Jitter on the categorical axis. Ignored when dodge=True.

True
native_scale bool

When True, treat the categorical axis as quantitative instead of ordinal (preserves numeric spacing rather than equal-spacing categories). Currently raises ValueError because the renderer does not support quantitative categorical axes.

False
ci int or float

Confidence-interval level (0--100) for "point" and "bar" kinds. Currently raises ValueError because the Summary transform is not yet wired into catplot.

95
n_boot int

Bootstrap iteration count used to compute ci. Currently raises ValueError alongside ci.

1000
seed int or None

Random seed forwarded to Jitter for reproducible strip positions.

None
theme Theme

Visual theme applied via Chart.theme().

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode().

{}

Returns:

Type Description
Chart

Configured chart (possibly faceted or coord-flipped).

Raises:

Type Description
ValueError

If kind is not one of the supported values.

ValueError

If kind="count" is used without specifying x (or y when orient="h").

ValueError

If native_scale=True is passed (not yet supported by the renderer).

ValueError

If ci is not the default value 95 and the kind is "point" or "bar" (Summary transform not yet wired).

Examples:

>>> import ferrum as fm
>>> fm.catplot(df, x="species", y="sepal_length", kind="box")

Group by a hue variable with dodged bars:

>>> fm.catplot(df, x="day", y="tip", hue="sex", kind="bar", dodge=True)

Horizontal violin plot:

>>> fm.catplot(df, x="total_bill", y="day", kind="violin", orient="h")

relplot

relplot(data: Any, *, x: Any, y: Any, hue: Any = None, size: Any = None, style: Any = None, col: Any = None, row: Any = None, kind: str = 'scatter', height: float | None = None, aspect: float | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> Chart

Relational figure-level function for scatter and line plots.

Dispatches to the appropriate mark based on kind:

  • "scatter" -- mark_point(). style maps to Shape.
  • "line" -- mark_line(). style maps to StrokeDash.

Parameters:

Name Type Description Default
data DataFrame - like

Input data accepted by Chart(data).

required
x str or encoding

Column name for the horizontal axis (required).

required
y str or encoding

Column name for the vertical axis (required).

required
hue str or encoding

Column name to map to color (one group per level).

None
size str or encoding

Column name to map to the Size channel (point area or line width).

None
style str or encoding

Column name to map to Shape (scatter) or StrokeDash (line).

None
col str

Column name for faceting across columns.

None
row str

Column name for faceting across rows.

None
kind ('scatter', 'line')

Which relational mark to draw.

"scatter"
height float or None

Height of the chart in pixels. Width is derived from aspect.

None
aspect float or None

Aspect ratio (width = height * aspect). Requires height.

None
theme Theme

Visual theme applied via Chart.theme().

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode().

{}

Returns:

Type Description
Chart

Configured chart (possibly faceted or sized).

Raises:

Type Description
ValueError

If kind is not one of "scatter" or "line".

Examples:

Scatter with hue grouping and column faceting:

>>> import ferrum as fm
>>> fm.relplot(df, x="total_bill", y="tip", hue="sex", col="time")

Line plot with per-level dash styling:

>>> fm.relplot(df, x="timepoint", y="signal", hue="region",
...            style="region", kind="line")

pairplot

pairplot(data: Any, *, vars: Any = None, x_vars: Any = None, y_vars: Any = None, hue: Any = None, kind: str = 'scatter', diag_kind: str = 'auto', markers: Any = None, height: float | None = None, aspect: float | None = None, corner: bool = False, dropna: bool = False, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> RepeatChart

Pairwise-scatter grid (scatterplot matrix).

Returns a RepeatChart whose template repeats over the cartesian product of row x column field lists, resolved from vars or x_vars/y_vars. When neither is supplied, all numeric columns in data are used.

Parameters:

Name Type Description Default
data DataFrame - like

Input data accepted by Chart(data).

required
vars sequence of str

Column names to plot on both axes (symmetric grid). Cannot be combined with x_vars/y_vars.

None
x_vars sequence of str

Column names for the columns of the grid. Must be paired with y_vars.

None
y_vars sequence of str

Column names for the rows of the grid. Must be paired with x_vars.

None
hue str or encoding

Column name to map to color across all panels.

None
kind ('scatter', 'kde', 'hist', 'reg')

Mark for the off-diagonal cells. "scatter" calls mark_point; "kde" calls mark_density; "hist" calls mark_histogram; "reg" calls mark_smooth(method="lm").

"scatter"
diag_kind ('auto', 'hist', 'kde', None, 'none')

Mark for the diagonal cells (only when vars is symmetric). "auto" resolves to "kde" (KDE is smoother and more informative than histograms for overlapping distributions). Pass None or "none" to suppress diagonal marks.

"auto"
markers str or list of str

Point-marker shape(s). A single string (e.g. "square") sets the shape on every scatter panel. A list maps each hue level to a shape via the Shape encoding. Only applies when kind="scatter".

None
height float or None

Height of each individual panel in pixels. When set alongside aspect, each panel's width is height * aspect.

None
aspect float or None

Aspect ratio per panel (width = height * aspect). Defaults to 1.0 when height is set and aspect is omitted.

None
corner bool

Render only the lower-triangle panels.

False
dropna bool

When True, drop rows with any null value in the selected variable columns before building the pairplot.

False
theme Theme

Visual theme applied to all panels.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode() on every panel template.

{}

Returns:

Type Description
RepeatChart

Grid of panels sharing the same off-diagonal and (optionally) diagonal chart templates.

Raises:

Type Description
ValueError

If kind or diag_kind is not one of the supported values.

ValueError

If both vars and x_vars/y_vars are supplied.

ValueError

If only one of x_vars / y_vars is supplied.

Examples:

>>> import ferrum as fm
>>> fm.pairplot(df)

Specific variables with KDE off-diagonal and per-species color:

>>> fm.pairplot(
...     df, vars=["sepal_length", "sepal_width", "petal_length"],
...     kind="kde", hue="species",
... )

heatmap

heatmap(data: Any, *, annot: bool = True, fmt: str = '.2f', cmap: str | None = None, linewidths: float = 0.5, linecolor: str = 'white', vmin: float | None = None, vmax: float | None = None, center: float | None = None, robust: bool = False, square: bool = False, mask: Any = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> Chart

2-D heatmap of a wide-format DataFrame.

Each row of data becomes a row of the heatmap; each numeric column becomes a column. The first non-numeric column (if any) is used as the row-label axis; rows without a label column get a synthetic integer index. The DataFrame is unpivoted to long form via Unpivot, then rendered with mark_rect and an optional annotation layer (mark_text).

Parameters:

Name Type Description Default
data DataFrame - like

Wide-format input. Numeric columns become the heatmap cells; the first non-numeric column (if present) labels the rows.

required
annot bool

Overlay cell values as text using mark_text.

True
fmt str

Python format specifier applied to cell values in the annotation layer.

".2f"
cmap str or None

Color scheme name (e.g. "blues", "viridis", "rdbu"). None (default) defers to the theme's sequential scheme.

None
linewidths float

Width of the cell border stroke in pixels. 0 disables borders.

0.5
linecolor str

Color of the cell border stroke.

"white"
vmin float or None

Minimum of the color scale domain. Overrides robust when set.

None
vmax float or None

Maximum of the color scale domain. Overrides robust when set.

None
center float or None

Value to center a diverging color scale on (maps to scale.domainMid).

None
robust bool

When True and vmin/vmax are unset, clip the color scale to the 2nd and 98th percentiles of the data.

False
square bool

Force equal width and height per cell so the heatmap is square.

False
mask array-like, "upper", "lower", or None

Cell-masking control. When "upper", only the upper triangle (including the diagonal) is shown. When "lower", only the lower triangle (including the diagonal) is shown. When an array-like boolean matrix (same shape as the numeric portion of data), cells where the mask is True are hidden.

None
theme Theme

Visual theme applied via Chart.theme().

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode().

{}

Returns:

Type Description
Chart

Possibly layered chart (mark_rect + mark_text when annot=True).

Raises:

Type Description
ValueError

If data has no numeric columns.

Examples:

>>> import ferrum as fm
>>> fm.heatmap(corr_df, cmap="rdbu", center=0)

Suppress annotations and use a custom domain:

>>> fm.heatmap(wide_df, annot=False, vmin=0, vmax=1, cmap="greens")

clustermap

clustermap(data: Any, *, method: str = 'ward', metric: str = 'euclidean', cmap: str | None = None, z_score: Any = None, standard_scale: Any = None, figsize: Any = None, dendrogram_ratio: float = 0.2, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> 'ClusterMapChart'

Clustered heatmap with row and column dendrograms.

Returns a ClusterMapChart composed of:

  • a center heatmap of the hierarchically-reordered wide-format DataFrame (Linkage + Reorder + Unpivot + mark_rect),
  • a column dendrogram (top, mark_segment, reading col_link_segments), and
  • a row dendrogram (left, mark_segment + CoordFlip, reading row_link_segments).

Parameters:

Name Type Description Default
data DataFrame - like

Wide-format input. Same layout requirements as heatmap.

required
method str

Linkage algorithm forwarded to Linkage (e.g. "ward", "average", "complete", "single").

"ward"
metric str

Distance metric forwarded to Linkage (e.g. "euclidean", "cosine", "correlation").

"euclidean"
cmap str or None

Color scheme name for the center heatmap (e.g. "magma", "viridis", "rdbu"). Forwarded to the Color encoding's scheme scale option. "magma" is preferred for dense heatmaps due to its perceptual uniformity across a wide luminance range.

None
z_score (0, 1, None)

Standardise data along rows (0) or columns (1) before clustering; forwarded to Linkage.

0
standard_scale (0, 1, None)

Normalise to [0, 1] along rows (0) or columns (1); forwarded to Linkage.

0
figsize tuple of (float, float)

Overall figure size as (width, height) in pixels, applied to the center heatmap panel via .properties().

None
dendrogram_ratio float

Fraction of the total width/height allocated to each dendrogram panel, forwarded to ClusterMapChart.

0.2
theme Theme

Visual theme applied to the center and dendrogram charts.

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode() on the center heatmap chart.

{}

Returns:

Type Description
ClusterMapChart

Compound view with center, row_dendrogram, and col_dendrogram sub-charts.

Raises:

Type Description
ValueError

If data has no numeric columns.

Examples:

>>> import ferrum as fm
>>> fm.clustermap(expr_df, method="average", cmap="rdbu", z_score=1)

Correlation matrix with euclidean ward clustering:

>>> fm.clustermap(corr_df, cmap="viridis")

jointplot

jointplot(data: Any, *, x: str, y: str, hue: Any = None, kind: str = 'scatter', marginal_kind: str = 'hist', ratio: int = 5, space: float = 0.05, xlim: Any = None, ylim: Any = None, joint_kws: Any = None, marginal_kws: Any = None, height: float | None = None, mark: dict | None = None, encode: dict | None = None, properties: dict | None = None, layers: list | None = None, theme: Any = None, **encode_kwargs: Any) -> JointChart

Joint-distribution plot with marginals.

Builds a JointChart composed of a central bivariate plot flanked by univariate marginals along the x (top) and y (right) axes.

Parameters:

Name Type Description Default
data DataFrame - like

Input data accepted by Chart(data).

required
x str

Column name for the horizontal axis (required).

required
y str

Column name for the vertical axis (required).

required
hue str or encoding

Column name to map to color in both the center and marginal charts.

None
kind ('scatter', 'kde', 'hist', 'hex', 'reg')

Mark to use for the central panel. "scatter" draws mark_point; "kde" draws mark_density; "hist" draws a 2-D histogram via Bin2D + mark_rect; "hex" draws mark_hex; "reg" layers mark_point + a mark_smooth(method="lm") fit line.

"scatter"
marginal_kind ('hist', 'kde', 'rug', 'box')

Mark to use for the marginal panels (same kind applied to both the top x-marginal and the right y-marginal).

"hist"
ratio int

Size ratio of the center panel to the marginal panels.

5
space float

Gap (in layout units) between the center and marginal panels.

0.05
xlim tuple

(min, max) domain override for the x-axis. Applied as an explicit scale domain on the center and top-marginal x encodings via X(field, scale={"domain": [min, max]}).

None
ylim tuple

(min, max) domain override for the y-axis. Applied as an explicit scale domain on the center and right-marginal y encodings via Y(field, scale={"domain": [min, max]}).

None
joint_kws dict

Extra keyword arguments forwarded to the center-panel mark call.

None
marginal_kws dict

Extra keyword arguments forwarded to the marginal mark calls.

None
height float or None

Height and width of the square central panel in pixels.

None
theme Theme

Visual theme applied to all three panels via Chart.theme().

None
mark dict

Per-layer mark overrides. For composite-mark charts, keys are layer names (e.g. {"scatter": {"opacity": 0.5}}); for single-mark charts, a flat dict of mark properties.

None
encode dict

Additional encoding kwargs merged via Chart.encode(**encode).

None
properties dict

Chart properties merged via Chart.properties(**properties) (e.g. {"width": 400, "title": "My chart"}).

None
layers list

Extra layers appended via Chart.layer(*layers).

None
**encode_kwargs Any

Additional keyword arguments forwarded to Chart.encode() on the center chart.

{}

Returns:

Type Description
JointChart

Compound view with center, top, and right sub-charts.

Raises:

Type Description
ValueError

If kind or marginal_kind is not one of the supported values.

Examples:

>>> import ferrum as fm
>>> fm.jointplot(df, x="sepal_length", y="sepal_width")

2-D histogram center with KDE marginals, colored by species:

>>> fm.jointplot(
...     df, x="sepal_length", y="petal_length",
...     kind="hist", marginal_kind="kde", hue="species",
... )