Workflows and meta-estimators ============================= Beyond single-call estimators, ``xyz`` provides meta-estimators and workflows for model selection, uncertainty quantification, and multivariate source selection. Bootstrap confidence intervals ------------------------------ ``BootstrapEstimate`` wraps any estimator and returns a point estimate plus a bootstrap distribution and confidence interval. - **method** ``"iid"``: resample rows with replacement (suitable for non–time-series or when trials are exchangeable). - **method** ``"trial"``: resample whole trials with replacement (multi-trial data); requires at least two trials. - **method** ``"block"``: block bootstrap along time within each trial; use ``block_length`` to control block size. Fitted attributes: ``estimate_``, ``bootstrap_distribution_``, ``ci_low_``, ``ci_high_``, ``standard_error_``. Use ``n_jobs`` to parallelize bootstrap replicates. Example: .. code-block:: python from xyz import BootstrapEstimate, GaussianTransferEntropy bootstrap = BootstrapEstimate( GaussianTransferEntropy(driver_indices=[1], target_indices=[0], lags=1), n_bootstrap=200, method="trial", ci=0.95, n_jobs=2, random_state=0, ).fit(data) print(bootstrap.estimate_, bootstrap.ci_low_, bootstrap.ci_high_) Greedy source selection (multivariate TE) ----------------------------------------- ``GreedySourceSelectionTransferEntropy`` performs forward selection of driver variables using a partial transfer entropy estimator. You provide a base estimator with ``driver_indices`` and ``conditioning_indices``, a list of ``candidate_sources`` (column indices), and optional ``max_sources`` and ``min_improvement``. The meta-estimator repeatedly adds the source that most improves the TE score, and stops when no candidate adds at least ``min_improvement``. Fitted attributes: ``selected_sources_``, ``best_estimator_``, ``best_score_``, ``selection_history_``. Supports ``n_jobs`` for evaluating candidate sets in parallel. Example: .. code-block:: python from xyz import GaussianPartialTransferEntropy, GreedySourceSelectionTransferEntropy selector = GreedySourceSelectionTransferEntropy( GaussianPartialTransferEntropy( driver_indices=[1], target_indices=[0], conditioning_indices=[], lags=1, ), candidate_sources=[1, 2, 3], max_sources=3, min_improvement=0.01, ).fit(data) # selector.selected_sources_ = [1, 2] # chosen drivers # selector.best_estimator_.driver_indices == [1, 2] Multivariate drivers in TE estimators -------------------------------------- All time-series TE estimators accept ``driver_indices`` as a list of column indices. When multiple indices are given, the embedded driver past is the concatenation of the embedded pasts of each driver variable (same ``lags``, ``tau``, ``delay`` for all). This allows a single TE model to include several source variables without running greedy selection. Example: .. code-block:: python from xyz import GaussianTransferEntropy # data columns: 0=target, 1=driver1, 2=driver2, 3=noise te = GaussianTransferEntropy( driver_indices=[1, 2], target_indices=[0], lags=1, ).fit(data) Parallelization (n_jobs) ------------------------ The following support an ``n_jobs`` parameter for parallel execution (default 1; use ``-1`` for all cores with joblib): - ``RagwitzEmbeddingSearchCV``: parallel over (dimension, tau) candidates. - ``InteractionDelaySearchCV``: parallel over delay candidates. - ``SurrogatePermutationTest``: parallel over surrogate fits. - ``BootstrapEstimate``: parallel over bootstrap replicates. - ``GreedySourceSelectionTransferEntropy``: parallel over candidate source sets. Results are deterministic for a fixed ``random_state`` when applicable.