Backends: Caching and Distributed Computing#

Training multiple model instances for consistency evaluation can be computationally expensive. iTuna provides several backends to help:

  1. Disk caching - Avoid re-training identical models

  2. Distributed execution - Train models in parallel across multiple processes

  3. DataJoint integration - Database-backed caching for team collaboration

This tutorial covers how to configure and use these backends.

import numpy as np
from sklearn.decomposition import FastICA

import ituna
/hkfs/home/haicore/hgf_hmgu/hgf_sfs7789/git/itune-ref/ituna/_backends/utils.py:22: UserWarning: config_dataclass is not available, saving/loading Configurable objects will not be available
  warnings.warn("config_dataclass is not available, saving/loading Configurable objects will not be available")
# Sample data for all examples
np.random.seed(42)
X = np.random.randn(1000, 20)

Default Backend: In-Memory#

By default, iTuna uses the in_memory backend, which trains all models fresh each time without caching.

# Check current configuration
print("Current config:", ituna.config.get_config())
Current config: {'DEFAULT_BACKEND': 'in_memory', 'BACKEND_KWARGS': {}, 'CACHE_DIR': 'backend_store', 'FILE_LOCK_TIMEOUT': 30}

Disk Cache Backend#

The disk_cache backend saves trained models to disk. If you run the same model on the same data again, it loads from cache instead of re-training.

This is extremely useful during exploratory analysis when you’re iterating on visualization or downstream analysis without changing the model.

# Enable disk caching globally
ituna.config.DEFAULT_BACKEND = "disk_cache"

print("Updated config:", ituna.config.get_config())
Updated config: {'DEFAULT_BACKEND': 'disk_cache', 'BACKEND_KWARGS': {}, 'CACHE_DIR': 'backend_store', 'FILE_LOCK_TIMEOUT': 30}
# Create and fit an ensemble - models will be cached
ensemble = ituna.ConsistencyEnsemble(
    estimator=FastICA(n_components=5, max_iter=500),
    consistency_transform=ituna.metrics.PairwiseConsistency(
        indeterminacy=ituna.metrics.Permutation(),
    ),
    random_states=3,
)

# First run: trains and caches models
print("First run (training):")
ensemble.fit(X)
print(f"Score: {ensemble.score(X):.4f}")
First run (training):
Score: 0.7096
/home/hgf_hmgu/hgf_sfs7789/miniconda3/envs/ituna/lib/python3.10/site-packages/sklearn/decomposition/_fastica.py:127: ConvergenceWarning: FastICA did not converge. Consider increasing tolerance or the maximum number of iterations.
  warnings.warn(
/home/hgf_hmgu/hgf_sfs7789/miniconda3/envs/ituna/lib/python3.10/site-packages/sklearn/decomposition/_fastica.py:127: ConvergenceWarning: FastICA did not converge. Consider increasing tolerance or the maximum number of iterations.
  warnings.warn(
/home/hgf_hmgu/hgf_sfs7789/miniconda3/envs/ituna/lib/python3.10/site-packages/sklearn/decomposition/_fastica.py:127: ConvergenceWarning: FastICA did not converge. Consider increasing tolerance or the maximum number of iterations.
  warnings.warn(
# Second run: loads from cache (much faster)
print("\nSecond run (loading from cache):")
ensemble2 = ituna.ConsistencyEnsemble(
    estimator=FastICA(n_components=5, max_iter=500),
    consistency_transform=ituna.metrics.PairwiseConsistency(
        indeterminacy=ituna.metrics.Permutation(),
    ),
    random_states=3,
)
ensemble2.fit(X)
print(f"Score: {ensemble2.score(X):.4f}")
Second run (loading from cache):
Score: 0.7096

Cache Invalidation#

The cache key is computed from:

  • Model class and all hyperparameters

  • Data hash

  • Random state

If you change any hyperparameter, it’s treated as a new model and will be trained fresh.

# Changing max_iter creates a new cache entry
ensemble3 = ituna.ConsistencyEnsemble(
    estimator=FastICA(n_components=5, max_iter=501),  # Different max_iter!
    consistency_transform=ituna.metrics.PairwiseConsistency(
        indeterminacy=ituna.metrics.Permutation(),
    ),
    random_states=3,
)

print("New hyperparameter - trains fresh:")
ensemble3.fit(X)
print(f"Score: {ensemble3.score(X):.4f}")
New hyperparameter - trains fresh:
Score: 0.6284
/home/hgf_hmgu/hgf_sfs7789/miniconda3/envs/ituna/lib/python3.10/site-packages/sklearn/decomposition/_fastica.py:127: ConvergenceWarning: FastICA did not converge. Consider increasing tolerance or the maximum number of iterations.
  warnings.warn(
/home/hgf_hmgu/hgf_sfs7789/miniconda3/envs/ituna/lib/python3.10/site-packages/sklearn/decomposition/_fastica.py:127: ConvergenceWarning: FastICA did not converge. Consider increasing tolerance or the maximum number of iterations.
  warnings.warn(
/home/hgf_hmgu/hgf_sfs7789/miniconda3/envs/ituna/lib/python3.10/site-packages/sklearn/decomposition/_fastica.py:127: ConvergenceWarning: FastICA did not converge. Consider increasing tolerance or the maximum number of iterations.
  warnings.warn(

Custom Cache Directory#

By default, models are cached in ./backend_store. You can customize this:

# Set custom cache directory
ituna.config.CACHE_DIR = "./my_model_cache"

print(f"Cache directory: {ituna.config.CACHE_DIR}")
Cache directory: ./my_model_cache

Shared Caching#

The disk cache is robust to concurrent access, so you can:

  • Share a cache directory across multiple notebooks

  • Share a cache with collaborators (e.g., on a network drive)

If someone has already trained a model with the same configuration on the same data, you’ll load their cached model instead of re-training.

Using Context Managers#

Instead of changing global config, you can use context managers for temporary settings:

# Reset to default
ituna.config.DEFAULT_BACKEND = "in_memory"

# Use disk cache only within this block
with ituna.config.config_context(DEFAULT_BACKEND="disk_cache"):
    print("Inside context:", ituna.config.get_config()["DEFAULT_BACKEND"])
    ensemble.fit(X)

print("Outside context:", ituna.config.get_config()["DEFAULT_BACKEND"])
Inside context: disk_cache
Outside context: in_memory

Distributed Backend#

The disk_cache_distributed backend trains models in parallel across multiple processes. This is useful when:

  • You have a multi-core machine and want to utilize all cores

  • Training many models (large random_states value)

Auto Mode#

In auto mode, iTuna automatically spawns worker processes:

# Configure distributed backend with auto workers
ituna.config.DEFAULT_BACKEND = "disk_cache_distributed"
ituna.config.BACKEND_KWARGS = {
    "trigger_type": "auto",
    "num_workers": 4,  # Number of parallel processes
}

print("Distributed config:", ituna.config.get_config())
Distributed config: {'DEFAULT_BACKEND': 'disk_cache_distributed', 'BACKEND_KWARGS': {'trigger_type': 'auto', 'num_workers': 4}, 'CACHE_DIR': './my_model_cache', 'FILE_LOCK_TIMEOUT': 30}
# Train with 10 random states in parallel
ensemble_parallel = ituna.ConsistencyEnsemble(
    estimator=FastICA(n_components=5, max_iter=500),
    consistency_transform=ituna.metrics.PairwiseConsistency(
        indeterminacy=ituna.metrics.Permutation(),
    ),
    random_states=10,
)

ensemble_parallel.fit(X)
print(f"Score: {ensemble_parallel.score(X):.4f}")
Fitting models: 100%|██████████| 10/10 [00:04<00:00,  2.46it/s, trained=10/10, errors=0, reserved=0, sweep_trained=10/10, sweep_errors=0, sweep_reserved=0]
Score: 0.6479

Manual Mode (for HPC clusters)#

In manual mode, iTuna prints a command that you can run on external compute nodes (e.g., SLURM jobs). This is ideal for HPC environments.

# Configure manual distributed backend
ituna.config.DEFAULT_BACKEND = "disk_cache_distributed"
ituna.config.BACKEND_KWARGS = {
    "trigger_type": "manual",
    "sweep_type": "constant",
    "sweep_name": "my_experiment_sweep",
}

# When you call fit(), it will print the worker command
# and wait for external workers to complete the training

CLI Worker Commands#

iTuna provides command-line tools for running workers:

# Local distributed backend
ituna-fit-distributed --sweep-name <uuid> --cache-dir ./backend_store

# With DataJoint backend
ituna-fit-distributed-datajoint --sweep-name <uuid> --schema-name myschema

These can be submitted as SLURM jobs or run on any machine with access to the cache.

DataJoint Backend#

For team collaboration with database-backed caching, use the DataJoint backend.

Setup#

  1. Install DataJoint support:

    pip install ituna[datajoint]
    
  2. Configure database credentials in .env (see .env.template):

    DJ_HOST=your-database-host
    DJ_USER=your-username
    DJ_PASS=your-password
    
  3. Use the backend:

# DataJoint backend configuration (requires setup)
# config.DEFAULT_BACKEND = "datajoint"
# config.BACKEND_KWARGS = {
#     "trigger_type": "auto",
#     "num_workers": 4,
#     "schema_name": "my_ituna_schema",
# }

Summary#

Backend

Use Case

in_memory

Quick experiments, no caching needed

disk_cache

Iterative analysis, avoid re-training

disk_cache_distributed

Large sweeps, multi-core machines

datajoint

Team collaboration, shared database

Key configuration options:

import ituna

# Set backend globally
ituna.config.DEFAULT_BACKEND = "disk_cache"
ituna.config.CACHE_DIR = "./my_cache"

# Or use context manager
with ituna.config.config_context(DEFAULT_BACKEND="disk_cache"):
    ensemble.fit(X)
# Reset to defaults for clean state
ituna.config.DEFAULT_BACKEND = "in_memory"
ituna.config.BACKEND_KWARGS = {}
ituna.config.CACHE_DIR = "backend_store"