QML tools
ML Tools
This module implements a Trainer
class for torch Modules
and QuantumModel
. It also implements the QNN
class and callbacks that can be used with the trainer module.
Trainer(model, optimizer, config, loss_fn='mse', train_dataloader=None, val_dataloader=None, test_dataloader=None, optimize_step=optimize_step, max_batches=None)
Bases: BaseTrainer
Trainer class to manage and execute training, validation, and testing loops for a model (eg.
QNN).
This class handles the overall training process, including: - Managing epochs and steps - Handling data loading and batching - Computing and updating gradients - Logging and monitoring training metrics
ATTRIBUTE | DESCRIPTION |
---|---|
current_epoch |
The current epoch number.
TYPE:
|
global_step |
The global step across all epochs.
TYPE:
|
Inherited Attributes
use_grad (bool): Indicates if gradients are used for optimization. Default is True.
model (nn.Module): The neural network model. optimizer (optim.Optimizer | NGOptimizer | None): The optimizer for training. config (TrainConfig): The configuration settings for training. train_dataloader (DataLoader | DictDataLoader | None): DataLoader for training data. val_dataloader (DataLoader | DictDataLoader | None): DataLoader for validation data. test_dataloader (DataLoader | DictDataLoader | None): DataLoader for testing data.
optimize_step (Callable): Function for performing an optimization step. loss_fn (Callable): loss function to use.
num_training_batches (int): Number of training batches. num_validation_batches (int): Number of validation batches. num_test_batches (int): Number of test batches.
state (str): Current state in the training process
Default training routine
for epoch in max_iter + 1:
# Training
for batch in train_batches:
train model
# Validation
if val_every % epoch == 0:
for batch in val_batches:
train model
Notes
- In case of InfiniteTensorDataset, number of batches = 1.
- In case of TensorDataset, number of batches are default.
- Training is run for max_iter + 1 epochs. Epoch 0 logs untrained model.
- Please look at the CallbackManager initialize_callbacks method to review the default logging behavior.
Examples:
import torch
from torch.optim import SGD
from qadence import (
feature_map,
hamiltonian_factory,
hea,
QNN,
QuantumCircuit,
TrainConfig,
Z,
)
from qadence.ml_tools.trainer import Trainer
from qadence.ml_tools.optimize_step import optimize_step
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.data import to_dataloader
# Initialize the model
n_qubits = 2
fm = feature_map(n_qubits)
ansatz = hea(n_qubits=n_qubits, depth=2)
observable = hamiltonian_factory(n_qubits, detuning=Z)
circuit = QuantumCircuit(n_qubits, fm, ansatz)
model = QNN(circuit, observable, backend="pyqtorch", diff_mode="ad")
# Set up the optimizer
optimizer = SGD(model.parameters(), lr=0.001)
# Use TrainConfig for configuring the training process
config = TrainConfig(
max_iter=100,
print_every=10,
write_every=10,
checkpoint_every=10,
val_every=10
)
# Create the Trainer instance with TrainConfig
trainer = Trainer(
model=model,
optimizer=optimizer,
config=config,
loss_fn="mse",
optimize_step=optimize_step
)
batch_size = 25
x = torch.linspace(0, 1, 32).reshape(-1, 1)
y = torch.sin(x)
train_loader = to_dataloader(x, y, batch_size=batch_size, infinite=True)
val_loader = to_dataloader(x, y, batch_size=batch_size, infinite=False)
# Train the model
model, optimizer = trainer.fit(train_loader, val_loader)
This also supports both gradient based and gradient free optimization. The default support is for gradient based optimization.
Notes:
- set_use_grad() (class level):This method is used to set the global
use_grad
flag, controlling whether the trainer uses gradient-based optimization. - Context Managers (instance level):
enable_grad_opt()
anddisable_grad_opt()
are context managers that temporarily switch the optimization mode for specific code blocks. This is useful when you want to mix gradient-based and gradient-free optimization in the same training process.
Examples
Gradient based optimization example Usage:
from torch import optim
optimizer = optim.SGD(model.parameters(), lr=0.01)
Trainer.set_use_grad(True)
trainer = Trainer(
model=model,
optimizer=optimizer,
config=config,
loss_fn="mse"
)
trainer.fit(train_loader, val_loader)
trainer = Trainer(
model=model,
config=config,
loss_fn="mse"
)
with trainer.enable_grad_opt(optimizer):
trainer.fit(train_loader, val_loader)
Gradient free optimization example Usage:
import nevergrad as ng
from qadence.ml_tools.parameters import num_parameters
ng_optimizer = ng.optimizers.NGOpt(
budget=config.max_iter, parametrization= num_parameters(model)
)
Trainer.set_use_grad(False)
trainer = Trainer(
model=model,
optimizer=ng_optimizer,
config=config,
loss_fn="mse"
)
trainer.fit(train_loader, val_loader)
import nevergrad as ng
from qadence.ml_tools.parameters import num_parameters
ng_optimizer = ng.optimizers.NGOpt(
budget=config.max_iter, parametrization= num_parameters(model)
)
trainer = Trainer(
model=model,
config=config,
loss_fn="mse"
)
with trainer.disable_grad_opt(ng_optimizer):
trainer.fit(train_loader, val_loader)
Initializes the Trainer class.
PARAMETER | DESCRIPTION |
---|---|
model
|
The PyTorch model to train.
TYPE:
|
optimizer
|
The optimizer for training.
TYPE:
|
config
|
Training configuration object.
TYPE:
|
loss_fn
|
Loss function used for training. If not specified, default mse loss will be used.
TYPE:
|
train_dataloader
|
DataLoader for training data.
TYPE:
|
val_dataloader
|
DataLoader for validation data.
TYPE:
|
test_dataloader
|
DataLoader for test data.
TYPE:
|
optimize_step
|
Function to execute an optimization step.
TYPE:
|
max_batches
|
Maximum number of batches to process per epoch. This is only valid in case of finite TensorDataset dataloaders. if max_batches is not None, the maximum number of batches used will be min(max_batches, len(dataloader.dataset)) In case of InfiniteTensorDataset only 1 batch per epoch is used.
TYPE:
|
Source code in qadence/ml_tools/trainer.py
_aggregate_result(result)
Aggregates the loss and metrics using the Accelerator's all_reduce_dict method if aggregation is enabled.
PARAMETER | DESCRIPTION |
---|---|
result
|
(tuple[torch.Tensor, dict[str, Any]]) The result consisting of loss and metrics.For more details, look at the signature of build_optimize_result.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[Tensor, dict[str, Any]]
|
tuple[torch.Tensor, dict[str, Any]]: The aggregated loss and metrics. |
Source code in qadence/ml_tools/trainer.py
_batch_iter(dataloader, num_batches)
Yields batches from the provided dataloader.
The batch of data is also moved to the correct device and dtype using accelerator.prepare.
PARAMETER | DESCRIPTION |
---|---|
dataloader
|
The dataloader to iterate over.
TYPE:
|
num_batches
|
The maximum number of batches to yield.
TYPE:
|
YIELDS | DESCRIPTION |
---|---|
Iterable[tuple[Tensor, ...] | None]
|
Iterable[tuple[torch.Tensor, ...] | None]: A batch from the dataloader moved to the specified device and dtype. |
Source code in qadence/ml_tools/trainer.py
_fit_end()
_fit_setup()
Sets up the training environment, initializes configurations,.
and moves the model to the specified device and data type. The callback_manager.start_training takes care of loading checkpoint, and setting up the writer.
Source code in qadence/ml_tools/trainer.py
_modify_batch_end_loss_metrics(loss_metrics)
Modifies the loss and metrics at the end of batch for proper logging.
All metrics are prefixed with the proper state of the training process - "train_" or "val_" or "test_" A "{state}_loss" is added to metrics.
PARAMETER | DESCRIPTION |
---|---|
loss_metrics
|
Original loss and metrics.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[Tensor, dict[str, Any]]
|
tuple[None | torch.Tensor, dict[str, Any]]: Modified loss and metrics. |
Source code in qadence/ml_tools/trainer.py
_train()
Runs the main training loop over multiple epochs.
This method sets up the training process by performing any necessary pre-training
actions (via on_train_start
), configuring progress tracking (if available), and then
iteratively calling _train_epoch
to run through the epochs.
RETURNS | DESCRIPTION |
---|---|
list[list[tuple[Tensor, dict[str, Any]]]]
|
list[list[tuple[torch.Tensor, dict[str, Any]]]]: Training loss |
list[list[tuple[Tensor, dict[str, Any]]]]
|
metrics for all epochs. list -> list -> tuples Epochs -> Training Batches -> (loss, metrics) |
Source code in qadence/ml_tools/trainer.py
_train_epochs(epoch_start, epoch_end, train_task=None, val_task=None)
Executes the training loop for a series of epochs.
PARAMETER | DESCRIPTION |
---|---|
epoch_start
|
The starting epoch index.
TYPE:
|
epoch_end
|
The ending epoch index (non-inclusive).
TYPE:
|
train_task
|
The progress bar task ID for training updates. If provided, the progress bar will be updated after each epoch. Defaults to None.
TYPE:
|
val_task
|
The progress bar task ID for validation updates. If provided and validation is enabled, the progress bar will be updated after each validation run. Defaults to None.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[list[tuple[Tensor, dict[str, Any]]]]
|
list[list[tuple[torch.Tensor, dict[str, Any]]]]: A tuple of |
list[list[tuple[Tensor, dict[str, Any]]]]
|
Training loss metrics for all epochs. list -> list -> tuples Epochs -> Training Batches -> (loss, metrics) |
tuple[list[list[tuple[Tensor, dict[str, Any]]]], list[list[tuple[Tensor, dict[str, Any]]]]]
|
And Validation loss metrics for all epochs list -> list -> tuples Epochs -> Training Batches -> (loss, metrics) |
Source code in qadence/ml_tools/trainer.py
build_optimize_result(result)
Builds and stores the optimization result by calculating the average loss and metrics.
Result (or loss_metrics) can have multiple formats:
- None
Indicates no loss or metrics data is provided.
- tuple[torch.Tensor, dict[str, Any]]
A single tuple containing the loss tensor
and metrics dictionary - at the end of batch.
- list[tuple[torch.Tensor, dict[str, Any]]]
A list of tuples for
multiple batches.
- list[list[tuple[torch.Tensor, dict[str, Any]]]]
A list of lists of tuples,
where each inner list represents metrics across multiple batches within an epoch.
PARAMETER | DESCRIPTION |
---|---|
result
|
(None | tuple[torch.Tensor, dict[Any, Any]] | list[tuple[torch.Tensor, dict[Any, Any]]] | list[list[tuple[torch.Tensor, dict[Any, Any]]]]) The loss and metrics data, which can have multiple formats
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
None
|
This method does not return anything. It sets
TYPE:
|
None
|
the computed average loss and metrics. |
Source code in qadence/ml_tools/trainer.py
710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 |
|
fit(train_dataloader=None, val_dataloader=None)
Fits the model using the specified training configuration.
The dataloaders can be provided to train on new datasets, or the default dataloaders provided in the trainer will be used.
PARAMETER | DESCRIPTION |
---|---|
train_dataloader
|
DataLoader for training data.
TYPE:
|
val_dataloader
|
DataLoader for validation data.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[Module, Optimizer]
|
tuple[nn.Module, optim.Optimizer]: The trained model and optimizer. |
Source code in qadence/ml_tools/trainer.py
get_ic_grad_bounds(eta, epsilons, variation_multiple=20, dataloader=None)
Calculate the bounds on the gradient norm of the loss using Information Content.
PARAMETER | DESCRIPTION |
---|---|
eta
|
The sensitivity IC.
TYPE:
|
epsilons
|
The epsilons to use for thresholds to for discretization of the finite derivatives.
TYPE:
|
variation_multiple
|
The number of sets of variational parameters to generate per each variational parameter. The number of variational parameters required for the statisctiacal analysis scales linearly with the amount of them present in the model. This is that linear factor.
TYPE:
|
dataloader
|
The dataloader for training data. A new dataloader can be provided, or the dataloader provided in the trinaer will be used. In case no dataloaders are provided at either places, it assumes that the model does not require any input data.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[float, float, float]
|
tuple[float, float, float]: The max IC lower bound, max IC upper bound, and sensitivity IC upper bound. |
Examples:
import torch
from torch.optim.adam import Adam
from qadence.constructors import ObservableConfig
from qadence.ml_tools.config import AnsatzConfig, FeatureMapConfig, TrainConfig
from qadence.ml_tools.data import to_dataloader
from qadence.ml_tools.models import QNN
from qadence.ml_tools.optimize_step import optimize_step
from qadence.ml_tools.trainer import Trainer
from qadence.operations.primitive import Z
fm_config = FeatureMapConfig(num_features=1)
ansatz_config = AnsatzConfig(depth=4)
obs_config = ObservableConfig(detuning=Z)
qnn = QNN.from_configs(
register=4,
obs_config=obs_config,
fm_config=fm_config,
ansatz_config=ansatz_config,
)
optimizer = Adam(qnn.parameters(), lr=0.001)
batch_size = 25
x = torch.linspace(0, 1, 32).reshape(-1, 1)
y = torch.sin(x)
train_loader = to_dataloader(x, y, batch_size=batch_size, infinite=True)
train_config = TrainConfig(max_iter=100)
trainer = Trainer(
model=qnn,
optimizer=optimizer,
config=train_config,
loss_fn="mse",
train_dataloader=train_loader,
optimize_step=optimize_step,
)
# Perform exploratory landscape analysis with Information Content
ic_sensitivity_threshold = 1e-4
epsilons = torch.logspace(-2, 2, 10)
max_ic_lower_bound, max_ic_upper_bound, sensitivity_ic_upper_bound = (
trainer.get_ic_grad_bounds(
eta=ic_sensitivity_threshold,
epsilons=epsilons,
)
)
# Resume training as usual...
trainer.fit(train_loader)
Source code in qadence/ml_tools/trainer.py
798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 |
|
run_test_batch(batch)
Runs a single test batch.
PARAMETER | DESCRIPTION |
---|---|
batch
|
Batch of data from the DataLoader.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[Tensor, dict[str, Any]]
|
tuple[torch.Tensor, dict[str, Any]]: Loss and metrics for the batch. |
Source code in qadence/ml_tools/trainer.py
run_train_batch(batch)
Runs a single training batch, performing optimization.
We use the step function to optimize the model based on use_grad. use_grad = True entails gradient based optimization, for which we use optimize_step function. use_grad = False entails gradient free optimization, for which we use update_ng_parameters function.
PARAMETER | DESCRIPTION |
---|---|
batch
|
Batch of data from the DataLoader.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[Tensor, dict[str, Any]]
|
tuple[torch.Tensor, dict[str, Any]]: Loss and metrics for the batch. tuple of (loss, metrics) |
Source code in qadence/ml_tools/trainer.py
run_training(dataloader)
Runs the training for a single epoch, iterating over multiple batches.
PARAMETER | DESCRIPTION |
---|---|
dataloader
|
DataLoader for training data.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[tuple[Tensor, dict[str, Any]]]
|
list[tuple[torch.Tensor, dict[str, Any]]]: Loss and metrics for each batch. list -> tuples Training Batches -> (loss, metrics) |
Source code in qadence/ml_tools/trainer.py
run_val_batch(batch)
Runs a single validation batch.
PARAMETER | DESCRIPTION |
---|---|
batch
|
Batch of data from the DataLoader.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[Tensor, dict[str, Any]]
|
tuple[torch.Tensor, dict[str, Any]]: Loss and metrics for the batch. |
Source code in qadence/ml_tools/trainer.py
run_validation(dataloader)
Runs the validation loop for a single epoch, iterating over multiple batches.
PARAMETER | DESCRIPTION |
---|---|
dataloader
|
DataLoader for validation data.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[tuple[Tensor, dict[str, Any]]]
|
list[tuple[torch.Tensor, dict[str, Any]]]: Loss and metrics for each batch. list -> tuples Validation Batches -> (loss, metrics) |
Source code in qadence/ml_tools/trainer.py
stop_training()
Helper function to indicate if the training should be stopped.
We all_reduce the indicator across all processes to ensure all processes are stopped.
Notes
self._stop_training indicator indicates if the training should be stopped. 0 is continue. 1 is stop.
Source code in qadence/ml_tools/trainer.py
test(test_dataloader=None)
Runs the testing loop if a test DataLoader is provided.
if the test_dataloader is not provided, default test_dataloader defined in the Trainer class is used.
PARAMETER | DESCRIPTION |
---|---|
test_dataloader
|
DataLoader for test data.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
list[tuple[Tensor, dict[str, Any]]]
|
list[tuple[torch.Tensor, dict[str, Any]]]: Loss and metrics for each batch. list -> tuples Test Batches -> (loss, metrics) |
Source code in qadence/ml_tools/trainer.py
AnsatzConfig(depth=1, ansatz_type=AnsatzType.HEA, ansatz_strategy=Strategy.DIGITAL, strategy_args=dict(), m_block_qubits=None, param_prefix='theta', tag=None)
dataclass
ansatz_strategy = Strategy.DIGITAL
class-attribute
instance-attribute
Ansatz strategy.
Strategy.DIGITAL
for fully digital ansatz. Required if ansatz_type
is AnsatzType.ALA
.
Strategy.SDAQC
for analog entangling block. Only available for AnsatzType.HEA
or
AnsatzType.ALA
.
Strategy.RYDBERG
for fully rydberg hea ansatz. Only available for AnsatzType.HEA
.
ansatz_type = AnsatzType.HEA
class-attribute
instance-attribute
What type of ansatz.
AnsatzType.HEA
for Hardware Efficient Ansatz.
AnsatzType.IIA
for Identity Intialized Ansatz.
AnsatzType.ALA
for Alternating Layer Ansatz.
depth = 1
class-attribute
instance-attribute
Number of layers of the ansatz.
m_block_qubits = None
class-attribute
instance-attribute
The number of qubits in the local entangling block of an Alternating Layer Ansatz (ALA).
Only used when ansatz_type
is AnsatzType.ALA
.
param_prefix = 'theta'
class-attribute
instance-attribute
The base bame of the variational parameter.
strategy_args = field(default_factory=dict)
class-attribute
instance-attribute
A dictionary containing keyword arguments to the function creating the ansatz.
Details about each below.
For Strategy.DIGITAL
strategy, accepts the following:
periodic (bool): if the qubits should be linked periodically.
periodic=False is not supported in emu-c.
operations (list): list of operations to cycle through in the
digital single-qubit rotations of each layer.
Defaults to [RX, RY, RX] for hea and [RX, RY] for iia.
entangler (AbstractBlock): 2-qubit entangling operation.
Supports CNOT, CZ, CRX, CRY, CRZ, CPHASE. Controlld rotations
will have variational parameters on the rotation angles.
Defaults to CNOT
For Strategy.SDAQC
strategy, accepts the following:
operations (list): list of operations to cycle through in the
digital single-qubit rotations of each layer.
Defaults to [RX, RY, RX] for hea and [RX, RY] for iia.
entangler (AbstractBlock): Hamiltonian generator for the
analog entangling layer. Time parameter is considered variational.
Defaults to NN interaction.
For Strategy.RYDBERG
strategy, accepts the following:
addressable_detuning: whether to turn on the trainable semi-local addressing pattern
on the detuning (n_i terms in the Hamiltonian).
Defaults to True.
addressable_drive: whether to turn on the trainable semi-local addressing pattern
on the drive (sigma_i^x terms in the Hamiltonian).
Defaults to False.
tunable_phase: whether to have a tunable phase to get both sigma^x and sigma^y rotations
in the drive term. If False, only a sigma^x term will be included in the drive part
of the Hamiltonian generator.
Defaults to False.
tag = None
class-attribute
instance-attribute
String to indicate the name tag of the ansatz.
Defaults to None, in which case no tag will be applied.
FeatureMapConfig(num_features=0, basis_set=BasisSet.FOURIER, reupload_scaling=ReuploadScaling.CONSTANT, feature_range=None, target_range=None, multivariate_strategy=MultivariateStrategy.PARALLEL, feature_map_strategy=Strategy.DIGITAL, param_prefix=None, num_repeats=0, operation=None, inputs=None, tag=None)
dataclass
basis_set = BasisSet.FOURIER
class-attribute
instance-attribute
Basis set for feature encoding.
Takes qadence.BasisSet. Give a single BasisSet to use the same for all features. Give a dict of (str, BasisSet) where the key is the name of the variable and the value is the BasisSet to use for encoding that feature. BasisSet.FOURIER for Fourier encoding. BasisSet.CHEBYSHEV for Chebyshev encoding.
feature_map_strategy = Strategy.DIGITAL
class-attribute
instance-attribute
Strategy for feature map.
Accepts DIGITAL, ANALOG or RYDBERG. Defaults to DIGITAL.
If the strategy is incompatible with the operation
chosen, then operation
gets preference and the given strategy is ignored.
feature_range = None
class-attribute
instance-attribute
Range of data that the input data is assumed to come from.
Give a single tuple to use the same range for all features. Give a dict of (str, tuple) where the key is the name of the variable and the value is the feature range to use for that feature.
inputs = None
class-attribute
instance-attribute
List that indicates the order of variables of the tensors that are passed.
Optional if a single feature is being encoded, required otherwise. Given input tensors
xs = torch.rand(batch_size, input_size:=2)
a QNN with inputs=["t", "x"]
will
assign t, x = xs[:,0], xs[:,1]
.
multivariate_strategy = MultivariateStrategy.PARALLEL
class-attribute
instance-attribute
The encoding strategy in case of multi-variate function.
Takes qadence.MultivariateStrategy.
If PARALLEL, the features are encoded in one block of rotation gates
with the register being split in sub-registers for each feature.
If SERIES, the features are encoded sequentially using the full register for each feature, with
an ansatz block between them. PARALLEL is allowed only for DIGITAL feature_map_strategy
.
num_features = 0
class-attribute
instance-attribute
Number of feature parameters to be encoded.
Defaults to 0. Thus, no feature parameters are encoded.
num_repeats = 0
class-attribute
instance-attribute
Number of feature map layers repeated in the data reuploading step.
If all features are to be repeated the same number of times, then can give a single
int
. For different number of repetitions for each feature, provide a dict
of (str, int) where the key is the name of the variable and the value is the
number of repetitions for that feature.
This amounts to the number of additional reuploads. So if num_repeats
is N,
the data gets uploaded N+1 times. Defaults to no repetition.
operation = None
class-attribute
instance-attribute
Type of operation.
Choose among the analog or digital rotations or a custom
callable function returning an AnalogBlock instance. If the type of operation is
incompatible with the strategy
chosen, then operation
gets preference and
the given strategy is ignored.
param_prefix = None
class-attribute
instance-attribute
String prefix to create trainable parameters in Feature Map.
A string prefix to create trainable parameters multiplying the feature parameter
inside the feature-encoding function. Note that currently this does not take into
account the domain of the feature-encoding function.
Defaults to None
and thus, the feature map is not trainable.
Note that this is separate from the name of the parameter.
The user can provide a single prefix for all features, and it will be appended
by appropriate feature name automatically.
reupload_scaling = ReuploadScaling.CONSTANT
class-attribute
instance-attribute
Scaling for encoding the same feature on different qubits.
Scaling used to encode the same feature on different qubits in the same layer of the feature maps. Takes qadence.ReuploadScaling. Give a single ReuploadScaling to use the same for all features. Give a dict of (str, ReuploadScaling) where the key is the name of the variable and the value is the ReuploadScaling to use for encoding that feature. ReuploadScaling.CONSTANT for constant scaling. ReuploadScaling.TOWER for linearly increasing scaling. ReuploadScaling.EXP for exponentially increasing scaling.
tag = None
class-attribute
instance-attribute
String to indicate the name tag of the feature map.
Defaults to None, in which case no tag will be applied.
target_range = None
class-attribute
instance-attribute
Range of data the data encoder assumes as natural range.
Give a single tuple to use the same range for all features. Give a dict of (str, tuple) where the key is the name of the variable and the value is the target range to use for that feature.
TrainConfig(max_iter=10000, print_every=0, write_every=0, checkpoint_every=0, plot_every=0, callbacks=lambda: list()(), log_model=False, root_folder=Path('./qml_logs'), create_subfolder_per_run=False, log_folder=Path('./'), checkpoint_best_only=False, val_every=0, val_epsilon=1e-05, validation_criterion=None, trainstop_criterion=None, batch_size=1, verbose=True, tracking_tool=ExperimentTrackingTool.TENSORBOARD, hyperparams=dict(), plotting_functions=tuple(), _subfolders=list(), nprocs=1, compute_setup='cpu', backend='gloo', log_setup='cpu', dtype=None, all_reduce_metrics=False)
dataclass
Default configuration for the training process.
This class provides default settings for various aspects of the training loop,
such as logging, checkpointing, and validation. The default values for these
fields can be customized when an instance of TrainConfig
is created.
Example:
TrainConfig(max_iter=10000, print_every=0, write_every=0, checkpoint_every=0, plot_every=0, callbacks=[], log_model=False, root_folder='/tmp/train', create_subfolder_per_run=False, log_folder=PosixPath('.'), checkpoint_best_only=False, val_every=0, val_epsilon=1e-05, validation_criterion=None, trainstop_criterion=None, batch_size=1, verbose=True, tracking_tool=<ExperimentTrackingTool.TENSORBOARD: 'tensorboard'>, hyperparams={}, plotting_functions=(), _subfolders=[], nprocs=1, compute_setup='cpu', backend='gloo', log_setup='cpu', dtype=None, all_reduce_metrics=False)
_subfolders = field(default_factory=list)
class-attribute
instance-attribute
List of subfolders used for logging different runs using the same config inside the.
root folder.
Each subfolder is of structure <id>_<timestamp>_<PID>
.
all_reduce_metrics = False
class-attribute
instance-attribute
Whether to aggregate metrics (e.g., loss, accuracy) across processes.
When True, metrics from different training processes are averaged to provide a consolidated metrics. Note: Since aggregation requires synchronization/all_reduce operation, this can increase the computation time significantly.
backend = 'gloo'
class-attribute
instance-attribute
Backend used for distributed training communication.
The default is "gloo". Other options may include "nccl" - which is optimized for GPU-based training or "mpi",
depending on your system and requirements.
It should be one of the backends supported by torch.distributed
. For further details, please look at
torch backends
batch_size = 1
class-attribute
instance-attribute
The batch size to use when processing a list or tuple of torch.Tensors.
This specifies how many samples are processed in each training iteration.
callbacks = field(default_factory=lambda: list())
class-attribute
instance-attribute
List of callbacks to execute during training.
Callbacks can be used for custom behaviors, such as early stopping, custom logging, or other actions triggered at specific events.
checkpoint_best_only = False
class-attribute
instance-attribute
If True
, checkpoints are only saved if there is an improvement in the.
validation metric. This conserves storage by only keeping the best models.
validation_criterion is required when this is set to True.
checkpoint_every = 0
class-attribute
instance-attribute
Frequency (in epochs) for saving model and optimizer checkpoints during training.
Set to 0 to disable checkpointing. This helps in resuming training or recovering models. Note that setting checkpoint_best_only = True will disable this and only best checkpoints will be saved.
compute_setup = 'cpu'
class-attribute
instance-attribute
Compute device setup; options are "auto", "gpu", or "cpu".
- "auto": Automatically uses GPU if available; otherwise, falls back to CPU.
- "gpu": Forces GPU usage, raising an error if no CUDA device is available.
- "cpu": Forces the use of CPU regardless of GPU availability.
create_subfolder_per_run = False
class-attribute
instance-attribute
Whether to create a subfolder for each run, named <id>_<timestamp>_<PID>
.
This ensures logs and checkpoints from different runs do not overwrite each other,
which is helpful for rapid prototyping. If False
, training will resume from
the latest checkpoint if one exists in the specified log folder.
dtype = None
class-attribute
instance-attribute
Data type (precision) for computations.
Both model parameters, and dataset will be of the provided precision.
If not specified or None, the default torch precision (usually torch.float32) is used. If provided dtype is torch.complex128, model parameters will be torch.complex128, and data parameters will be torch.float64
hyperparams = field(default_factory=dict)
class-attribute
instance-attribute
A dictionary of hyperparameters to be tracked.
This can include learning rates, regularization parameters, or any other training-related configurations.
log_folder = Path('./')
class-attribute
instance-attribute
The log folder for saving checkpoints and tensorboard logs.
This stores the path where all logs and checkpoints are being saved
for this training session. log_folder
takes precedence over root_folder
,
but it is ignored if create_subfolders_per_run=True
(in which case, subfolders
will be spawned in the root folder).
log_model = False
class-attribute
instance-attribute
Whether to log a serialized version of the model.
When set to True
, the
model's state will be logged, useful for model versioning and reproducibility.
log_setup = 'cpu'
class-attribute
instance-attribute
Logging device setup; options are "auto" or "cpu".
- "auto": Uses the same device for logging as for computation.
- "cpu": Forces logging to occur on the CPU. This can be useful to avoid potential conflicts with GPU processes.
max_iter = 10000
class-attribute
instance-attribute
Number of training iterations (epochs) to perform.
This defines the total number of times the model will be updated.
In case of InfiniteTensorDataset, each epoch will have 1 batch. In case of TensorDataset, each epoch will have len(dataloader) batches.
nprocs = 1
class-attribute
instance-attribute
The number of processes to use for training when spawning subprocesses.
For effective parallel processing, set this to a value greater than 1. - In case of Multi-GPU or Multi-Node-Multi-GPU setups, nprocs should be equal to the total number of GPUs across all nodes (world size), or total number of GPU to be used.
If nprocs > 1, multiple processes will be spawned for training. The training framework will launch additional processes (e.g., for distributed or parallel training). - For CPU setup, this will launch a true parallel processes - For GPU setup, this will launch a distributed training routine. This uses the DistributedDataParallel framework from PyTorch.
plot_every = 0
class-attribute
instance-attribute
Frequency (in epochs) for generating and saving figures during training.
Set to 0 to disable plotting.
plotting_functions = field(default_factory=tuple)
class-attribute
instance-attribute
Functions used for in-training plotting.
These are called to generate plots that are logged or saved at specified intervals.
print_every = 0
class-attribute
instance-attribute
Frequency (in epochs) for printing loss and metrics to the console during training.
Set to 0 to disable this output, meaning that metrics and loss will not be printed during training.
root_folder = Path('./qml_logs')
class-attribute
instance-attribute
The root folder for saving checkpoints and tensorboard logs.
The default path is "./qml_logs"
This can be set to a specific directory where training artifacts are to be stored.
Checkpoints will be saved inside a subfolder in this directory. Subfolders will be
created based on create_subfolder_per_run
argument.
tracking_tool = ExperimentTrackingTool.TENSORBOARD
class-attribute
instance-attribute
The tool used for tracking training progress and logging metrics.
Options include tools like TensorBoard, which help visualize and monitor model training.
trainstop_criterion = None
class-attribute
instance-attribute
A function to determine if the training process should stop based on a.
specific stopping metric. If None
, training continues until max_iter
is reached.
val_epsilon = 1e-05
class-attribute
instance-attribute
A small safety margin used to compare the current validation loss with the.
best previous validation loss. This is used to determine improvements in metrics.
val_every = 0
class-attribute
instance-attribute
Frequency (in epochs) for performing validation.
If set to 0, validation is not performed.
Note that metrics from validation are always written, regardless of the write_every
setting.
Note that initial validation happens at the start of training (when val_every > 0)
For initial validation - initial metrics are written.
- checkpoint is saved (when checkpoint_best_only = False)
validation_criterion = None
class-attribute
instance-attribute
A function to evaluate whether a given validation metric meets a desired condition.
The validation_criterion has the following format: def validation_criterion(val_loss: float, best_val_loss: float, val_epsilon: float) -> bool: # process
If None
, no custom validation criterion is applied.
verbose = True
class-attribute
instance-attribute
Whether to print metrics and status messages during training.
If True
, detailed metrics and status updates will be displayed in the console.
write_every = 0
class-attribute
instance-attribute
Frequency (in epochs) for writing loss and metrics using the tracking tool during training.
Set to 0 to disable this logging, which prevents metrics from being logged to the tracking tool. Note that the metrics will always be written at the end of training regardless of this setting.
get_parameters(model)
Retrieve all trainable model parameters in a single vector.
PARAMETER | DESCRIPTION |
---|---|
model
|
the input PyTorch model
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
a 1-dimensional tensor with the parameters
TYPE:
|
Source code in qadence/ml_tools/parameters.py
num_parameters(model)
set_parameters(model, theta)
Set all trainable parameters of a model from a single vector.
Notice that this function assumes prior knowledge of right number of parameters in the model
PARAMETER | DESCRIPTION |
---|---|
model
|
the input PyTorch model
TYPE:
|
theta
|
the parameters to assign
TYPE:
|
Source code in qadence/ml_tools/parameters.py
optimize_step(model, optimizer, loss_fn, xs, device=None, dtype=None)
Default Torch optimize step with closure.
This is the default optimization step.
PARAMETER | DESCRIPTION |
---|---|
model
|
The input model to be optimized.
TYPE:
|
optimizer
|
The chosen Torch optimizer.
TYPE:
|
loss_fn
|
A custom loss function that returns the loss value and a dictionary of metrics.
TYPE:
|
xs
|
The input data. If None, it means the given model does not require any input data.
TYPE:
|
device
|
A target device to run computations on.
TYPE:
|
dtype
|
Data type for
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[Tensor | float, dict | None]
|
tuple[Tensor | float, dict | None]: A tuple containing the computed loss value and a dictionary with collected metrics. |
Source code in qadence/ml_tools/optimize_step.py
update_ng_parameters(model, optimizer, loss_fn, data, ng_params)
Update the model parameters using Nevergrad.
This function integrates Nevergrad for derivative-free optimization.
PARAMETER | DESCRIPTION |
---|---|
model
|
The PyTorch model to be optimized.
TYPE:
|
optimizer
|
A Nevergrad optimizer instance.
TYPE:
|
loss_fn
|
A custom loss function that returns the loss value and a dictionary of metrics.
TYPE:
|
data
|
Input data for the model. If None, it means the model does not require input data.
TYPE:
|
ng_params
|
The current set of parameters managed by Nevergrad.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[float, dict, Array]
|
tuple[float, dict, ng.p.Array]: A tuple containing the computed loss value, a dictionary of metrics, and the updated Nevergrad parameters. |
Source code in qadence/ml_tools/optimize_step.py
DictDataLoader(dataloaders)
dataclass
This class only holds a dictionary of DataLoader
s and samples from them.
InfiniteTensorDataset(*tensors)
Bases: IterableDataset
Randomly sample points from the first dimension of the given tensors.
Behaves like a normal torch Dataset
just that we can sample from it as
many times as we want.
Examples:
import torch
from qadence.ml_tools.data import InfiniteTensorDataset
x_data, y_data = torch.rand(5,2), torch.ones(5,1)
# The dataset accepts any number of tensors with the same batch dimension
ds = InfiniteTensorDataset(x_data, y_data)
# call `next` to get one sample from each tensor:
xs = next(iter(ds))
Source code in qadence/ml_tools/data.py
OptimizeResult(iteration, model, optimizer, loss=None, metrics=lambda: dict()(), extra=lambda: dict()(), rank=0, device='cpu')
dataclass
OptimizeResult stores many optimization intermediate values.
We store at a current iteration, the model, optimizer, loss values, metrics. An extra dict can be used for saving other information to be used for callbacks.
device = 'cpu'
class-attribute
instance-attribute
Device on which this result for calculated.
extra = field(default_factory=lambda: dict())
class-attribute
instance-attribute
Extra dict for saving anything else to be used in callbacks.
iteration
instance-attribute
Current iteration number.
loss = None
class-attribute
instance-attribute
Loss value.
metrics = field(default_factory=lambda: dict())
class-attribute
instance-attribute
Metrics that can be saved during training.
model
instance-attribute
Model at iteration.
optimizer
instance-attribute
Optimizer at iteration.
rank = 0
class-attribute
instance-attribute
Rank of the process for which this result was generated.
data_to_device(xs, *args, **kwargs)
Utility method to move arbitrary data to 'device'.
to_dataloader(*tensors, batch_size=1, infinite=False)
Convert torch tensors an (infinite) Dataloader.
PARAMETER | DESCRIPTION |
---|---|
*tensors
|
Torch tensors to use in the dataloader.
TYPE:
|
batch_size
|
batch size of sampled tensors
TYPE:
|
infinite
|
if
TYPE:
|
Examples:
import torch
from qadence.ml_tools import to_dataloader
(x, y, z) = [torch.rand(10) for _ in range(3)]
loader = iter(to_dataloader(x, y, z, batch_size=5, infinite=True))
print(next(loader))
print(next(loader))
print(next(loader))
[tensor([0.4483, 0.8226, 0.3765, 0.1782, 0.0435]), tensor([0.8529, 0.4461, 0.4437, 0.3194, 0.8547]), tensor([0.8719, 0.6678, 0.4718, 0.9445, 0.1859])]
[tensor([0.3387, 0.6516, 0.8960, 0.8684, 0.5089]), tensor([0.2403, 0.3733, 0.5831, 0.8063, 0.0280]), tensor([0.2211, 0.4503, 0.2892, 0.3290, 0.9851])]
[tensor([0.4483, 0.8226, 0.3765, 0.1782, 0.0435]), tensor([0.8529, 0.4461, 0.4437, 0.3194, 0.8547]), tensor([0.8719, 0.6678, 0.4718, 0.9445, 0.1859])]
Source code in qadence/ml_tools/data.py
QNN(circuit, observable, backend=BackendName.PYQTORCH, diff_mode=DiffMode.AD, measurement=None, noise=None, configuration=None, inputs=None, input_diff_mode=InputDiffMode.AD)
Bases: QuantumModel
Quantum neural network model for n-dimensional inputs.
Examples:
import torch
from qadence import QuantumCircuit, QNN, Z
from qadence import hea, feature_map, hamiltonian_factory, kron
# create the circuit
n_qubits, depth = 2, 4
fm = kron(
feature_map(1, support=(0,), param="x"),
feature_map(1, support=(1,), param="y")
)
ansatz = hea(n_qubits=n_qubits, depth=depth)
circuit = QuantumCircuit(n_qubits, fm, ansatz)
obs_base = hamiltonian_factory(n_qubits, detuning=Z)
# the QNN will yield two outputs
obs = [2.0 * obs_base, 4.0 * obs_base]
# initialize and use the model
qnn = QNN(circuit, obs, inputs=["x", "y"])
y = qnn(torch.rand(3, 2))
Initialize the QNN.
The number of inputs is determined by the feature parameters in the input quantum circuit while the number of outputs is determined by how many observables are provided as input
PARAMETER | DESCRIPTION |
---|---|
circuit
|
The quantum circuit to use for the QNN.
TYPE:
|
observable
|
The observable.
TYPE:
|
backend
|
The chosen quantum backend.
TYPE:
|
diff_mode
|
The differentiation engine to use. Choices 'gpsr' or 'ad'.
TYPE:
|
measurement
|
optional measurement protocol. If None, use exact expectation value with a statevector simulator
TYPE:
|
noise
|
A noise model to use.
TYPE:
|
configuration
|
optional configuration for the backend
TYPE:
|
inputs
|
List that indicates the order of variables of the tensors that are passed
to the model. Given input tensors
TYPE:
|
input_diff_mode
|
The differentiation mode for the input tensor.
TYPE:
|
Source code in qadence/ml_tools/models.py
136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
|
__str__()
Return a string representation of a QNN.
When creating a QNN from a set of configurations, we print the configurations used. Otherwise, we use the default printing.
RETURNS | DESCRIPTION |
---|---|
str | Any
|
str | Any: A string representation of a QNN. |
Example:
from qadence import QNN
from qadence.constructors.hamiltonians import Interaction
from qadence.ml_tools.config import AnsatzConfig, FeatureMapConfig
from qadence.ml_tools.constructors import (
ObservableConfig,
)
from qadence.operations import Z
from qadence.types import BackendName
backend = BackendName.PYQTORCH
fm_config = FeatureMapConfig(num_features=1)
ansatz_config = AnsatzConfig()
observable_config = ObservableConfig(detuning=Z, interaction=Interaction.ZZ, scale=2)
qnn = QNN.from_configs(
register=2,
obs_config=observable_config,
fm_config=fm_config,
ansatz_config=ansatz_config,
backend=backend,
)
QNN(
ansatz_config = AnsatzConfig(depth=1, ansatz_type=<AnsatzType.HEA: 'hea'>, ansatz_strategy=<Strategy.DIGITAL: 'Digital'>, strategy_args={}, m_block_qubits=None, param_prefix='theta', tag=None)
fm_config = FeatureMapConfig(num_features=1, basis_set={'x': <BasisSet.FOURIER: 'Fourier'>}, reupload_scaling={'x': <ReuploadScaling.CONSTANT: 'Constant'>}, feature_range={'x': None}, target_range={'x': None}, multivariate_strategy=<MultivariateStrategy.PARALLEL: 'Parallel'>, feature_map_strategy=<Strategy.DIGITAL: 'Digital'>, param_prefix=None, num_repeats={'x': 0}, operation=<class 'qadence.operations.parametric.RX'>, inputs=['x'], tag=None)
register = 2
observable_config = {'Obs.': '(2.000 * (Z(0) + Z(1) + (Z(0) ⊗ Z(1))))'}
)
Source code in qadence/ml_tools/models.py
forward(values=None, state=None, measurement=None, noise=None, endianness=Endianness.BIG)
Forward pass of the model.
This returns the (differentiable) expectation value of the given observable
operator defined in the constructor. Differently from the base QuantumModel
class, the QNN accepts also a tensor as input for the forward pass. The
tensor is expected to have shape: n_batches x in_features
where n_batches
is the number of data points and in_features
is the dimensionality of the problem
The output of the forward pass is the expectation value of the input
observable(s). If a single observable is given, the output shape is
n_batches
while if multiple observables are given the output shape
is instead n_batches x n_observables
PARAMETER | DESCRIPTION |
---|---|
values
|
the values of the feature parameters
TYPE:
|
state
|
Initial state.
TYPE:
|
measurement
|
optional measurement protocol. If None, use exact expectation value with a statevector simulator
TYPE:
|
noise
|
A noise model to use.
TYPE:
|
endianness
|
Endianness of the resulting bit strings.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
a tensor with the expectation value of the observables passed in the constructor of the model
TYPE:
|
Source code in qadence/ml_tools/models.py
from_configs(register, obs_config, fm_config=FeatureMapConfig(), ansatz_config=AnsatzConfig(), backend=BackendName.PYQTORCH, diff_mode=DiffMode.AD, measurement=None, noise=None, configuration=None, input_diff_mode=InputDiffMode.AD)
classmethod
Create a QNN from a set of configurations.
PARAMETER | DESCRIPTION |
---|---|
register
|
The number of qubits or a register object.
TYPE:
|
obs_config
|
The configuration(s) for the observable(s).
TYPE:
|
fm_config
|
The configuration for the feature map. Defaults to no feature encoding block.
TYPE:
|
ansatz_config
|
The configuration for the ansatz. Defaults to a single layer of hardware efficient ansatz.
TYPE:
|
backend
|
The chosen quantum backend.
TYPE:
|
diff_mode
|
The differentiation engine to use. Choices are 'gpsr' or 'ad'.
TYPE:
|
measurement
|
Optional measurement protocol. If None, use exact expectation value with a statevector simulator.
TYPE:
|
noise
|
A noise model to use.
TYPE:
|
configuration
|
Optional backend configuration.
TYPE:
|
input_diff_mode
|
The differentiation mode for the input tensor.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
QNN
|
A QNN object. |
RAISES | DESCRIPTION |
---|---|
ValueError
|
If the observable configuration is not provided. |
Example:
import torch
from qadence.ml_tools.config import AnsatzConfig, FeatureMapConfig
from qadence.ml_tools import QNN
from qadence.constructors import ObservableConfig
from qadence.operations import Z
from qadence.types import (
AnsatzType, BackendName, BasisSet, ReuploadScaling, Strategy
)
register = 4
obs_config = ObservableConfig(
detuning=Z,
scale=5.0,
shift=0.0,
trainable_transform=None,
)
fm_config = FeatureMapConfig(
num_features=2,
inputs=["x", "y"],
basis_set=BasisSet.FOURIER,
reupload_scaling=ReuploadScaling.CONSTANT,
feature_range={
"x": (-1.0, 1.0),
"y": (0.0, 1.0),
},
)
ansatz_config = AnsatzConfig(
depth=2,
ansatz_type=AnsatzType.HEA,
ansatz_strategy=Strategy.DIGITAL,
)
qnn = QNN.from_configs(
register, obs_config, fm_config, ansatz_config, backend=BackendName.PYQTORCH
)
x = torch.rand(2, 2)
y = qnn(x)
Source code in qadence/ml_tools/models.py
214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 |
|
derivative(ufa, x, derivative_indices)
Compute derivatives w.r.t.
inputs of a UFA with a single output. The
derivative_indices
specify which derivative(s) are computed. E.g.
derivative_indices=(1,2)
would compute the a second order derivative w.r.t
to the indices 1
and 2
of the input tensor.
PARAMETER | DESCRIPTION |
---|---|
ufa
|
The model for which we want to compute the derivative.
TYPE:
|
x
|
(batch_size, input_size) input tensor.
TYPE:
|
derivative_indices
|
Define which derivatives to compute.
TYPE:
|
Examples:
If we create a UFA with three inputs and denote the first, second, and third
input with x
, y
, and z
we can compute the following derivatives w.r.t
to those inputs:
import torch
from qadence.ml_tools.models import derivative, QNN
from qadence.ml_tools.config import FeatureMapConfig, AnsatzConfig
from qadence.constructors.hamiltonians import ObservableConfig
from qadence.operations import Z
fm_config = FeatureMapConfig(num_features=3, inputs=["x", "y", "z"])
ansatz_config = AnsatzConfig()
obs_config = ObservableConfig(detuning=Z)
f = QNN.from_configs(
register=3, obs_config=obs_config, fm_config=fm_config, ansatz_config=ansatz_config,
)
inputs = torch.rand(5,3,requires_grad=True)
# df_dx
derivative(f, inputs, (0,))
# d2f_dydz
derivative(f, inputs, (1,2))
# d3fdy2dx
derivative(f, inputs, (1,1,0))
Source code in qadence/ml_tools/models.py
format_to_dict_fn(inputs=[])
Format an input tensor into the format required by the forward pass.
The tensor is assumed to have dimensions: n_batches x in_features where in_features corresponds to the number of input features of the QNN
Source code in qadence/ml_tools/models.py
Callback(on='idle', called_every=1, callback=None, callback_condition=None, modify_optimize_result=None)
Base class for defining various training callbacks.
ATTRIBUTE | DESCRIPTION |
---|---|
on |
The event on which to trigger the callback. Must be a valid on value from: ["train_start", "train_end", "train_epoch_start", "train_epoch_end", "train_batch_start", "train_batch_end","val_epoch_start", "val_epoch_end", "val_batch_start", "val_batch_end", "test_batch_start", "test_batch_end"]
TYPE:
|
called_every |
Frequency of callback calls in terms of iterations.
TYPE:
|
callback |
The function to call if the condition is met.
TYPE:
|
callback_condition |
Condition to check before calling.
TYPE:
|
modify_optimize_result |
Function to modify
TYPE:
|
A callback can be defined in two ways:
- By providing a callback function directly in the base class: This is useful for simple callbacks that don't require subclassing.
Example:
from qadence.ml_tools.callbacks import Callback
def custom_callback_function(trainer, config, writer):
print("Custom callback executed.")
custom_callback = Callback(
on="train_end",
called_every=5,
callback=custom_callback_function
)
- By inheriting and implementing the
run_callback
method: This is suitable for more complex callbacks that require customization.
Example:
from qadence.ml_tools.callbacks import Callback
class CustomCallback(Callback):
def run_callback(self, trainer, config, writer):
print("Custom behavior in the inherited run_callback method.")
custom_callback = CustomCallback(on="train_end", called_every=10)
Source code in qadence/ml_tools/callbacks/callback.py
on
property
writable
Returns the TrainingStage.
RETURNS | DESCRIPTION |
---|---|
TrainingStage
|
TrainingStage for the callback
TYPE:
|
__call__(when, trainer, config, writer)
Executes the callback if conditions are met.
PARAMETER | DESCRIPTION |
---|---|
when
|
The event when the callback is triggered.
TYPE:
|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Any
|
Result of the callback function if executed.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
_should_call(when, opt_result)
Checks if the callback should be called.
PARAMETER | DESCRIPTION |
---|---|
when
|
The event when the callback is considered for execution.
TYPE:
|
opt_result
|
The current optimization results.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
bool
|
Whether the callback should be called.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Executes the defined callback.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Any
|
Result of the callback execution.
TYPE:
|
RAISES | DESCRIPTION |
---|---|
NotImplementedError
|
If not implemented in subclasses. |
Source code in qadence/ml_tools/callbacks/callback.py
EarlyStopping(on, called_every, monitor, patience=5, mode='min')
Bases: Callback
Stops training when a monitored metric has not improved for a specified number of epochs.
This callback monitors a specified metric (e.g., validation loss or accuracy). If the metric does not improve for a given patience period, training is stopped.
Example Usage in TrainConfig
:
To use EarlyStopping
, include it in the callbacks
list when setting up your TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import EarlyStopping
# Create an instance of the EarlyStopping callback
early_stopping = EarlyStopping(on="val_epoch_end",
called_every=1,
monitor="val_loss",
patience=5,
mode="min")
config = TrainConfig(
max_iter=10000,
print_every=1000,
callbacks=[early_stopping]
)
Initializes the EarlyStopping callback.
PARAMETER | DESCRIPTION |
---|---|
on
|
The event to trigger the callback (e.g., "val_epoch_end").
TYPE:
|
called_every
|
Frequency of callback calls in terms of iterations.
TYPE:
|
monitor
|
The metric to monitor (e.g., "val_loss" or "train_loss"). All metrics returned by optimize step are available to monitor. Please add "val_" and "train_" strings at the start of the metric name.
TYPE:
|
patience
|
Number of iterations to wait for improvement. Default is 5.
TYPE:
|
mode
|
Whether to minimize ("min") or maximize ("max") the metric. Default is "min".
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Monitors the metric and stops training if no improvement is observed.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
GradientMonitoring(on, called_every=1)
Bases: Callback
Logs gradient statistics (e.g., mean, standard deviation, max) during training.
This callback monitors and logs statistics about the gradients of the model parameters to help debug or optimize the training process.
Example Usage in TrainConfig
:
To use GradientMonitoring
, include it in the callbacks
list when
setting up your TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import GradientMonitoring
# Create an instance of the GradientMonitoring callback
gradient_monitoring = GradientMonitoring(on="train_batch_end", called_every=10)
config = TrainConfig(
max_iter=10000,
print_every=1000,
callbacks=[gradient_monitoring]
)
Initializes the GradientMonitoring callback.
PARAMETER | DESCRIPTION |
---|---|
on
|
The event to trigger the callback (e.g., "train_batch_end").
TYPE:
|
called_every
|
Frequency of callback calls in terms of iterations.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Logs gradient statistics.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
LRSchedulerCosineAnnealing(on, called_every, t_max, min_lr=0.0)
Bases: Callback
Applies cosine annealing to the learning rate during training.
This callback decreases the learning rate following a cosine curve, starting from the initial learning rate and annealing to a minimum (min_lr).
Example Usage in TrainConfig
:
To use LRSchedulerCosineAnnealing
, include it in the callbacks
list
when setting up your TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import LRSchedulerCosineAnnealing
# Create an instance of the LRSchedulerCosineAnnealing callback
lr_cosine = LRSchedulerCosineAnnealing(on="train_batch_end",
called_every=1,
t_max=5000,
min_lr=1e-6)
config = TrainConfig(
max_iter=10000,
# Print metrics every 1000 training epochs
print_every=1000,
# Add the custom callback
callbacks=[lr_cosine]
)
Initializes the LRSchedulerCosineAnnealing callback.
PARAMETER | DESCRIPTION |
---|---|
on
|
The event to trigger the callback.
TYPE:
|
called_every
|
Frequency of callback calls in terms of iterations.
TYPE:
|
t_max
|
The total number of iterations for one annealing cycle.
TYPE:
|
min_lr
|
The minimum learning rate. Default is 0.0.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Adjusts the learning rate using cosine annealing.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
LRSchedulerCyclic(on, called_every, base_lr, max_lr, step_size)
Bases: Callback
Applies a cyclic learning rate schedule during training.
This callback oscillates the learning rate between a minimum (base_lr) and a maximum (max_lr) over a defined cycle length (step_size). The learning rate follows a triangular wave pattern.
Example Usage in TrainConfig
:
To use LRSchedulerCyclic
, include it in the callbacks
list when setting
up your TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import LRSchedulerCyclic
# Create an instance of the LRSchedulerCyclic callback
lr_cyclic = LRSchedulerCyclic(on="train_batch_end",
called_every=1,
base_lr=0.001,
max_lr=0.01,
step_size=2000)
config = TrainConfig(
max_iter=10000,
# Print metrics every 1000 training epochs
print_every=1000,
# Add the custom callback
callbacks=[lr_cyclic]
)
Initializes the LRSchedulerCyclic callback.
PARAMETER | DESCRIPTION |
---|---|
on
|
The event to trigger the callback.
TYPE:
|
called_every
|
Frequency of callback calls in terms of iterations.
TYPE:
|
base_lr
|
The minimum learning rate.
TYPE:
|
max_lr
|
The maximum learning rate.
TYPE:
|
step_size
|
Number of iterations for half a cycle.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Adjusts the learning rate cyclically.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
LRSchedulerStepDecay(on, called_every, gamma=0.5)
Bases: Callback
Reduces the learning rate by a factor at regular intervals.
This callback adjusts the learning rate by multiplying it with a decay factor after a specified number of iterations. The learning rate is updated as: lr = lr * gamma
Example Usage in TrainConfig
:
To use LRSchedulerStepDecay
, include it in the callbacks
list when setting
up your TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import LRSchedulerStepDecay
# Create an instance of the LRSchedulerStepDecay callback
lr_step_decay = LRSchedulerStepDecay(on="train_epoch_end",
called_every=100,
gamma=0.5)
config = TrainConfig(
max_iter=10000,
# Print metrics every 1000 training epochs
print_every=1000,
# Add the custom callback
callbacks=[lr_step_decay]
)
Initializes the LRSchedulerStepDecay callback.
PARAMETER | DESCRIPTION |
---|---|
on
|
The event to trigger the callback.
TYPE:
|
called_every
|
Frequency of callback calls in terms of iterations.
TYPE:
|
gamma
|
The decay factor applied to the learning rate. A value < 1 reduces the learning rate over time. Default is 0.5.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Runs the callback to apply step decay to the learning rate.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
LoadCheckpoint(on='idle', called_every=1, callback=None, callback_condition=None, modify_optimize_result=None)
Bases: Callback
Callback to load a model checkpoint.
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Loads a model checkpoint.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Any
|
The result of loading the checkpoint.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
LogHyperparameters(on='idle', called_every=1, callback=None, callback_condition=None, modify_optimize_result=None)
Bases: Callback
Callback to log hyperparameters using the writer.
The LogHyperparameters
callback can be added to the TrainConfig
callbacks
as a custom user defined callback.
Example Usage in TrainConfig
:
To use LogHyperparameters
, include it in the callbacks
list when setting up your
TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import LogHyperparameters
# Create an instance of the LogHyperparameters callback
log_hyper_callback = LogHyperparameters(on = "val_batch_end", called_every = 100)
config = TrainConfig(
max_iter=10000,
# Print metrics every 1000 training epochs
print_every=1000,
# Add the custom callback that runs every 100 val_batch_end
callbacks=[log_hyper_callback]
)
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Logs hyperparameters using the writer.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
LogModelTracker(on='idle', called_every=1, callback=None, callback_condition=None, modify_optimize_result=None)
Bases: Callback
Callback to log the model using the writer.
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Logs the model using the writer.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
PlotMetrics(on='idle', called_every=1, callback=None, callback_condition=None, modify_optimize_result=None)
Bases: Callback
Callback to plot metrics using the writer.
The PlotMetrics
callback can be added to the TrainConfig
callbacks as
a custom user defined callback.
Example Usage in TrainConfig
:
To use PlotMetrics
, include it in the callbacks
list when setting up your
TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import PlotMetrics
# Create an instance of the PlotMetrics callback
plot_metrics_callback = PlotMetrics(on = "val_batch_end", called_every = 100)
config = TrainConfig(
max_iter=10000,
# Print metrics every 1000 training epochs
print_every=1000,
# Add the custom callback that runs every 100 val_batch_end
callbacks=[plot_metrics_callback]
)
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Plots metrics using the writer.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
PrintMetrics(on='idle', called_every=1, callback=None, callback_condition=None, modify_optimize_result=None)
Bases: Callback
Callback to print metrics using the writer.
The PrintMetrics
callback can be added to the TrainConfig
callbacks as a custom user defined callback.
Example Usage in TrainConfig
:
To use PrintMetrics
, include it in the callbacks
list when
setting up your TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import PrintMetrics
# Create an instance of the PrintMetrics callback
print_metrics_callback = PrintMetrics(on = "val_batch_end", called_every = 100)
config = TrainConfig(
max_iter=10000,
# Print metrics every 1000 training epochs
print_every=1000,
# Add the custom callback that runs every 100 val_batch_end
callbacks=[print_metrics_callback]
)
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Prints metrics using the writer.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
SaveBestCheckpoint(on, called_every)
Bases: SaveCheckpoint
Callback to save the best model checkpoint based on a validation criterion.
Initializes the SaveBestCheckpoint callback.
PARAMETER | DESCRIPTION |
---|---|
on
|
The event to trigger the callback.
TYPE:
|
called_every
|
Frequency of callback calls in terms of iterations.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Saves the checkpoint if the current loss is better than the best loss.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
SaveCheckpoint(on='idle', called_every=1, callback=None, callback_condition=None, modify_optimize_result=None)
Bases: Callback
Callback to save a model checkpoint.
The SaveCheckpoint
callback can be added to the TrainConfig
callbacks
as a custom user defined callback.
Example Usage in TrainConfig
:
To use SaveCheckpoint
, include it in the callbacks
list when setting up your
TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import SaveCheckpoint
# Create an instance of the SaveCheckpoint callback
save_checkpoint_callback = SaveCheckpoint(on = "val_batch_end", called_every = 100)
config = TrainConfig(
max_iter=10000,
# Print metrics every 1000 training epochs
print_every=1000,
# Add the custom callback that runs every 100 val_batch_end
callbacks=[save_checkpoint_callback]
)
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Saves a model checkpoint.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
WriteMetrics(on='idle', called_every=1, callback=None, callback_condition=None, modify_optimize_result=None)
Bases: Callback
Callback to write metrics using the writer.
The WriteMetrics
callback can be added to the TrainConfig
callbacks as
a custom user defined callback.
Example Usage in TrainConfig
:
To use WriteMetrics
, include it in the callbacks
list when setting up your
TrainConfig
:
from qadence.ml_tools import TrainConfig
from qadence.ml_tools.callbacks import WriteMetrics
# Create an instance of the WriteMetrics callback
write_metrics_callback = WriteMetrics(on = "val_batch_end", called_every = 100)
config = TrainConfig(
max_iter=10000,
# Print metrics every 1000 training epochs
print_every=1000,
# Add the custom callback that runs every 100 val_batch_end
callbacks=[write_metrics_callback]
)
Source code in qadence/ml_tools/callbacks/callback.py
run_callback(trainer, config, writer)
Writes metrics using the writer.
PARAMETER | DESCRIPTION |
---|---|
trainer
|
The training object.
TYPE:
|
config
|
The configuration object.
TYPE:
|
writer
|
The writer object for logging.
TYPE:
|
Source code in qadence/ml_tools/callbacks/callback.py
BaseTrainer(model, optimizer, config, loss_fn='mse', optimize_step=optimize_step, train_dataloader=None, val_dataloader=None, test_dataloader=None, max_batches=None)
Base class for training machine learning models using a given optimizer.
The base class implements contextmanager for gradient based/free optimization, properties, property setters, input validations, callback decorator generator, and empty hooks for different training steps.
This class provides
- Context managers for enabling/disabling gradient-based optimization
- Properties for managing models, optimizers, and dataloaders
- Input validations and a callback decorator generator
- Config and callback managers using the provided
TrainConfig
ATTRIBUTE | DESCRIPTION |
---|---|
use_grad |
Indicates if gradients are used for optimization. Default is True.
TYPE:
|
model |
The neural network model.
TYPE:
|
optimizer |
The optimizer for training.
TYPE:
|
config |
The configuration settings for training.
TYPE:
|
train_dataloader |
DataLoader for training data.
TYPE:
|
val_dataloader |
DataLoader for validation data.
TYPE:
|
test_dataloader |
DataLoader for testing data.
TYPE:
|
optimize_step |
Function for performing an optimization step.
TYPE:
|
loss_fn |
loss function to use. Default loss function used is 'mse'
TYPE:
|
num_training_batches |
Number of training batches. In case of InfiniteTensorDataset only 1 batch per epoch is used.
TYPE:
|
num_validation_batches |
Number of validation batches. In case of InfiniteTensorDataset only 1 batch per epoch is used.
TYPE:
|
num_test_batches |
Number of test batches. In case of InfiniteTensorDataset only 1 batch per epoch is used.
TYPE:
|
state |
Current state in the training process
TYPE:
|
Initializes the BaseTrainer.
PARAMETER | DESCRIPTION |
---|---|
model
|
The model to train.
TYPE:
|
optimizer
|
The optimizer for training.
TYPE:
|
config
|
The TrainConfig settings for training.
TYPE:
|
loss_fn
|
The loss function to use. str input to be specified to use a default loss function. currently supported loss functions: 'mse', 'cross_entropy'. If not specified, default mse loss will be used.
TYPE:
|
train_dataloader
|
DataLoader for training data. If the model does not need data to evaluate loss, no dataset should be provided.
TYPE:
|
val_dataloader
|
DataLoader for validation data.
TYPE:
|
test_dataloader
|
DataLoader for testing data.
TYPE:
|
max_batches
|
Maximum number of batches to process per epoch. This is only valid in case of finite TensorDataset dataloaders. if max_batches is not None, the maximum number of batches used will be min(max_batches, len(dataloader.dataset)) In case of InfiniteTensorDataset only 1 batch per epoch is used.
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
config
property
writable
Returns the training configuration.
RETURNS | DESCRIPTION |
---|---|
TrainConfig
|
The configuration object.
TYPE:
|
model
property
writable
Returns the model if set, otherwise raises an error.
RETURNS | DESCRIPTION |
---|---|
Module
|
nn.Module: The model. |
optimizer
property
writable
Returns the optimizer if set, otherwise raises an error.
RETURNS | DESCRIPTION |
---|---|
Optimizer | Optimizer | None
|
optim.Optimizer | NGOptimizer | None: The optimizer. |
test_dataloader
property
writable
Returns the test DataLoader, validating its type.
RETURNS | DESCRIPTION |
---|---|
DataLoader
|
The DataLoader for testing data.
TYPE:
|
train_dataloader
property
writable
Returns the training DataLoader, validating its type.
RETURNS | DESCRIPTION |
---|---|
DataLoader
|
The DataLoader for training data.
TYPE:
|
use_grad
property
writable
Returns the optimization framework for the trainer.
use_grad = True : Gradient based optimization use_grad = False : Gradient free optimization
RETURNS | DESCRIPTION |
---|---|
bool
|
Bool value for using gradient.
TYPE:
|
val_dataloader
property
writable
Returns the validation DataLoader, validating its type.
RETURNS | DESCRIPTION |
---|---|
DataLoader
|
The DataLoader for validation data.
TYPE:
|
_compute_num_batches(dataloader)
Computes the number of batches for the given DataLoader.
PARAMETER | DESCRIPTION |
---|---|
dataloader
|
The DataLoader for which to compute the number of batches.
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
_validate_dataloader(dataloader, dataloader_type)
Validates the type of the DataLoader and raises errors for unsupported types.
PARAMETER | DESCRIPTION |
---|---|
dataloader
|
The DataLoader to validate.
TYPE:
|
dataloader_type
|
The type of DataLoader ("train", "val", or "test").
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
callback(phase)
staticmethod
Decorator for executing callbacks before and after a phase.
Phase are different hooks during the training. list of valid phases is defined in Callbacks. We also update the current state of the training process in the callback decorator.
PARAMETER | DESCRIPTION |
---|---|
phase
|
The phase for which the callback is executed (e.g., "train", "train_epoch", "train_batch").
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Callable
|
The decorated function.
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
disable_grad_opt(optimizer=None)
Context manager to temporarily disable gradient-based optimization.
PARAMETER | DESCRIPTION |
---|---|
optimizer
|
The Nevergrad optimizer to use. If no optimizer is provided, default optimizer for trainer object will be used.
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
enable_grad_opt(optimizer=None)
Context manager to temporarily enable gradient-based optimization.
PARAMETER | DESCRIPTION |
---|---|
optimizer
|
The PyTorch optimizer to use. If no optimizer is provided, default optimizer for trainer object will be used.
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_test_batch_end(test_batch_loss_metrics)
Called at the end of each testing batch.
PARAMETER | DESCRIPTION |
---|---|
test_batch_loss_metrics
|
Metrics for the testing batch loss. tuple of (loss, metrics)
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_test_batch_start(batch)
Called at the start of each testing batch.
PARAMETER | DESCRIPTION |
---|---|
batch
|
A batch of data from the DataLoader. Typically a tuple containing input tensors and corresponding target tensors.
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_train_batch_end(train_batch_loss_metrics)
Called at the end of each training batch.
PARAMETER | DESCRIPTION |
---|---|
train_batch_loss_metrics
|
Metrics for the training batch loss. tuple of (loss, metrics)
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_train_batch_start(batch)
Called at the start of each training batch.
PARAMETER | DESCRIPTION |
---|---|
batch
|
A batch of data from the DataLoader. Typically a tuple containing input tensors and corresponding target tensors.
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_train_end(train_losses, val_losses=None)
Called at the end of training.
PARAMETER | DESCRIPTION |
---|---|
train_losses
|
Metrics for the training losses. list -> list -> tuples Epochs -> Training Batches -> (loss, metrics)
TYPE:
|
val_losses
|
Metrics for the validation losses. list -> list -> tuples Epochs -> Validation Batches -> (loss, metrics)
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_train_epoch_end(train_epoch_loss_metrics)
Called at the end of each training epoch.
PARAMETER | DESCRIPTION |
---|---|
train_epoch_loss_metrics
|
Metrics for the training epoch losses. list -> tuples Training Batches -> (loss, metrics)
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_train_epoch_start()
on_train_start()
on_val_batch_end(val_batch_loss_metrics)
Called at the end of each validation batch.
PARAMETER | DESCRIPTION |
---|---|
val_batch_loss_metrics
|
Metrics for the validation batch loss. tuple of (loss, metrics)
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_val_batch_start(batch)
Called at the start of each validation batch.
PARAMETER | DESCRIPTION |
---|---|
batch
|
A batch of data from the DataLoader. Typically a tuple containing input tensors and corresponding target tensors.
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_val_epoch_end(val_epoch_loss_metrics)
Called at the end of each validation epoch.
PARAMETER | DESCRIPTION |
---|---|
val_epoch_loss_metrics
|
Metrics for the validation epoch loss. list -> tuples Validation Batches -> (loss, metrics)
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
on_val_epoch_start()
set_use_grad(value)
classmethod
Sets the global use_grad flag.
PARAMETER | DESCRIPTION |
---|---|
value
|
Whether to use gradient-based optimization.
TYPE:
|
Source code in qadence/ml_tools/train_utils/base_trainer.py
BaseWriter
Bases: ABC
Abstract base class for experiment tracking writers.
METHOD | DESCRIPTION |
---|---|
open |
Opens the writer and sets up the logging environment. |
close |
Closes the writer and finalizes any ongoing logging processes. |
print_metrics |
Prints metrics and loss in a formatted manner. |
write |
Writes the optimization results to the tracking tool. |
log_hyperparams |
Logs the hyperparameters to the tracking tool. |
plot |
Logs model plots using provided plotting functions. |
log_model |
Logs the model and any relevant information. |
close()
abstractmethod
log_hyperparams(hyperparams)
abstractmethod
Logs hyperparameters.
PARAMETER | DESCRIPTION |
---|---|
hyperparams
|
A dictionary of hyperparameters to log.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
log_model(model, train_dataloader=None, val_dataloader=None, test_dataloader=None)
abstractmethod
Logs the model and associated data.
PARAMETER | DESCRIPTION |
---|---|
model
|
The model to log.
TYPE:
|
train_dataloader
|
DataLoader for training data.
TYPE:
|
val_dataloader
|
DataLoader for validation data.
TYPE:
|
test_dataloader
|
DataLoader for testing data.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
open(config, iteration=None)
abstractmethod
Opens the writer and prepares it for logging.
PARAMETER | DESCRIPTION |
---|---|
config
|
Configuration object containing settings for logging.
TYPE:
|
iteration
|
The iteration step to start logging from. Defaults to None.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
plot(model, iteration, plotting_functions)
abstractmethod
Logs plots of the model using provided plotting functions.
PARAMETER | DESCRIPTION |
---|---|
model
|
The model to plot.
TYPE:
|
iteration
|
The current iteration number.
TYPE:
|
plotting_functions
|
Functions used to generate plots.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
print_metrics(result)
Prints the metrics and loss in a readable format.
PARAMETER | DESCRIPTION |
---|---|
result
|
The optimization results to display.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
write(iteration, metrics)
abstractmethod
Logs the results of the current iteration.
PARAMETER | DESCRIPTION |
---|---|
iteration
|
The current training iteration.
TYPE:
|
metrics
|
A dictionary of metrics to log, where keys are metric names and values are the corresponding metric values.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
MLFlowWriter()
Bases: BaseWriter
Writer for logging to MLflow.
ATTRIBUTE | DESCRIPTION |
---|---|
run |
The active MLflow run.
TYPE:
|
mlflow |
The MLflow module.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
close()
get_signature_from_dataloader(model, dataloader)
Infers the signature of the model based on the input data from the dataloader.
PARAMETER | DESCRIPTION |
---|---|
model
|
The model to use for inference.
TYPE:
|
dataloader
|
DataLoader for model inputs.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Any
|
Optional[Any]: The inferred signature, if available. |
Source code in qadence/ml_tools/callbacks/writer_registry.py
log_hyperparams(hyperparams)
Logs hyperparameters to MLflow.
PARAMETER | DESCRIPTION |
---|---|
hyperparams
|
A dictionary of hyperparameters to log.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
log_model(model, train_dataloader=None, val_dataloader=None, test_dataloader=None)
Logs the model and its signature to MLflow using the provided data loaders.
PARAMETER | DESCRIPTION |
---|---|
model
|
The model to log.
TYPE:
|
train_dataloader
|
DataLoader for training data.
TYPE:
|
val_dataloader
|
DataLoader for validation data.
TYPE:
|
test_dataloader
|
DataLoader for testing data.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
open(config, iteration=None)
Opens the MLflow writer and initializes an MLflow run.
PARAMETER | DESCRIPTION |
---|---|
config
|
Configuration object containing settings for logging.
TYPE:
|
iteration
|
The iteration step to start logging from. Defaults to None.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
mlflow
|
The MLflow module instance.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
plot(model, iteration, plotting_functions)
Logs plots of the model using provided plotting functions.
PARAMETER | DESCRIPTION |
---|---|
model
|
The model to plot.
TYPE:
|
iteration
|
The current iteration number.
TYPE:
|
plotting_functions
|
Functions used to generate plots.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
write(iteration, metrics)
Logs the results of the current iteration to MLflow.
PARAMETER | DESCRIPTION |
---|---|
iteration
|
The current training iteration.
TYPE:
|
metrics
|
A dictionary of metrics to log, where keys are metric names and values are the corresponding metric values.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
TensorBoardWriter()
Bases: BaseWriter
Writer for logging to TensorBoard.
ATTRIBUTE | DESCRIPTION |
---|---|
writer |
The TensorBoard SummaryWriter instance.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
close()
log_hyperparams(hyperparams)
Logs hyperparameters to TensorBoard.
PARAMETER | DESCRIPTION |
---|---|
hyperparams
|
A dictionary of hyperparameters to log.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
log_model(model, train_dataloader=None, val_dataloader=None, test_dataloader=None)
Logs the model.
Currently not supported by TensorBoard.
PARAMETER | DESCRIPTION |
---|---|
model
|
The model to log.
TYPE:
|
train_dataloader
|
DataLoader for training data.
TYPE:
|
val_dataloader
|
DataLoader for validation data.
TYPE:
|
test_dataloader
|
DataLoader for testing data.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
open(config, iteration=None)
Opens the TensorBoard writer.
PARAMETER | DESCRIPTION |
---|---|
config
|
Configuration object containing settings for logging.
TYPE:
|
iteration
|
The iteration step to start logging from. Defaults to None.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
SummaryWriter
|
The initialized TensorBoard writer.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
plot(model, iteration, plotting_functions)
Logs plots of the model using provided plotting functions.
PARAMETER | DESCRIPTION |
---|---|
model
|
The model to plot.
TYPE:
|
iteration
|
The current iteration number.
TYPE:
|
plotting_functions
|
Functions used to generate plots.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
write(iteration, metrics)
Logs the results of the current iteration to TensorBoard.
PARAMETER | DESCRIPTION |
---|---|
iteration
|
The current training iteration.
TYPE:
|
metrics
|
A dictionary of metrics to log, where keys are metric names and values are the corresponding metric values.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
get_writer(tracking_tool)
Factory method to get the appropriate writer based on the tracking tool.
PARAMETER | DESCRIPTION |
---|---|
tracking_tool
|
The experiment tracking tool to use.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
BaseWriter
|
An instance of the appropriate writer.
TYPE:
|
Source code in qadence/ml_tools/callbacks/writer_registry.py
InformationContent(model, loss_fn, xs, epsilons, variation_multiple=20)
Information Landscape class.
This class handles the study of loss landscape from information theoretic perspective and provides methods to get bounds on the norm of the gradient from the Information Content of the loss landscape.
PARAMETER | DESCRIPTION |
---|---|
model
|
The quantum or classical model to analyze.
TYPE:
|
loss_fn
|
Loss function that takes model output and calculates loss
TYPE:
|
xs
|
Input data to evaluate the model on
TYPE:
|
epsilons
|
The thresholds to use for discretization of the finite derivatives
TYPE:
|
variation_multiple
|
The number of sets of variational parameters to generate per each variational parameter. The number of variational parameters required for the statistical analysis scales linearly with the amount of them present in the model. This is that linear factor.
TYPE:
|
Notes
This class provides flexibility in terms of what the model, the loss function, and the xs are. The only requirement is that the loss_fn takes the model and xs as arguments and returns the loss, and another dictionary of other metrics.
Thus, assumed structure: loss_fn(model, xs) -> (loss, metrics, ...)
Example: A Classifier
model = nn.Linear(10, 1)
def loss_fn(
model: nn.Module,
xs: tuple[torch.Tensor, torch.Tensor]
) -> tuple[torch.Tensor, dict[str, float]:
criterion = nn.MSELoss()
inputs, labels = xs
outputs = model(inputs)
loss = criterion(outputs, labels)
metrics = {"loss": loss.item()}
return loss, metrics
xs = (torch.randn(10, 10), torch.randn(10, 1))
info_landscape = InfoLandscape(model, loss_fn, xs)
xs
include both the
inputs and the target labels. The logic for calculation of the loss from this lies
entirely within the loss_fn
function. This can then further be used to obtain the
bounds on the average norm of the gradient of the loss function.
Example: A Physics Informed Neural Network
class PhysicsInformedNN(nn.Module):
// <Initialization Logic>
def forward(self, xs: dict[str, torch.Tensor]):
return {
"pde_residual": pde_residual(xs["pde"]),
"boundary_condition": bc_term(xs["bc"]),
}
def loss_fn(
model: PhysicsInformedNN,
xs: dict[str, torch.Tensor]
) -> tuple[torch.Tensor, dict[str, float]:
pde_residual, bc_term = model(xs)
loss = torch.mean(torch.sum(pde_residual**2, dim=1), dim=0)
+ torch.mean(torch.sum(bc_term**2, dim=1), dim=0)
return loss, {"pde_residual": pde_residual, "bc_term": bc_term}
xs = {
"pde": torch.linspace(0, 1, 10),
"bc": torch.tensor([0.0]),
}
info_landscape = InfoLandscape(model, loss_fn, xs)
In this example, the model is a Physics Informed Neural Network, and the `xs`
are the inputs to the different residual components of the model. The logic
for calculation of the residuals lies within the PhysicsInformedNN class, and
the loss function is defined to calculate the loss that is to be optimized
from these residuals. This can then further be used to obtain the
bounds on the average norm of the gradient of the loss function.
The first value that the loss_fn
returns is the loss value that is being optimized.
The function is also expected to return other value(s), often the metrics that are
used to calculate the loss. These values are ignored for the purpose of this class.
Source code in qadence/ml_tools/information/information_content.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
|
calculate_IC
cached
property
Calculate Information Content for multiple epsilon values.
Returns: Tensor of IC values for each epsilon [n_epsilons]
batched_loss()
Calculate loss for all parameter variations in a batched manner.
Returns: Tensor of loss values for each parameter variation
Source code in qadence/ml_tools/information/information_content.py
calculate_transition_probabilities_batch()
Calculate transition probabilities for multiple epsilon values.
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor of shape [n_epsilons, 6] containing probabilities for each transition type |
Tensor
|
Columns order: [+1to0, +1to-1, 0to+1, 0to-1, -1to0, -1to+1] |
Source code in qadence/ml_tools/information/information_content.py
discretize_derivatives()
Convert finite derivatives into discrete values.
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor containing discretized derivatives with shape [n_epsilons, n_variations-2] |
Tensor
|
Each row contains {-1, 0, 1} values for that epsilon |
Source code in qadence/ml_tools/information/information_content.py
get_grad_norm_bounds_max_IC()
Compute the bounds on the average norm of the gradient.
RETURNS | DESCRIPTION |
---|---|
tuple[float, float]
|
tuple[Tensor, Tensor]: The lower and upper bounds. |
Source code in qadence/ml_tools/information/information_content.py
get_grad_norm_bounds_sensitivity_IC(eta)
Compute the bounds on the average norm of the gradient.
PARAMETER | DESCRIPTION |
---|---|
eta
|
The sensitivity IC.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
The lower bound.
TYPE:
|
Source code in qadence/ml_tools/information/information_content.py
max_IC()
Get the maximum Information Content and its corresponding epsilon.
Returns: Tuple of (maximum IC value, optimal epsilon)
Source code in qadence/ml_tools/information/information_content.py
q_value(H_value)
cached
staticmethod
Compute the q value.
q is the solution to the equation: H(x) = 4h(x) + 2h(1/2 - 2x)
It is the value of the probability of 4 of the 6 transitions such that the IC is the same as the IC of our system.
This quantity is useful in calculating the bounds on the norms of the gradients.
PARAMETER | DESCRIPTION |
---|---|
H_value
|
The information content.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
The q value
TYPE:
|
Source code in qadence/ml_tools/information/information_content.py
randomized_finite_der()
Calculate normalized finite difference of loss on doing random walk in the parameter space.
This serves as a proxy for the derivative of the loss with respect to parameters.
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor containing normalized finite differences (approximate directional derivatives) |
Tensor
|
between consecutive points in the random walk. Shape: [n_variations - 1] |
Source code in qadence/ml_tools/information/information_content.py
reshape_param_variations()
Reshape variations of the model's variational parameters.
RETURNS | DESCRIPTION |
---|---|
dict[str, Tensor]
|
Dictionary of parameter tensors, each with shape [n_variations, *param_shape] |
Source code in qadence/ml_tools/information/information_content.py
sensitivity_IC(eta)
Find the minimum value of epsilon such that the information content is less than eta.
PARAMETER | DESCRIPTION |
---|---|
eta
|
Threshold value, the sensitivity IC.
TYPE:
|
Returns: The epsilon value that gives IC that is less than the sensitivity IC.
Source code in qadence/ml_tools/information/information_content.py
Accelerator(nprocs=1, compute_setup='auto', log_setup='cpu', backend='gloo', dtype=None)
Bases: Distributor
A class for handling distributed training.
This class extends Distributor
to manage distributed training using PyTorch's
torch.distributed
API. It supports spawning multiple processes and wrapping models with
DistributedDataParallel
(DDP) when required.
This class is provides head level method - distribute() - which wraps a function at a head process level,
before launching nprocs
processes as required. Furthermore, it provides processes level methods,
such as prepare(), and prepare_batch() which can be run inside each process for correct movement and
preparation of model, optimizers and datasets.
Inherited Attributes
nprocs (int): Number of processes to launch for distributed training. execution (BaseExecution): Detected execution instance for process launch (e.g., "torchrun","default"). execution_type (ExecutionType): Type of execution used. rank (int): Global rank of the process (to be set during environment setup). world_size (int): Total number of processes (to be set during environment setup). local_rank (int | None): Local rank on the node (to be set during environment setup). master_addr (str): Master node address (to be set during environment setup). master_port (str): Master node port (to be set during environment setup). node_rank (int): Rank of the node on the cluster setup.
There are three different indicators for number of processes executed.
-
- self._config_nprocs: Number of processes specified by the user. Provided in the initilization of the Accelerator. (acc = Accelerator(nprocs = 2))
-
- self.nprocs: Number of processes defined at the head level.
- When accelerator is used to spawn processes (e.g., In case default, python execution), nprocs = _config_nprocs.
- When an external elastic method is used to spawn processes (e.g., In case of torchrun), nprocs = 1. This is because the external launcher already spawns multiple processes, and the accelerator init is called from each process.
-
- self.world_size: Number of processes actually executed.
Initializes the Accelerator class.
PARAMETER | DESCRIPTION |
---|---|
nprocs
|
Number of processes to launch. Default is 1.
TYPE:
|
compute_setup
|
Compute device setup; options are "auto" (default), "gpu", or "cpu". - "auto": Uses GPU if available, otherwise CPU. - "gpu": Forces GPU usage, raising an error if no CUDA device is available. - "cpu": Forces CPU usage.
TYPE:
|
log_setup
|
Logging device setup; options are "auto", "cpu" (default). - "auto": Uses same device to log as used for computation. - "cpu": Forces CPU logging.
TYPE:
|
backend
|
The backend for distributed communication. Default is "gloo".
TYPE:
|
dtype
|
Data type for controlling numerical precision. Default is None.
TYPE:
|
Source code in qadence/ml_tools/train_utils/accelerator.py
_prepare_data(dataloader)
Adjusts DataLoader(s) for distributed training.
PARAMETER | DESCRIPTION |
---|---|
dataloader
|
The dataloader or dictionary of dataloaders to prepare.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
DataLoader | DictDataLoader
|
Union[DataLoader, DictDataLoader]: The prepared dataloader(s) with the correct distributed sampling setup. |
Source code in qadence/ml_tools/train_utils/accelerator.py
_prepare_dataloader(dataloader)
Prepares a single DataLoader for distributed training.
When training in a distributed setting (i.e., when self.world_size > 1
), data must be
divided among multiple processes. This is achieved by creating a
DistributedSampler that splits the dataset into distinct subsets for each process.
This method does the following:
- If distributed training is enabled:
- Checks if the dataset is not an instance of InfiniteTensorDataset
.
- If so, creates a DistributedSampler
for the dataset using the total number
of replicas (self.world_size
) and the current process's rank (self.local_rank
).
- Otherwise (i.e., for infinite datasets), no sampler is set (sampler remains None
).
- Returns a new DataLoader configured with:
- The same dataset and batch size as the original.
- The distributed sampler (if applicable).
- The number of workers and pin_memory settings retrieved from the original DataLoader.
- If not in a distributed setting (i.e., self.world_size <= 1
), returns the original DataLoader unmodified.
PARAMETER | DESCRIPTION |
---|---|
dataloader
|
The original DataLoader instance that loads the dataset.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
DataLoader
|
A new DataLoader prepared for distributed training if in a multi-process environment; otherwise, the original DataLoader is returned.
TYPE:
|
Source code in qadence/ml_tools/train_utils/accelerator.py
_prepare_model(model)
Moves the model to the desired device and casts it to the specified dtype.
In a distributed setting, if more than one device is used (i.e., self.world_size > 1), the model is wrapped in DistributedDataParallel (DDP) to handle gradient synchronization across devices.
PARAMETER | DESCRIPTION |
---|---|
model
|
The PyTorch model to prepare.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Module
|
nn.Module: The model moved to the correct device (and wrapped in DDP if applicable). |
Source code in qadence/ml_tools/train_utils/accelerator.py
_prepare_optimizer(optimizer)
Passes through the optimizer without modification.
PARAMETER | DESCRIPTION |
---|---|
optimizer
|
The optimizer to prepare.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Optimizer
|
optim.Optimizer: The unmodified optimizer. |
Source code in qadence/ml_tools/train_utils/accelerator.py
_spawn_method(instance, method, args, kwargs)
This method spawns the required numbers of processes.
- if execution is
default
, it will spawnnproc
processes across all nodes - if execution is
otherwise
, it will run a single process.
PARAMETER | DESCRIPTION |
---|---|
instance
|
The object (Trainer) that contains the method to execute.
This object is expected to have an
TYPE:
|
method
|
The function of the method on the instance to be executed.
TYPE:
|
args
|
Positional arguments to pass to the target method.
TYPE:
|
kwargs
|
Keyword arguments to pass to the target method.
TYPE:
|
Source code in qadence/ml_tools/train_utils/accelerator.py
all_reduce_dict(d, op='mean')
Performs an all-reduce operation on a dictionary of tensors, averaging values across all processes.
PARAMETER | DESCRIPTION |
---|---|
d
|
A dictionary where values are tensors to be reduced across processes.
TYPE:
|
op
|
Operation method to all_reduce with. Available options include
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
dict[str, Tensor]
|
dict[str, torch.Tensor]: A dictionary with the reduced tensors, averaged over the world size. |
Source code in qadence/ml_tools/train_utils/accelerator.py
broadcast(obj, src)
Broadcasts an object from the source process to all processes.
On non-source processes, this value is ignored.
PARAMETER | DESCRIPTION |
---|---|
obj
|
The object to broadcast on the source process.
TYPE:
|
src
|
The source process rank.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Any
|
The broadcasted object from the source process.
TYPE:
|
Source code in qadence/ml_tools/train_utils/accelerator.py
distribute(fun)
Decorator to distribute the fit function across multiple processes.
This function is generic and can work with other methods as well. Weather it is bound or unbound.
When applied to a function (typically a fit function), this decorator
will execute the function in a distributed fashion using torch.multiprocessing.
The number of processes used is determined by self.nprocs
,
and if multiple nodes are involved (self.num_nodes > 1
), the process count is
adjusted accordingly. In single process mode (self.nporcs
is 1), the function
is executed directly in the current process.
After execution, the decorator returns the model stored in instance.model
.
PARAMETER | DESCRIPTION |
---|---|
fun
|
The function to be decorated. This function usually implements a model fitting or training routine.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
callable
|
The wrapped function. When called, it will execute in distributed mode
(if configured) and return the value of
TYPE:
|
Source code in qadence/ml_tools/train_utils/accelerator.py
is_class_method(fun, args)
Determines if fun
is a class method or a standalone function.
Frist argument of the args should be: - An object and has dict: making it a class - Has a method named fun: making it a class that has this method.
PARAMETER | DESCRIPTION |
---|---|
fun
|
The function being checked.
TYPE:
|
args
|
The arguments passed to the function.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
bool
|
True if
TYPE:
|
Source code in qadence/ml_tools/train_utils/accelerator.py
prepare(*args)
Prepares models, optimizers, and dataloaders for distributed training.
This method iterates over the provided objects and:
- Moves models to the specified device (e.g., GPU or CPU) and casts them to the
desired precision (specified by self.dtype
). It then wraps models in
DistributedDataParallel (DDP) if more than one device is used.
- Passes through optimizers unchanged.
- For dataloaders, it adjusts them to use a distributed sampler (if applicable)
by calling a helper method. Note that only the sampler is prepared; moving the
actual batch data to the device is handled separately during training.
Please use the prepare_batch
method to move the batch to correct device/dtype.
PARAMETER | DESCRIPTION |
---|---|
*args
|
A variable number of objects to be prepared. These can include:
- PyTorch models (
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
tuple[Any, ...]
|
tuple[Any, ...]: A tuple containing the prepared objects, where each object has been modified as needed to support distributed training. |
Source code in qadence/ml_tools/train_utils/accelerator.py
prepare_batch(batch)
Moves a batch of data to the target device and casts it to the desired data dtype.
This method is typically called within the optimization step of your training loop. It supports various batch formats: - If the batch is a dictionary, each value is moved individually. - If the batch is a tuple or list, each element is processed and returned as a tuple. - Otherwise, the batch is processed directly.
PARAMETER | DESCRIPTION |
---|---|
batch
|
The batch of data to move to the device. This can be a dict, tuple, list,
or any type compatible with
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Any
|
The batch with all elements moved to
TYPE:
|
Source code in qadence/ml_tools/train_utils/accelerator.py
worker(rank, instance, fun, args, kwargs)
Worker function to be executed in each spawned process.
This function is called in every subprocess created by torch.multiprocessing (via mp.spawn). It performs the following tasks: 1. Sets up the accelerator for the given process rank. This typically involves configuring the GPU or other hardware resources for distributed training. 2. If the retrieved method has been decorated (i.e. it has a 'wrapped' attribute), the original, unwrapped function is invoked with the given arguments. Otherwise, the method is called directly.
PARAMETER | DESCRIPTION |
---|---|
rank
|
The rank (or identifier) of the spawned process.
TYPE:
|
instance
|
The object (Trainer) that contains the method to execute.
This object is expected to have an
TYPE:
|
fun
|
The function of the method on the instance to be executed.
TYPE:
|
args
|
Positional arguments to pass to the target method.
TYPE:
|
kwargs
|
Keyword arguments to pass to the target method.
TYPE:
|