Skip to content

QML tools

ML Tools

This module implements gradient-free and gradient-based training loops for torch Modules and QuantumModel.

TrainConfig(max_iter=10000, print_every=1000, write_every=50, checkpoint_every=5000, folder=None, create_subfolder_per_run=False, checkpoint_best_only=False, val_every=None, val_epsilon=1e-05, validation_criterion=None, trainstop_criterion=None, batch_size=1, verbose=True) dataclass

Default config for the train function.

The default value of each field can be customized with the constructor:

from qadence.ml_tools import TrainConfig
c = TrainConfig(folder="/tmp/train")
TrainConfig(max_iter=10000, print_every=1000, write_every=50, checkpoint_every=5000, folder=PosixPath('/tmp/train'), create_subfolder_per_run=False, checkpoint_best_only=False, val_every=None, val_epsilon=1e-05, validation_criterion=<function TrainConfig.__post_init__.<locals>.<lambda> at 0x7f35cc675fc0>, trainstop_criterion=<function TrainConfig.__post_init__.<locals>.<lambda> at 0x7f35cc675bd0>, batch_size=1, verbose=True)

batch_size: int = 1 class-attribute instance-attribute

The batch_size to use when passing a list/tuple of torch.Tensors.

checkpoint_best_only: bool = False class-attribute instance-attribute

Write model/optimizer checkpoint only if a metric has improved.

checkpoint_every: int = 5000 class-attribute instance-attribute

Write model/optimizer checkpoint.

create_subfolder_per_run: bool = False class-attribute instance-attribute

Checkpoint/tensorboard logs stored in subfolder with name <timestamp>_<PID>.

Prevents continuing from previous checkpoint, useful for fast prototyping.

folder: Optional[Path] = None class-attribute instance-attribute

Checkpoint/tensorboard logs folder.

max_iter: int = 10000 class-attribute instance-attribute

Number of training iterations.

print_every: int = 1000 class-attribute instance-attribute

Print loss/metrics.

trainstop_criterion: Optional[Callable] = None class-attribute instance-attribute

A boolean function which evaluates a given training stopping metric is satisfied.

val_epsilon: float = 1e-05 class-attribute instance-attribute

Safety margin to check if validation loss is smaller than the lowest.

validation loss across previous iterations.

val_every: int | None = None class-attribute instance-attribute

Calculate validation metric.

If None, validation check is not performed.

validation_criterion: Optional[Callable] = None class-attribute instance-attribute

A boolean function which evaluates a given validation metric is satisfied.

verbose: bool = True class-attribute instance-attribute

Whether or not to print out metrics values during training.

write_every: int = 50 class-attribute instance-attribute

Write tensorboard logs.

get_parameters(model)

Retrieve all trainable model parameters in a single vector.

PARAMETER DESCRIPTION
model

the input PyTorch model

TYPE: Module

RETURNS DESCRIPTION
Tensor

a 1-dimensional tensor with the parameters

TYPE: Tensor

Source code in qadence/ml_tools/parameters.py
def get_parameters(model: Module) -> Tensor:
    """Retrieve all trainable model parameters in a single vector.

    Args:
        model (Module): the input PyTorch model

    Returns:
        Tensor: a 1-dimensional tensor with the parameters
    """
    ps = [p.reshape(-1) for p in model.parameters() if p.requires_grad]
    return torch.concat(ps)

num_parameters(model)

Return the total number of parameters of the given model.

Source code in qadence/ml_tools/parameters.py
def num_parameters(model: Module) -> int:
    """Return the total number of parameters of the given model."""
    return len(get_parameters(model))

set_parameters(model, theta)

Set all trainable parameters of a model from a single vector.

Notice that this function assumes prior knowledge of right number of parameters in the model

PARAMETER DESCRIPTION
model

the input PyTorch model

TYPE: Module

theta

the parameters to assign

TYPE: Tensor

Source code in qadence/ml_tools/parameters.py
def set_parameters(model: Module, theta: Tensor) -> None:
    """Set all trainable parameters of a model from a single vector.

    Notice that this function assumes prior knowledge of right number
    of parameters in the model

    Args:
        model (Module): the input PyTorch model
        theta (Tensor): the parameters to assign
    """

    with torch.no_grad():
        idx = 0
        for ps in model.parameters():
            if ps.requires_grad:
                n = torch.numel(ps)
                if ps.ndim == 0:
                    ps[()] = theta[idx : idx + n]
                else:
                    ps[:] = theta[idx : idx + n].reshape(ps.size())
                idx += n

optimize_step(model, optimizer, loss_fn, xs, device=None, dtype=None)

Default Torch optimize step with closure.

This is the default optimization step which should work for most of the standard use cases of optimization of Torch models

PARAMETER DESCRIPTION
model

The input model

TYPE: Module

optimizer

The chosen Torch optimizer

TYPE: Optimizer

loss_fn

A custom loss function

TYPE: Callable

xs

the input data. If None it means that the given model does not require any input data

TYPE: dict | list | Tensor | None

device

A target device to run computation on.

TYPE: device DEFAULT: None

RETURNS DESCRIPTION
tuple

tuple containing the model, the optimizer, a dictionary with the collected metrics and the compute value loss

TYPE: tuple[Tensor | float, dict | None]

Source code in qadence/ml_tools/optimize_step.py
def optimize_step(
    model: Module,
    optimizer: Optimizer,
    loss_fn: Callable,
    xs: dict | list | torch.Tensor | None,
    device: torch.device = None,
    dtype: torch.dtype = None,
) -> tuple[torch.Tensor | float, dict | None]:
    """Default Torch optimize step with closure.

    This is the default optimization step which should work for most
    of the standard use cases of optimization of Torch models

    Args:
        model (Module): The input model
        optimizer (Optimizer): The chosen Torch optimizer
        loss_fn (Callable): A custom loss function
        xs (dict | list | torch.Tensor | None): the input data. If None it means
            that the given model does not require any input data
        device (torch.device): A target device to run computation on.

    Returns:
        tuple: tuple containing the model, the optimizer, a dictionary with
            the collected metrics and the compute value loss
    """

    loss, metrics = None, {}
    xs_to_device = data_to_device(xs, device=device, dtype=dtype)

    def closure() -> Any:
        # NOTE: We need the nonlocal as we can't return a metric dict and
        # because e.g. LBFGS calls this closure multiple times but for some
        # reason the returned loss is always the first one...
        nonlocal metrics, loss
        optimizer.zero_grad()
        loss, metrics = loss_fn(model, xs_to_device)
        loss.backward(retain_graph=True)
        return loss.item()

    optimizer.step(closure)
    # return the loss/metrics that are being mutated inside the closure...
    return loss, metrics

train(model, dataloader, optimizer, config, loss_fn, device=None, optimize_step=optimize_step, write_tensorboard=write_tensorboard, dtype=None)

Runs the training loop with gradient-based optimizer.

Assumes that loss_fn returns a tuple of (loss, metrics: dict), where metrics is a dict of scalars. Loss and metrics are written to tensorboard. Checkpoints are written every config.checkpoint_every steps (and after the last training step). If a checkpoint is found at config.folder we resume training from there. The tensorboard logs can be viewed via tensorboard --logdir /path/to/folder.

PARAMETER DESCRIPTION
model

The model to train.

TYPE: Module

dataloader

dataloader of different types. If None, no data is required by the model

TYPE: Union[None, DataLoader, DictDataLoader]

optimizer

The optimizer to use.

TYPE: Optimizer

config

TrainConfig with additional training options.

TYPE: TrainConfig

loss_fn

Loss function returning (loss: float, metrics: dict[str, float])

TYPE: Callable

device

String defining device to train on, pass 'cuda' for GPU.

TYPE: device DEFAULT: None

optimize_step

Customizable optimization callback which is called at every iteration.= The function must have the signature optimize_step(model, optimizer, loss_fn, xs, device="cpu") (see the example below). Apart from the default we already supply three other optimization functions optimize_step_evo, optimize_step_grad_norm, and optimize_step_inv_dirichlet. Learn more about how to use this in the Advancded features tutorial of the documentation.

TYPE: Callable DEFAULT: optimize_step

write_tensorboard

Customizable tensorboard logging callback which is called every config.write_every iterations. The function must have the signature write_tensorboard(writer, loss, metrics, iteration) (see the example below).

TYPE: Callable DEFAULT: write_tensorboard

Example:

Source code in qadence/ml_tools/train_grad.py
def train(
    model: Module,
    dataloader: Union[None, DataLoader, DictDataLoader],
    optimizer: Optimizer,
    config: TrainConfig,
    loss_fn: Callable,
    device: torch_device = None,
    optimize_step: Callable = optimize_step,
    write_tensorboard: Callable = write_tensorboard,
    dtype: torch_dtype = None,
) -> tuple[Module, Optimizer]:
    """Runs the training loop with gradient-based optimizer.

    Assumes that `loss_fn` returns a tuple of (loss,
    metrics: dict), where `metrics` is a dict of scalars. Loss and metrics are
    written to tensorboard. Checkpoints are written every
    `config.checkpoint_every` steps (and after the last training step).  If a
    checkpoint is found at `config.folder` we resume training from there.  The
    tensorboard logs can be viewed via `tensorboard --logdir /path/to/folder`.

    Args:
        model: The model to train.
        dataloader: dataloader of different types. If None, no data is required by
            the model
        optimizer: The optimizer to use.
        config: `TrainConfig` with additional training options.
        loss_fn: Loss function returning (loss: float, metrics: dict[str, float])
        device: String defining device to train on, pass 'cuda' for GPU.
        optimize_step: Customizable optimization callback which is called at every iteration.=
            The function must have the signature `optimize_step(model,
            optimizer, loss_fn, xs, device="cpu")` (see the example below).
            Apart from the default we already supply three other optimization
            functions `optimize_step_evo`, `optimize_step_grad_norm`, and
            `optimize_step_inv_dirichlet`. Learn more about how to use this in
            the [Advancded features](../../tutorials/advanced) tutorial of the
            documentation.
        write_tensorboard: Customizable tensorboard logging callback which is
            called every `config.write_every` iterations. The function must have
            the signature `write_tensorboard(writer, loss, metrics, iteration)`
            (see the example below).

    Example:
    ```python exec="on" source="material-block"
    from pathlib import Path
    import torch
    from itertools import count
    from qadence import Parameter, QuantumCircuit, Z
    from qadence import hamiltonian_factory, hea, feature_map, chain
    from qadence.models import QNN
    from qadence.ml_tools import TrainConfig, train_with_grad, to_dataloader

    n_qubits = 2
    fm = feature_map(n_qubits)
    ansatz = hea(n_qubits=n_qubits, depth=3)
    observable = hamiltonian_factory(n_qubits, detuning = Z)
    circuit = QuantumCircuit(n_qubits, fm, ansatz)

    model = QNN(circuit, observable, backend="pyqtorch", diff_mode="ad")
    batch_size = 1
    input_values = {"phi": torch.rand(batch_size, requires_grad=True)}
    pred = model(input_values)

    ## lets prepare the train routine

    cnt = count()
    criterion = torch.nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

    def loss_fn(model: torch.nn.Module, data: torch.Tensor) -> tuple[torch.Tensor, dict]:
        next(cnt)
        x, y = data[0], data[1]
        out = model(x)
        loss = criterion(out, y)
        return loss, {}

    tmp_path = Path("/tmp")
    n_epochs = 5
    batch_size = 25
    config = TrainConfig(
        folder=tmp_path,
        max_iter=n_epochs,
        checkpoint_every=100,
        write_every=100,
    )
    x = torch.linspace(0, 1, batch_size).reshape(-1, 1)
    y = torch.sin(x)
    data = to_dataloader(x, y, batch_size=batch_size, infinite=True)
    train_with_grad(model, data, optimizer, config, loss_fn=loss_fn)
    ```
    """
    # load available checkpoint
    init_iter = 0
    if config.folder:
        model, optimizer, init_iter = load_checkpoint(config.folder, model, optimizer)
        logger.debug(f"Loaded model and optimizer from {config.folder}")

    # Move model to device before optimizer is loaded
    if isinstance(model, DataParallel):
        model = model.module.to(device=device, dtype=dtype)
    else:
        model = model.to(device=device, dtype=dtype)
    # initialize tensorboard
    writer = SummaryWriter(config.folder, purge_step=init_iter)

    perform_val = isinstance(config.val_every, int)
    if perform_val:
        if not isinstance(dataloader, DictDataLoader):
            raise ValueError(
                "If `config.val_every` is provided as an integer, dataloader must"
                "be an instance of `DictDataLoader`."
            )
        iter_keys = dataloader.dataloaders.keys()
        if "train" not in iter_keys or "val" not in iter_keys:
            raise ValueError(
                "If `config.val_every` is provided as an integer, the dictdataloader"
                "must have `train` and `val` keys to access the respective dataloaders."
            )
        val_dataloader = dataloader.dataloaders["val"]
        dataloader = dataloader.dataloaders["train"]

    ## Training
    progress = Progress(
        TextColumn("[progress.description]{task.description}"),
        BarColumn(),
        TaskProgressColumn(),
        TimeRemainingColumn(elapsed_when_finished=True),
    )
    data_dtype = None
    if dtype:
        data_dtype = float64 if dtype == complex128 else float32

    best_val_loss = math.inf
    with progress:
        dl_iter = iter(dataloader) if dataloader is not None else None
        if perform_val:
            dl_iter_val = iter(val_dataloader) if val_dataloader is not None else None

        # outer epoch loop
        for iteration in progress.track(range(init_iter, init_iter + config.max_iter)):
            try:
                # in case there is not data needed by the model
                # this is the case, for example, of quantum models
                # which do not have classical input data (e.g. chemistry)
                if dataloader is None:
                    loss, metrics = optimize_step(
                        model=model,
                        optimizer=optimizer,
                        loss_fn=loss_fn,
                        xs=None,
                        device=device,
                        dtype=data_dtype,
                    )
                    loss = loss.item()

                elif isinstance(dataloader, (DictDataLoader, DataLoader)):
                    loss, metrics = optimize_step(
                        model=model,
                        optimizer=optimizer,
                        loss_fn=loss_fn,
                        xs=next(dl_iter),  # type: ignore[arg-type]
                        device=device,
                        dtype=data_dtype,
                    )

                else:
                    raise NotImplementedError(
                        f"Unsupported dataloader type: {type(dataloader)}. "
                        "You can use e.g. `qadence.ml_tools.to_dataloader` to build a dataloader."
                    )

                if iteration % config.print_every == 0 and config.verbose:
                    print_metrics(loss, metrics, iteration)

                if iteration % config.write_every == 0:
                    write_tensorboard(writer, loss, metrics, iteration)

                if perform_val:
                    if iteration % config.val_every == 0:
                        xs = next(dl_iter_val)
                        xs_to_device = data_to_device(xs, device=device, dtype=data_dtype)
                        val_loss, _ = loss_fn(model, xs_to_device)
                        if config.validation_criterion(val_loss, best_val_loss, config.val_epsilon):  # type: ignore[misc]
                            best_val_loss = val_loss
                            if config.folder and config.checkpoint_best_only:
                                write_checkpoint(config.folder, model, optimizer, iteration="best")
                            metrics["val_loss"] = val_loss
                            write_tensorboard(writer, math.nan, metrics, iteration)

                if config.folder:
                    if iteration % config.checkpoint_every == 0 and not config.checkpoint_best_only:
                        write_checkpoint(config.folder, model, optimizer, iteration)

            except KeyboardInterrupt:
                logger.info("Terminating training gracefully after the current iteration.")
                break

    # Final writing and checkpointing
    if config.folder and not config.checkpoint_best_only:
        write_checkpoint(config.folder, model, optimizer, iteration)
    write_tensorboard(writer, loss, metrics, iteration)
    writer.close()

    return model, optimizer

train(model, dataloader, optimizer, config, loss_fn)

Runs the training loop with a gradient-free optimizer.

Assumes that loss_fn returns a tuple of (loss, metrics: dict), where metrics is a dict of scalars. Loss and metrics are written to tensorboard. Checkpoints are written every config.checkpoint_every steps (and after the last training step). If a checkpoint is found at config.folder we resume training from there. The tensorboard logs can be viewed via tensorboard --logdir /path/to/folder.

PARAMETER DESCRIPTION
model

The model to train

TYPE: Module

dataloader

Dataloader constructed via dictdataloader

TYPE: DictDataLoader | DataLoader | None

optimizer

The optimizer to use taken from the Nevergrad library. If this is not the case the function will raise an AssertionError

TYPE: Optimizer

loss_fn

Loss function returning (loss: float, metrics: dict[str, float])

TYPE: Callable

Source code in qadence/ml_tools/train_no_grad.py
def train(
    model: Module,
    dataloader: DictDataLoader | DataLoader | None,
    optimizer: NGOptimizer,
    config: TrainConfig,
    loss_fn: Callable,
) -> tuple[Module, NGOptimizer]:
    """Runs the training loop with a gradient-free optimizer.

    Assumes that `loss_fn` returns a tuple of (loss, metrics: dict), where
    `metrics` is a dict of scalars. Loss and metrics are written to
    tensorboard. Checkpoints are written every `config.checkpoint_every` steps
    (and after the last training step).  If a checkpoint is found at `config.folder`
    we resume training from there.  The tensorboard logs can be viewed via
    `tensorboard --logdir /path/to/folder`.

    Args:
        model: The model to train
        dataloader: Dataloader constructed via `dictdataloader`
        optimizer: The optimizer to use taken from the Nevergrad library. If this is not
            the case the function will raise an AssertionError
        loss_fn: Loss function returning (loss: float, metrics: dict[str, float])
    """
    init_iter = 0
    if config.folder:
        model, optimizer, init_iter = load_checkpoint(config.folder, model, optimizer)
        logger.debug(f"Loaded model and optimizer from {config.folder}")

    def _update_parameters(
        data: Tensor | None, ng_params: ng.p.Array
    ) -> tuple[float, dict, ng.p.Array]:
        loss, metrics = loss_fn(model, data)  # type: ignore[misc]
        optimizer.tell(ng_params, float(loss))
        ng_params = optimizer.ask()  # type: ignore [assignment]
        params = promote_to_tensor(ng_params.value, requires_grad=False)
        set_parameters(model, params)
        return loss, metrics, ng_params

    assert loss_fn is not None, "Provide a valid loss function"
    # TODO: support also Scipy optimizers
    assert isinstance(optimizer, NGOptimizer), "Use only optimizers from the Nevergrad library"

    # initialize tensorboard
    writer = SummaryWriter(config.folder, purge_step=init_iter)

    # set optimizer configuration and initial parameters
    optimizer.budget = config.max_iter
    optimizer.enable_pickling()

    # TODO: Make it GPU compatible if possible
    params = get_parameters(model).detach().numpy()
    ng_params = ng.p.Array(init=params)

    # serial training
    # TODO: Add a parallelization using the num_workers argument in Nevergrad
    progress = Progress(
        TextColumn("[progress.description]{task.description}"),
        BarColumn(),
        TaskProgressColumn(),
        TimeRemainingColumn(elapsed_when_finished=True),
    )
    with progress:
        dl_iter = iter(dataloader) if dataloader is not None else None

        for iteration in progress.track(range(init_iter, init_iter + config.max_iter)):
            if dataloader is None:
                loss, metrics, ng_params = _update_parameters(None, ng_params)

            elif isinstance(dataloader, (DictDataLoader, DataLoader)):
                data = next(dl_iter)  # type: ignore[arg-type]
                loss, metrics, ng_params = _update_parameters(data, ng_params)

            else:
                raise NotImplementedError("Unsupported dataloader type!")

            if iteration % config.print_every == 0 and config.verbose:
                print_metrics(loss, metrics, iteration)

            if iteration % config.write_every == 0:
                write_tensorboard(writer, loss, metrics, iteration)

            if config.folder:
                if iteration % config.checkpoint_every == 0:
                    write_checkpoint(config.folder, model, optimizer, iteration)

            if iteration >= init_iter + config.max_iter:
                break

    ## Final writing and stuff
    if config.folder:
        write_checkpoint(config.folder, model, optimizer, iteration)
    write_tensorboard(writer, loss, metrics, iteration)
    writer.close()

    return model, optimizer

DictDataLoader(dataloaders) dataclass

This class only holds a dictionary of DataLoaders and samples from them.

InfiniteTensorDataset(*tensors)

Bases: IterableDataset

Randomly sample points from the first dimension of the given tensors.

Behaves like a normal torch Dataset just that we can sample from it as many times as we want.

Examples:

import torch
from qadence.ml_tools.data import InfiniteTensorDataset

x_data, y_data = torch.rand(5,2), torch.ones(5,1)
# The dataset accepts any number of tensors with the same batch dimension
ds = InfiniteTensorDataset(x_data, y_data)

# call `next` to get one sample from each tensor:
xs = next(iter(ds))
(tensor([0.0292, 0.4673]), tensor([1.]))

Source code in qadence/ml_tools/data.py
def __init__(self, *tensors: Tensor):
    """Randomly sample points from the first dimension of the given tensors.

    Behaves like a normal torch `Dataset` just that we can sample from it as
    many times as we want.

    Examples:
    ```python exec="on" source="above" result="json"
    import torch
    from qadence.ml_tools.data import InfiniteTensorDataset

    x_data, y_data = torch.rand(5,2), torch.ones(5,1)
    # The dataset accepts any number of tensors with the same batch dimension
    ds = InfiniteTensorDataset(x_data, y_data)

    # call `next` to get one sample from each tensor:
    xs = next(iter(ds))
    print(str(xs)) # markdown-exec: hide
    ```
    """
    self.tensors = tensors

data_to_device(xs, *args, **kwargs)

Utility method to move arbitrary data to 'device'.

Source code in qadence/ml_tools/data.py
@singledispatch
def data_to_device(xs: Any, *args: Any, **kwargs: Any) -> Any:
    """Utility method to move arbitrary data to 'device'."""
    raise ValueError(f"Unable to move {type(xs)} with input args: {args} and kwargs: {kwargs}.")

to_dataloader(*tensors, batch_size=1, infinite=False)

Convert torch tensors an (infinite) Dataloader.

PARAMETER DESCRIPTION
*tensors

Torch tensors to use in the dataloader.

TYPE: Tensor DEFAULT: ()

batch_size

batch size of sampled tensors

TYPE: int DEFAULT: 1

infinite

if True, the dataloader will keep sampling indefinitely even after the whole dataset was sampled once

TYPE: bool DEFAULT: False

Examples:

import torch
from qadence.ml_tools import to_dataloader

(x, y, z) = [torch.rand(10) for _ in range(3)]
loader = iter(to_dataloader(x, y, z, batch_size=5, infinite=True))
print(next(loader))
print(next(loader))
print(next(loader))
[tensor([0.4666, 0.4997, 0.6687, 0.8507, 0.1610]), tensor([0.1936, 0.0998, 0.5131, 0.6089, 0.2802]), tensor([0.7594, 0.1033, 0.1207, 0.1032, 0.4517])]
[tensor([0.4290, 0.8304, 0.3368, 0.8526, 0.3990]), tensor([0.7135, 0.6896, 0.9118, 0.5699, 0.2560]), tensor([0.3177, 0.3676, 0.8235, 0.4959, 0.2025])]
[tensor([0.4666, 0.4997, 0.6687, 0.8507, 0.1610]), tensor([0.1936, 0.0998, 0.5131, 0.6089, 0.2802]), tensor([0.7594, 0.1033, 0.1207, 0.1032, 0.4517])]
Source code in qadence/ml_tools/data.py
def to_dataloader(*tensors: Tensor, batch_size: int = 1, infinite: bool = False) -> DataLoader:
    """Convert torch tensors an (infinite) Dataloader.

    Arguments:
        *tensors: Torch tensors to use in the dataloader.
        batch_size: batch size of sampled tensors
        infinite: if `True`, the dataloader will keep sampling indefinitely even after the whole
            dataset was sampled once

    Examples:

    ```python exec="on" source="above" result="json"
    import torch
    from qadence.ml_tools import to_dataloader

    (x, y, z) = [torch.rand(10) for _ in range(3)]
    loader = iter(to_dataloader(x, y, z, batch_size=5, infinite=True))
    print(next(loader))
    print(next(loader))
    print(next(loader))
    ```
    """
    ds = InfiniteTensorDataset(*tensors) if infinite else TensorDataset(*tensors)
    return DataLoader(ds, batch_size=batch_size)