Qadence Trainer Guide
The Trainer
class in qadence.ml_tools
is a versatile tool designed to streamline the training of quantum machine learning models.
It offers flexibility for both gradient-based and gradient-free optimization methods, supports custom loss functions, and integrates seamlessly with tracking tools like TensorBoard and MLflow.
Additionally, it provides hooks for implementing custom behaviors during the training process.
For training QML models, Qadence offers this out-of-the-box Trainer
for optimizing differentiable
models, e.g. QNN
s and QuantumModel
, containing either trainable and/or non-trainable parameters
(see the parameters tutorial for detailed information about parameter types):
1. Overview
The Trainer
class simplifies the training workflow by managing the training loop, handling data loading, and facilitating model evaluation.
It is compatible with various optimization strategies and allows for extensive customization to meet specific training requirements.
Example of initializing the Trainer
:
from qadence.ml_tools import Trainer, TrainConfig
from torch.optim import Adam
# Initialize model and optimizer
model = ... # Define or load a quantum model here
optimizer = Adam(model.parameters(), lr=0.01)
config = TrainConfig(max_iter=100, print_every=10)
# Initialize Trainer with model, optimizer, and configuration
trainer = Trainer(model=model, optimizer=optimizer, config=config)
Notes:
toqadence
versions prior to 1.9.0 providedtrain_with_grad
andtrain_no_grad
functions, which are being replaced withTrainer
. The user can transition as following.
2. Gradient-Based and Gradient-Free Optimization
The Trainer
supports both gradient-based and gradient-free optimization methods.
Default is gradient-based optimization.
- Gradient-Based Optimization: Utilizes optimizers from PyTorch's
torch.optim
module. This is the default behaviour of theTrainer
, thus setting this is not necessary. However, it can be explicity mentioned as follows. Example of using gradient-based optimization:
from qadence.ml_tools import Trainer
# set_use_grad(True) to enable gradient based training. This is the default behaviour of Trainer.
Trainer.set_use_grad(True)
- Gradient-Free Optimization: Employs optimization algorithms from the Nevergrad library.
Example of using gradient-free optimization with Nevergrad:
from qadence.ml_tools import Trainer
# set_use_grad(False) to disable gradient based training.
Trainer.set_use_grad(False)
Using Context Managers for Mixed Optimization
For cases requiring both optimization methods in a single training session, the Trainer
class provides context managers to enable or disable gradients.
# Temporarily switch to gradient-based optimization
with trainer.enable_grad_opt(optimizer):
print("Gradient Based Optimization")
# trainer.fit(train_loader)
# Switch to gradient-free optimization for specific steps
with trainer.disable_grad_opt(ng_optimizer):
print("Gradient Free Optimization")
# trainer.fit(train_loader)
3. Custom Loss Functions
Users can define custom loss functions tailored to their specific tasks.
The Trainer
accepts a loss_fn
parameter, which should be a callable that takes the model and data as inputs and returns a tuple containing the loss tensor and a dictionary of metrics.
Example of using a custom loss function:
import torch
from itertools import count
cnt = count()
criterion = torch.nn.MSELoss()
def loss_fn(model: torch.nn.Module, data: torch.Tensor) -> tuple[torch.Tensor, dict]:
next(cnt)
x, y = data
out = model(x)
loss = criterion(out, y)
return loss, {}
This custom loss function can be used in the trainer
from qadence.ml_tools import Trainer, TrainConfig
from torch.optim import Adam
# Initialize model and optimizer
model = ... # Define or load a quantum model here
optimizer = Adam(model.parameters(), lr=0.01)
config = TrainConfig(max_iter=100, print_every=10)
trainer = Trainer(model=model, optimizer=optimizer, config=config, loss_fn=loss_fn)
4. Hooks for Custom Behavior
The Trainer
class provides several hooks that enable users to inject custom behavior at different stages of the training process.
These hooks are methods that can be overridden in a subclass to execute custom code.
The available hooks include:
on_train_start
: Called at the beginning of the training process.on_train_end
: Called at the end of the training process.on_train_epoch_start
: Called at the start of each training epoch.on_train_epoch_end
: Called at the end of each training epoch.on_train_batch_start
: Called at the start of each training batch.on_train_batch_end
: Called at the end of each training batch.
Each "start" and "end" hook receives data and loss metrics as arguments. The specific values provided for these arguments depend on the training stage associated with the hook. The context of the training stage (e.g., training, validation, or testing) determines which metrics are relevant and how they are populated. For details of inputs on each hook, please review the documentation of BaseTrainer
.
- Example of what inputs are provided to training hooks.
```
def on_train_batch_start(self, batch: Tuple[torch.Tensor, ...] | None) -> None:
"""
Called at the start of each training batch.
Args:
batch: A batch of data from the DataLoader. Typically a tuple containing
input tensors and corresponding target tensors.
"""
pass
```
```
def on_train_batch_end(self, train_batch_loss_metrics: Tuple[torch.Tensor, Any]) -> None:
"""
Called at the end of each training batch.
Args:
train_batch_loss_metrics: Metrics for the training batch loss.
Tuple of (loss, metrics)
"""
pass
```
Example of using a hook to log a message at the end of each epoch:
from qadence.ml_tools import Trainer
class CustomTrainer(Trainer):
def on_train_epoch_end(self, train_epoch_loss_metrics):
print(f"End of epoch - Loss and Metrics: {train_epoch_loss_metrics}")
Notes: Trainer offers inbuilt callbacks as well. Callbacks are mainly for logging/tracking purposes, but the above mentioned hooks are generic. The workflow for every train batch looks like: 1. perform on_train_batch_start callbacks, 2. call the on_train_batch_start hook, 3. do the batch training, 4. call the on_train_batch_end hook, and 5. perform on_train_batch_end callbacks.
The use of
on_
{phase}_start
andon_
{phase}_end
hooks is not specifically to add extra callbacks, but for any other generic pre/post processing. For example, reshaping input batch in case of RNNs/LSTMs, post processing loss and adding an extra metric. They could also be used to add more callbacks (which is not recommended - as we provide methods to add extra callbacks in the TrainCofig)
5. Experiment Tracking with TensorBoard and MLflow
The Trainer
integrates with TensorBoard and MLflow for experiment tracking:
-
TensorBoard: Logs metrics and visualizations during training, allowing users to monitor the training process.
-
MLflow: Tracks experiments, logs parameters, metrics, and artifacts, and provides a user-friendly interface for comparing different runs.
To utilize these tracking tools, the Trainer
can be configured with appropriate writers that handle the logging of metrics and other relevant information during training.
Example of using TensorBoard tracking:
from qadence.ml_tools import TrainConfig
from qadence.types import ExperimentTrackingTool
# Set up tracking with TensorBoard
config = TrainConfig(max_iter=100, tracking_tool=ExperimentTrackingTool.TENSORBOARD)
Example of using MLflow tracking:
from qadence.types import ExperimentTrackingTool
# Set up tracking with MLflow
config = TrainConfig(max_iter=100, tracking_tool=ExperimentTrackingTool.MLFLOW)
6. Examples
6.1. Training with Trainer
and TrainConfig
Setup
Let's do the necessary imports and declare a DataLoader
. We can already define some hyperparameters here, including the seed for random number generators. mlflow can log hyperparameters with arbitrary types, for example the observable that we want to monitor (Z
in this case, which has a qadence.Operation
type).
import random
from itertools import count
import numpy as np
import torch
from matplotlib import pyplot as plt
from matplotlib.figure import Figure
from torch.nn import Module
from torch.utils.data import DataLoader
from qadence import hea, QuantumCircuit, Z
from qadence.constructors import feature_map, hamiltonian_factory
from qadence.ml_tools import Trainer, TrainConfig
from qadence.ml_tools.data import to_dataloader
from qadence.ml_tools.utils import rand_featureparameters
from qadence.ml_tools.models import QNN, QuantumModel
from qadence.types import ExperimentTrackingTool
hyperparams = {
"seed": 42,
"batch_size": 10,
"n_qubits": 2,
"ansatz_depth": 1,
"observable": Z,
}
np.random.seed(hyperparams["seed"])
torch.manual_seed(hyperparams["seed"])
random.seed(hyperparams["seed"])
def dataloader(batch_size: int = 25) -> DataLoader:
x = torch.linspace(0, 1, batch_size).reshape(-1, 1)
y = torch.cos(x)
return to_dataloader(x, y, batch_size=batch_size, infinite=True)
We continue with the regular QNN definition, together with the loss function and optimizer.
obs = hamiltonian_factory(register=hyperparams["n_qubits"], detuning=hyperparams["observable"])
data = dataloader(hyperparams["batch_size"])
fm = feature_map(hyperparams["n_qubits"], param="x")
model = QNN(
QuantumCircuit(
hyperparams["n_qubits"], fm, hea(hyperparams["n_qubits"], hyperparams["ansatz_depth"])
),
observable=obs,
inputs=["x"],
)
cnt = count()
criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)
inputs = rand_featureparameters(model, 1)
def loss_fn(model: QuantumModel, data: torch.Tensor) -> tuple[torch.Tensor, dict]:
next(cnt)
out = model.expectation(inputs)
loss = criterion(out, torch.rand(1))
return loss, {}
TrainConfig
specifications
Qadence offers different tracking options via TrainConfig
. Here we use the ExperimentTrackingTool
type to specify that we want to track the experiment with mlflow. Tracking with tensorboard is also possible. We can then indicate what and how often we want to track or log.
For Training
write_every
controls the number of epochs after which the loss values is logged. Thanks to the plotting_functions
and plot_every
arguments, we are also able to plot model-related quantities throughout training. Notice that arbitrary plotting functions can be passed, as long as the signature is the same as plot_fn
below. Finally, the trained model can be logged by setting log_model=True
. Here is an example of plotting function and training configuration
def plot_fn(model: Module, iteration: int) -> tuple[str, Figure]:
descr = f"ufa_prediction_epoch_{iteration}.png"
fig, ax = plt.subplots()
x = torch.linspace(0, 1, 100).reshape(-1, 1)
out = model.expectation(x)
ax.plot(x.detach().numpy(), out.detach().numpy())
return descr, fig
config = TrainConfig(
root_folder="mlflow_demonstration",
max_iter=10,
checkpoint_every=1,
plot_every=2,
write_every=1,
log_model=True,
tracking_tool=ExperimentTrackingTool.MLFLOW,
hyperparams=hyperparams,
plotting_functions=(plot_fn,),
)
Training and inspecting
Model training happens as usual
After training , we can inspect our experiment via the mlflow UI
In this case, since we're running on a local server, we can access the mlflow UI by navigating to http://localhost:8080/.6.2. Fitting a function with a QNN using ml_tools
In Quantum Machine Learning, the general consensus is to use complex128
precision for states and operators and float64
precision for parameters. This is also the convention which is used in qadence
.
However, for specific usecases, lower precision can greatly speed up training and reduce memory consumption. When using the pyqtorch
backend, qadence
offers the option to move a QuantumModel
instance to a specific precision using the torch to
syntax.
Let's look at a complete example of how to use Trainer
now. Here we perform a validation check during training and use a validation criterion that checks whether the validation loss in the current iteration has decreased compared to the lowest validation loss from all previous iterations. For demonstration, the train and the validation data are kept the same here. However, it is beneficial and encouraged to keep them distinct in practice to understand model's generalization capabilities.
from pathlib import Path
import torch
from functools import reduce
from operator import add
from itertools import count
import matplotlib.pyplot as plt
from qadence import Parameter, QuantumCircuit, Z
from qadence import hamiltonian_factory, hea, feature_map, chain
from qadence import QNN
from qadence.ml_tools import TrainConfig, Trainer, to_dataloader
Trainer.set_use_grad(True)
DEVICE = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
DTYPE = torch.complex64
n_qubits = 4
fm = feature_map(n_qubits)
ansatz = hea(n_qubits=n_qubits, depth=3)
observable = hamiltonian_factory(n_qubits, detuning=Z)
circuit = QuantumCircuit(n_qubits, fm, ansatz)
model = QNN(circuit, observable, backend="pyqtorch", diff_mode="ad")
batch_size = 100
input_values = {"phi": torch.rand(batch_size, requires_grad=True)}
pred = model(input_values)
cnt = count()
criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)
def loss_fn(model: torch.nn.Module, data: torch.Tensor) -> tuple[torch.Tensor, dict]:
next(cnt)
x, y = data[0], data[1]
out = model(x)
loss = criterion(out, y)
return loss, {}
def validation_criterion(
current_validation_loss: float, current_best_validation_loss: float, val_epsilon: float
) -> bool:
return current_validation_loss <= current_best_validation_loss - val_epsilon
n_epochs = 300
config = TrainConfig(
max_iter=n_epochs,
batch_size=batch_size,
checkpoint_best_only=True,
val_every=10, # The model will be run on the validation data after every `val_every` epochs.
validation_criterion=validation_criterion
)
fn = lambda x, degree: .05 * reduce(add, (torch.cos(i*x) + torch.sin(i*x) for i in range(degree)), 0.)
x = torch.linspace(0, 10, batch_size, dtype=torch.float32).reshape(-1, 1)
y = fn(x, 5)
train_dataloader = to_dataloader(x, y, batch_size=batch_size, infinite=True)
val_dataloader = to_dataloader(x, y, batch_size=batch_size, infinite=True)
trainer = Trainer(model, optimizer, config, loss_fn=loss_fn,
train_dataloader = train_dataloader, val_dataloader = val_dataloader,
device=DEVICE, dtype=DTYPE)
trainer.fit()
plt.clf()
plt.plot(x.numpy(), y.numpy(), label='truth')
plt.plot(x.numpy(), model(x).detach().numpy(), "--", label="final", linewidth=3)
plt.legend()
6.3. Fitting a function - Low-level API
For users who want to use the low-level API of qadence
, here an example written without Trainer
.
from pathlib import Path
import torch
from itertools import count
from qadence.constructors import hamiltonian_factory, hea, feature_map
from qadence import chain, Parameter, QuantumCircuit, Z
from qadence import QNN
from qadence.ml_tools import TrainConfig
n_qubits = 2
fm = feature_map(n_qubits)
ansatz = hea(n_qubits=n_qubits, depth=3)
observable = hamiltonian_factory(n_qubits, detuning=Z)
circuit = QuantumCircuit(n_qubits, fm, ansatz)
model = QNN(circuit, observable, backend="pyqtorch", diff_mode="ad")
batch_size = 1
input_values = {"phi": torch.rand(batch_size, requires_grad=True)}
pred = model(input_values)
criterion = torch.nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)
n_epochs=50
cnt = count()
tmp_path = Path("/tmp")
config = TrainConfig(
root_folder=tmp_path,
max_iter=n_epochs,
checkpoint_every=100,
write_every=100,
batch_size=batch_size,
)
x = torch.linspace(0, 1, batch_size).reshape(-1, 1)
y = torch.sin(x)
for i in range(n_epochs):
out = model(x)
loss = criterion(out, y)
loss.backward()
optimizer.step()
6.4. Custom train
loop
If you need custom training functionality that goes beyond what is available in
qadence.ml_tools.Trainer
you can write your own
training loop based on the building blocks that are available in Qadence.
A simplified version of Qadence's train loop is defined below. Feel free to copy it and modify at will.
For logging we can use the get_writer
from the Writer Registry
. This will set up the default writer based on the experiment tracking tool.
All writers from the Writer Registry
offer open
, close
, print_metrics
, write_metrics
, plot_metrics
, etc methods.
from typing import Callable, Union
from torch.nn import Module
from torch.optim import Optimizer
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from qadence.ml_tools.config import TrainConfig
from qadence.ml_tools.data import DictDataLoader, data_to_device
from qadence.ml_tools.optimize_step import optimize_step
from qadence.ml_tools.callbacks import get_writer
from qadence.ml_tools.callbacks.saveload import load_checkpoint, write_checkpoint
def train(
model: Module,
data: DataLoader,
optimizer: Optimizer,
config: TrainConfig,
loss_fn: Callable,
device: str = "cpu",
optimize_step: Callable = optimize_step,
write_tensorboard: Callable = write_tensorboard,
) -> tuple[Module, Optimizer]:
# Move model to device before optimizer is loaded
model = model.to(device)
# load available checkpoint
init_iter = 0
if config.log_folder:
model, optimizer, init_iter = load_checkpoint(config.log_folder, model, optimizer)
# Initialize writer based on the tracking tool specified in the configuration
writer = get_writer(config.tracking_tool) # Uses ExperimentTrackingTool to select writer
writer.open(config, iteration=init_iter)
dl_iter = iter(dataloader)
# outer epoch loop
for iteration in range(init_iter, init_iter + config.max_iter):
data = data_to_device(next(dl_iter), device)
loss, metrics = optimize_step(model, optimizer, loss_fn, data)
if iteration % config.print_every == 0 and config.verbose:
writer.print_metrics(OptimizeResult(iteration, model, optimizer, loss, metrics))
if iteration % config.write_every == 0:
writer.write(iteration, metrics)
if config.log_folder:
if iteration % config.checkpoint_every == 0:
write_checkpoint(config.log_folder, model, optimizer, iteration)
# Final writing and checkpointing
if config.log_folder:
write_checkpoint(config.log_folder, model, optimizer, iteration)
writer.write(iteration,metrics)
writer.close()
return model, optimizer
6.5. Gradient-free optimization using Trainer
We can achieve gradient free optimization with Trainer.set_use_grad(False)
or trainer.disable_grad_opt(ng_optimizer)
. An example solving a QUBO using gradient free optimization based on Nevergrad
optimizers and Trainer
is shown in the analog QUBO Tutorial.