Readout mitigation

Readout error mitigation

Readout errors are introduced during measurements in the computation basis via probabilistic bitflip operators characterized by the readout matrix (also known as confusion matrix) defined over the system of qubits of dimension \(2^n\times2^n\). The complete implementation of the mitigation technique involves using the characterized readout matrix for the system of qubits \((T)\) and classically applying an inversion \((T^{−1})\) to the measured probability distributions. However there are several limitations of this approach:

The complete implementation requires \(2^n\) characterization experiments (probability measurements), which is not scalable.
Classical overhead from full matrix inversion for large system of qubits is expensive
The matrix \(T\) may become singular for large \(n\), preventing direct inversion.
The inverse \(T^{−1}\) might not be a stochastic matrix, meaning that it can produce negative corrected probabilities.
The correction is not rigorously justified, so we cannot be sure that we are only removing SPAM errors and not otherwise corrupting an estimated probability distribution.

Qadence relies on the assumption of uncorrelated readout errors, this gives us:

\[ T=T_1\otimes T_2\otimes \dots \otimes T_n \]

for which the inversion is straightforward:

\[ T^{-1}=T_1^{-1}\otimes T_2^{-1}\otimes \dots \otimes T_n^{-1} \]

from qadence import QuantumModel, QuantumCircuit, hamiltonian_factory, kron, H, Z, I
from qadence import NoiseProtocol, NoiseHandler


# Simple circuit and observable construction.
block = kron(H(0), I(1))
circuit = QuantumCircuit(2, block)
n_shots = 10000

# Construct a quantum model and noise
model = QuantumModel(circuit=circuit)
error_probability = 0.2
noise = NoiseHandler(protocol=NoiseProtocol.READOUT.INDEPENDENT,options={"error_probability": error_probability})

noiseless_samples = model.sample(n_shots=n_shots)
noisy_samples = model.sample(noise=noise, n_shots=n_shots)

noiseless samples: [OrderedCounter({'00': 5027, '10': 4973})]
noisy samples: [OrderedCounter({'10': 4504, '00': 4477, '11': 526, '01': 493})]

Note that the noisy states have samples with the second qubit flipped. In the below protocols, we describe ways to reconstruct the noiseless distribution (untargeted mitigation). Besides this one might just be interrested in mitigating the expectation value (targeted mitigation).

Constrained optimization

However, even for a reduced \(n\) the forth limitation holds. This can be avoided by reformulating into a minimization problem¹:

\[ \lVert Tp_{\textrm{corr}}-p_{\textrm{raw}}\rVert_{2}^{2} \]

subjected to physicality constraints \(0 \leq p_{corr}(x) \leq 1\) and \(\lVert p_{corr} \rVert = 1\). At this point, two methods are implemented to solve this problem. The method involves solving a constrained optimization problem and can be computationally expensive.

from qadence_protocols.mitigations.protocols import Mitigations
from qadence_protocols.types import ReadOutOptimization


# Define the mitigation method solving the minimization problem:
options={"optimization_type": ReadOutOptimization.CONSTRAINED, "n_shots": n_shots}
mitigation = Mitigations(protocol=Mitigations.READOUT, options=options)

# Run noiseless, noisy and mitigated simulations.
mitigated_samples_opt = mitigation(model=model, noise=noise)

Optimization based mitigation: [Counter({'10': 5004, '00': 4962, '01': 34})]

Maximum Likelihood estimation (MLE)

This method replaces the constraints with additional post processing for correcting probability distributions with negative entries. The runtime of the method is linear in the size of the distribution and thus is very efficient. The optimality of the solution is however not always guaranteed. The method redistributes any negative probabilities on using the inverse operation equally and can be shown to maximize the likelihood with minimal effort².

# Define the mitigation method solving the minimization problem:
options={"optimization_type": ReadOutOptimization.MLE, "n_shots": n_shots}
mitigation = Mitigations(protocol=Mitigations.READOUT, options=options)
mitigated_samples_mle = mitigation(model=model, noise=noise)

MLE based mitigation [Counter({'10': 5010, '00': 4990})]

Matrix free measurement mitigation (MTHREE)

This method relies on inverting the probability distribution within a restricted subspace of measured bitstrings³. The method is better suited for computations that exceed 20 qubits where the corrected probability distribution would require a state in a unreasonably high dimensional Hilbert space. Thus, the idea here is to stick to the basis states that show up in the measurement alone. Additionally, one might want to include states that are \(k\) hamming distance away from it.

import numpy as np
from scipy.sparse import csr_matrix
from scipy.sparse.linalg import gmres

n_qubits = 10
exact_prob = np.random.rand(2 ** (n_qubits))
exact_prob[2 ** (n_qubits//2):]=0
exact_prob = 0.90 * exact_prob + 0.1 * np.ones(2**n_qubits) / 2**n_qubits
exact_prob = exact_prob / sum(exact_prob)
np.random.shuffle(exact_prob)

observed_prob = np.array(exact_prob, copy=True)
observed_prob[exact_prob < 1 / 2 ** (n_qubits)] = 0

observed_prob = observed_prob / sum(observed_prob)

input_csr = csr_matrix(observed_prob, shape=(1, 2**n_qubits)).T

{'0001000000': 0.025, '0001001101': 0.049, '0001100100': 0.065, '0001100101': 0.019, '0001101000': 0.009, '0010001000': 0.066, '0010100011': 0.021, '0010101101': 0.034, '0100010111': 0.001, '0100011011': 0.05, '0101100010': 0.027, '0101101111': 0.035, '0110000101': 0.033, '0110100000': 0.051, '0110100110': 0.057, '0110101010': 0.019, '0111101100': 0.03, '1000110001': 0.021, '1000110010': 0.005, '1001000001': 0.028, '1001010100': 0.032, '1010000001': 0.014, '1010110000': 0.059, '1010110101': 0.027, '1010111111': 0.04, '1011100101': 0.018, '1101000010': 0.045, '1101011010': 0.004, '1110111100': 0.037, '1110111110': 0.066, '1111110011': 0.014}
Filling percentage 0.0302734375 %

We have generated a probability distribution within a small subspace of bitstrings being filled. We use csr_matrix for efficient representation and computation. Now we use MTHREE to do mitigation on the probability distribution

from scipy.stats import wasserstein_distance
from qadence_protocols.mitigations.readout import (
    normalized_subspace_kron,
    mle_solve,
    matrix_inv,
    tensor_rank_mult
    )

noise_matrices = []
for t in range(n_qubits):
    t_a, t_b = np.random.rand(2) / 8
    K = np.array([[1 - t_a, t_a], [t_b, 1 - t_b]]).transpose()  # column sum be 1
    noise_matrices.append(K)

confusion_matrix_subspace = normalized_subspace_kron(noise_matrices, observed_prob.nonzero()[0])

p_corr_mthree_gmres = gmres(confusion_matrix_subspace, input_csr.toarray())[0]
p_corr_mthree_gmres_mle = mle_solve(p_corr_mthree_gmres)

noise_matrices_inv = list(map(matrix_inv, noise_matrices))
p_corr_inv_mle = mle_solve(tensor_rank_mult(noise_matrices_inv, observed_prob))

distance = wasserstein_distance(p_corr_mthree_gmres_mle, p_corr_inv_mle)

Wasserstein distance between the 2 distributions: 0.0005305734227154717

We have used wasserstein_distance instead of kl_divergence as many of the bistrings have 0 probabilites. If the expected solution lies outside the space of observed bitstrings, MTHREE will fail. We next look at majority voting to circument this problem when the expected output is a single bitstring.

Majority Voting

Mitigation protocol to be used only when the circuit output has a single expected bitstring as the solution ⁴. The method votes on the likeliness of each qubit to be a 0 or 1 assuming a tensor product structure for the output. The method is valid only when the readout errors are not correlated.

from qadence import QuantumModel, QuantumCircuit,kron, H, Z, I
from qadence import NoiseHandler, NoiseProtocol
from qadence_protocols.mitigations.readout import majority_vote
import numpy as np

# Simple circuit and observable construction.
n_qubits = 4
block = kron(*[I(i) for i in range(n_qubits)])
circuit = QuantumCircuit(n_qubits, block)
n_shots = 1000

# Construct a quantum model.
model = QuantumModel(circuit=circuit)

# Sampling the noisy solution
error_p = 0.2
noise = NoiseHandler(protocol=NoiseProtocol.READOUT.INDEPENDENT,options={"error_probability": error_p})
noisy_samples = model.sample(noise=noise, n_shots=n_shots)[0]

# Removing samples that correspond to actual solution
noisy_samples['0000'] = 0

# Constructing the probability vector
ordered_bitstrings = [bin(k)[2:].zfill(n_qubits) for k in range(2**n_qubits)]
observed_prob = np.array([noisy_samples[bs] for bs in ordered_bitstrings]) / n_shots

noisy samples: OrderedCounter({'1000': 72, '0001': 71, '0100': 68, '0010': 51, '1100': 11, '1010': 9, '0011': 7, '1001': 7, '0101': 6, '0110': 6, '1011': 1, '0000': 0})
observed probability: [0.    0.071 0.051 0.007 0.068 0.006 0.006 0.    0.072 0.007 0.009 0.001
 0.011 0.    0.    0.   ]

We have removed the actual solution from the observed distribution and will use this as the observed probability.

noise_matrices = [np.array([[1 - error_p, error_p], [error_p, 1 - error_p]])]*n_qubits
result_index = majority_vote(noise_matrices, observed_prob).argmax()

mitigated solution index: 0

Model free mitigation

This protocol makes use of all possible so-called twirl operations to average out the effect of readout errors into an effective scaling. The twirl operation consists of using bit flip operators before the measurement and after the measurement is obtained⁵. The number of twirl operations can be reduced through random sampling. The method is exact in that it requires no calibration which might be prone to errors of modelling.

from qadence import NoiseHandler, NoiseProtocol
from qadence.measurements import Measurements
from qadence.operations import CNOT, RX, Z
from qadence_protocols import Mitigations

import torch
from qadence import (
    QuantumCircuit,
    QuantumModel,
    chain,
    kron,
)

error_probability=0.15
n_shots=10000
block= chain(kron(RX(0, torch.pi / 3), RX(1, torch.pi / 6)), CNOT(0, 1))
observable=[3 * kron(Z(0), Z(1)) + 2 * Z(0)]

circuit = QuantumCircuit(block.n_qubits, block)
noise = NoiseHandler(protocol=NoiseProtocol.READOUT.INDEPENDENT, options={"error_probability": error_probability})
tomo_measurement = Measurements(
    protocol=Measurements.TOMOGRAPHY,
    options={"n_shots": n_shots},
)

model = QuantumModel(
    circuit=circuit, observable=observable,
)

noisy_model = QuantumModel(
    circuit=circuit,
    observable=observable,
    measurement=tomo_measurement,
    noise=noise,
)

mitigate = Mitigations(protocol=Mitigations.TWIRL)
expectation_mitigated = mitigate(noisy_model)

noiseless expectation value tensor([[3.6110]], grad_fn=<TransposeBackward0>)
noisy expectation value tensor([[2.7140]], grad_fn=<TransposeBackward0>)
expected mitigation value tensor([3.6526])