Information Content
Information Content
InformationContent(model, loss_fn, xs, epsilons, variation_multiple=20)
Information Landscape class.
This class handles the study of loss landscape from information theoretic perspective and provides methods to get bounds on the norm of the gradient from the Information Content of the loss landscape.
PARAMETER | DESCRIPTION |
---|---|
model
|
The quantum or classical model to analyze.
TYPE:
|
loss_fn
|
Loss function that takes model output and calculates loss
TYPE:
|
xs
|
Input data to evaluate the model on
TYPE:
|
epsilons
|
The thresholds to use for discretization of the finite derivatives
TYPE:
|
variation_multiple
|
The number of sets of variational parameters to generate per each variational parameter. The number of variational parameters required for the statistical analysis scales linearly with the amount of them present in the model. This is that linear factor.
TYPE:
|
Notes
This class provides flexibility in terms of what the model, the loss function, and the xs are. The only requirement is that the loss_fn takes the model and xs as arguments and returns the loss, and another dictionary of other metrics.
Thus, assumed structure: loss_fn(model, xs) -> (loss, metrics, ...)
Example: A Classifier
model = nn.Linear(10, 1)
def loss_fn(
model: nn.Module,
xs: tuple[torch.Tensor, torch.Tensor]
) -> tuple[torch.Tensor, dict[str, float]:
criterion = nn.MSELoss()
inputs, labels = xs
outputs = model(inputs)
loss = criterion(outputs, labels)
metrics = {"loss": loss.item()}
return loss, metrics
xs = (torch.randn(10, 10), torch.randn(10, 1))
info_landscape = InfoLandscape(model, loss_fn, xs)
xs
include both the
inputs and the target labels. The logic for calculation of the loss from this lies
entirely within the loss_fn
function. This can then further be used to obtain the
bounds on the average norm of the gradient of the loss function.
Example: A Physics Informed Neural Network
class PhysicsInformedNN(nn.Module):
// <Initialization Logic>
def forward(self, xs: dict[str, torch.Tensor]):
return {
"pde_residual": pde_residual(xs["pde"]),
"boundary_condition": bc_term(xs["bc"]),
}
def loss_fn(
model: PhysicsInformedNN,
xs: dict[str, torch.Tensor]
) -> tuple[torch.Tensor, dict[str, float]:
pde_residual, bc_term = model(xs)
loss = torch.mean(torch.sum(pde_residual**2, dim=1), dim=0)
+ torch.mean(torch.sum(bc_term**2, dim=1), dim=0)
return loss, {"pde_residual": pde_residual, "bc_term": bc_term}
xs = {
"pde": torch.linspace(0, 1, 10),
"bc": torch.tensor([0.0]),
}
info_landscape = InfoLandscape(model, loss_fn, xs)
In this example, the model is a Physics Informed Neural Network, and the `xs`
are the inputs to the different residual components of the model. The logic
for calculation of the residuals lies within the PhysicsInformedNN class, and
the loss function is defined to calculate the loss that is to be optimized
from these residuals. This can then further be used to obtain the
bounds on the average norm of the gradient of the loss function.
The first value that the loss_fn
returns is the loss value that is being optimized.
The function is also expected to return other value(s), often the metrics that are
used to calculate the loss. These values are ignored for the purpose of this class.
Source code in perceptrain/information/information_content.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 |
|
calculate_IC
cached
property
Calculate Information Content for multiple epsilon values.
Returns: Tensor of IC values for each epsilon [n_epsilons]
batched_loss()
Calculate loss for all parameter variations in a batched manner.
Returns: Tensor of loss values for each parameter variation
Source code in perceptrain/information/information_content.py
calculate_transition_probabilities_batch()
Calculate transition probabilities for multiple epsilon values.
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor of shape [n_epsilons, 6] containing probabilities for each transition type |
Tensor
|
Columns order: [+1to0, +1to-1, 0to+1, 0to-1, -1to0, -1to+1] |
Source code in perceptrain/information/information_content.py
discretize_derivatives()
Convert finite derivatives into discrete values.
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor containing discretized derivatives with shape [n_epsilons, n_variations-2] |
Tensor
|
Each row contains {-1, 0, 1} values for that epsilon |
Source code in perceptrain/information/information_content.py
get_grad_norm_bounds_max_IC()
Compute the bounds on the average norm of the gradient.
RETURNS | DESCRIPTION |
---|---|
tuple[float, float]
|
tuple[Tensor, Tensor]: The lower and upper bounds. |
Source code in perceptrain/information/information_content.py
get_grad_norm_bounds_sensitivity_IC(eta)
Compute the bounds on the average norm of the gradient.
PARAMETER | DESCRIPTION |
---|---|
eta
|
The sensitivity IC.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Tensor
|
The lower bound.
TYPE:
|
Source code in perceptrain/information/information_content.py
max_IC()
Get the maximum Information Content and its corresponding epsilon.
Returns: Tuple of (maximum IC value, optimal epsilon)
Source code in perceptrain/information/information_content.py
q_value(H_value)
cached
staticmethod
Compute the q value.
q is the solution to the equation: H(x) = 4h(x) + 2h(1/2 - 2x)
It is the value of the probability of 4 of the 6 transitions such that the IC is the same as the IC of our system.
This quantity is useful in calculating the bounds on the norms of the gradients.
PARAMETER | DESCRIPTION |
---|---|
H_value
|
The information content.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
float
|
The q value
TYPE:
|
Source code in perceptrain/information/information_content.py
randomized_finite_der()
Calculate normalized finite difference of loss on doing random walk in the parameter space.
This serves as a proxy for the derivative of the loss with respect to parameters.
RETURNS | DESCRIPTION |
---|---|
Tensor
|
Tensor containing normalized finite differences (approximate directional derivatives) |
Tensor
|
between consecutive points in the random walk. Shape: [n_variations - 1] |
Source code in perceptrain/information/information_content.py
reshape_param_variations()
Reshape variations of the model's variational parameters.
RETURNS | DESCRIPTION |
---|---|
dict[str, Tensor]
|
Dictionary of parameter tensors, each with shape [n_variations, *param_shape] |
Source code in perceptrain/information/information_content.py
sensitivity_IC(eta)
Find the minimum value of epsilon such that the information content is less than eta.
PARAMETER | DESCRIPTION |
---|---|
eta
|
Threshold value, the sensitivity IC.
TYPE:
|
Returns: The epsilon value that gives IC that is less than the sensitivity IC.