Apply parameters shift rules to train quantum model using TorchQuantum.#

torchquantum Logo

Tutorial Author: Zirui Li, Hanrui Wang

###Outline 1. Introduction to Parameters Shift Rules. 2. Train a model with parameters shift rules. 3. A simple 2 qubit model for a simple 2 classification task.

In this tutorial, you can learn parameters shift rules and how to use parameters shift rules to calculate gradients and use the gradient to train a model.

##Introduction to Parameters Shift Rules

###Back Propagation

Previously, our quantum model was based on qiskit and pytorch. Once we did an inference of the model, pytorch will automatically build a computaional graph. We can calculate the gradients of each node in the computational graph in a reversed order based on the chain rule. This is called back propagation.


###Parameters Shift Rules

As we all know, when executing a quantum circuit on a real quantum machine, we can not observe the intermdiate quantum state. So, back propagation to calculate gradients are impossible when our circuits run on real quantum machines. Parameters shift rules offer us a technique to calculate gradients only by doing inference. For a circuit function \(f(\theta)\), we can calculate \(f'(\theta)\) by shifting \(\theta\) twice and minus one result by the other and multiply with a factor. The figure below describes the workflow of how to calculate the gradient of a parameter in a 4-qubit circuit.


Suppose an \(m\)-qubit quantum circuit is parametrized by \(n\) parameters \(\theta=[\theta_1,\cdots,\theta_i,\cdots,\theta_n]\), the expectation value of measures of this circuit can be represented by a circuit function,

\[f(\theta)=\langle\psi|U(\theta_i)^{\dagger}\widehat{Q}U(\theta_i)|\psi\rangle, \quad f(\theta)\in\mathbb{R}^{m}, \theta\in\mathbb{R}^n.\]

where \(\theta_i\) is the scalar parameter whose gradient is to be calculated, and \(U(\theta_i)\) is the gate where \(\theta_i\) lies in.

Here, for notation simplicity, we have already absorbed the unitaries before \(U(\theta_i)\) into \(\langle\psi|\), \(|\psi\rangle\). Unitaries after \(U(\theta_i)\) and observables are fused into \(\widehat{Q}\).

Usually, the rotation gates used in QNN can be written in the form \(U(\theta_i)=e^{-\frac{i}{2}\theta_i H}\). Here \(H\) is the Hermitian generator of \(U\) with only 2 unique eigenvalues +1 and -1.

In this way, the gradients of the circuit function \(f\) with respect to \(\theta_i\) are,

\[\begin{split}\begin{aligned} &\frac{\partial f(\theta)}{\partial \theta_i}=\frac{1}{2}\Big(f\big(\theta_+\big)-f\big(\theta_{-}\big)\Big), \\ &\theta_+=[\theta_1,\cdots,\theta_i+\frac{\pi}{2},\cdots,\theta_n], \theta_{-}=[\theta_1,\cdots,\theta_i-\frac{\pi}{2},\cdots,\theta_n], \end{aligned}\end{split}\]

where \(\theta_+\) and \(\theta_{-}\) are the positive shift and negative shift of \(\theta\).

Note that this parameter shift rule is fundamentally different from any numerical difference methods that only approximate the directional derivatives. Instead, the equation calculates the exact gradient w.r.t \(\theta_i\) without any approximation errors or numerical issues.

We apply \(\text{softmax}\) on \(f(\theta)\) as the predicted probability for each class. Then we calculate the cross entropy between the predicted probability distribution \(p\) and the target distribution \(t\) as the classification loss \(\mathcal{L}\),

\[\mathcal{L}(\theta)=-t^T\cdot\texttt{softmax}(f(\theta))=-\sum_{j=1}^m t_j \log{p_j},\quad p_j=\frac{e^{f_j(\theta)}}{\sum_{j=1}^m e^{f_j(\theta)}}.\]

Then the gradient of the loss function with respect to \(\theta_i\) is \(\frac{\partial\mathcal{L}(\theta)}{\partial \theta_i}=\big(\frac{\partial\mathcal{L}(\theta)}{\partial f(\theta)}\big)^T\frac{\partial f(\theta)}{\partial \theta_i}\).

Here \(\frac{\partial f(\theta)}{\partial \theta_i}\) can be calculated on real quantum computer by the parameter shift rules, and \(\frac{\partial\mathcal{L}(\theta)}{\partial f(\theta)}\) can be efficiently calculated on classical devices using backpropagation supported by automatic differentiation frameworks, e.g., PyTorch and TensorFlow.

Now we derive the parameter shift rules used in our QNN models.

Assume \(U(\theta_i)=R_X(\theta_i),R_X(\alpha)=e^{-\frac{i}{2}\alpha X}\), where \(X\) is the Pauli-X matrix.

Firstly, the RX gate is,

\[\begin{split} \begin{aligned} R_X(\alpha)&=e^{-\frac{i}{2}\alpha X}=\sum_{k=0}^{\infty}(-i\alpha/2)^kX^k/k!\\ &=\sum_{k=0}^{\infty}(-i\alpha/2)^{2k}X^{2k}/(2k)!+\sum_{k=0}^{\infty}(-i\alpha/2)^{2k+1}X^{2k+1}/(2k+1)!\\ &=\sum_{k=0}^{\infty}(-1)^k(\alpha/2)^{2k}I/(2k)!-i\sum_{k=0}^{\infty}(-1)^k(\alpha/2)^{2k+1}X/(2k+1)!\\ &=\cos(\alpha/2)I-i\sin(\alpha/2)X. \end{aligned}\end{split}\]

Let \(\alpha=\frac{\pi}{2}\), \(R_X(\pm\frac{\pi}{2})=\frac{1}{\sqrt{2}}(I\mp iX)\).

As \(f(\theta)=\langle\psi|R_X(\theta_i)^{\dagger}\widehat{Q}R_X(\theta_i)|\psi\rangle\), \(R_X(\alpha)R_X(\beta)=R_X(\alpha+\beta)\), and \(\frac{\partial}{\partial \alpha}R_X(\alpha)=-\frac{i}{2}XR_X(\alpha)\), we have

\[\begin{split}\begin{aligned} \frac{\partial f(\theta)}{\partial \theta_i} % &=\langle\psi|\frac{\partial}{\partial \theta_i}R_X(\theta_i)^{\dagger}\widehat{Q}R_X(\theta_i)|\psi\rangle+\langle\psi|R_X(\theta_i)^{\dag}\widehat{Q}\frac{\partial}{\partial \theta_i}R_X(\theta_i)|\psi\rangle\\ =&\langle\psi|R_X(\theta_i)^{\dagger}(-\frac{i}{2}X)^{\dagger}\widehat{Q}R_X(\theta_i)|\psi\rangle+\langle\psi|R_X(\theta_i)^{\dagger}\widehat{Q}(-\frac{i}{2}X)R_X(\theta_i)|\psi\rangle\\ % &=\frac{1}{2}(\langle\psi|R_X(\theta_i)^{\dagger}(-iX)^{\dagger}\widehat{Q}R_X(\theta_i)|\psi\rangle+\langle\psi|R_X(\theta_i)^{\dagger}\widehat{Q}(-iX)R_X(\theta_i)|\psi\rangle)\\ =&\frac{1}{4}(\langle\psi|R_X(\theta_i)^{\dagger}(I-iX)^{\dagger}\widehat{Q}(I-iX)R_X(\theta_i)|\psi\rangle\\&-\langle\psi|R_X(\theta_i)^{\dagger}(I+iX)^{\dagger}\widehat{Q}(I+iX)R_X(\theta_i)|\psi\rangle)\\ =&\frac{1}{2}(\langle\psi|R_X(\theta_i)^{\dagger}R_X(\frac{\pi}{2})^{\dagger}\widehat{Q}R_X(\frac{\pi}{2})R_X(\theta_i)|\psi\rangle\\&-\langle\psi|R_X(\theta_i)^{\dagger}R_X(-\frac{\pi}{2})^{\dagger}\widehat{Q}R_X(-\frac{\pi}{2})R_X(\theta_i)|\psi\rangle)\\ =&\frac{1}{2}(f(\theta_+)-f(\theta_-)). \end{aligned}\end{split}\]

Without loss of generality, the derivation holds for all unitaries of the form \(e^{-\frac{i}{2}\alpha H}\), e.g., RX, RY, RZ, XX, YY, ZZ, where \(H\) is a Hermitian matrix with only 2 unique eigenvalues +1 and -1.

##Train a model with parameters shift rules

###Installation Firstly, install qiskit.

Data type cannot be displayed: application/vnd.colab-display-data+json

[ ]:
!ls artifact  example2  example4  example6
example1       example3  example5  example7
[ ]:
!cp artifact/ ../../usr/local/lib/python3.7/dist-packages/qiskit/providers/aer/backends/ -r
import torch
import torch.nn.functional as F
import torch.optim as optim
import numpy as np

import torchquantum as tq
import torchquantum.functional as tqf
from torchquantum.layer.layers import SethLayer0

from torchquantum.dataset import MNIST
from torch.optim.lr_scheduler import CosineAnnealingLR

Build a quantum model#

Our 4-qubit quantum model contains an encoder that can encode a 4x4 image to quantum state; a quantum layer RZZ+RY+RZZ+RY, 16 parameters in total; and PauliZ measure on each qubit.

class QFCModel(tq.QuantumModule):
    def __init__(self):
        self.n_wires = 4
        self.encoder = tq.GeneralEncoder(

        self.arch = {'n_wires': self.n_wires, 'n_blocks': 2, 'n_layers_per_block': 2}
        self.q_layer = SethLayer0(self.arch)

        self.measure = tq.MeasureAll(tq.PauliZ)

    def forward(self, x, use_qiskit=False):
        bsz = x.shape[0]
        q_device = tq.QuantumDevice(n_wires=self.n_wires, bsz=bsz)
        x = F.avg_pool2d(x, 6).view(bsz, 16)

        if use_qiskit:
            x = self.qiskit_processor.process_parameterized(
                q_device, self.encoder, self.q_layer, self.measure, x)
            self.encoder(q_device, x)
            x = self.measure(q_device)

        x = x.reshape(bsz, 4)

        return x

Build the function of parameters shift rules#

The function can shift the parameters and calculate the gradients to the expectation value of each measure for each parameter. It returns both the expectaion values and the gradient for each parameter.

def shift_and_run(model, inputs, use_qiskit=False):
    param_list = []
    for param in model.parameters():
    grad_list = []
    for param in param_list:
        param.copy_(param + np.pi * 0.5)
        out1 = model(inputs, use_qiskit)
        param.copy_(param - np.pi)
        out2 = model(inputs, use_qiskit)
        param.copy_(param + np.pi * 0.5)
        grad = 0.5 * (out1 - out2)
    return model(inputs, use_qiskit), grad_list

Set whether using gpu, using cuda, number of epochs, optimizer and scheduler. Initialize the model and the MNIST-36 classification dataset.

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")
model = QFCModel().to(device)
n_epochs = 15
optimizer = optim.Adam(model.parameters(), lr=5e-3, weight_decay=1e-4)
scheduler = CosineAnnealingLR(optimizer, T_max=n_epochs)

dataset = MNIST(
    train_valid_split_ratio=[0.9, 0.1],
    digits_of_interest=[3, 6],

dataflow = dict()
for split in dataset:
    sampler =[split])
    dataflow[split] =
[2023-09-12 07:53:28.866] Only use the front 5000 images as TRAIN set.
[2023-09-12 07:53:28.966] Only use the front 3000 images as TEST set.

###Train the model.

During each training step, we calculated the gradients twice. First we use back propagation and second we use parameters shift rules.

grads_bp = []
grads_ps = []

def train_and_return_grad(dataflow, model, device, optimizer):
    for feed_dict in dataflow['train']:
        inputs = feed_dict['image'].to(device)
        targets = feed_dict['digit'].to(device)

        # calculate gradients via back propagation
        outputs = model(inputs)
        prediction = outputs.reshape(-1, 2, 2).sum(-1).squeeze()
        loss = F.nll_loss(F.log_softmax(prediction, dim=1), targets)
        grad_bp = []
        for i, param in enumerate(model.q_layer.parameters()):

        # calculate gradients via parameters shift rules
        with torch.no_grad():
            outputs, grad_list = shift_and_run(model, inputs)
        prediction = outputs.reshape(-1, 2, 2).sum(-1).squeeze()
        loss = F.nll_loss(F.log_softmax(prediction, dim=1), targets)
        grad_ps = []
        for i, param in enumerate(model.q_layer.parameters()):
            param.grad = torch.sum(grad_list[i] * outputs.grad).to(dtype=torch.float32, device=param.device).view(param.shape)

        print(f"loss: {loss.item()}", end='\r')

def valid_test(dataflow, split, model, device, qiskit=False):
    target_all = []
    output_all = []
    with torch.no_grad():
        for feed_dict in dataflow[split]:
            inputs = feed_dict['image'].to(device)
            targets = feed_dict['digit'].to(device)

            outputs = model(inputs, use_qiskit=qiskit)
            prediction = F.log_softmax(outputs.reshape(-1, 2, 2).sum(-1).squeeze(), dim=1)

        target_all =, dim=0)
        output_all =, dim=0)

    _, indices = output_all.topk(1, dim=1)
    masks = indices.eq(target_all.view(-1, 1).expand_as(indices))
    size = target_all.shape[0]
    corrects = masks.sum().item()
    accuracy = corrects / size
    loss = F.nll_loss(output_all, target_all).item()

    print(f"{split} set accuracy: {accuracy}")
    print(f"{split} set loss: {loss}")

for epoch in range(1, n_epochs + 1):
    # train
    print(f"Epoch {epoch}:")
    train_and_return_grad(dataflow, model, device, optimizer)
    # valid
    valid_test(dataflow, 'valid', model, device)

# test
valid_test(dataflow, 'test', model, device, qiskit=False)
Epoch 1:
0.005 0.9950294494628906
valid set accuracy: 0.3219917012448133
valid set loss: 0.8985593914985657
Epoch 2:
valid set accuracy: 0.36016597510373444
valid set loss: 0.8457769155502319
Epoch 3:
valid set accuracy: 0.4464730290456432
valid set loss: 0.792057454586029
Epoch 4:
valid set accuracy: 0.537759336099585
valid set loss: 0.7392197251319885
Epoch 5:
valid set accuracy: 0.5892116182572614
valid set loss: 0.6949657797813416
Epoch 6:
valid set accuracy: 0.6190871369294606
valid set loss: 0.6624241471290588
Epoch 7:
valid set accuracy: 0.6423236514522822
valid set loss: 0.6419368386268616
Epoch 8:
valid set accuracy: 0.6572614107883817
valid set loss: 0.629072368144989
Epoch 9:
valid set accuracy: 0.6697095435684647
valid set loss: 0.620841920375824
Epoch 10:
valid set accuracy: 0.6730290456431536
valid set loss: 0.6155759692192078
Epoch 11:
valid set accuracy: 0.6771784232365146
valid set loss: 0.6119210124015808
Epoch 12:
valid set accuracy: 0.6796680497925311
valid set loss: 0.6096568703651428
Epoch 13:
valid set accuracy: 0.6804979253112033
valid set loss: 0.6084039807319641
Epoch 14:
valid set accuracy: 0.6821576763485477
valid set loss: 0.6078216433525085
Epoch 15:
valid set accuracy: 0.6821576763485477
valid set loss: 0.607674777507782
test set accuracy: 0.720020325203252
test set loss: 0.5852651000022888

Plot and compare the gradients#

We have recorded two sets of gradients calculated by back propagation and parameters shift rules respectively. Now let’s plot these gradients and we can valid that the gradients calculated by parameters shift rules are exactly the same as those calculated by back propagation.

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import matplotlib

grads_bp = np.array(grads_bp)
grads_ps = np.array(grads_ps)

n_steps = grads_bp.shape[0]
n_params = grads_bp.shape[1]

fig, ax_list = plt.subplots(n_params, 1, sharex=True, figsize=(15, 2 * n_params))

for i, ax in enumerate(ax_list):
  ax.plot(grads_bp[:, i], c="#1f77b4", label="back propagation")
  ax.scatter(range(n_steps), grads_ps[:, i], c="#ff7f0e", marker="^", label="parameters shift")
  ax.set_ylabel("grad of param{0}".format(i))
  ax.axhline(color='black', lw=0.5)



A simple 2 qubit model for a simple 2 classification task#

Firstly we create a dataset. The dataset is a simple 2 classification dataset from Jiang et al. (2020).


[ ]:
from torchpack.datasets.dataset import Dataset

class Classification2Dataset(
    def __init__(self, num=11): = [] = []
        sum0 = 0
        sum1 = 0
        for x in np.linspace(0, 1, num=num):
            for y in np.linspace(0, 1, num=num):
      [x, y]))
                if (x**2 + y**2 <= 0.55**2 or (x-1)**2 + (y-1)**2 <= 0.55**2):
                    sum1 = sum1 + 1
                    sum0 = sum0 + 1

    def __getitem__(self, idx):
        return {'data':[idx], 'target':[idx]}

    def __len__(self):
        return len( - 1

class Simple2Class(Dataset):
    def __init__(self):
        train_dataset = Classification2Dataset()
        valid_dataset = Classification2Dataset(num=10)
        datasets = {'train': train_dataset, 'valid': valid_dataset, 'test': valid_dataset}

Then we create our quantum circuit


The circuit only contains three trainable parameters. When executing the model, we firstly transform the input (x, y) to the phase \(\arcsin(\sqrt{x+y-xy})\) and feed the phase to an RY gate. This is the encoding. After the ansatz, the 2 expectation values from 2 measures are the circuit outputs. Outside the circuit, we add a logsoftmax function to the output and get the predictions of each class.

[ ]:
class Q2Model(tq.QuantumModule):
    class Ansatz(tq.QuantumModule):
        def __init__(self):
            self.n_wires = 2
            self.op1 = tq.RZ(has_params=True, trainable=True)
            self.op2 = tq.RY(has_params=True, trainable=True)
            self.op3 = tq.RY(has_params=True, trainable=True)
            self.op4 = tq.CNOT(has_params=False, trainable=False)

        def forward(self, q_device: tq.QuantumDevice):
            self.q_device = q_device
            self.op1(self.q_device, wires=0)
            self.op2(self.q_device, wires=1)
            self.op3(self.q_device, wires=0)
            self.op4(self.q_device, wires=[0, 1])

    def __init__(self):
        self.n_wires = 2
        self.q_device = tq.QuantumDevice(n_wires=self.n_wires)
        self.encoder = tq.GeneralEncoder([{'input_idx': [0], 'func': 'ry', 'wires': [0]}])

        self.ansatz = self.Ansatz()

        self.measure = tq.MeasureAll(tq.PauliZ)

    def forward(self, x, use_qiskit=False):
        bsz = x.shape[0]
        data = 2 * torch.arcsin(torch.sqrt(x[:, 0] + x[:, 1] - 2 * x[:, 0] * x[:, 1])).reshape(bsz, 1)

        if use_qiskit:
            data = self.qiskit_processor.process_parameterized(
                self.q_device, self.encoder, self.ansatz, self.measure, data)
            self.encoder(self.q_device, data)
            data = self.measure(self.q_device)

        data = data.reshape(bsz, 2)

        return data

Load the dataset.

[ ]:
dataset = Simple2Class()
dataflow = dict()
for split in dataset:
    sampler =[split])
    dataflow[split] =
[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0]
[1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1]
[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 1, 0, 0, 0, 0, 0]
[1, 1, 1, 1, 0, 0, 0, 0, 0, 0]
[1, 1, 1, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 1, 1, 1]
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
/usr/local/lib/python3.7/dist-packages/torch/utils/data/ UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.

Define train and valid function. The model is a 2-qubit model so there is a slightly difference to the process of the circuit output.

[ ]:
def train_2qubit(dataflow, model, device, optimizer, qiskit=False, input_name = 'data', target_name = 'target'):
    for feed_dict in dataflow['train']:
        inputs = feed_dict[input_name].to(device)
        targets = feed_dict[target_name].to(device)

        with torch.no_grad():
            outputs, grad_list = shift_and_run(model, inputs, use_qiskit=qiskit)
        prediction = F.log_softmax(outputs, dim=1)
        loss = F.nll_loss(prediction, targets)
        for i, param in enumerate(model.parameters()):
            param.grad = torch.sum(grad_list[i] * outputs.grad).to(dtype=torch.float32, device=param.device).view(param.shape)
        print(f"loss: {loss.item()}", end='\r')

def valid_test_2qubit(dataflow, split, model, device, qiskit=False, input_name = 'data', target_name = 'target'):
    target_all = []
    output_all = []
    with torch.no_grad():
        for feed_dict in dataflow[split]:
            inputs = feed_dict[input_name].to(device)
            targets = feed_dict[target_name].to(device)

            outputs = model(inputs, use_qiskit=qiskit)
            prediction = F.log_softmax(outputs, dim=1)

        target_all =, dim=0)
        output_all =, dim=0)

    _, indices = output_all.topk(1, dim=1)
    masks = indices.eq(target_all.view(-1, 1).expand_as(indices))
    size = target_all.shape[0]
    corrects = masks.sum().item()
    accuracy = corrects / size
    loss = F.nll_loss(output_all, target_all).item()

    print(f"{split} set accuracy: {accuracy}")
    print(f"{split} set loss: {loss}")

Train and valid the model on ibmq_quito. You need to import QiskitProcessor from torchquantum.plugin to create a processor that handles your access to real quantum computer. You can set whether use real quantum computer or qiskit’s noise model, and the backend of your quantum computer. Call model.set_qiskit_processor to attach the processor to your model.

[ ]:
from torchquantum.plugin import QiskitProcessor
model = Q2Model().to(device)
processor_real_qc = QiskitProcessor(use_real_qc=True, backend_name='ibmq_quito')

n_epochs = 5
optimizer = optim.Adam(model.parameters(), lr=5e-2, weight_decay=1e-4)
scheduler = CosineAnnealingLR(optimizer, T_max=n_epochs)
for epoch in range(1, n_epochs + 1):
    # train
    print(f"Epoch {epoch}:")
    train_2qubit(dataflow, model, device, optimizer, qiskit=True)
    # valid
    valid_test_2qubit(dataflow, 'valid', model, device, qiskit=True)
# test
valid_test_2qubit(dataflow, 'test', model, device, qiskit=True)

Epoch 1:
/usr/local/lib/python3.7/dist-packages/torch/utils/data/ UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
/usr/local/lib/python3.7/dist-packages/qiskit/circuit/ DeprecationWarning: The QuantumCircuit.combine() method is being deprecated. Use the compose() method which is more flexible w.r.t circuit register compatibility.
  return self.combine(rhs)
[2022-03-02 05:03:08.183] Before transpile: {'depth': 5, 'size': 7, 'width': 4, 'n_single_gates': 4, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'ry': 3, 'rz': 1, 'cx': 1, 'measure': 2}}
[2022-03-02 05:03:08.472] After transpile: {'depth': 9, 'size': 14, 'width': 7, 'n_single_gates': 11, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'sx': 5, 'rz': 6, 'cx': 1, 'measure': 2}}

Job Status: job is being validated Job Status: job is queued (18) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (18) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (18) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run

[2022-03-02 09:07:04.863] Before transpile: {'depth': 5, 'size': 7, 'width': 4, 'n_single_gates': 4, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'ry': 3, 'rz': 1, 'cx': 1, 'measure': 2}}
[2022-03-02 09:07:04.885] After transpile: {'depth': 9, 'size': 14, 'width': 7, 'n_single_gates': 11, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'sx': 5, 'rz': 6, 'cx': 1, 'measure': 2}}

Job Status: job is being validated Job Status: job is queued (17) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (17) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (17) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run

[2022-03-02 09:10:45.202] Before transpile: {'depth': 5, 'size': 7, 'width': 4, 'n_single_gates': 4, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'ry': 3, 'rz': 1, 'cx': 1, 'measure': 2}}
[2022-03-02 09:10:45.220] After transpile: {'depth': 9, 'size': 14, 'width': 7, 'n_single_gates': 11, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'sx': 5, 'rz': 6, 'cx': 1, 'measure': 2}}

Job Status: job is being validated Job Status: job is queued (16) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (16) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (16) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run

[2022-03-02 09:16:00.892] Before transpile: {'depth': 5, 'size': 7, 'width': 4, 'n_single_gates': 4, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'ry': 3, 'rz': 1, 'cx': 1, 'measure': 2}}
[2022-03-02 09:16:00.912] After transpile: {'depth': 9, 'size': 14, 'width': 7, 'n_single_gates': 11, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'sx': 5, 'rz': 6, 'cx': 1, 'measure': 2}}

Job Status: job is being validated Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run

[2022-03-02 09:19:42.755] Before transpile: {'depth': 5, 'size': 7, 'width': 4, 'n_single_gates': 4, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'ry': 3, 'rz': 1, 'cx': 1, 'measure': 2}}
[2022-03-02 09:19:42.778] After transpile: {'depth': 9, 'size': 14, 'width': 7, 'n_single_gates': 11, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'sx': 5, 'rz': 6, 'cx': 1, 'measure': 2}}

Job Status: job is being validated Job Status: job is queued (14) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (14) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (14) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run

[2022-03-02 09:23:10.114] Before transpile: {'depth': 5, 'size': 7, 'width': 4, 'n_single_gates': 4, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'ry': 3, 'rz': 1, 'cx': 1, 'measure': 2}}
[2022-03-02 09:23:10.137] After transpile: {'depth': 9, 'size': 14, 'width': 7, 'n_single_gates': 11, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'sx': 5, 'rz': 6, 'cx': 1, 'measure': 2}}

Job Status: job is being validated Job Status: job is queued (14) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (14) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (14) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run

[2022-03-02 09:24:51.741] Before transpile: {'depth': 5, 'size': 7, 'width': 4, 'n_single_gates': 4, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'ry': 3, 'rz': 1, 'cx': 1, 'measure': 2}}
[2022-03-02 09:24:51.767] After transpile: {'depth': 9, 'size': 14, 'width': 7, 'n_single_gates': 11, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'sx': 5, 'rz': 6, 'cx': 1, 'measure': 2}}

Job Status: job is being validated Job Status: job is queued (13) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (13) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run


Job Status: job is being validated Job Status: job is queued (13) Job Status: job is queued (1) Job Status: job is actively running Job Status: job has successfully run

[2022-03-02 09:28:21.188] Before transpile: {'depth': 5, 'size': 7, 'width': 4, 'n_single_gates': 4, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'ry': 3, 'rz': 1, 'cx': 1, 'measure': 2}}
[2022-03-02 09:28:21.242] After transpile: {'depth': 9, 'size': 14, 'width': 7, 'n_single_gates': 11, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'sx': 5, 'rz': 6, 'cx': 1, 'measure': 2}}
Job Status: job has successfully run
[2022-03-02 09:32:01.063] Before transpile: {'depth': 5, 'size': 7, 'width': 4, 'n_single_gates': 4, 'n_two_gates': 1, 'n_three_more_gates': 0, 'n_gates_dict': {'ry': 3, 'rz': 1, 'cx': 1, 'measure': 2}}
valid set accuracy: 0.9494949494949495 valid set loss: 0.47831726414734255 Epoch 2:

loss: 0.566702274514434
