Phase kickback is a useful concept in quantum algorithm design. It provides a framework to understand many famous quantum algorithms, such as Shor’s algorithm, the phase estimation algorithm, Deutsch algorithm, Simon’s algorithm, etc.

Its main idea can be demonstrated with two qubits. Suppose we are given a one-qubit unitary gate $$U$$ (the term ‘gate’ is interchangable with ‘time evolution’ or ‘operator’) and one of its eigenstates $$\left|\psi\right>$$, i.e.,

$U\left|\psi\right> = e^{i\phi}\left|\psi\right>$

and our task is to find out what $$\phi$$ is (up to $$2\pi$$ wraps).

Note that there are two ways to think of this unitary gate

• white box: $$U$$ is known but too difficult to diagonalize
• black box: $$U$$ is not known but we are given a button to click which applies it on the qubit

The word ‘known’ is a bit tricky. Does it mean knowing at high-level what $$U$$ is, or knowing every entry of the matrix if it has a matrix representation, or knowing every entry of the corresponding Hamiltonian? Fortunately, for our purposes, this detail can be omitted.

Recall that pure phase factor on a quantum state is not measurable (commonly known as $$U(1)$$ symmetry), only relative phase on different states is. Thus given only the unitary gate and the eigenstate, there is no hope to get $$\phi$$. With some extra resources, it becomes possible, and it is exactly the aim of phase kickback. The three extra resources in phase kickback are

• an extra qubit, commonly known as ancilla (the latin word for ‘maid’) qubit
• a way to do Hadamard gate on the ancilla qubit
• a way to do controlled-unitary gate on the two qubits

In the more general case where $$U$$ acts on multiple qubits, more ancilla qubits may be needed. If we have a general-purpose quantum computer, all these resources are available.

Before we proceed, we can also think about whether less resources can be used to extract $$\phi$$. For example, what if we only have all possible single-qubit gates at hand, i.e., we can render any unitary evolution on $$\left|\psi\right>$$? Depending on how much we know about $$\left|\psi\right>$$, it could be possible to get $$\phi$$. I will leave these details to you to think about.

The concept of controlled-unitary gate may also need some explanation. In our two-qubit example, it is a two-qubit gate whose action on the controlled qubit depends on the state of the control qubit. Usually, the ancilla qubit (or qubits) is used as control. If the control qubit is in state $$\left|0\right>$$, then nothing happens to the controlled qubit. If the control qubit is in state $$\left|1\right>$$, then the single-qubit unitary gate is applied to the controlled qubit. Also if the control qubit is in a superposition state, the superposition of the action happens. We are only interested in controlled-unitary gates instead of general controlled gates since the time evolution of quantum systems is unitary.

In our two-qubit example, the controlled-unitary gate has an explicit matrix form

$C(U) = \begin{bmatrix} 1& 0 & 0 & 0\\ 0& 1 & 0 & 0\\ 0& 0 & u_{00} & u_{01}\\ 0& 0 & u_{10} & u_{11} \end{bmatrix}$

where $$u_{ij}$$ are the matrix components of the single-qubit unitary gate. Note also that the concept of controlled gate is not new. For example, the well known XOR gate in classical logic is a controlled-NOT gate (with only one output bit though, the one being controlled). In fact, if we keep the control bit as well in the output, we get exactly the same two-qubit controlled-NOT (CNOT) gate with matrix form

$CNOT= \begin{bmatrix} 1& 0 & 0 & 0\\ 0& 1 & 0 & 0\\ 0& 0 & 0 & 1\\ 0& 0 & 1 & 0 \end{bmatrix}$

For classical bits, the input to this CNOT gate can only be one of the four unit vectors, corresponding to state 00, 01, 10, and 11.

Now we are ready to describe the phase kickback protocol: start with $$\left|0\right>\left|\psi\right>$$, apply the Hadamard gate on the ancilla qubit and then the controlled-unitary gate on the two qubits. One can easily verify the resultant state as

$C(U) H\otimes I\left|0\right>\left|\psi\right> =\frac{\left|0\right>+e^{i\phi}\left|1\right>}{\sqrt 2}\left|\psi\right>$

where $$I$$ is the identity matrix, $$\otimes$$ is the tensor product operation, and the one-qubit Hadamard gate is

$H = \frac{1}{\sqrt 2}\begin{bmatrix} 1& 1 \\ 1& -1 \end{bmatrix}.$

Note that the overall effect is to add a phase shift to the control (ancilla) qubit. This is opposite to the common sense that the control bit remains intact and the controlled bit changes. And this is why it is called phase kickback.

For example, the controlled-unitary gate could be a controlled-phase (CPhase) gate and the eigenstate could be $$\left|\psi\right> = \left|1\right>$$. Then we have

$C_\phi H\otimes I\left|0\right>\left|1\right> =\frac{\left|0\right>+e^{i\phi}\left|1\right>}{\sqrt 2}\left|1\right>$

where the quantum gates are given by

$C_\phi = \begin{bmatrix} 1& 0 & 0 & 0\\ 0& 1 & 0 & 0\\ 0& 0 & 1 & 0\\ 0& 0 & 0 & e^{i\phi} \end{bmatrix}$

To further extract the phase $$\phi$$ on the ancilla qubit, there are various options. The most straightforward one is to generate many copies of this state, and keep measuring the three physical observables

$\left<\sigma_x\right> = \cos\phi, \quad \left<\sigma_y\right> = \sin\phi, \quad \left<\sigma_z\right> = 0$

where $$\sigma_i$$’s are the Pauli matrices. This is basically the Bloch sphere representation of quantum two-level systems. Thus in principle $$\phi$$ can be determined as accurate as one wishes. However, it is not efficient to estimate $$\phi$$ this way (unless in special situations, say $$\phi$$ is known to be one of a few possible values) due to the cost of generating the copies. There are other efficient ways to measure $$\phi$$, for example, using quantum Fourier transform (see a future blog post). The details won’t be included here.

It turns out that many quantum algorithms boil down to somehow encode the answer in the phase of the ancilla qubits, with the help of controlled-unitary gates. Thus it is very helpful to think in this phase kickback framework.

To be more concrete, let’s see how the Deutch algorithm works. If it’s your first time to see it, it may blow your mind (actually all the famous quantum algorithms have this effect). The complete version of Deutch algorithm solves the following problem: given a binary function $$f$$ defined on binary strings of $$n$$ digits,

$f:\{0, 1\}^n \rightarrow \{0, 1\},$

with the guarantee that $$f$$ is either constant or balanced (meaning that half of the answer is 0), determine whether it is constant or balanced.

On a classical computer, we will have to evaluate $$2^{n-1}+1$$ inputs to know the answer. In contrast, Deutch algorithm needs only one measurement on $$n$$ qubits: an exponential speedup! For simplicity, I will only consider the $$n=1$$ case. Generalizing to the $$n$$-qubit case is somewhat straightforward. Again, in this simplified situation, classically we need to evaluate two inputs. With a quantum computer, we only need one readout on one qubit.

The first thing we need to worry about is how to implement $$f$$. Obviously it cannot be applied on quantum states directly since its domain is binary strings. Without much surprise, it is implemented as a controlled unitary gate, with the action

$\left|x\right>\left|y\right> \longrightarrow \left|x\right> \left|f(x)\oplus y\right>$

where $$\oplus$$ is the XOR operation. This action is obviously reversible, thus there must be a way to implement it on a quantum computer as a unitary time evolution. Let’s call it $$U_f$$. Note that its action on $$\left|y\right>$$ is controlled by the value of $$f(x)$$ instead of the ancilla qubit state directly. It is a CNOT gate controlled by the value of $$f(x)$$. Let me also assure you that in the $$n$$-qubit case, such $$U_f$$ can be implemented on a quantum computer without exponential trouble. Thus the exponential speed up is real.

With a hardware implementation of $$U_f$$, we can apply it on a special initial state, i.e.,

$C(U_f) H\otimes H\left|0\right>\left|1\right>$

where the first qubit is the ancilla qubit, and the controlled qubit is in the eigenstate of the NOT gate (or X-gate since it’s the Pauli x matrix) $$H\left|1\right>$$ with eigenvalue $$-1$$, or equivalently $$\phi=\pi$$. After simplification, the ancillar qubit is given by

$\frac{1}{\sqrt 2}\left(\left|0\right> + (-1)^{f(0) \oplus f(1)}\left|1\right>\right)$

Recall that $$f$$ is either constant or balanced. Thus the phase kickback is either $$0$$ or $$\pi$$, which is particularly easy to distinguish since they are eigenstates of $$\sigma_x$$. One can measure $$\sigma_x$$ just once to tell them apart, with only one copy of the input state. In the quantum computing framework, the equivalent procedure is to first apply Hadamard gate and then check whether the state is $$\left|0\right>$$ or $$\left|1\right>$$.

Admittedly, Deutch algorithm is quite artificial. But it opens up the possibility of exponential speedup on quantum computers. Many quantum algorithms more efficient than their classical counterparts are thus discovered and some of them are indeed practically useful.