In this tutorial, I will simply go over how to use recently released TensorFlow Probability package to do Hamiltonian Monte Carlo(HMC) sampling. It makes doing HMC easier for everybody. First, you need to install TensorFlow and TensorFlow_probability if you haven't done so. The task we want to achieve is to get sufficient amount of samples \(\phi\)from a given probability distribution: \[P_{target}=\frac{\exp(-S[\phi])}{\mathcal{Z}}\]
,where \(\mathcal{Z}\) is the partition function, and in physics, we call \(S[\phi]\) the action of field \(\phi\). If you are not familiar with physics, it is fine, and this tutorial is not about physics. It is just the notations. And you should notice that any probability distribution can be written into this form. \(\phi=(\phi_{0}, \phi_{1}...)\) is a multi-dimensional vector, or a field configuration. From now on, we will use \(\phi\) to implicitly denote a vector.
The only difference between HMC and traditional MCMC is about how to update one configuration to get the next one. In HMC, we view \(S[\phi]\) as a potential for \(\phi\). In analog to classical mechanics, we can add a kinetic energy term: \(K(\pi)=\frac{1}{2}m \pi^{2}\) to write down the total energy or Hamiltonian:
\[H = K(\pi)+S[\phi]\]
Usually, we will set \(m=1\). This is just to fix the total energy scale, and TensorFlow implicitly encode this in the package. In order to get an updated configuration, we evolve this configuration in the phase space (\(\phi, \pi\)) using Hamiltonian equation:
\[\frac{d\phi}{dt} = \frac{\partial H}{\partial \pi}, \frac{d\pi}{dt}=-\frac{\partial H}{\partial \phi}\]
We let the system evolve a small amount of time \(\delta t\), to get the updated configuration \(\phi^{'}\). Then the accept/reject procedure is the same as traditional MCMC. In the following, I will demonstrate how to use TensorFlow Probability to do sample on complex \(\phi^{4}\) theory. If you are not familiar with the theory, the essential part is the probability density of \(\phi\) is given by:
\[S[\phi]=-k\sum_{<i,j>}\phi^{*}_{i}\phi_{j}+\sum_{i}m|\phi_{i}|^{2}+\sum_{I}\lambda |\phi_{i}|^{4}\]
where \(\phi_{i}\) is a complex field lives on two dimensional lattice, and <i,j> denotes the i-th and j-th locations are the nearest neighbor.
First, we include necessary packages.
,where \(\mathcal{Z}\) is the partition function, and in physics, we call \(S[\phi]\) the action of field \(\phi\). If you are not familiar with physics, it is fine, and this tutorial is not about physics. It is just the notations. And you should notice that any probability distribution can be written into this form. \(\phi=(\phi_{0}, \phi_{1}...)\) is a multi-dimensional vector, or a field configuration. From now on, we will use \(\phi\) to implicitly denote a vector.
The only difference between HMC and traditional MCMC is about how to update one configuration to get the next one. In HMC, we view \(S[\phi]\) as a potential for \(\phi\). In analog to classical mechanics, we can add a kinetic energy term: \(K(\pi)=\frac{1}{2}m \pi^{2}\) to write down the total energy or Hamiltonian:
\[H = K(\pi)+S[\phi]\]
Usually, we will set \(m=1\). This is just to fix the total energy scale, and TensorFlow implicitly encode this in the package. In order to get an updated configuration, we evolve this configuration in the phase space (\(\phi, \pi\)) using Hamiltonian equation:
\[\frac{d\phi}{dt} = \frac{\partial H}{\partial \pi}, \frac{d\pi}{dt}=-\frac{\partial H}{\partial \phi}\]
We let the system evolve a small amount of time \(\delta t\), to get the updated configuration \(\phi^{'}\). Then the accept/reject procedure is the same as traditional MCMC. In the following, I will demonstrate how to use TensorFlow Probability to do sample on complex \(\phi^{4}\) theory. If you are not familiar with the theory, the essential part is the probability density of \(\phi\) is given by:
\[S[\phi]=-k\sum_{<i,j>}\phi^{*}_{i}\phi_{j}+\sum_{i}m|\phi_{i}|^{2}+\sum_{I}\lambda |\phi_{i}|^{4}\]
where \(\phi_{i}\) is a complex field lives on two dimensional lattice, and <i,j> denotes the i-th and j-th locations are the nearest neighbor.
First, we include necessary packages.
Then we define our two dimensional lattice complex \(\phi^{4}\) theory, and its parameters. (The parameters are chosen in order to create a deep Mexican hat, and the model is reduced to XY model controlled by temperature T). This part is not essential for those who are not interested in physics, and is safe to skip.
Now we define the target_log_prob that we want to sample from. Notice TensorFlow distribution will take x as dimension:[batch, vector dimension]
The important part is to define the following HMC kernel.
Now we can do the HMC sampling from the defined the target_log_prob.