For more info : Introduction to EBM_072514
***Peeking Into The Seminar***
The basic concept of statistical models and machine learning is the measurement of dependence between variables. After capturing the dependence of input and output values and algorithms, it trains a machine to offer the highest dependence values from the new input values.
Energy-Based Models (EBMs) encode dependence by defining the energy of the input and output arrays. If the correlation is high, the energy of the array should be low. In the opposite case, energy function should be defined as high energy. Hence, the EBM is able to output the lowest value of energy for the input values.
In EBM, as the energy function is the function for the parameters, input values, and output values’ elements, it is possible to define the loss function which measures the quality of the energy function. Loss of function is usually a parameter’s function and it is an L(W). By finding the minimum loss function, it is able to find an optimized energy function. This is an EBM’s training process.
Y=x^2 training is a simple test to realize a machine algorithm. After exporting samples which can satisfy 200 y=x^2 between -1 and 1, the samples train a machine. Here, flat energy changes can be tested.
Classic thermophysics can only calculate macrostates. For example, temperature, volume, number of molecules in objects, and so forth. However, even for the same macrostate, there are all different microstates of molecule arrays, locations, quantum states, and phase space states, or energy states. A microstate’s average should be the same as the measurement of the macrostate, and statistical dynamics will occur there.
For the most basic concept, there is the Ensemble concept. It can think of virtual concepts which is a collection of a myriad of microstates in macrostates. For the most common Canonical Ensemble, the volume, temperature, and certain number of molecules handle certain macrostates. For this case, microstates of energy can be different; each state is explained through probability distribution.
Also, like in EBM, when there is a yk value for any xi input, let (xi = yk) where it is a microstate. Then, so many microstates, (xi,y1), (xi,y2) …(xi,y)… can constitute its ensemble. This time, the probability of microstates can follow the Boltzmann distribution according to the earlier defined energy. RBM is a probability model that uses Nll loss function in EBM, it defines energy function and also defines loss of function, too. As a result, the process of minimizing the loss of function can be the learning method for RBM.