Bayesian Inference and I Know First’s Application

 

 

This article was written by Kwon Sok Oh, a Financial Analyst at I Know First.

 

 

Bayesian Inference and I Know First’s Application

In this article, we will examine formal mathematical definitions and statistical intuitions of Bayesian inference concepts. The explanations given in this article will be complemented step-by-step through examples using the changes in Bank of America stock prices. A brief description of I Know First’s Bayesian Neural Networks will be given at the end.

Summary

  1. Introduction of Bayesian Inference and BAC Example
  2. Prior Distribution: Concepts and BAC Example
  3. Likelihood Function: Concepts and BAC Example
  4. Posterior Distribution: Concepts and BAC Example
  5. Posterior Predictive Distribution: Concepts and BAC Example
  6. Conclusion of Discussion and BAC Example
  7. I Know First and Bayesian Neural Networks

1. Introduction of Bayesian Inference and BAC Example

Given an original population distribution and parameters of such distribution, Bayesian inference incorporates new observations to improve the parameter distributions and creates a predictive population distribution based on the new parameter distributions. We will illustrate the process of Bayesian inference step-by-step through an example.

Consider the data on Bank of America stock closing prices from June 26th, 2009 to June 26th, 2018. We are, of course, going to exclude weekends and holidays since the stock market does not open during weekends and holidays. We want to use Bayesian inference on the change in stock price, so let’s consider the first 700 observations in stock price changes (from June 26th, 2009 to April 5th, 2012). Graphing the observations in stock price changes, we have the below histogram.

Hence, we will assume that the population distribution is approximately normal. The mean is -0.00502857142857143 and the standard deviation is 0.330974536976335. We will use these figures in the next step for setting up the prior distribution.

2. Prior Distribution: Concepts and BAC Example

The prior distribution is the original distribution that the parameter is assumed to have. Inferring from the above 700 observations, let’s assume that the prior distribution for the unknown mean μ of the population distribution is a normal distribution with known mean η = -0.00502857142857143 and known standard deviation δ = 0.01. Let’s also assume that the population is normally distributed with mean μ and standard deviation σ = 0.330974536976335.

3. Likelihood Function: Concepts and BAC Example

Now, let’s make another 700 new observations, denoted by Y 1,Y 2,…,Y n. (In this case, n = 700, but for the sake of simplicity, let’s keep n as n until all derivations have been made.) These observations are the next 700 observations, from April 5th, 2012 to January 21st, 2015. Using these new observations, we want to find the likelihood function for the mean μ.

Let U = ∑ Y i. From elementary statistical inference, we know that U is a sufficient statistic for μ, so it suffices to work with U instead of Y 1,Y 2,…,Y n. Also, again from elementary statistical inference, U is normally distributed with mean nμ and variance

4. Posterior Distribution: Concepts and BAC Example

From the prior distribution and the likelihood function of the mean μ, we can now derive the posterior distribution of μ. The posterior distribution incorporates the information from the new observations into the prior distribution for an improved distribution model of the mean μ.

The joint density of U and μ is f(u,μ) = L(u|μ) × g(μ), where g(μ) is the density function for μ.

Now, we calculate the posterior density of μ given U = u.

Let the posterior density be denoted by h(μ|u).

Now, let’s plug in n = 700,δ = 0.01,u = 6.03,σ = 0.330974536976335, and η = -0.00502857142857143.

We have η′ = 0.00029045475 and δ′ = 0.00781104026.

Please refer to figure 1 and figure 2 for a graphical representation. We can see that the posterior distribution incorporates information from the likelihood function to improve the prior distribution. Note that the standard deviation of the posterior distribution is still pretty high, reflecting the fact that it is very difficult to predict future changes in stock prices.

5. Posterior Predictive Distribution: Concepts and BAC Example

The last part of this discussion would be to figure out the posterior predictive distribution. The posterior predictive distribution incorporates information from the posterior distribution of the parameter μ into the population distribution. Such incorporation will improve the population distribution model to better represent actual reality.

Let the posterior predictive distribution be denoted by p(y|u).

We have p(y|u) = ∫ p(y|u,μ)h(μ|u)dμ, but in this example, y is independent of u given μ, so p(y|u) = ∫ p(y|μ)h(μ|u)dμ.

Now, let’s go back to the data and check whether the posterior predictive distribution is a better model of reality than the original population distribution.

For the observations from January 21st, 2015 to June 26th, 2018, the mean was 0.0153526023121387 with standard deviation 0.328048626253475. Let’s denote this population as “new population.” The mean of the predictive distribution was η′ = 0.00029045475, which is a significant improvement over the mean of the original population distribution, which was -0.00502857142857143. Please refer to Figure 3 for a graphical representation.

Hence, this is a good example in which Bayesian inference incorporated information from new observations and improved the old model into a more exact new model.

6. Conclusion of Discussion

In this discussion, we examined the definitions, the statistical meanings, and an example of Bayesian inference.

We first assumed a distribution model for the population and a prior distribution the parameter μ. We then used the likelihood function to create the posterior distribution for the parameter, which incorporated information from new observations to improve the prior distribution. Finally, we used the posterior distribution of the parameter to create the posterior predictive distribution of the population, which again incorporated information from new observations to improve the original population distribution model. The posterior predictive distribution of the population turned out to be a more accurate representation of the changes in Bank of America stock prices than the original population distribution model.

7. I Know First and Bayesian Neural Networks

From the founding of the company, I Know First has used probabilistic approaches to forecasting. In Bayesian terms, the historical data represents our prior knowledge of the market behavior with each variable (asset) having its own statistical properties that vary with time. Each asset interacts with other assets in a complex way. Through machine learning, multiple competing models of such interactions are created. Each model has a certain view of the markets based on what it has learned from past history.

Combining that view with the present values of the market gives a conditional distribution of future values. Together these independent competing models create a “cloud” of predictions, with measurable statistical parameters. The end result is a signal that is a weighted average of such predictions according to each past performance, which I Know First denotes as “predictability.” After each trading day new data is added to the data set and is learned, resulting in the updated view of the markets. Thus, each new forecast is based on the latest updated model of the market, which is, in essence, the Bayesian posterior.

Please note-for trading decisions use the most recent forecast. Get today’s forecast and Top stock picks.

To subscribe today click here.