# 统计代写|贝叶斯分析代写Bayesian Analysis代考|Some useful results from probability theory

#### Doug I. Jones

## 统计代写|贝叶斯分析代写Bayesian Analysis代考|Some useful results from probability theory

We assume the reader is familiar with elementary manipulations involving probabilities and probability distributions. In particular, basic probability background that must be well understood for key parts of the book includes the manipulation of joint densities, the definition of simple moments, the transformation of variables, and methods of simulation. In this section we briefly review these assumed prerequisites and clarify some further notational conventions used in the remainder of the book. Appendix A provides information on some commonly used probability distributions.

As introduced in Section 1.3, we generally represent joint distributions by their joint probability mass or density function, with dummy arguments reflecting the name given to each variable being considered. Thus for two quantities $u$ and $v$, we write the joint density as $p(u, v)$; if specific values need to be referenced, this notation will be further abused as with, for example, $p(u, v=1)$.

In Bayesian calculations relating to a joint density $p(u, v)$, we will often refer to a conditional distribution or density function such as $p(u \mid v)$ and a marginal density such as $p(u)=\int p(u, v) d v$. In this notation, either or both $u$ and $v$ can be vectors. Typically it will be clear from the context that the range of integration in the latter expression refers to the entire range of the variable being integrated out. It is also often useful to factor a joint density as a product of marginal and conditional densities; for example, $p(u, v, w)=p(u \mid v, w) p(v \mid w) p(w)$.

Some authors use different notations for distributions on parameters and observablesfor example, $\pi(\theta), f(y \mid \theta)$-but this obscures the fact that all probability distributions have the same logical status in Bayesian inference. We must always be careful, though, to indicate appropriate conditioning; for example, $p(y \mid \theta)$ is different from $p(y)$. In the interests of conciseness, however, our notation hides the conditioning on hypotheses that hold throughout-no probability judgments can be made in a vacuum-and to be more explicit one might use a notation such as the following:
$$p(\theta, y \mid H)=p(\theta \mid H) p(y \mid \theta, H)$$
where $H$ refers to the set of hypotheses or assumptions used to define the model. Also, we sometimes suppress explicit conditioning on known explanatory variables, $x$.
We use the standard notations, $\mathrm{E}(\cdot)$ and $\operatorname{var}(\cdot)$, for mean and variance, respectively:
$$\mathrm{E}(u)=\int u p(u) d u, \quad \operatorname{var}(u)=\int(u-\mathrm{E}(u))^2 p(u) d u .$$

## 统计代写|贝叶斯分析代写Bayesian Analysis代考|Modeling using conditional probability

Useful probability models often express the distribution of observables conditionally or hierarchically rather than through more complicated unconditional distributions. For example, suppose $y$ is the height of a university student selected at random. The marginal distribution $p(y)$ is (essentially) a mixture of two approximately normal distributions centered around 160 and 175 centimeters. A more useful description of the distribution of $y$ would be based on the joint distribution of height and sex: $p$ (male $) \approx p$ (female $) \approx \frac{1}{2}$, along with the conditional specifications that $p(y \mid$ female $)$ and $p(y \mid$ male $)$ are each approximately normal with means 160 and $175 \mathrm{~cm}$, respectively. If the conditional variances are not too large, the marginal distribution of $y$ is bimodal. In general, we prefer to model complexity with a hierarchical structure using additional variables rather than with complicated marginal distributions, even when the additional variables are unobserved or even unobservable; this theme underlies mixture models, as discussed in Chapter 22 . We repeatedly return to the theme of conditional modeling throughout the book.
Means and variances of conditional distributions
It is often useful to express the mean and variance of a random variable $u$ in terms of the conditional mean and variance given some related quantity $v$. The mean of $u$ can be obtained by averaging the conditional mean over the marginal distribution of $v$,
$$\mathrm{E}(u)=\mathrm{E}(\mathrm{E}(u \mid v))$$
where the inner expectation averages over $u$, conditional on $v$, and the outer expectation averages over $v$. Identity (1.8) is easy to derive by writing the expectation in terms of the joint distribution of $u$ and $v$ and then factoring the joint distribution:
$$\mathrm{E}(u)=\iint u p(u, v) d u d v=\iint u p(u \mid v) d u p(v) d v=\int \mathrm{E}(u \mid v) p(v) d v$$

# 贝叶斯分析代考

## 统计代写|贝叶斯分析代写Bayesian Analysis代考|Some useful results from probability theory

$$p(\theta, y \mid H)=p(\theta \mid H) p(y \mid \theta, H)$$

$$\mathrm{E}(u)=\int u p(u) d u, \quad \operatorname{var}(u)=\int(u-\mathrm{E}(u))^2 p(u) d u .$$

## 统计代写|贝叶斯分析代写Bayesian Analysis代考|Modeling using conditional probability

$$\mathrm{E}(u)=\mathrm{E}(\mathrm{E}(u \mid v))$$

$$\mathrm{E}(u)=\iint u p(u, v) d u d v=\iint u p(u \mid v) d u p(v) d v=\int \mathrm{E}(u \mid v) p(v) d v$$

