# 统计代写|统计推断代写Statistical inference代考|STAT3923

Doug I. Jones

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础
## 统计代写|统计推断代写Statistical inference代考|Motivation: Election polls

Let us consider the following “practical” question.
One of $L$ candidates for an office is about to be selected by a populationcandidate. How do we predict the winner via an opinion poll?
A (naive) model of the situation could be as follows. Let us represent the preference of a particular voter by his preference vector-a basic orth $e$ in $\mathbf{R}^{L}$ with unit entry in a position $\ell$ meaning that the voter is about to vote for the $\ell$-th candidate. The entries $\mu_{\ell}$ in the average $\mu$, over the population, of these vectors are the fractions of votes in favor of the $\ell$-th candidate, and the elected candidate is the one “indexing” the largest of the $\mu_{\ell}$ ‘s. Now assume that we select at random, from the uniform distribution, a member of the population and observe his preference vector. Our observation $\omega$ is a realization of a discrete random variable taking values in the set $\Omega=\left{e_{1}, \ldots, e_{L}\right}$ of basic orths in $\mathbf{R}^{L}$, and $\mu$ is the distribution of $\omega$ (technically, the density of this distribution w.r.t. the counting measure $\Pi$ on $\Omega$ ). Selecting a small threshold $\delta$ and assuming that the true unknown to us $-\mu$ is such that the largest entry in $\mu$ is at least by $\delta$ larger than every other entry and that $\mu_{\ell} \geq \frac{1}{N}$ for all $\ell, N$ being the population size, ${ }^{13}$ we can model the population preference for the $\ell$-th candidate with
\begin{aligned} \mu \in M_{\ell} &=\left{\mu \in \mathbf{R}^{d}: \mu_{i} \geq \frac{1}{N}, \sum_{i} \mu_{i}=1, \mu_{\ell} \geq \mu_{i}+\delta \forall(i \neq \ell)\right} \ & \subset \mathcal{M}=\left{\mu \in \mathbf{R}^{d}: \mu>0, \sum_{i} \mu_{i}=1\right} \end{aligned}

## 统计代写|统计推断代写Statistical inference代考|Sequential hypothesis testing

In view of the above analysis, when predicting outcomes of “close run” elections, huge poll sizes are necessary. It, however, does not mean that nothing can be done in order to build more reasonable opinion polls. The classical related statistical idea, going back to Wald [236], is to pass to sequential tests where the observations are processed one by one, and at every instant we either accept some of our hypotheses and terminate, or conclude that the observations obtained so far are insufficient to make a reliable inference and pass to the next observation. The idea is that a properly built sequential test, while still ensuring a desired risk, will be able to make “early decisions” in the case when the distribution underlying observations is “well inside” the true hypothesis and thus is far from the alternatives. Let us show $\mathcal{C}{s}$ closeness: hypotheses in the tuple $\left{G{2 \ell-1}^{s}: \mu \in M_{\ell}, G_{2 \ell}^{s}: \mu \in M_{\ell}^{s}, 1 \leq \ell \leq 3\right}$ are not $\mathcal{C}{s}$-close to each other if the corresponding $M$-sets belong to different areas and at least one of the sets is painted dark, like $M{1}^{s}$ and $M_{2}$, but not $M_{1}$ and $M_{2}$.
how our machinery can be utilized to conceive a sequential test for the problem of predicting the outcome of $L$-candidate elections. Thus, our goal is, given a small threshold $\delta$, to decide upon $L$ hypotheses (2.94). Let us act as follows.

1. We select a factor $\theta \in(0,1)$, say, $\theta=10^{-1 / 4}$, and consider thresholds $\delta_{1}=\theta$, $\delta_{2}=\theta \delta_{1}, \delta_{3}=\theta \delta_{2}$, and so on, until for the first time we get a threshold $\leq \delta$; to save notation, we assume that this threshold is exactly $\delta$, and let the number of the thresholds be $S$.
2. We split somehow (e.g., equally) the risk $\epsilon$ which we want to guarantee into $S$ portions $\epsilon_{s}, 1 \leq s \leq S$, so that $\epsilon_{s}$ are positive and
$$\sum_{s=1}^{S} \epsilon_{s}=\epsilon .$$

# 统计推断代考

\begin{对齐} \mu \in M_{\ell} &=\left{\mu \in \mathbf{R}^{d}: \mu_{i} \geq \frac{1}{N}, \ sum_{i} \mu_{i}=1, \mu_{\ell} \geq \mu_{i}+\delta \forall(i \neq \ell)\right} \ & \subset \mathcal{M}= \left{\mu \in \mathbf{R}^{d}: \mu>0, \sum_{i} \mu_{i}=1\right} \end{aligned}\begin{aligned} \mu \in M_{\ell} &=\left{\mu \in \mathbf{R}^{d}: \mu_{i} \geq \frac{1}{N}, \sum_{i} \mu_{i}=1, \mu_{\ell} \geq \mu_{i}+\delta \forall(i \neq \ell)\right} \ & \subset \mathcal{M}=\left{\mu \in \mathbf{R}^{d}: \mu>0, \sum_{i} \mu_{i}=1\right} \end{aligned}

1. 㧴们以杲种方式 (例如，平等地) 分割风险㧴们要保证 $S$ 部分 $\epsilon_{s}, 1 \leq s \leq S$ ， 以便 $\epsilon_{s}$ 是积极的并且

