如果你也在 怎样代写机器学习 machine learning这个学科遇到相关的难题,请随时右上角联系我们的24/7代写客服。
机器学习是一个致力于理解和建立 “学习 “方法的研究领域,也就是说,利用数据来提高某些任务的性能的方法。机器学习算法基于样本数据(称为训练数据)建立模型,以便在没有明确编程的情况下做出预测或决定。机器学习算法被广泛用于各种应用,如医学、电子邮件过滤、语音识别和计算机视觉,在这些应用中,开发传统算法来执行所需任务是困难的或不可行的。
couryes-lab™ 为您的留学生涯保驾护航 在代写机器学习 machine learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写机器学习 machine learning代写方面经验极为丰富,各种代写机器学习 machine learning相关的作业也就用不着说。
我们提供的机器学习 machine learning及其相关学科的代写,服务范围广, 其中包括但不限于:
- Statistical Inference 统计推断
- Statistical Computing 统计计算
- Advanced Probability Theory 高等概率论
- Advanced Mathematical Statistics 高等数理统计学
- (Generalized) Linear Models 广义线性模型
- Statistical Machine Learning 统计机器学习
- Longitudinal Data Analysis 纵向数据分析
- Foundations of Data Science 数据科学基础
计算机代写|机器学习代写machine learning代考|Classification and the Learning Pipeline
So far, we have considered supervised learning tasks in which the output variable $y$ is a real number, that is, $y \in \mathbb{R}$. Often, we will deal with problems with binary or categorical output variables, for example, we might be interested in problems such as:
- Will a user click on a product or advertisement? (binary outcome)
- What category of object does an image contain? (multiclass)
- What product is a user most likely to purchase next? (multiclass)
- Which of the two products would a user prefer? (binary)
In this chapter, we will explore how to design classification algorithms for tasks like those above, and in particular explore a classifier that extends the ideas behind regression from Chapter 2 to classification problems.
Logistic Regression sets up classification using a probabilistic framework, by transforming the predictions $X \cdot \theta$ that we used when building regressors into probabilities associated with observing a particular label $y$. By associating a probability with a particular label, and thereby to all of the labels in a dataset, we can again develop prediction frameworks that are differentiable and can be optimized using gradient-based approaches, much as we saw in Section 2.5.
Ultimately logistic regression is just one of dozens of classification schemes; we describe it here rather than alternatives (such as Support Vector Machines (Cortes and Vapnik, 1995), or Random Forest Classifiers (Ho, 1995)) mainly because logistic regression more closely matches the approaches we will develop in later chapters. This same type of modeling approach will be used throughout this book, when building Recommender Systems in Chapter 5, or generating fashionable outfits in Chapter 9 , among others. We will briefly discuss the merits of alternative classification approaches in Section 3.2.
After exploring classification techniques in Section 3.1, we will explore evaluation strategies for classification models in Section 3.3, much as we did for regression models in Chapter 2.
Finally, we will explore the idea of the learning pipeline. Having developed techniques for regression (chap. 2), classification, and evaluation strategies for both, in Section $3.4$ we will explore how to compare models, how to ensure that our results are significant, and how to ensure that our models generalize well to unseen data. This type of end-to-end strategy for model training will be used whenever we train supervised learning models throughout the remainder of the book.
计算机代写|机器学习代写machine learning代考|Other Classification Techniques
In our introduction to classification, we have only discussed a single classification technique: Logistic Regression. Our choice to explore this particular technique was largely a practical one: the idea of associating a probability with a particular outcome (as in eq. (3.5)) and estimating that probability via a differentiable function (to facilitate gradient ascent) will appear repeatedly as we develop more and more complex models.
However, the technique we have explored is only one class of approach to build classifiers. The specific choice to map binary labels to continuous probabilities via a smooth function has hidden assumptions and limitations, meaning that logistic regression is not the ideal classifier for every situation. Below we present a few alternatives, largely as further reading and to highlight specific situations where logistic regression may not be the preferable choice.
Support Vector Machines: While logistic regressors optimize a probability associated with a set of observed labels, they do not explicitly minimize the number of mistakes made by the classifier. Support Vector Machines (SVMs) (Cortes and Vapnik, 1995) replace the sigmoid function in Figure $3.1$ with an expression that assigns zero cost to correctly classified examples, ${ }^1$ and a positive $\operatorname{cost}^2$ to incorrectly classified examples (in proportion to the confidence of the prediction $x \cdot \theta$ ). This distinction is fairly subtle: while every sample will influence the optimal value of $\theta$ for a logistic regressor, the solution found by an SVM is entirely determined by a few samples closest to the classification boundary, or those that are mislabeled. Conceptually it is appealing for a classifier to focus on the most ‘difficult’ samples in this way, though note that in many cases (and notably when building recommender systems) our goal is to optimize ranking performance rather than classification accuracy (as we will discuss in sec. 3.3.3), such that giving special attention to the most ambiguous examples is not necessarily desirable.
到目前为止,我们已经考虑了监督学习任务,其中输出变量$y$是一个实数,即$y \in \mathbb{R}$。通常,我们将处理二进制或分类输出变量的问题,例如,我们可能对以下问题感兴趣:
用户会点击产品或广告吗?(二进制结果)图像包含什么类别的对象?(多类)用户接下来最有可能购买的产品是什么?(multi – class)这两种产品,用户更喜欢哪一种?在本章中,我们将探索如何为上述任务设计分类算法,并特别探索一个分类器,该分类器将第二章回归背后的思想扩展到分类问题
逻辑回归使用概率框架建立分类,通过将我们在构建回归量时使用的预测$X \cdot \theta$转换为与观察特定标签$y$相关的概率。通过将概率与特定标签关联,从而与数据集中的所有标签关联,我们可以再次开发可微的预测框架,并可以使用基于梯度的方法进行优化,就像我们在2.5节中看到的那样
归根结底,逻辑回归只是众多分类方案中的一种;我们在这里描述它而不是替代方法(如支持向量机(Cortes和Vapnik, 1995)或随机森林分类器(Ho, 1995)),主要是因为逻辑回归更接近于我们将在后面章节中开发的方法。同样的建模方法将在本书中使用,在第5章中构建推荐系统时,或在第9章中生成时髦的服装时,等等。我们将在第3.2节简要讨论替代分类方法的优点
计算机代写|机器学习代写machine learning代考|其他分类技术
支持向量机:虽然逻辑回归器优化了与一组观察到的标签相关的概率,但它们并没有显式地最小化分类器所犯的错误的数量。支持向量机(Cortes和Vapnik, 1995)将图$3.1$中的sigmoid函数替换为一个表达式,该表达式将正确分类的示例赋值为零代价,将${ }^1$赋值为正代价,将错误分类的示例赋值为正$\operatorname{cost}^2$(与预测的置信度$x \cdot \theta$成比例)。这种区别是相当微妙的:虽然每个样本都会影响逻辑回归函数$\theta$的最优值,但支持向量机找到的解决方案完全由少数最接近分类边界的样本决定,或那些被错误标记的样本。从概念上讲,分类器以这种方式专注于最“困难”的样本是很有吸引力的,但请注意,在许多情况下(特别是在构建推荐系统时),我们的目标是优化排名性能而不是分类准确性(我们将在第3.3.3节讨论),因此,对最模糊的示例给予特别关注不一定是可取的
统计代写请认准statistics-lab™. statistics-lab™为您的留学生涯保驾护航。
术语 广义线性模型(GLM)通常是指给定连续和/或分类预测因素的连续响应变量的常规线性回归模型。它包括多元线性回归,以及方差分析和方差分析(仅含固定效应)。
有限元是一种通用的数值方法,用于解决两个或三个空间变量的偏微分方程(即一些边界值问题)。为了解决一个问题,有限元将一个大系统细分为更小、更简单的部分,称为有限元。这是通过在空间维度上的特定空间离散化来实现的,它是通过构建对象的网格来实现的:用于求解的数值域,它有有限数量的点。边界值问题的有限元方法表述最终导致一个代数方程组。该方法在域上对未知函数进行逼近。[1] 然后将模拟这些有限元的简单方程组合成一个更大的方程系统,以模拟整个问题。然后,有限元通过变化微积分使相关的误差函数最小化来逼近一个解决方案。
随机过程,是依赖于参数的一组随机变量的全体,参数通常是时间。 随机变量是随机现象的数量表现,其时间序列是一组按照时间发生先后顺序进行排列的数据点序列。通常一组时间序列的时间间隔为一恒定值(如1秒,5分钟,12小时,7天,1年),因此时间序列可以作为离散时间数据进行分析处理。研究时间序列数据的意义在于现实中,往往需要研究某个事物其随时间发展变化的规律。这就需要通过研究该事物过去发展的历史记录,以得到其自身发展的规律。
多元回归分析渐进(Multiple Regression Analysis Asymptotics)属于计量经济学领域,主要是一种数学上的统计分析方法,可以分析复杂情况下各影响因素的数学关系,在自然科学、社会和经济学等多个领域内应用广泛。
MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中,其中问题和解决方案以熟悉的数学符号表示。典型用途包括:数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发,包括图形用户界面构建MATLAB 是一个交互式系统,其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题,尤其是那些具有矩阵和向量公式的问题,而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问,这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展,得到了许多用户的投入。在大学环境中,它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域,MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要,工具箱允许您学习和应用专业技术。工具箱是 MATLAB 函数(M 文件)的综合集合,可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。