2022年9月22日
经济代写|发展经济学代写Development Economics代考|Inclusiveness of Access and Use Affect the Representativeness of Big Data

Access to and use of mobiles and the Internet affect not only more traditional surveys that use these technologies but also the representativeness of data captured for big data analytics. Data captured may not reflect what a person actually thinks and does. A person may not have a data trail or data exhaust, or may be sharing or borrowing a phone or an account listed under someone else’s name. Additionally, big data sources are representative of different people and groups. For example, Twitter users tend to be younger, wealthier, more educated and more likely to live in urban areas than Facebook users, and the Twitter platform only represents a small proportion of the population, especially in low-income countries (Abreu Lopes, Bailur and Barton-Owen, 2018).

If evaluators use big data, they need to use evaluation methods that ensure that the most marginalized or vulnerable are fairly represented. In addition to considering individual access and use of mobiles and the Internet, it may be helpful to think of data as coming from four different “buckets”. This way the type and source of data can be reviewed to determine whether data is inclusive, and if not, who is missing and how can those voices be included (Raftree, 2017). Different kinds of data present more or less stark choices for organizations using them and the end evaluands and users.
1 Traditional data. In this case, researchers, evaluators and/or enumerators are in control of the process. They design a questionnaire or data gathering process and go out and collect qualitative or quantitative data; they send out a survey and request feedback; they do focus group discussions or interviews; they collect data on digital devices or on paper and digitize it later for analysis and decision-making. The sampling process is tightly controlled and is deliberately constructed to fit a predetermined criterion of quality. However, such control of the quality and soundness of the data means that it is resource-intensive and of limited size. This kind of data represents the voice of those precisely selected by the agency and those who are intended to be heard for the purpose of the evaluation.

经济代写|发展经济学代写Development Economics代考|Bias in Big Data, Artificial Intelligence and Machine Learning

The application of big data and big data analytics to development evaluation is still in its infancy. It is only in the past ten years that development and UN agencies have begun thinking about these data sources and their predictive potential, and even more recently that their role in evaluation has been examined (see Chapter 3). Though impressive capacity to process data exists, this capacity has advanced far more quickly than has human capacity to understand its implications, and ethical and legal frameworks have not yet caught up.

In her book Weapons of Math Destruction, Cathy O’Neil details several recent cases in which big data algorithms have directly caused harm, including the financial crash of the late $2000 \mathrm{~s}$, school ranking, private universities and policing. Though some big data algorithms can be healthy – baseball managers use them to devise plays – O’Neil says this is only possible if algorithms are open and transparently created, if they can be scrutinized and unpacked, if unintended consequences are tracked and adjusted for when they are negative, and if the algorithms are not causing damage or harm. Unfortunately, in many cases, those creating algorithms purposefully target and/or take advantage of more vulnerable people.

In the case of development, assuming that there is good intent, the question becomes one of the unintended consequences that could arise from creating algorithms where there is insufficient data. O’Neill notes that proxy indicators often stand in where there is an absence of hard data and can lead to perverse incentives and distortion of monitoring and evaluation systems that causes harm. If algorithms are not continuously tested and adjusted using fresh data, they can easily become stale. And if they are created by people with little contextual or cultural awareness of how a system actually works,
Predictive capabilities could go a long way towards improving development approaches and outcomes. But humans are designing the algorithms used to make these predictions, so they contain persistent and historical biases. The attractive claim of big data is that it can turn qualitative into quantitative. Yet objectivity and accuracy claims are misleading. As Boyd and Crawford (2011) note, “working with Big Data is still subjective, and what it quantifies does not necessarily have a closer claim on objective truth – particularly when considering messages from social media sites”.

1传统数据。在这种情况下，研究人员、评估人员和/或枚举人员控制这个过程。他们设计调查问卷或数据收集流程，然后出去收集定性或定量数据;他们会发送调查并请求反馈;他们进行焦点小组讨论或采访;他们在数字设备或纸张上收集数据，然后将其数字化，以便进行分析和决策。抽样过程是严格控制的，并刻意构造以符合预定的质量标准。然而，这种对数据质量和可靠性的控制意味着数据是资源密集型的，而且规模有限。这类数据代表了机构精确挑选的人的声音，以及为评估目的而希望听到的人的声音

