Systematizing Audit in Algorithmic Recruitment
Abstract
:1. Introduction
- Principles: A large number of academic groups, governmental bodies, private industry and non-governmental organizations have published their own principles for the ethical use of AI (European Commission 2021; Kazim et al. 2021; Piano 2020);
- Processes: The second phase, typified by an ethical-by-design approach, was an engineering focused problem-solving exercise. An important dimension of this phase was the recognition that factors such as governance, social impact, legal compliance, and engineering standards should be considered during the processes involved in the creation and introduction of AI technologies (Leslie 2019);
- Assurance and audit: We believe the third phase (and current one) is concerned with the need to standardize the use of AI, where such standardization can be in the form of legal compliance, meeting industry best practice, or some sector-specific regulation (Kazim et al. 2021).
2. Auditing AI-Driven Recruitment Systems
2.1. Algorithmic Recruitment
2.2. Discrimination in Algorithmic Hiring
3. Assessing AI-Driven Recruitment Systems through Audit
3.1. Risk
- Compliance risk: A system should not contravene the laws within which it is operating. There are several laws that may be applicable, including labor laws, antidiscrimination legislation (Civil Rights Act of 1964), and right to redress (EU GDPR). Failure to comply with these laws risks litigation;
- Reputational: As evidenced by the fallout and public concern following cases of high-profile harm (cf. bias in Amazon’s AI recruitment), companies that are seen to have unethical recruitment practices are liable for reputational damage;
- Financial: Financial loss can be incurred as a result of fines or lawsuits initiated by customers and regulators, loss of commercial earnings through customer loss, and costs associated with poor recruitment strategies such as hiring the wrong candidate, missing out on top performers, and poor staffing strategies;
- Governance: System developers and deployers should always be in control of the systems and have the ability to monitor and report on these systems. The use of such technologies increases risk of loss of control; hence, good accountability measures, such as reporting and documentation, are required.
3.2. Auditing Stages
- Data: Input data are analyzed to ensure that they do not include any identifiable or protected characteristics (or proxy for those). Output data are used to compute performance and fairness metrics for different disadvantaged groups, as well as the robustness of the outputs under different perturbations (e.g., noise, removing features);
- Model: The model choice, chosen parameters and objective function influence the explainability and robustness of the process. The auditor will also look for signs of overfitting that could hamper the generalization capabilities of the model and limit how applicable the model is outside of the population it was trained on;
- Development: The design, building process, training routine of the algorithm and associated documentation are audited using information relating to the anonymization process, the reasons why certain features are used, and whether there is any adverse impact analysis to identify whether the model results in group differences and at what stage this analysis takes place.
3.3. Verticals
- Bias: Ensuring systems act fairly across protected characteristics. This is perhaps the most acute vertical of concern, as recruitment is a sector that impacts an individual’s life prospects. Indeed, that a system is fair is not only an ethical imperative but also a legal one. There are a number of bias metrics that can be used to assess how the models impact unprivileged or minority groups (see IBM Research 2021; Cohen 1988; Morris and Lobsenz 2000):
- ○
- Disparate Impact: Ratio of the selection rates of the unprivileged group to the privileged group, where selection rates refer to the proportion of candidates belonging to a specific group that are hired;
- ○
- Statistical Parity: This compares the selection rates of the unprivileged and privileged groups to determine whether a prediction is independent of a protected characteristic. If statistical parity occurs, the selection rates of the privileged and unprivileged group would be the same;
- ○
- Cohen’s D: Effect size used to indicate the standardized difference in the selection rates of the unprivileged and privileged groups. Similar to Statistical Parity but divided by the pooled standard deviation of the selection rate of the two groups;
- ○
- 2-SD Rule: If the standard deviation (SD) difference between the expected and observed selection rates for a specific group is greater than 2 SDs, this would be considered as unfair;
- ○
- Equal Opportunity Difference: The difference between the true positive rates of the unprivileged and privileged groups. The true positive rate measures the proportion of candidates that are correctly allocated to an outcome by the model, as determined by checking against an alternative model or process such as manual applicant screening;
- ○
- Average Odds Difference: Average difference between the false positive rate and true positive rate between unprivileged and privileged groups. The false positive rate measures the proportion of candidates that are supposed to receive an unfavorable outcome, as determined by an alternative measure, but are instead given a favorable outcome by the model.
- Transparency: This vertical is bidimensional, encompassing transparency in the governance and decision-making procedures, and system explainability. With respect to the former, relevant areas for transparency include documentation and standardization of assessments, while the latter is concerned with the extent to which the algorithmic system can be explained. In engineering literature, the degree of transparency is referred to as a black- or white-box, where black-box models lack transparency, with nothing being able to be explained. Indeed, this relates to the broad area known as explainable AI (often referred to as XAI) (Adadi and Berrada 2018; Arrieta et al. 2020). In terms of the recruitment context, good practice entails being able to provide a reason for why a candidate was not hired and, in doing so, enabling the candidate to understand why and potentially improve in that dimension. Furthermore, users of the system can be aware of why the system has recommended in the manner in which it has, and thus whether the reasoning was legitimate (and fair);
- Safety (also referred to as Robustness): A safe and robust model is one that is accurate when applied in different contexts or to different datasets. While a motivating factor in the use of such technologies is increased efficiency, it is critical that the models can be applied fairly across all candidates to avoid unfair decisions and filtering out the best candidates. This is critical given that for a system to be trustworthy, it should perform at an acceptable level of accuracy and be robust enough to withstand changes in the dataset such that a small change will not radically alter the performance. For recruitment, it is paramount that users are confident that systems are robust in this way;
- Privacy: Aside from issues of data stewardship, algorithms can process data in ways that reveal the nature of the data they have been trained on or that they are utilizing when they provide an output, such as a decision or recommendation. Here, concerns are with data leakage and, more generally, whether data minimization principles are being respected. Privacy concerns are particularly relevant in the context of recruitment given that individual and sensitive data relating to information such as race and gender is being processed.
3.4. Levels of Access
3.5. Assurance
4. Contextual Factors
4.1. Human Oversight
- Human-in-the-loop: Involves a human capable of intervention in every decision cycle;
- Human-on-the-loop: Involves a human who is capable of intervention during the design cycle or who is monitoring its operation;
- Human-in-command: Involves a human capable of overseeing the overall activity of the system, with the ability to decide when and how to intervene in any particular situation.
4.2. Where in the Pipeline?
4.3. What Is Explained and to Whom?
4.4. What Is Being Assessed?
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
1 | Note that we use the term ‘algorithms’ to encompass a range of systems. At its most basic, an algorithm is simply a set of instructions designed to perform a specific task. Our main referent in this article is to data science algorithms, which include those that fall under AI paradigms such as machine learning or knowledge-based systems, used in human resources applications such as psychometric testing and recruitment (Koshiyama et al. 2020). |
2 | AI is also used to automate other human resources (HR) processes where decisions are being made about people (ex. posting job listings, matching to job roles, performance measurement, etc.). We will not explore these dimensions in this paper, but will rather tangentially touch upon them at points. |
References
- Adadi, Amina, and Mohammed Berrada. 2018. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access 6: 52138–60. [Google Scholar] [CrossRef]
- Ajunwa, Ifeoma, Kate Crawford, and Joel S. Ford. 2016. Health and big data: An ethical framework for health information collection by corporate wellness programs. Journal of Law, Medicine and Ethics 44: 474–80. [Google Scholar] [CrossRef] [PubMed]
- Arrieta, Alejandro Barredo, Natalia Díaz-Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, Alberto Barbado, Salvador García, Sergio Gil-López, Daniel Molina, Richard Benjamins, and et al. 2020. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusio 58: 82–115. [Google Scholar] [CrossRef] [Green Version]
- Arslan, Ayse Kok. 2000. A Design Framework for Auditing AI. Journal of Multidisciplinary Engineering Science and Technology (JMEST) 7: 12768–76. [Google Scholar]
- Bartneck, Christoph, Christoph Lütge, Alan Wagner, and Sean Welsh. 2021. An Introduction to Ethics in Robotics and AI. Cham: Springer Nature, p. 117. [Google Scholar]
- Bender, Silke, and Alan Fish. 2000. The transfer of knowledge and the retention of expertise: The continuing need for global assignments. Journal of Knowledge Management 4: 125–37. [Google Scholar] [CrossRef]
- Cedefop. 2020. Available online: https://www.cedefop.europa.eu/en/news-and-press/news/artificial-intelligence-post-pandemic-world-work-and-skills (accessed on 19 August 2021).
- Cohen, Jacob. 1988. Statistical Power Analysis for the Behavioral Sciences, 2nd ed. London: Routledge. [Google Scholar] [CrossRef]
- Dastin, Jeffrey. 2018. Amazon Scraps Secret AI Recruiting Tool That Showed Bias against Women. Available online: https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G (accessed on 13 September 2021).
- Davenport, Thomas H. 2018. From analytics to artificial intelligence. Journal of Business Analytics 1: 73–80. [Google Scholar] [CrossRef]
- Davenport, Thomas, and Ravi Kalakota. 2019. The potential for artificial intelligence in healthcare. Future Healthcare Journal 6: 94–98. [Google Scholar] [CrossRef] [Green Version]
- Dignum, Virginia. 2018. Ethics in artificial intelligence: Introduction to the special issue. Ethics and Information Technology 20: 1–3. [Google Scholar] [CrossRef] [Green Version]
- European Commission. 2020. White Paper on Artificial Intelligence: A European Approach to Excellence and Trust. Available online: https://ec.europa.eu/commission/sites/beta-political/files/political-guidelines-next-commission_en.pdf (accessed on 13 September 2021).
- European Commission. 2021. Proposal for a Regulation Laying down Harmonised Rules on Artificial Intelligence. Available online: https://digital-strategy.ec.europa.eu/en/library/proposal-regulation-laying-down-harmonised-rules-artificial-intelligence (accessed on 13 September 2021).
- German Data Ethics Commission. 2018. Opinion of the Data Ethics Commission. Available online: https://www.bmjv.de/SharedDocs/Downloads/DE/Themen/Fokusthemen/Gutachten_DEK_EN_lang.pdf;jsessionid=765C0C06EB1D627F1FDA363CDE73F4EC.2_cid297?__blob=publicationFile&v=3 (accessed on 13 September 2021).
- Hadjimichael, Demetris, and Haridimos Tsoukas. 2019. Toward a better understanding of tacit knowledge in organizations: Taking stock and moving forward. Academy of Management Annals 13: 672–703. [Google Scholar] [CrossRef]
- Hagendorff, Thilo. 2020. The ethics of AI ethics: An evaluation of guidelines. Minds and Machines 30: 99–120. [Google Scholar] [CrossRef] [Green Version]
- Hannák, Aniko, Claudia Wagner, David Garcia, Alan Mislove, Markus Strohmaier, and Christo Wilson. 2017. Bias in online freelance marketplaces: Evidence from TaskRabbit and Fiverr. In Proceedings of the ACM Conference on Computer Supported Cooperative Work. New York: Association for Computing Machinery, pp. 1914–33. [Google Scholar] [CrossRef] [Green Version]
- IBM Research. 2021. AI Fairness 360. Armonk: IBM Research. [Google Scholar]
- Int. 1894–2020. Sale of Automated Employment Decision Tools. The New York City Council. Committee on Technology (27 February 2020). Available online: https://legistar.council.nyc.gov/LegislationDetail.aspx?ID=4344524&GUID=B051915D-A9AC-451E-81F8-6596032FA3F9&Options=Advanced&Search (accessed on 13 September 2021).
- Jobin, Anna, Marcello Ienca, and Effy Vayena. 2019. The global landscape of AI ethics guidelines. Nature Machine Intelligence 1: 389–99. [Google Scholar] [CrossRef]
- Kazim, Emre, and Adriano Soares Koshiyama. 2020a. A high-level overview of AI ethics. Patterns 2: 100314. [Google Scholar] [CrossRef]
- Kazim, Emre, and Adriano Koshiyama. 2020b. AI assurance processes. SSRN Electronic Journal, 1–9. [Google Scholar] [CrossRef]
- Kazim, Emre, and Adriano Soares Koshiyama. 2021. EU proposed AI legal framework. SSRN Electronic Journal, 1–9. [Google Scholar] [CrossRef]
- Kazim, Emre, Danielle Mendes Thame Denny, and Adriano Koshiyama. 2021. AI auditing and impact assessment: According to the UK information commissioner’s office. AI and Ethics 1: 301–10. [Google Scholar] [CrossRef]
- Koshiyama, Adriano, Emre Kazim, Philip Treleaven, Pete Rai, Lukasz Szpruch, Giles Pavey, Ghazi Ahamat, Franziska Leutner, Randy Goebel, Andrew Knight, and et al. 2021. Towards algorithm auditing: A survey on managing legal, ethical and technological risks of AI, ML and associated algorithms. SSRN Electronic Journal. [Google Scholar] [CrossRef]
- Koshiyama, Adriano, Nick Firoozye, and Philip Treleaven. 2020. Algorithms in future capital markets. SSRN Electronic Journal. [Google Scholar] [CrossRef]
- Leslie, David. 2019. Understanding artificial intelligence ethics and safety. The Alan Turing Institute. [Google Scholar] [CrossRef]
- Mehrabi, Ninareh, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2021. A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR) 54: 1–35. [Google Scholar] [CrossRef]
- Mokander, Jakob, and Luciano Floridi. 2021. Ethics-based auditing to develop trustworthy AI. arXiv arXiv:2105.00002. [Google Scholar]
- Morris, Scott B., and Russell E. Lobsenz. 2000. Significance tests and confidence intervals for the adverse impact ratio. Personnel Psychology 53: 89–111. [Google Scholar] [CrossRef]
- Munoko, Ivy, Helen L. Brown-Liburd, and Miklos Vasarhelyi. 2020. The ethical implications of using artificial intelligence in auditing. Journal of Business Ethics 167: 209–34. [Google Scholar] [CrossRef]
- Pasquale, Frank. 2019. Data-informed duties in AI development. Columbia Law Review 119: 1917. Available online: https://heinonline.org/HOL/Page?handle=hein.journals/clr119&div=59&g_sent=1&casa_token=8cLSvOz1eWwAAAAA:K2IW3PgIJxZiklfvoYg99zqtSbq-gommj8eILC028Wpo-Ow9rb95UZVpWyG_g25LimPyploK (accessed on 13 September 2021).
- Pedreshi, Dino, Salvatore Ruggieri, and Franco Turini. 2008. Discrimination-aware data mining. Paper presented at the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, NV, USA, August 24–27; pp. 560–68. [Google Scholar]
- Piano, Samuele Lo. 2020. Ethical principles in machine learning and artificial intelligence: Cases from the field and possible ways forward. Humanities and Social Sciences Communications 7: 9. [Google Scholar] [CrossRef]
- Raji, Inioluwa Deborah, Andrew Smart, Rebecca N. White, Margaret Mitchell, Timnit Gebru, Ben Hutchinson, Jamila Smith-Loud, Daniel Theron, and Parker Barnes. 2020. Closing the AI accountability gap: Defining an end-to-end framework for internal algorithmic auditing. Paper presented at 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, January 27–30; pp. 33–44. [Google Scholar]
- Real-Time Talent. 2016. IT Fact Sheet. Available online: http://www.realtimetalent.org/wp-content/uploads/2016/07/RTT_2016_April_TC_IT_Factsheet.pdf (accessed on 13 September 2021).
- Rieke, Aaron, Miranda Bogen, and David G. Robinson. 2018. Public Scrutiny of Automated Decisions: Early Lessons and Emerging Methods. Available online: https://apo.org.au/sites/default/files/resource-files/2018-02/apo-nid210086.pdf (accessed on 13 September 2021).
- Robertson, Ronald E., David Lazer, and Christo Wilson. 2018. Auditing the personalization and composition of politically-related search engine results pages. Paper presented at 2018 World Wide Web Conference on World Wide Web—WWW ’18, Lyon, France, April 23–27; pp. 955–65. [Google Scholar] [CrossRef] [Green Version]
- Rushby, John. 1988. Quality Measures and Assurance for AI Software. NASA Contractor Reports (Issue 4187). Available online: https://ntrs.nasa.gov/search.jsp?R=19880020920 (accessed on 13 September 2021).
- Ryan, John R. 1982. Software product quality assurance. Paper presented atAFIPS 1982 National Computer Conference, Houston, TX, USA, June 7–10; pp. 393–98. [Google Scholar] [CrossRef]
- Schmidt, Frank L., and John E. Hunter. 2016. The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 100 Years of Research Findings. Working Paper. Available online: https://home.ubalt.edu/tmitch/645/session%204/Schmidt%20&%20Oh%20MKUP%20validity%20and%20util%20100%20yrs%20of%20research%20Wk%20PPR%202016.pdf (accessed on 13 September 2021).
- Shneiderman, Ben. 2016. Opinion: The dangers of faulty, biased, or malicious algorithms requires independent oversight. Proceedings of the National Academy of Sciences 113: 13538–40. [Google Scholar] [CrossRef] [Green Version]
- Umbrello, Steven, and Ibo van de Poel. 2021. Mapping value sensitive design onto AI for social good principles. AI and Ethics 1: 1–14. [Google Scholar] [CrossRef]
- Voas, Jeffrey, and Keith Miller. 2006. Software certification services: Encouraging trust and reasonable expectations. IT Professional 8: 39–44. [Google Scholar] [CrossRef]
- Woolley, Anita Williams, Ishani Aggarwal, and Thomas W. Malone. 2015. Collective intelligence and group performance. Current Directions in Psychological Science 24: 420–24. [Google Scholar] [CrossRef]
- Wright, James, and David Atkinson. 2019. The Impact of Artificial Intelligence within the Recruitment Industry: Defining a New Way of Recruiting. Available online: https://www.cfsearch.com/wp-content/uploads/2019/10/James-Wright-The-impact-of-artificial-intelligence-within-the-recruitment-industry-Defining-a-new-way-of-recruiting.pdf (accessed on 13 September 2021).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kazim, E.; Koshiyama, A.S.; Hilliard, A.; Polle, R. Systematizing Audit in Algorithmic Recruitment. J. Intell. 2021, 9, 46. https://doi.org/10.3390/jintelligence9030046
Kazim E, Koshiyama AS, Hilliard A, Polle R. Systematizing Audit in Algorithmic Recruitment. Journal of Intelligence. 2021; 9(3):46. https://doi.org/10.3390/jintelligence9030046
Chicago/Turabian StyleKazim, Emre, Adriano Soares Koshiyama, Airlie Hilliard, and Roseline Polle. 2021. "Systematizing Audit in Algorithmic Recruitment" Journal of Intelligence 9, no. 3: 46. https://doi.org/10.3390/jintelligence9030046
APA StyleKazim, E., Koshiyama, A. S., Hilliard, A., & Polle, R. (2021). Systematizing Audit in Algorithmic Recruitment. Journal of Intelligence, 9(3), 46. https://doi.org/10.3390/jintelligence9030046