La charla será impartida por Erik Derner, investigador postdoctoral de la Fundación ELLIS Alicante, el miércoles 8 de noviembre a las 11:30 en el Salón de Actos de la Politécnica IV
Erik Derner, investigador postdoctoral de ELLIS Alicante, impartirá una charla sobre sesgos en corpus de texto para entrenar LLM el día 8 de noviembre a las 11.30am en el Salón de Actos de la Politecnica IV.
Speaker:
Dr. Erik Derner, ELLIS Alicante
Title:
Towards Unbiased LLMs from the Roots: Exploring Biases in Language Corpora
Abstract:
In the rapidly advancing field of Natural Language Processing (NLP), driven by the widespread adoption of Large Language Models (LLMs), biases inherent in these models mirror the broader societal biases present in the textual data they are trained on. One of the typical examples is the gender bias. For instance, LLMs inadvertently perpetuate stereotypes by linking certain professions or characteristics more strongly with a particular gender. This talk will examine the entire pipeline associated with biases in textual data sets (language corpora). We will discuss the most prevalent types of biases, investigate the methods to measure them and suggest techniques for debiasing the data. The talk will present relevant related work and provide useful insights into practical strategies for identifying and mitigating biases in textual data.
Short bio:
Erik Derner received his Ph.D. in Robotics and Artificial Intelligence from the Czech Technical University in Prague, Czech Republic, in 2022. Currently, he is an ELLIS postdoctoral researcher at the ELLIS Alicante Unit, working on human-centric AI research in the team of Dr. Nuria Oliver. The main objective of his research is to contribute to the development of fair and safe language models, specifically focusing on low-resource languages. His areas of interest comprise human-centric AI, large language models, robotics, computer vision, reinforcement learning, and genetic algorithms. He is a member of the ELLIS network.