Issue #2 | August 2022 

THE TURING P01NT

Curiosity. Creativity. Courage. Community. 

It's Time To Talk About Bias

Welcoming words

Welcome to the second issue of the CRT-AI Newsletter "The Turing Point". This time we talk about bias in AI. Even though the topic has received quite a lot of attention in recent years, the level of awareness remains low. Knowing what tools a scientist can use to spot, evaluate and mitigate any form of bias is of great importance. For this issue, our team prepared a list of interesting articles, tools and datasets that can help you on your PhD ​​​​​journey.

Find Your CRT-AI Science Buddy

Fancy a collaboration with your CRT-AI mates? You have created that amazing tool but nobody knows about it? You have spent hundreds of hours gathering this incredible database but you are the only one using it? We have your back!

The CRT-AI’s Padlet now provides you with a nice place to advertise your work and find your AI soulmate <3. Whether you are looking for a collaboration or have a nice database to share, you can publish a post with everything you want people to know. If it isn’t your case yet, remember to save this Padlet to your favorites; you could be surprised by how inspiring this mood board will soon be! Here we welcome every single one of our members and want everyone to feel comfortable. You can therefore post by yourself or by filling in our special form!

Find your buddy or fill the form and we will do the job

Turing Point Podcast

Episode 2 of the Turing Point Podcast! CRT-AI PhD students Anais Murat and Sharmi Dev Gupta moderate a lively conversation with Dr. Begum Genc and CRT-AI student Buvana Ganesh about the impacts of gender bias in Artificial Intelligence.

Artificial Intelligence Dictionary

Coming to your AID (AI Dictionary)

by ​Cathy Roche

CRT-AI Events: Diversity Driving Innovation

by Naa Korkoi Addo

   ​The meet up was held in the BNY Mellon offices Dublin on the 25th May 2022. Researchers from third level institutions and guests from industry were in attendance. During the event, the panelists shared their career stories; from where they started to where they currently are. They did not fail to bring in the good, bad, and ugly as the theme of the event was Diversity Driving Innovation. This was a very topical issue and the aim was to increase awareness around women's participation in the IT sector.
    ​Dr Suzanne Little opened with a discussion around her career and research to date and mentioned that she had no intention of becoming a lecturer at the start of her career. She also touched upon on how she seized opportunities and how her ambitions and  curiosity positively impacted her life.
   Joanna Murphy also contributed to her story of how she switched between jobs and ultimately reached her current position. She stressed the fact that it was not always plain sailing. During the meet up, participants had the opportunity to ask questions about how industries are adapting culturally to include more women in their organizations. Dr Little touched upon the fact that just reaching out to other females to encourage them to enter the field is not enough. 
    Eoin Lane shared a story about a female colleague in his department that was performing remarkably well and yet left the industry. He asked the panel and audience what were the factors that could affect women's decision making when considering career change or moves out of the industry.
      

Gender Bias in Automated Speech Recognition

by Kislay Raj and Cathy Roche

     Dating back to 1878, Emma Nutt was considered the first woman telephone operator. Her voice was so well received that she soon became the standard for all other companies. By the 1880s, all telephone operators were women. We now have over a hundred years of female voice recordings that can be used to create new automated voice assistants. Gender bias in voice automation reflects this lack of male voice data as well as accepted assumptions about the female voice [1].
     Developers rely on female voices because of the lack of an equivalent male voice training set [1]. As a result, the first speech assistant was made using a female voice, as it was easier to use existing data than develop a new dataset. Furthermore, even if a new male voice dataset were created, there is a risk that it might not be as well received as the female version. It was simply more expedient for companies to use existing data known to have general public acceptance, even if this meant perpetuating gender bias [1].
      In Machine Learning (ML), a model makes predictions based on the data provided as a training set. Natural Language Processing (NLP) has enabled ML models to recognise the gender of voices. If the training data is imbalanced and uses more samples from one class​, then there will be a bias towards that class. The model can make more accurate predictions for the data it has seen most frequently [2].

Read more

     As humans, we may assume that when we communicate, we do so without bias towards a particular group. We interpret different languages and perform specific tasks based on making meaning from the language. When machines are tasked with replicating this process, computers 'recognise' speech, and then the natural language unit performs the 'understanding' of words, referred to as interpreters [2]. This process is one of AI's most complex tasks, as the system attempts to generate output that is as 'natural’ as human speech. Gender bias appears because of various ambiguities in ML models, including lexical ambiguity. As AI bots might be unable to 'recognise' words correctly, they perform predictions based on limited data sets. Any bias in the dataset will be reflected in the predictions made. Recognition errors can also result from pragmatic ambiguity, which gives different meanings to different words and sentences, depending on context [4].

    Researchers at Stanford University found that individuals deal with machines in the same manner they treat humans. A lack of diversity amongst developers has resulted in algorithms with significant bias in AI models, as the models do not reflect the broader population [5].  Due to the nature of the data, AI bots replicate and can reinforce gender assumptions built into the original data, for example, the association of female-voiced assistants with a submissive and malleable nature [5].

How to Make Automatic Speech Recognition (ASR) Systems Less Biased?

    Due to the complexity and richness of human speech, it is unrealistic to believe that bias (in the ML sense) will ever be eliminated completely from Automated Speech Recognition (ASR) systems. It is reasonable to think that it will take time for improvements in AI and ML. After all, as humans, we occasionally have difficulty comprehending speakers of other languages or those with different accents.

    What can be done when it is clear that a model is not working well for a particular set of people? There are several approaches to help improve model performance. 

Proposed solution

    One solution involves examining the whole ML pipeline: (i) Dataset, (ii) Training (or the model) and (iii) Results.  Within the dataset, ensuring a balanced distribution of all subgroups can be achieved when pre-processing the data. During training, one can include the constraints of fairness like demographic parity (statistical independence between outcome and demographics), so that the model optimises for accuracy and fairness. Finally, one can make post-hoc changes to the outputs so that the demographic distribution is balanced. The figure below indicates techniques that can be used to address bias at each stage [6,7].

Conclusion

    Within ML, biased data produces biased models: whether it is intentional or unintentional, any data not reflective of the range of potential outcomes, will result in bias. Sampling bias can lead to inaccurate models, particularly if they are built using historical datasets which have built-in biases. For example, if a company trains a model to assist in decision-making about promotions and has a poor track record of promoting women, its model will likely make the same biased decisions because of the nature of the training data. Similarly, a model trained on speech that was simple for its inventor to collect (from themselves, their family and friends), who might all speak with similar accents or inflection, the resulting ASR model may reflect a preference for such voices and may not recognise those with a different tone or accent [7].

    While there is no easy, universal method for identifying and addressing bias in ASR systems, it is important that all data is examined for potential bias before models are developed and deployed. Observe patterns in the data, anticipate the populations affected by the model’s decisions and be aware of what is missing from the dataset. 

Bibliography

[1]  Adapt. 2022. Gender Bias in AI: Why Voice Assistants Are Female | Adapt - Adapt. [online] Available at: [Accessed 22 May 2022].

[2] Robison, C., 2022. How AI bots and voice assistants reinforce gender bias. [online] Brookings. Available at: [Accessed 20 May 2022].

[3] Harvard Business Review. 2022. Voice Recognition Still Has Significant Race and Gender Biases. [online] Available at:  [Accessed 20 May 2022].

[4] L. Zhang, Y. Wu, and X. Wu, “A causal framework for discovering and removing direct and indirect discrimination,” CoRR, vol. abs/1611.07509, 2016

[5] Female IBM Researchers are helping AI overcome bias and find its voice | IBM Research Blog. [online] IBM Research Blog. Available at:  [Accessed 25 May 2022].

[6] Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K. and Galstyan, A., 2021. A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6), pp.1-35.

[7] Yuan, M., Kumar, V., Ahmad, M.A. and Teredesai, A., 2021. Assessing Fairness in Classification Parity of Machine Learning Models in Healthcare. arXiv preprint arXiv:2102.03717.