- Trainer*in: akhilesh kakolu ramarao
- Trainer*in: Kevin Tang
Ethics, Bias and Natural Language Processing (Prof. Tang, SS 2024, Tues: 16:30--18:00)
**Description:**
Is technology really as innocent and as objective as
they are said to be? As machine learning (ML) and Artificial
Intelligence (AI) becomes more prominent in our life from speech and
voice recognition by Alexa to automatic fake news warnings of social
media posts, issues with social bias and fairness in language technology
become more pertinent than ever before. Negative impacts that biased ML
and AI could have for various social identities such as race, gender
and culture.
We first introduce the concept of bias in language
technology, and the different types of biases such as racial, gender,
cultural biases. To begin to understand the cause of these biases, we
will cover the basic underlying structure of some of the technologies
such as Automatic Speech Recognition, hate speech detection and word
association. To evaluate these biases, we will learn to generate test
cases that can be used to evaluate trained systems, and the metrics that
are used for measuring bias/fairness. Finally, we will cover the basics
of bias mediation and techniques.
**Audience:** those interested
in social factors (e.g., sociolinguistics), digital humanities,
computational ethics, and challenges in AI. Students who are interested
in Artificial Intelligence.
**Literature**Given the rapidly
developing nature of this topic, there is not a single textbook, but
rather we would sample from existing research papers and handbook
chapters.
e.g., Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and Machine Learning. 2019. URL: http://www.fairmlbook.org.
Feng,
S., Kudina, O., Halpern, B. M., & Scharenborg, O. (2021).
Quantifying bias in automatic speech recognition. arXiv preprint
arXiv:2103.15122.
Garg, N., Schiebinger, L., Jurafsky, D., &
Zou, J. (2018). Word embeddings quantify 100 years of gender and ethnic
stereotypes. Proceedings of the National Academy of Sciences, 115(16),
E3635-E3644.
** Requirements **
Requirements
for 2 CPs are a set of assignments plus an active participation in all
in-class activities. Requirements for 3 CPs are similar to those for 2
CPs but have more assignments. All will be described in the course
syllabus that will be provided and discussed in the first session.
In case that you miss more than 2 sessions, you will have to compensate for this participation by handing in extra written work.
- Trainer*in: Kevin Tang
Colloquium (Prof. Tang, SS 2024, Wed: 16:30--18:00)
This colloquium is for all students who want to discuss their project for a Bachelor, Master or doctoral thesis and who wish to receive feedback and support. The colloqium takes place every second week in person. The other weeks you would be required to work as a group. We will use the first session to decide on the topics of presentation, which will then have to become a part of the colloquium's program. In the in-person weeks, we will also cover research related skills, such as time-management, hypothesis generation, critical reading and more.
Requirements for 2 CPs are a set of assignments plus an active participation in all in-class activities. Requirements for 3 CPs are similar to those for 2 CPs but have more assignments. All will be described in the course syllabus that will be provided and discussed in the first session.
In case that you miss more than 2 sessions, you will have to compensate for this participation by handing in extra written work.
- Trainer*in: Kevin Tang
Introduction to Corpus Phonetics (Prof. Tang, SS 2024, Wed: 14:30--16:00)
**Audience:** Students who would like to improve their employability by
learning a highly desirable skill. Students who would like to do any
English Linguistic courses with a quantitative component in the future,
especially in the area of phonetics and phonology. It can also be
beneficial to those who are more literature-based but would like to do
more digital humanities. Students who are interested in Artificial
Intelligence.
**Keywords: **
quantitative analysis, R, phonetics, phonology, language, linguistics
**Description:**
This
course aims to fill a gap between the students’ knowledge in phonetics
and phonology and their ability to applying that knowledge to ask
non-trival research questions using a large amount of speech and lexical
data. It would cover corpus compilation, semi-automatic annotation
(phonetic transcription and forced-alignment), extraction of phonetic
and phonological variables and the basics of statistical analyses of
corpus data. It complements other courses such as advanced phonetics,
quantitative and experimental methods, and corpus/computational
linguistics. The course will involve the use of programming languages
(such as Python, R and unix commands) and they will be introduced as
needed.
**Textbook:**
While we won't be using a single
textbook, we will likely sample from the following textbook: Harrington,
J. (2010). Phonetic analysis of speech corpora. John Wiley & Sons.
** Requirements **
Requirements
for 2 CPs are a set of assignments plus an active participation in all
in-class activities. Requirements for 3 CPs are similar to those for 2
CPs but have more assignments. All will be described in the course
syllabus that will be provided and discussed in the first session.
In case that you miss more than 2 sessions, you will have to compensate for this participation by handing in extra written work.
- Trainer*in: Kevin Tang
Quantitative Methods for Linguistic Data: An Introduction to Statistics using R (Prof. Tang, SS 2024, Wed: 12:30-14:00)
**Audience:** Students who would like to improve their employability
by learning a highly desirable skill. Students who would like to do any
English Linguistic courses with a quantitative component in the future.
It can also be beneficial to those who are more literature-based but
would like to do more digital humanities. Students who are interested in
Artificial Intelligence.
**Keywords: **
statistics, quantitative analysis, R, phonetics, phonology, language, linguistics
**Description:**
It
is as necessary to be numerate as it is to be literate, but students in
the field of humanities are often not as numerate as they are literate.
They will need to evaluate evidence that are based on probability-based
models or statistical results in many of the courses that they take in
university, as they consider the efficacy of vaccination and the
severity of the pandemic, as they begin to vote in local and national
elections, as they search for employment on the job market after
graduating, and so on. With an increasingly digital world filled with
big data, a command of statistical reasoning is more important than
ever. In this course, we will learn numeracy through linguistics,
specifically through phonetics and phonology by learning to analyse the
sounds of languages quantitatively.
How do we analyse the sounds
of languages quantitatively? This course, Analysing the sounds of
languages, covers the basics of quantitative methods using real data
taken from the field of phonetics and phonology. We will provide a
gentle introduction to the statistical program R (www.r-project.org) -- a
program that is used by data scientists in the tech. industry and
academic researchers. The course will consist of a combination of
lectures, and plenty of hands-on exercises. We introduce research
questions, such as ”Do Southerners in the US really talk more slowly?”
or ”Why do we expect scholarly words to be longer than familiar words?”
as a framework for introducing the numerical concepts required to answer
research questions such as these. In this course, statistical methods
are introduced with a research question and a solid understanding of the
data, which is why we use real data and questions that are relevant to
anyone who commands a spoken language. A good amount of space is also
devoted to illustrating how to formulate and answer a research question,
and hypothesis development and testing.
**Textbook:**
To get a
sense of what we will do on this course, do check out the main textbook
that we will be using https://kb.osu.edu/handle/1811/77848 (freely
available). I look forward to numerating with you on phonetics and
phonology.
Smith, Bridget J., Beckman, Mary E., and Foltz, Anouschka (2016). Analyzing the sounds of languages. Ohio State University. http://hdl.handle.net/1811/77848
** Requirements **
Requirements for 2 CPs are a set of assignments plus an active
participation in all in-class activities. Requirements for 3 CPs are
similar to those for 2 CPs but have more assignments. All will be
described in the course syllabus that will be provided and discussed in
the first session.
In case that you miss more than 2 sessions, you will have to compensate for this participation by handing in extra written work.