Hadas Kotek
office location: Cambridge, MA
email: hkotek at alum.mit.edu
I am a Linguist working at Apple. I am a member of the Natural Language data science team for Siri Understanding. I have expertise in managing all aspects of data collection and analysis efforts for various NLP tasks at scale, including onboarding and training annotators; designing annotation tasks and tools; data sampling; analyzing data for accuracy, consistency, and efficiency; error analysis; implementing improvements to ontology, task design, and annotator training; and reporting to stakeholders. Most recently, my focus has been on data generation, robustness, and safety of Large Language Models. In Fall 2023, I am teaching a seminar on LLMs at MIT.
I continue to engage in research as a Research Affiliate at MIT linguistics, with a focus on LLM safety and efficient data collection. My academic research focused on different aspects of the syntax-semantics interface, using both traditional and experimental methods. I mainly worked on A-bar phenomena, including wh-questions, focus constructions, relative clauses and free relatives, ellipsis, wh-indefinites, (focus) intervention effects, and comparatives and superlatives. I additionally have an ongoing interest in studying and contributing to equity in the field.
I received my PhD in Linguistics from MIT in 2014, with a dissertation on the syntax, semantics, and processing of questions. Prior to joining Apple, I was a Lecturer in Semantics at Yale and a Visiting Assistant Professor in Syntax at NYU, and I have held a Mellon Postdoctoral Fellowship at McGill University.
Please visit the about page for more details concerning my research interests and my academic history. See my resume, my academic CV, my MIT linguistics user page, or my LinkedIn page for additional details.
✨ ️NEW✨ Blog
AI ethics
-
Text-to-image models are shallow in more ways than one (part 2): a discussion of interesting aspects of the images generated in part 1 with respect to gender, race and ethnicity, age, attractiveness, lexical choices, and image style. In short: the models exhibit biases along all these dimensions, in fairly predictable and yet concerning ways.
-
Text-to-image models are shallow in more ways than one (part 1): a summary of some basic experiments I carried out for my seminar on LLMs, focusing on syntactic and semantic ambiguity. We find that text-to-image models engage in very shallow parsing, likely not much more than a bag-of-words approach. Many sometimes-surprising frequency effects follow.
-
Stereotypical Gender Effects in 2016: a summary of a co-authored study I conducted in 2016 exploring stereotypes associated with occupation-denoting nouns. Full dataset here (csv)
-
Doctors can’t get pregnant and other gender biases in ChatGPT: a writeup of my Twitter post that went viral, illustrating gender biases in ChatGPT.
Informational posts on the non-academic job market
-
Job application materials: in this post I share two versions of my academic job application materials, one from my first post-PhD cycle and one from the last cycle. I also share the resume that got me my alt-ac job at the end of that last cycle.
-
A guided self-reflection for getting started with Alt-Ac jobs: a post with lots of questions you can ask yourself to start figuring out what jobs are right for you.
-
Job titles and job descriptions for linguists (and other social scientists): a compilation of (a) job titles, (b) informal job descriptions, (c) sample job ads, and (d) interviews with job holders for a diverse set of non-technical roles for linguists. This post is over 5k words long and divided into 10 categories, since there’s a lot to cover.
-
Transferable skills (and how to talk about them): a compilation of transferable skills for AltAc jobs, including sample resume bullet points using my own experience in academia to illustrate.
-
Learning about Alt-Ac opportunities (aka how to get started): on getting started on the journey by asking yourself some questions and gathering some information.
-
Prepping for Alt-Ac jobs (aka taking action): on getting started on the journey by taking some active steps to learn or expand relevant skills.
-
Alt-Ac informational interviews: on informational interviews for alt-ac careers, including a suggested list of questions.
-
Do you need a graduate degree to get an Alt-Ac job?: answering this FAQ. Short answer: probably not.
-
Let’s talk about terminology: an ongoing master list of industry terminology.
Academia
-
Job application materials: in this post I share two versions of my academic job application materials, one from my first post-PhD cycle and one from the last cycle. I also share the resume that got me my alt-ac job at the end of that last cycle.
-
On the emotional toll of the academic job market: a reflection on how emotionally difficult it was to be on the academic job market.
-
My academic job market journey: a short post about my own experience on the academic job market, mainly focusing on the numbers and concrete facts.
-
On leaving academia: a three-part post (part 2, part 3) that gets a bit navel-gazy.
-
Academic job interview questions: a compilation of 25 sets of interview questions I was asked in interviews between 2014–2019.
Science and other outreach
-
LSA Summer Institute at UMass: Careers in Language Technologies, a 4-lecture series, summer 2023.
-
Superlinguo jobs series post.
-
The Vocal Fries podcast appearance (“John Mary Bill Sue”).
-
Because Language podcast appearance.
-
@Science_Is_US #PeopleofScience campaign on Twitter and Instagram by scienceisus.org.
-
Linguistic Society of America: January 2021 Member Spotlight.
-
Resources on Equity and Inclusivity in Linguistics (REIL) Guidelbook: a joint effort of LSA’s COGEL (Committee on Gender Equity in Linguistics, formerly COSWL) and SALT’s SALTED (Semantics and Linguistic Theory: Equity and Diversity): with Melissa Baese-Berk, Michael Yoshitaka Erlewine, Ivona Kucerova, Elin McCready, Mary Moroney, Jessica Rett, Carly Sommerlot, and Susi Wurmbrand.
-
Pop-Up Mentoring Program; originally with by Melissa Baese-Berk, Paola Cepeda, Kristen Syrett, Jessica Rett, Ivona Kucerova. Now a continued LSA COGEL effort. The recipient of the 2019 LSA Linguistic Service Award.
Newest work
Peer reviewed conference papers:
-
Hadas Kotek, Rikker Dockum, and David Q. Sun. 2023. Gender bias and stereotypes in Large Language Models. ACM Collective Intelligence conference.
-
Xiu, Zidi, Kai-Chen Cheng, David Q. Sun, Jiannan Lu, Hadas Kotek, Yuhan Zhang, Paul McCarthy, Christopher Klein, Stephen Pulman, Jason D. Williams. 2023. Feedback Effect in User Interaction with Intelligent Assistants: Delayed Engagement, Adaption and Drop-out. The 27th meeting of the Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD).
-
Patel, Alkesh, Joel Moniz, Roman Nguyen, Hadas Kotek, Nick Tzou, Vincent Renkens. 2021. MMIU: Dataset for Intent Understanding in Multimodal Assistant”. West Coast NLP (WeCNLP).
-
Sun, David Q., Hadas Kotek, Christopher Klein, Mayank Gupta, William Li and Jason D. Williams. 2020. Improving Human-Labeled Data through Dynamic Automatic Conflict Resolution. The 28th International Conference on Computational Linguistics (COLING).
-
Patel, Alkesh, Akanksha Bindal, Hadas Kotak, Christopher Klein, and Jason D. Williams. Generating Natural Questions from Images for Multimodal Assistant.
- West Coast NLP (WeCNLP), October 2020.
- IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2021.
Journal papers:
-
Paola Cepeda, Hadas Kotek, Katharina Pabst, and Kristen Syrett. 2021. Gender bias in linguistics textbooks: Has anything changed since Macaulay & Brice (1997)?. Language 97(4): 678–702.
-
Kotek, Hadas, Rikker Dockum, Sarah Babinski, and Christopher Geissler. 2021. Gender bias and stereotypes in linguistic example sentences. Language 97(4): 653–677.
-
Kotek, Hadas, Sarah Babinski, Rikker Dockum, and Christopher Geissler. 2021. Gender stereotypes and inclusion in language teaching. Babylonia 1: 66–70.
-
Kastner, Itamar, Hadas Kotek, Rikker Dockum, Michael Dow, Maria Esipova, Caitlin Green, Todd Snider. Who speaks for us? Lessons from the Pinker letter. Manuscript.
Books:
-
Kotek, Hadas. 2019. Composing Questions. Linguistics Inquiry Monograph series. Cambridge, MA: MIT Press.
-
Halpert, Claire, Hadas Kotek, and Coppe van Urk (eds.). 2017. A Pesky Set: Papers for David Pesetsky. MIT Working Paper in Linguistics 80. Cambridge, MA: MITWPL.
Recorded talks
- Gender bias in constructed example sentences (with Rikker Dockum, Sarah Babinski, and Christopher Geissler); slides, YouTube recording.
- Webinar, SOAS, March 2021.
- Colloquium talk, University of Connecticut, March 2021.
- Colloquium talk, University of Oregon, October 2020.
- Ellipsis licensing in sluicing: A QuD account (with Matthew Barros; handout, slides, video recording).
- Chicago Linguistic Society (CLS) 53, University of Chicago, May 2017.
- Multiple questions about sluicing, Yale University, April 2017.
- GLOW workshop on compositionality at the interfaces, Leiden University, March 2017.
- Diagnosing covert movement. Panel on questions, workshop for David Pesetsky. MIT Department of Linguistics, February 2017. (handout, slides, video recording, starting at 1:15:30).
Additional details about these and other papers and presentations can be found on my publications page. To read more about my various projects, visit the research page.
For the most up to date list of my presentations and publications, please consult my CV.