URCS Doctoral Student

About Me

I’m a PhD candidate at the University of Rochester in Computer Science. My research spans across Natural Language Understanding/Processing, Computational Linguistics, and Artificial Intelligence. Specifically, I work on mining commonsense knowledge and organizing world knowledge into schemas to reflect the relational understanding between entities as well as between knowledge and action with Len Schubert. I also work on detecting and recovering semantically compressed forms, including ellipsis, with Aaron Steven White.

Projects

Lexical knowledge and object schema acquisition

Jo wanted to hang up the frame. Should we expect him to go looking for a nail or a TV?

I work on obtaining commonly assumed knowledge about objects from various sources and representing the acquired knowledge in a formalized knowledge representation, called a schema. My work focuses on capturing lexical knowledge—such as hypernyms, parts, materials, and usage information—to enable human-like reasoning in AI.

Commonsense reasoning dataset

Which is less likely to be, made at least partially, of a material that is a constituent of a ping pong paddle: a computer mouse or a tuning fork?

I create a manually curated dataset of binary-choice questions about shared materials between objects. Each entry in the dataset is carefully reviewed to challenge language models to consider detailed information on artifact parts and material compositions.

Semantically compressed forms

What do you mean by you think so?

I also work on making the underlying meaning of semantically compressed forms explicit in text. The scope of the project extends beyond elliptical utterances, such as verb phrase ellipsis, to include non-elliptical referring expressions, such as null complement anaphora.

Neg(ation)-raising

megaattitude.io/projects/mega-negraising

Jo doesn’t think that Bo left. Does Jo think that Bo didn't leave?

I was involved in FACTS.lab’s MegaAttitude Project for building lexicon-scale datasets from crowdsourcing and studying the semantic contribution of lexical items and syntactic structures.

Papers

2024

Hannah YoungEun An, and Lenhart K. Schubert. (2024). Large Language Models as a Tool for Mining Object Knowledge. paper

Jiacan Yu, Hannah Y. An, and Lenhart K. Schubert. (2024). Language Models Benefit from Preparation with Elicited Knowledge. dataset

William Gantt, Shabnam Behzad, Hannah YoungEun An, Yunmo Chen, Aaron Steven White, Benjamin Van Durme, and Mahsa Yarmohammadi. (2024). MultiMUC: Multilingual Template Filling on MUC-4. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 349-368, St. Julian’s, Malta. Association for Computational Linguistics. paper

2023

James Allen, Hannah An, Ritwik Bose, Will de Beaumont, and Choh Man Teng. (2023). COLLIE: a broad-coverage ontology and lexicon of verbs in English. In Language Resources and Evaluation, 57(1), pages 57-86. Springer Nature. paper

2020

James Allen, Hannah An, Ritwik Bose, Will de Beaumont, and Choh Man Teng. (2020). A Broad-Coverage Deep Semantic Lexicon for Verbs. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 3243–3251, Marseille, France. European Language Resources Association. paper

Hannah Youngeun An and Aaron Steven White. (2020). The lexical and grammatical sources of neg-raising inferences. In Proceedings of the Society for Computation in Linguistics, pages 386-399, New York, New York. Association for Computational Linguistics. paper poster

Education

University of Rochester

PhD Computer Science

2019 - Present

Specialized in Artificial Intelligence, I am currently working on making implicit content explicit and acquiring lexical knowledge for AI systems to support the reasoning process.

University of Rochester

MS Computer Science

2019 - 2021

I worked on aligning and developing ontologies with a deep semantic lexicon to support reasoning about commonsense knowledge, supervised by James Allen.

University of Rochester

MS Computational Linguistics

2017 - 2019

During my first two years in Rochester, many courses taught me key skills required in NLP and ML. As a part of the master’s degree, I completed a final project ‘Neg-raising inference and its syntactic contexts’ supervised by Aaron Steven White.

University of Washington

MA Linguistics

2012 - 2017

I graduated with Departmental Honors in Linguistics, with an unweighted 4.0 GPA in the courses offered from the department and with an honors thesis ‘Characteristic determiners vs. adjectives in Korean’ supervised by Toshiyuki Ogihara.

University of Washington

BS Applied and Computational Mathematical Sciences

2012 - 2017

My undergraduate degree with specialization in Scientific Computing and Numerical Algorithms prepared me to acquire a strong background in Math. Independent of my major program, I completed a minor in Philosophy as well.

Teaching

Teaching Assistant: CSC 247/447 Natural Language Processing (Spring 2021, University of Rochester)
Teaching Assistant: CSC 442 Introduction to Artificial Intelligence (Fall 2020, University of Rochester)
Teaching Assistant: CSC 247/447 Natural Language Processing (Spring 2020, University of Rochester)
Teaching Assistant: LIN 110 Introduction to Linguistic Analysis (Spring 2018, University of Rochester)
Korean Tutor @ CLUE Academic Support Program (Fall 2016 - Spring 2017, University of Washington)