Hello! I am a senior studying Computer Science at Cornell University. As a researcher, I am interested in the fields of natural language processing, machine learning, and computational social science.
At Cornell, I have been fortunate to work with Professor Lillian Lee
and Ana Smith in the NLP group,
as well as Professor Thomas Cleland in the Computational Physiology Lab.
My interests in computer science are interdisciplinary, and I hope to focus my research on extracting insights from large amounts of data, utilizing NLP & ML to understand social behavior, and incorporating commonsense and bias detection in NLP systems.
Thanks for visiting!
Professional ExperienceFacebook Software Engineer Intern (Summer 2020)
Cisco Data Science Intern (Summer 2019)
Sanofi Data Science Intern (Summer 2018)
Teaching ExperienceCS 4740: Natural Language Processing (fall 2020)
CS 4850: Mathematical Foundations for the Information Age (spring 2020)
CS 2800: Discrete Structures (fall 2018, spring 2019)
I am exploring the roles of language and cognition in the making of moral judgments, using datasets of ethical dilemmas scraped from a reddit forum.
I annotated a corpus of oral conversation alignments with expert analysis, trained a baseline Transformer-Ranker model for SCOTUS Justice identification, and tracked US Supreme Court reporting activity in news streams.
I created a Python GUI using the kivy package and implemented biochemical calulations and visualziations with numpy and matplotlib/seaborn. The application was packaged into a Windows executable and distributed to lab members for internal use.
I processed unstructured text data and built a Named Entity Recognition model to extract top Cisco security product names. The model achieved ~95% accuracy in identifying product names within the text. The results of this project would help with analysis of product trends for Cisco and its competitors.
I performed ETL, feature engineering, and analytics work using SQL and PySpark on 200 million rows of Cisco.com session activity data for integration with a causal analysis pipeline. The goal was to identify the types of content customers were interacting with and their respective impact on Annual Recurring Revenue, in order to recommend and enhance web content with positive ARR impact.
I parallelized the execution of a disease prediction algorithm in R and developed scripts to automate the verification of the predictions for a test set of 1000 genes. The use cases of this project included target discovery/prioritization and disease indication selection during clinical trials.
Inspired by the Tamagotchis from our youth, we implemented a virtual pet with an 8x8 bi-color LED matrix and FRDM board. The pet had basic functions such as eating and pooping, and the user interacted with the pet through the buttons by feeding it and cleaning up after it. The pet's health status was displayed as well. Our project covered class material such as I/O interfacing and buses. I2C was the protocol to communicate between the board and LED grid.
I helped prototype a showerhead attachment (that looked like Iron Man’s Arc Reactor) using Arduino Board, CAD software, and 3D printing as an innovation to regulate water temperature for consumers with sensitive skin, and then pitched a business plan with 5 team members.