Please ensure Javascript is enabled for purposes of website accessibility Arjun Subramonian's Website

Selected Publications

On the Discrimination Risk of Mean Aggregation Feature Imputation in Graphs
In human networks, nodes belonging to a marginalized group often have a disproportionate rate of unknown or missing features. This, in conjunction with graph structure and known feature biases, can cause graph feature imputation algorithms to predict values for unknown features that make the marginalized group's feature values more distinct from the the dominant group's feature values than they are in reality. We call this distinction the discrimination risk. We prove that a higher discrimination risk can amplify the unfairness of a machine learning model applied to the imputed data. We then formalize a general graph feature imputation framework called mean aggregation imputation and theoretically and empirically characterize graphs in which applying this framework can yield feature values with a high discrimination risk. We propose a simple algorithm to ensure mean aggregation-imputed features provably have a low discrimination risk, while minimally sacrificing reconstruction error (with respect to the imputation objective). We evaluate the fairness and accuracy of our solution on synthetic and real-world credit networks.
Group Excess Risk Bound of Overparameterized Linear Regression with Constant-Stepsize SGD
It has been observed that machine learning models trained using stochastic gradient descent (SGD) exhibit poor generalization to certain groups within and outside the population from which training instances are sampled. This has serious ramifications for the fairness, privacy, robustness, and out-of-distribution (OOD) generalization of machine learning. Hence, we theoretically characterize the inherent generalization of SGD-learned overparameterized linear regression to intra- and extra-population groups. We do this by proving an excess risk bound for an arbitrary group in terms of the full eigenspectra of the data covariance matrices of the group and population. We additionally provide a novel interpretation of the bound in terms of how the group and population data distributions differ and the group effective dimension of SGD, as well as connect these factors to real-world challenges in practicing trustworthy machine learning. We further empirically study our bound on simulated data.
You Reap What You Sow: On the Challenges of Bias Evaluation Under Multilingual Settings
Evaluating bias, fairness, and social impact in monolingual language models is a difficult task. This challenge is further compounded when language modeling occurs in a multilingual context. Considering the implication of evaluation biases for large multilingual language models, we situate the discussion of bias evaluation within a wider context of social scientific research with computational work. We highlight three dimensions of developing multilingual bias evaluation frameworks: (1) increasing transparency through documentation, (2) expanding targets of bias beyond gender, and (3) addressing cultural differences that exist between languages. We further discuss the power dynamics and consequences of training large language models and recommend that researchers remain cognizant of the ramifications of developing such technologies.
On Dyadic Fairness: Exploring and Mitigating Bias in Graph Connections
This blog post discusses the ICLR 2021 paper "On Dyadic Fairness: Exploring and Mitigating Bias in Graph Connections" by Li et al., highlighting the importance of its theoretical results while critically examining the notions and applications of dyadic fairness provided.
Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies
We explain the complexity of gender and language around it, and survey non-binary persons to understand harms associated with the treatment of gender as binary in English language technologies. We also detail how current language representations (e.g., GloVe, BERT) capture and perpetuate these harms and related challenges that need to be acknowledged and addressed for representations to equitably encode gender information.
Fairness and Bias Mitigation: A practical guide into the AllenNLP Fairness module
As models and datasets become increasingly large and complex, it is critical to evaluate the fairness of models according to multiple definitions of fairness and mitigate biases in learned representations. allennlp.fairness aims to make fairness metrics, fairness training tools, and bias mitigation algorithms extremely easy to use and accessible to researchers and practitioners of all levels.
Motif-Driven Contrastive Learning of Graph Representations
Our framework MotIf-driven Contrastive leaRning Of Graph representations (MICRO-Graph) can: 1) use GNNs to extract motifs from large graph datasets; 2) leverage learned motifs to sample informative subgraphs for contrastive learning of GNN.
MOTIF-Driven Contrastive Learning of Graph Representations
We propose a MOTIF-driven contrastive framework to pretrain a graph neural network in a self-supervised manner so that it can automatically mine motifs from large graph datasets. Our framework achieves state-of-the-art results on various graph-level downstream tasks with few labels, like molecular property prediction.
Automated, Cost-Effective Optical System for Accelerated Antimicrobial Susceptibility Testing (AST) Using Deep Learning
We demonstrate an automated, cost-effective optical system that delivers early AST results, minimizing incubation time and eliminating human errors, while remaining compatible with standard phenotypic assay workflow. The system is composed of cost-effective components and eliminates the need for optomechanical scanning. A neural network processes the captured optical intensity information from an array of fiber optic cables to determine whether bacterial growth has occurred in each well of a 96-well microplate.
Estimating the Ages of FGK Dwarf Stars Through the Use of GALEX FUV Magnitudes
We utilized far-ultraviolet (FUV) photometry acquired by the Galaxy Evolution Explorer (GALEX) space telescope as an indicator of chromospheric activity to infer ages of late-F, G, and K type dwarf stars. We derived a purely empirical correlation between FUV magnitudes and stellar age in conjunction with (B − V) color. Such a calibration has utility in population studies of FGK dwarfs for further understanding of the chemical evolution of the Milky Way.
Queer in AI
Queer in AI is an organization that aims to combat the harms faced by queer researchers within AI. Several inclusion initiatives are outlined, including those centered on policy and financial aid.
Rebuilding Trust: Queer in AI Approach to Artificial Intelligence Risk Management
We argue that any AI development, deployment, and monitoring framework that aspires to trust must incorporate both feminist, non-exploitative participatory design principles and strong, outside, and continual monitoring and testing. We additionally explain the importance of considering aspects of trustworthiness beyond just transparency, fairness, and accountability, specifically, to consider justice and shifting power to the disempowered as core values to any trustworthy AI system. Creating trustworthy AI starts by funding, supporting, and empowering grassroots organizations like Queer in AI so the field of AI has the diversity and inclusion to credibly and effectively develop trustworthy AI. We leverage the expert knowledge Queer in AI has developed through its years of work and advocacy to discuss if and how gender, sexuality, and other aspects of queer identity should be used in datasets and AI systems and how harms along these lines should be mitigated.
How to Make Virtual Conferences Queer-Friendly: A Guide
Queer in AI frequently gets inquires about making virtual conferences more inclusive from both conference organizers and queer community organizers. The purpose of this document is to provide a tutorial for D&I organizers on how to make virtual conferences queer friendly.
Queer | Inclusive | Badass
Employing future ML capabilities and ML-generated artifacts as a proxy, my poster presents how the tech community, by 2025, will prioritize the creation of fair, intersectional, and ethical technology.
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model


  • NSF MENTOR '22 Fellowship (2022)
  • AI2 2021 Outstanding Intern of the Year Award (2022) ~ 1 of 3 interns awarded for going above and beyond as a researcher and as a colleague while at AI2, receiving $10,000 and an invitation to return to AI2 for another internship
  • MLH Top 50 Class of 2021 ~ out of 135,000 students who participated in hackathons, story was one of 50 recognized due to projects and impact on other students in community
  • UCLA Samueli School-Wide Outstanding Bachelor of Science (2021)
  • UCLA Chancellor's Service Award (2021)
  • UCLA Samueli Engineering Achievement Award in Student Welfare (2021)
  • UCLA Eugene V. Cota-Robles Fellowship (2021) ~ one of most prestigious graduate fellowships awarded by UCLA
  • UCLA Graduate Research Assistantship (2021)
  • Boeing Company Scholarship (2021)
  • Brian J. Lewis Endowment (2021)
  • Computing Research Association Outstanding Undergraduate Researcher Honorable Mention (2020)
  • AAAI Undergraduate Consortium (2020) ~ presenting at AAAI Undergraduate Research Symposium and receiving mentorship from leading researchers in AI; 1 of 14 accepted out of 82 applicants for inspiring personal statement and exemplary service and research in self-supervised methods for learning graph-level representations
  • IBM Quantum Challenge (2020) ~ decomposed a large unitary gate for a minimal gate set with Qiskit; 1 of 574 winners out of 1745 participants
  • Out for Undergrad Tech Conference (2020) ~ 1 of 300 applicants accepted for superb academics, exemplary leadership, and work experiences, as well as diverse and unique viewpoints
  • Google Queer Tech Voices Conference (2020) ~ 1 of 32 accepted out of hundreds of applicants
  • 3rd Place Award for Best Hack @ Rose Hack, Major League Hacking (2019) ~ developed application that produces mashups of songs and evaluates which two songs form the best mashup
  • Siemens Competition Regional Finalist (2017) ~ 1 of 101 finalists selected from 4092 entrants
  • Award of Achievement, Association for Computing Machinery, San Francisco Bay Area Professional Chapter (2016) ~ developed automated digital music transposer
  • Dean's Honors List (2018-2021)

Invited Talks and Panels


  • Arjun Subramonian on Queer Approaches to AI and Computing (The Good Robot, 2023)
  • Trans Researchers Want Google Scholar to Stop Deadnaming Them (WIRED, 2022)
  • Taking The TamBram Out of Pride Month (gaysi, 2022)
  • Asian-Americans, we must resign from our role as Silicon Valley's model minority mascot (XRDS: Crossroads, The ACM Magazine for Students, Volume 28, Issue 4, 2022)
  • .Tech Domains x Major League Hacking: 24 Student Programmers Share Their #MyStartInTech Stories (.Tech Domains, 2021)
  • UCLA Engineering Outstanding Bachelor Awardee Champions Equity for LGBTQ+ Community (UCLA Samueli Newsroom, 2021)
  • Queer in AI with Arjun Subramonian (500 Queer Scientists, 2021)
  • UCLA Samueli Announces 2021 Commencement Awards (UCLA Samueli Newsroom, 2021)
  • QWER Hacks: A Case Study on How to Build an Inclusive Hackathon (UCLA Samueli Newsroom, 2021)
  • UCLA's ACM AI Podcast Addresses AI and Diversity, Featuring Guests from Underrepresented Communities (UCLA Samueli Newsroom, 2021)
  • Student-run tech podcast aims to make computer science more diverse, accessible (Daily Bruin, 2021)
  • ACM AI at UCLA, Outreach + Events Feature (A.I. For Anyone, 2020)
  • Students code software to help underrepresented groups in LGTBQ+ hackathon (Daily Bruin, 2020)
  • Equality in America Town Hall with Tom Steyer (CNN, 2019)
  • Washington, California Students Win Regional Siemens Competition at California Institute of Technology (citybizlist, 2017)
  • Indian American STEM Whiz Kids Named 2017 Siemens Regional Finalists (IndiaWest, 2017)
  • Three MVHS students make it to semifinal round of Siemens competition (El Estoque, 2017)
  • Local Charity Map of Bay Area (ArcGIS, 2016)


  • (2023) I am a teaching assistant for Computer Science 32 at UCLA, which covers object-oriented programming, data structures, and algorithms.

  • (2022-Present) I was/am a reviewer for: LoG 2022, FAccT 2022, GLFrontiers @ NeurIPS 2022, TSRML @ NeurIPS 2022, TrustNLP @ NAACL 2022, Challenges & Perspectives in Creating Large Language Models @ ACL 2022, NAACL Student Research Workshop (SRW) 2022, Workshop on Online Abuse and Harms @ NAACL 2022

  • (2022) I am serving as an Affinity Workshops Chair for NeurIPS 2022.

  • (2021-Present) I am a core organizer with Queer in AI, hosting workshops and socials (AAAI 2021, ICML 2021, NeurIPS 2021, FAccT 2022, NAACL 2022, ICML 2022) at AI conferences to build a strong community of queer and trans researchers. Furthermore, I organized the undergraduate mentoring program, which gets junior queer and trans folks involved with AI research and aids them in applying to graduate school. Additionally, I advise AI conferences on diversity and inclusion and accessibility issues, and I help shape AI policy as it concerns queer and trans communities. Misc: guide on gender and pronouns for instructors, stats from Queer in AI's graduate application financial aid program.

  • (2021-2022) I am serving as an Accessibility Chair on NAACL 2022's Diversity and Inclusion committee, ensuring in-person and digital accessibility for the conference. I authored guidelines on: Publication Accessibility, Quality, and Inclusivity, Poster and Talk Accessibility, Quality, and Inclusivity.

  • (2021-Present) I serve on the UCLA Samueli Standing Committee on Diversity, on behalf of Queer and Trans in STEM. I am working towards dropping the GRE requirement and eliminating application fees for graduate school admissions.

  • (2021) I reviewed scholarship applications for UCLA Engineering.

  • (2021) I helped organize AllenNLP Hacks, a hackathon to connect with marginalized students, welcome them into AllenNLP's open-source community, bring their perspectives to AllenNLP's research, and encourage them to apply to intern and work with AllenNLP.

  • (2021-2022) As an organizer of the UCLA Computer Science Summer Institute (CSSI), I have interviewed and recruited a diverse group of Undergraduate Learning Assistants to lead interactive coding and problem-solving sessions with the high school students.

  • (2020-2021) I led JEDI initiatives within ACM at UCLA, employing actionable goalsetting and reflection to take concrete steps towards making the organization more inclusive of everyone.

  • (2019-2021) I co-founded QWER Hacks, Major League Hacking's first-ever LGBTQIA+ event and the first collegiate LGBTQIA+ hackathon in the nation, which increases the visibility of and celebrates the queer and trans community in STEM.

  • (2019-2021) I advocate to make an AI education accessible to everyone. With the prevalence of AI in modern society and the harms it poses to marginalized communities, it is paramount that we empower individuals from these communities to have conversations about AI and fight against algorithmic injustices. As Outreach Director of ACM AI at UCLA, I created, led, and taught machine learning and AI ethics classes at Title I schools in LA, through in-person visits, virtual sessions, and educational technology (e.g. mean-squared error, convolutional filters, biases in machine learning, etc.)

  • (2020) I created and produced the "You Belong in AI!" podcast, which inspires youth and college students of all identities and backgrounds, especially those who are marginalized, to pursue AI opportunities.