Zoher Kachwala

I am a PhD candidate at Indiana University’s Luddy School of Informatics, Computing and Engineering, under the guidance of Professor Filippo Menczer. I am actively involved in the Observatory on Social Media and the NaN research group. I also have the privilege to collaborate with Professor Jisun An and Professor Haewoon Kwak.

My research focuses on advancing the capabilities of frontier Large Language Models to solve large-scale, real-world challenges through collaborative scientific advancement. I drive breakthrough technologies in foundational AI systems for practical applications like semantic search and safe content generation through:

Scalable LLM Systems: Designing breakthrough decoding strategies and training methodologies for improved reasoning capabilities, with focus on robust deployment across diverse applications and large-scale inference scenarios.
Multimodal AI & Evaluation: Developing next-generation prompt-based methods for Vision-Language Models and creating comprehensive evaluation frameworks for intelligent systems that advance computing infrastructure.
AI Safety at Scale: Building production-ready moderation systems through collaborative fine-tuning and deployment of specialized LLMs, addressing real-world safety challenges across hundreds of online communities.

This collaborative research contributes to building foundational AI systems that advance scientific understanding while solving problems at the scale needed to benefit billions of people.

Recent GitHub Activity

Contribution activity for the past year

Updated automatically • View on GitHub

news

May 15, 2025	Excited to share that our paper `Task-Aligned Prompting Improves Detection of AI-Generated Images in VLMs` has been submitted to NeurIPS 2025! Our zero-shot-s² method improves AI-generated image detection by up to 29% without fine-tuning. 🚀
Oct 17, 2024	The results of the first CNetS Chocolate Tasting Workshop are finally live! 🎉 We had a panel of expert taste-testers rate 15 different chocolates on a -5 to 5 scale, and the results are full of surprises. Which chocolate reigned supreme? Which one got the cold shoulder? You’ll have to click through to find out!😏
Jun 06, 2024	My virtual NAACL24 presentation for `Rematch` is now live on YouTube! In this video, I delve into: 🔍 The significance of graphical representations in language, or “local knowledge graphs.” ⚖️ The critical aspects we aim to optimize while keeping computational costs low. 🏆 How our algorithm, REMATCH, outperforms state-of-the-art methods in these areas.
Mar 14, 2024	Our paper `REMATCH: Robust and Efficient Knowledge Graph Matching` was accepted to NAACL24!
Mar 01, 2024	My research was awarded computing resources worth $160,550 by NSF’s Jetstream2 Project!

selected publications

Task-aligned prompting improves zero-shot detection of AI-generated images by Vision-Language Models

Zoher Kachwala , Danishjeet Singh , Danielle Yang , and 1 more author

2025

Abs Bib PDF Code

As image generators produce increasingly realistic images, concerns about potential misuse continue to grow. Supervised detection relies on large, curated datasets and struggles to generalize across diverse generators. In this work, we investigate the use of pre-trained Vision-Language Models (VLMs) for zero-shot detection of AI-generated images. While off-the-shelf VLMs exhibit some task-specific reasoning and chain-of-thought prompting offers gains, we show that task-aligned prompting elicits more focused reasoning and significantly improves performance without fine-tuning. Specifically, prefixing the model’s response with the phrase "Let’s examine the style and the synthesis artifacts" — a method we call zero-shot-s² — boosts Macro F1 scores by 8%–29%. These gains are consistent for two widely used open-source models and across three recent, diverse datasets spanning human faces, objects, and animals with images generated by 16 different models — demonstrating strong generalization.
@misc{kachwala2025taskalignedpromptingimproveszeroshot, title = {Task-aligned prompting improves zero-shot detection of AI-generated images by Vision-Language Models}, author = {Kachwala, Zoher and Singh, Danishjeet and Yang, Danielle and Menczer, Filippo}, year = {2025}, eprint = {2506.11031}, archiveprefix = {arXiv}, primaryclass = {cs.LG}, url = {https://arxiv.org/abs/2506.11031}, status = {Under Review at NeurIPS 2025}, }
Advanced Heuristics for LLM Decoding Improve Chain-of-Thought Reasoning

Zoher Kachwala , and Filippo Menczer

In In Progress , 2024

Abs Bib

Proposes novel decoding strategies guided by answer labels to improve chain-of-thought reasoning in LLMs, boosting interpretability and task performance in complex generation scenarios.
@inproceedings{kachwala2024heuristics, title = {Advanced Heuristics for LLM Decoding Improve Chain-of-Thought Reasoning}, author = {Kachwala, Zoher and Menczer, Filippo}, booktitle = {In Progress}, year = {2024}, status = {In Progress}, }
Fine-Tuning Specialized LLMs for Large-Scale Community Content Moderation

Zoher Kachwala , Jisun An , Haewoon Kwak , and 1 more author

In In Progress , 2024

Abs Bib

Building a real-world moderation system by developing a framework to fine-tune and deploy specialized LLMs; this work supports healthier online communities by predicting nuanced rule violations across 500+ subreddits.
@inproceedings{kachwala2024moderation, title = {Fine-Tuning Specialized LLMs for Large-Scale Community Content Moderation}, author = {Kachwala, Zoher and An, Jisun and Kwak, Haewoon and Menczer, Filippo}, booktitle = {In Progress}, year = {2024}, status = {In Progress}, }
REMATCH: Robust and Efficient Knowledge Graph Matching

Zoher Kachwala , Jisun An , Haewoon Kwak , and 1 more author

In Findings of the Association for Computational Linguistics: NAACL 2024 , 2024

Abs Bib PDF Code

Introduced a novel AMR similarity metric (rematch) that is 5x faster than SOTA and improves semantic similarity by up to 5%, alongside a new benchmark (RARE) for evaluating structural similarity in knowledge graphs.
@inproceedings{kachwala2024rematch, title = {REMATCH: Robust and Efficient Knowledge Graph Matching}, author = {Kachwala, Zoher and An, Jisun and Kwak, Haewoon and Menczer, Filippo}, booktitle = {Findings of the Association for Computational Linguistics: NAACL 2024}, year = {2024}, url = {https://aclanthology.org/2024.findings-naacl.64}, }
A multi-platform collection of social media posts about the 2022 US midterm elections

Rachith Aiyappa , Matthew R DeVerna , Manita Pote , and 8 more authors

In Proceedings of the International AAAI Conference on Web and Social Media , 2023

Abs Bib PDF Code

Social media are utilized by millions of citizens to discuss important political issues. Politicians use these platforms to connect with the public and broadcast policy positions. Therefore, data from social media has enabled many studies of political discussion. While most analyses are limited to data from individual platforms, people are embedded in a larger information ecosystem spanning multiple social networks. Here we describe and provide access to the Indiana University 2022 U.S. Midterms Multi-Platform Social Media Dataset (MEIU22), a collection of social media posts from Twitter, Facebook, Instagram, Reddit, and 4chan. MEIU22 links to posts about the midterm elections based on a comprehensive list of keywords and tracks the social media accounts of 1,011 candidates from October 1 to December 25, 2022. We also publish the source code of our pipeline to enable similar multi-platform research projects.
@inproceedings{aiyappa2023multi, title = {A multi-platform collection of social media posts about the 2022 US midterm elections}, author = {Aiyappa, Rachith and DeVerna, Matthew R and Pote, Manita and Truong, Bao Tran and Zhao, Wanying and Axelrod, David and Pessianzadeh, Aria and Kachwala, Zoher and Kim, Munjung and Seckin, Ozgur Can and others}, booktitle = {Proceedings of the International AAAI Conference on Web and Social Media}, volume = {17}, pages = {981--989}, year = {2023}, doi = {10.1609/icwsm.v17i1.22205}, url = {https://ojs.aaai.org/index.php/ICWSM/article/view/22205}, }

The Inexplicable Efficacy of Language Models

Rachith Aiyappa , and Zoher Kachwala

XRDS: Crossroads, The ACM Magazine for Students, Apr 2023

Abs Bib

@article{aiyappa_inexplicable_2023,
  title = {The {Inexplicable} {Efficacy} of {Language} {Models}},
  volume = {29},
  issn = {1528-4972},
  url = {https://dl.acm.org/doi/10.1145/3589654},
  doi = {10.1145/3589654},
  number = {3},
  urldate = {2024-03-18},
  journal = {XRDS: Crossroads, The ACM Magazine for Students},
  author = {Aiyappa, Rachith and Kachwala, Zoher},
  month = apr,
  year = {2023},
  pages = {60--62},
}