Zoher Kachwala

I am a PhD candidate at Indiana University, in the Luddy School of Informatics, Computing and Engineering, advised by Professor Filippo Menczer. I am an active member of the Observatory on Social Media and the NaN research group. I also collaborate with Professor Jisun An and Professor Haewoon Kwak.

My research specializes in LLM post-training and evaluation:

Evaluation & Benchmarking: Designed REMATCH (NAACL’24), a novel AMR graph evaluation metric achieving 5x speedup while ranking first in semantic similarity. Building multimodal benchmarks for community-aware content moderation.
Post-Training Methods: Developed Prefill-Guided Thinking (NeurIPS’25 Workshop), achieving 24% F1 improvement for zero-shot AI image detection. Researching structured fine-tuning for cross-domain generalization.
Research to Production: Build and deploy systems using PyTorch, vLLM, and multi-GPU infrastructure. Experience scaling LLM training and evaluation pipelines on GPU clusters.

I also contributed to large-scale social media research (MEIU22, ICWSM 2023), releasing multi-platform datasets for political discourse analysis. Currently exploring heuristic-guided decoding for improved reasoning and optimization landscapes of prefills versus prompts.

Recent GitHub Activity

Contribution activity for the past year

Updated automatically • View on GitHub

news

May 15, 2025	Excited to share that our paper `Prefilled responses enhance zero-shot detection of AI-generated images` has been accepted to the NeurIPS 2025 Workshop on Generative and Protective AI for Content Creation! The paper has also been submitted to ACL ARR. Our Prefill-Guided Thinking (PGT) method improves AI-generated image detection by up to 24% without training data. 🚀
Oct 17, 2024	The results of the first CNetS Chocolate Tasting Workshop are finally live! 🎉 We had a panel of expert taste-testers rate 15 different chocolates on a -5 to 5 scale, and the results are full of surprises. Which chocolate reigned supreme? Which one got the cold shoulder? You’ll have to click through to find out!😏
Jun 06, 2024	My virtual NAACL24 presentation for `Rematch` is now live on YouTube! In this video, I delve into: 🔍 The significance of graphical representations in language, or “local knowledge graphs.” ⚖️ The critical aspects we aim to optimize while keeping computational costs low. 🏆 How our algorithm, REMATCH, outperforms state-of-the-art methods in these areas.
Mar 14, 2024	Our paper `REMATCH: Robust and Efficient Knowledge Graph Matching` was accepted to NAACL24!
Mar 01, 2024	My research was awarded computing resources worth $160,550 by NSF’s Jetstream2 Project!

selected publications

Prefilled responses enhance zero-shot detection of AI-generated images

Zoher Kachwala , Danishjeet Singh , Danielle Yang , and 1 more author

2025

Abs Bib PDF Code

@misc{kachwala2025prefilledresponsesenhancezeroshot,
  title = {Prefilled responses enhance zero-shot detection of AI-generated images},
  author = {Kachwala, Zoher and Singh, Danishjeet and Yang, Danielle and Menczer, Filippo},
  year = {2025},
  eprint = {2506.11031},
  archiveprefix = {arXiv},
  primaryclass = {cs.LG},
  url = {https://arxiv.org/abs/2506.11031},
  status = {NeurIPS 2024 Workshop; Under Review – ACL ARR},
}

MultiModReddit: A Benchmark for Community-Aware Content Moderation

Zoher Kachwala , Jisun An , Haewoon Kwak , and 1 more author

2025

Under Review – ACL ARR

Abs Bib

Multimodal benchmark with 100K discussion threads across 100 Reddit communities with distinct rule sets, enabling evaluation of LLM moderation systems that respect community-specific norms.
@unpublished{kachwala2025multimodreddit, title = {MultiModReddit: A Benchmark for Community-Aware Content Moderation}, author = {Kachwala, Zoher and An, Jisun and Kwak, Haewoon and Menczer, Filippo}, year = {2025}, note = {Under Review – ACL ARR}, status = {Under Review}, }
Cross-Community Generalization through Structured Supervised Fine-Tuning

Zoher Kachwala , Jisun An , Haewoon Kwak , and 1 more author

2025

In Preparation

Abs Bib

SFT methodology using structured chain-of-thought distillation, demonstrating that systematic context analysis enables single-community training to generalize across 100+ diverse communities.
@unpublished{kachwala2025crosscommunity, title = {Cross-Community Generalization through Structured Supervised Fine-Tuning}, author = {Kachwala, Zoher and An, Jisun and Kwak, Haewoon and Menczer, Filippo}, year = {2025}, note = {In Preparation}, status = {In Preparation}, }

REMATCH: Robust and Efficient Matching of Local Knowledge Graphs to Improve Structural and Semantic Similarity

Zoher Kachwala , Jisun An , Haewoon Kwak , and 1 more author

In Findings of the Association for Computational Linguistics: NAACL 2024 , Jun 2024

Abs Bib PDF Code

@inproceedings{kachwala-etal-2024-rematch,
  title = {{REMATCH}: Robust and Efficient Matching of Local Knowledge Graphs to Improve Structural and Semantic Similarity},
  author = {Kachwala, Zoher and An, Jisun and Kwak, Haewoon and Menczer, Filippo},
  editor = {Duh, Kevin and Gomez, Helena and Bethard, Steven},
  booktitle = {Findings of the Association for Computational Linguistics: NAACL 2024},
  month = jun,
  year = {2024},
  address = {Mexico City, Mexico},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2024.findings-naacl.64/},
  doi = {10.18653/v1/2024.findings-naacl.64},
  pages = {1018--1028},
}

A multi-platform collection of social media posts about the 2022 US midterm elections

Rachith Aiyappa , Matthew R DeVerna , Manita Pote , and 8 more authors

In Proceedings of the International AAAI Conference on Web and Social Media , Jun 2023

Abs Bib PDF Code

@inproceedings{aiyappa2023multi,
  title = {A multi-platform collection of social media posts about the 2022 US midterm elections},
  author = {Aiyappa, Rachith and DeVerna, Matthew R and Pote, Manita and Truong, Bao Tran and Zhao, Wanying and Axelrod, David and Pessianzadeh, Aria and Kachwala, Zoher and Kim, Munjung and Seckin, Ozgur Can and others},
  booktitle = {Proceedings of the International AAAI Conference on Web and Social Media},
  volume = {17},
  pages = {981--989},
  year = {2023},
  doi = {10.1609/icwsm.v17i1.22205},
  url = {https://ojs.aaai.org/index.php/ICWSM/article/view/22205},
}

The Inexplicable Efficacy of Language Models

Rachith Aiyappa , and Zoher Kachwala

XRDS: Crossroads, The ACM Magazine for Students, Apr 2023

Abs Bib

@article{aiyappa_inexplicable_2023,
  title = {The {Inexplicable} {Efficacy} of {Language} {Models}},
  volume = {29},
  issn = {1528-4972},
  url = {https://dl.acm.org/doi/10.1145/3589654},
  doi = {10.1145/3589654},
  number = {3},
  urldate = {2024-03-18},
  journal = {XRDS: Crossroads, The ACM Magazine for Students},
  author = {Aiyappa, Rachith and Kachwala, Zoher},
  month = apr,
  year = {2023},
  pages = {60--62},
}