Gaurav Kumar

PhD Researcher in Natural Language Processing

Indian Institute of Science Education and Research (IISER) Bhopal

I build reliable NLP systems for misinformation detection, multimodal verification, and LLM-based reasoning. My work spans general fact-checking, multilingual NLP, and vision-language grounding with a strong focus on measurable research impact.

Email Me Explore Publications

Research Focus

Fact-checking, Multimodal NLP, LLM Reasoning

Recent Venue

WWW 2026 (Web4Good Track)

Current Affiliation

IISER Bhopal

Research Focus

Reliable NLP for Fact Verification

Designing trustworthy NLP pipelines for fact verification across text and multimodal settings.

Research Interests

My research targets robust evidence-aware reasoning in LLMs, with emphasis on entailment-driven verification, multilingual transfer, and multimodal fusion for misinformation-heavy environments.

General, multilingual, and multimodal fact-checking
Agentic AI and autonomous LLM systems
Prompt engineering and Model Context Protocol workflows
LLM reasoning, inference optimization, and evaluation
Biomedical NLP and adverse drug reaction mining

Methods and Tooling

LoRA and PEFT fine-tuningNatural Language Inference-based verificationRetrieval-grounded generation and evidence routingContrastive multimodal representation learningBenchmark-driven error analysis and ablations

Publications

Selected Papers

Detailed publication entries with venue, year, status, and direct action links.

ARR for the ACL · 2025

Under Review

Improving the Fact-Checking Performance of Language Models by Relying on Their Entailment Ability

Gaurav Kumar et al.

Introduces an entailment-centered fact-checking strategy that improves macro-F1 across RAW-FC and LIAR-RAW by strengthening claim-evidence consistency scoring.

Paper Coming soon

The Web Conference (WWW) 2026, Web4Good Track · 2026

Accepted

Multimodal Fact Checking with Unified Visual, Textual, and Contextual Representations

Aditya Kishore*, Gaurav Kumar* et al.

Presents a unified multimodal architecture with relational fusion and contrastive objectives, reporting state-of-the-art weighted F1 (0.84) on Factify 2.

Paper Coming soon

DHOW-MiLLA Workshop, co-located with The Web Conference (WWW) 2026 · 2026

Accepted

On VLMs for Diverse Tasks in Multimodal Meme Classification

Deepesh Gavit, Gaurav Kumar et al.

Evaluates vision-language models across diverse multimodal meme classification tasks and highlights transfer behavior under varying annotation regimes.

Paper Coming soon

Work Experience

Research and Thesis Work

Academic research experience with supervisor context and outcome-focused highlights.

PhD Researcher (NLP)

IISER Bhopal

Bhopal, India

2022 - Present

Supervisor: Dr. Jasabanta Patro, Assistant Professor, Department of Data Science and Engineering

Developed an entailment-driven fact-checking framework with up to 28.6% macro-F1 gains on LIAR-RAW and 44.3% gains on RAW-FC.
Built MultiCheck, a unified multimodal verification pipeline integrating text and image encoders with relational fusion.
Designed reproducible evaluation pipelines for claim verification in multilingual and multimodal settings.

Teaching Assistant

Indian Institute of Science Education and Research (IISER) Bhopal

Bhopal, India

2023 - 2026

Deep Learning (DSE316) — Jan–Apr 2023
Introduction to Programming (ECS102) — Aug–Nov 2023
Database Management Systems (DSE310) — Jan–Apr 2024
Natural Language Processing (DSE407/607) — Aug–Nov 2024
Natural Language Processing (DSE407/607) — Jan–Apr 2025
Advanced Natural Language Processing (DSE418/618) — Aug–Nov 2025
Data Structures and Algorithms (ECS202) — Jan–Apr 2026

M.Tech Thesis Researcher

Jawaharlal Nehru University

New Delhi, India

2022

Supervisor: Dr. Aditi Sharan, Associate Professor, School of Computer and Systems Sciences

Researched deep learning methods for adverse drug reaction extraction from clinical text.
Focused on entity extraction for discontinuous mentions in biomedical NLP workflows.

Projects

Selected Research Projects

Applied systems and prototypes aligned with NLP research objectives.

Fake News Detection and Link Prediction

Research prototype combining linguistic signals and graph structure to detect misinformation and predict propagation links in social media ecosystems.

NLPGraph Neural NetworksMisinformation Analysis

Repository

Adverse Drug Reaction Entity Extraction

Entity extraction framework inspired by maximal-clique style reasoning for complex and discontinuous biomedical mentions.

Biomedical NLPNamed Entity RecognitionDeep Learning

Repository

NLP Processing Toolkit

Interactive toolkit for tokenization, POS tagging, chunking, stemming, lemmatization, NER, parsing, and feature extraction with practical visual outputs.

PythonNLTKspaCyApplied NLP

Repository

Gallery

Research Artifacts

Figures and screenshots from ongoing NLP and fact-checking work.

Fact-Checking System Artifact

Visualization from misinformation detection and verification workflows used for claim-level analysis.

Thesis Research Snapshot

Representative figure from M.Tech thesis work on adverse drug reaction information extraction.

Applied NLP Toolkit Interface

Practical NLP toolkit UI demonstrating classical preprocessing and feature extraction tasks.

Outreach

Talks, Workshops, and Service

Academic engagement through symposium participation, workshops, and community contribution.

Talks and Workshops

symposium

Indian Symposium on Machine Learning (IndoML 2025)

Participant

BITS Pilani Hyderabad, India · 2025

symposium

Indian Symposium on Machine Learning (IndoML 2024)

Participant

BITS Pilani Goa, India · 2024

symposium

Indian Symposium on Machine Learning (IndoML 2023)

Participant

IIT Bombay, India · 2023

workshop

CVIT Summer School

Summer School Participant

IIIT Hyderabad, India · 2023

Service and Community

Academic Community Participation

IndoML Community

2023 - 2025

Contributed to technical discussions and research exchange sessions around current ML and NLP directions.

Open Research and Reproducibility

NLP Research Projects

Ongoing

Maintains transparent experiment reporting and benchmark-focused reproducibility for fact-checking systems.

Volunteer

EMNLP 2025

2025

Contributed as a conference volunteer supporting organization and coordination activities.

Volunteer and Participant

Engineer's Day, IISER Bhopal

13 September 2025

Participated in technical activities and volunteered in organizing events during Engineer's Day celebrations at IISER Bhopal.

Recognition

Awards and Certifications

Competitive achievements and formal training milestones.

Honors

GATE Qualification (Computer Science and Engineering)

Graduate Aptitude Test in Engineering

2021

Qualified with focus on CS fundamentals and analytical aptitude.

GATE Qualification (Computer Science and Engineering)

Graduate Aptitude Test in Engineering

2022

Repeated national-level qualification in Computer Science and Engineering.

JNU Entrance Examination

Jawaharlal Nehru University

2020

Selected for M.Tech in Statistical Computing (Data Science).

Best Teaching Assistant Award

IISER Bhopal

2025

Awarded for outstanding teaching assistance during Aug–Nov 2025 semester.

Certifications and Training

Research Internship

IIT Kanpur

2018

57-day research internship program.

Python Training

Internshala

2018

6-week structured training in Python programming.

Web Development Training

M.S.M.E

2019

4-week training on web development fundamentals.

Skills

Technical Capability Matrix

Core NLP, ML, and engineering tools used in day-to-day research and experimentation.

NLP and LLM Skills

LLM fine-tuning (LoRA, PEFT)BERT and RoBERTaNatural Language InferencePrompt engineeringKnowledge-grounded fact-checkingMultilingual and multimodal NLP

ML and Data Science

Supervised and unsupervised learningFeature engineeringModel evaluation and optimizationStatistical modelingCross-validation and regularization

Programming and Tools

PythonSQLPyTorchScikit-learnNumPy and PandasGit and GitHubDockerBash and LinuxJupyter

Education

Academic Training

PhD · Data Science and Engineering

Indian Institute of Science Education and Research

Bhopal, India

2022 - Present

CPI: 8.00/10

M.Tech · Statistical Computing (Data Science)

Jawaharlal Nehru University

New Delhi, India

2020 - 2022

CPI: 8.2/9

B.Tech · Computer Science and Engineering

Aryabhatta Knowledge University

Patna, India

2015 - 2019

CPI: 8.11/10

Contact

Let's Collaborate

Open to research collaboration, NLP engineering opportunities, and academic discussions.

Email

gaurav22@iiserb.ac.in

gauravkumar62024@gmail.com

Bhopal, India

Professional Profiles

LinkedIn GitHub