Md Tanvirul Alam

I am Tanvir, a PhD candidate at Rochester Institute of Technology working on reasoning in language and vision-language models, with a particular focus on reinforcement learning with verifiable rewards. My research studies when these systems genuinely acquire new reasoning ability versus when they rely on shortcuts, semantic priors, or reward-driven heuristics, spanning multimodal benchmarks, self-play, and structured real-world domains such as cyber threat intelligence. More broadly, I am interested in building robust AI systems that can reason reliably under distribution shift and in high-stakes settings. Looking ahead, I am especially interested in world models and in how language, vision, and action can be integrated into more grounded reasoning systems; if you would like to collaborate, feel free to email me.

Research Interests

Reasoning Reinforcement Learning with Verifiable Rewards Large Language Models Vision-Language Models Robustness and Generalization World Models Benchmarking and Evaluation Cyber Threat Intelligence

Connect

GitHub Google Scholar LinkedIn Stack Overflow

Email: ma8235 [at] rit [dot] edu