About
I'm Abhinav Mohanty — a GenAI researcher and red teamer with a background in systems security and responsible AI evaluation. My interests span the full responsible AI spectrum: adversarial robustness and red teaming, safety and alignment, deception and scheming behaviors, fairness and bias, societal risk, and the evaluation infrastructure that makes any of this measurable. The common thread is the same question — how do you actually know whether an AI system is behaving the way it should?
This blog is where I think in public about responsible AI evaluation — the hard measurement problems, failure modes, and open questions that don't often get written down clearly. The goal is signal, not volume — posts that a practitioner would find worth reading, grounded in evidence and firsthand experiment.
Disclaimer
The analyses published on this site are based exclusively on publicly available information, open literature, and independent reasoning.
They do not reference, reflect, or rely upon any non-public systems, data, processes, or internal deliberations of any organization.
Any views expressed here are the author's own and are not intended to represent the positions, policies, or perspectives of any current or former employer or affiliated institution.
Scope & methodology
This blog focuses on:
- System-level analysis of responsible AI practices
- Evaluation methodologies and their limitations
- Cross-industry patterns and failure modes
- Long-horizon and adversarial risk dynamics
This blog does not:
- Analyze or comment on the internal practices of specific organizations
- Discuss ongoing or recent incidents involving organizations the author is affiliated with
- Provide operational details that could enable misuse
- Offer legal, policy, or compliance advice
All analyses are framed at an abstract or generalized level to avoid attribution to any single system or organization.
Research background
My doctoral work was on Inline Reference Monitors (IRMs) — runtime policy enforcement mechanisms embedded directly into code. I applied this across several security domains: Adobe Flash applications, hybrid mobile apps (where web and native code share the same runtime), hybrid IoT companion apps, and control-flow hijacking vulnerabilities in IoT firmware. The thread across all of it is the same problem: how do you enforce security policy on a system you don't fully control, at runtime, without rewriting it from scratch?
That question maps cleanly onto what's now called agentic AI security — how do you constrain an autonomous system's behavior over time, across tool calls and memory, in ways that hold even when the system is doing things its designers didn't anticipate? The technical context has changed; the problem structure hasn't.
On the GenAI side, I contributed to the Amazon Nova technical report and work on responsible AI evaluation, red teaming, and adversarial testing methodology. I've also published on using gamification to teach IoT cybersecurity — building interactive, CTF-style experiences to improve student engagement with security concepts.
Selected publications
| Paper | Venue | Year | Citations |
|---|---|---|---|
| The Amazon Nova Family of Models: Technical Report and Model Card | Amazon / arXiv | 2024 | 54 |
| HybriGuard: IRM-Based Policy Enforcement for Hybrid Mobile Apps | IEEE S&P Workshops | 2017 | 31 |
| Control-Hijacking Vulnerabilities in IoT Firmware: A Brief Survey | IoT Security & Privacy Workshop | 2018 | 28 |
| HybriDiagnostics: Security Issues in Hybrid SmartHome Companion Apps | IEEE S&P Workshops | 2021 | 14 |
| Criminal Investigations: Gamification for IoT Cybersecurity Education | ACM SIGCSE | 2022 | 12 |
Full list on Google Scholar · 177 citations · h-index 6