Notes & preprints
Writing on trustworthy AI in practice
Technical content is everywhere—but it's still hard to find work that connects security, evaluation, and production systems for production systems. This space is where I share experiments, benchmarks, and lessons from building and defending real pipelines.
Expect themes like guardrails & prompt injection, hybrid defense architectures, observability & reliability, and scalable backends for AI workloads—always with an eye toward what actually ships.
Interested in collaborating or contributing? Reach out via the contact page.
Evaluating Hybrid Guardrail Architectures for Prompt Injection Defense in LLMs
Systematic evaluation of baseline, regex-only, and hybrid guardrails on 625 prompts. Hybrid regex + LLM classifier achieves strong recall on standard and adversarial benchmarks—with full metrics and failure analysis.
Read article