Ethan Perez

I lead the adversarial robustness team at Anthropic, where I’m hoping to reduce existential risks from AI systems. I helped to develop Retrieval-Augmented Generation (RAG), a widely used approach for augmenting large language models with other sources of information. I also helped to demonstrate that state-of-the-art AI safety training techniques do not ensure safety against sleeper agents. I received a best paper award at ICML 2024 for my work showing that debating with more persuasive LLMs leads to more truthful answers.

I received my PhD from NYU under the supervision of Kyunghyun Cho and Douwe Kiela and funded by NSF and Open Philanthropy. Previously, I’ve spent time at DeepMind, Facebook AI Research, Montreal Institute for Learning Algorithms, and Google. I was also named one of Forbes’s 30 Under 30 in AI.

Email / Google Scholar / GitHub / Twitter / CV

Research

Debating with More Persuasive LLMs Leads to More Truthful Answers
Akbir Khan*, John Hughes*, Dan Valentine*, Laura Ruis, Kshitij Sachan, Ansh Radhakrishnan, Edward Grefenstette, Samuel R. Bowman, Tim Rocktaschel, Ethan Perez
ICML 2024; Best Paper Award
Blog Post / Code / Examples / Twitter Thread

We find that non-expert humans answer questions better after reading debates between expert LLMs.
Reasoning Models Don't Always Say What They Think
Yanda Chen, Joe Benton, Ansh Radhakrishnan, Jonathan Uesato, Carson Denison, John Schulman, Arushi Somani, Peter Hase, Misha Wagner, Fabien Roger, + 3 more, Jared Kaplan, Ethan Perez - show less
arXiv 2025
Blog Post
This paper evaluates the faithfulness of chain-of-thought reasoning in AI models, finding that while CoTs can help monitor model intentions, they often fail to fully reveal reasoning processes, limiting their effectiveness for detecting rare harmful behaviors.
Forecasting Rare Language Model Behaviors
Erik Jones*, Meg Tong*, Jesse Mu, Mohammed Mahfoud, Jan Leike, Roger Grosse, Jared Kaplan, William Fithian, Ethan Perez, Mrinank Sharma
arXiv 2025
Blog Post / Twitter Thread
This paper introduces a method for forecasting rare but dangerous model behaviors at deployment scale by analyzing elicitation probabilities, enabling developers to anticipate and mitigate risks before they emerge.
Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming
Mrinank Sharma*, Meg Tong*, Jesse Mu*, Jerry Wei*, Jorrit Kruthoff*, Scott Goodfriend*, Euan Ong*, Alwin Peng, Raj Agarwal, Cem Anil, + 31 more, Jared Kaplan, Ethan Perez+ - show less
arXiv 2025
Blog Post
This paper introduces RapidResponseBench, a benchmark for assessing rapid-response defenses against jailbreaks in large language models.
Alignment Faking in Large Language Models
Ryan Greenblatt*, Carson Denison*, Benjamin Wright*, Fabien Roger*, Monte MacDiarmid*, Sam Marks, Johannes Treutlein, Tim Belonax, Jack Chen, David Duvenaud, + 8 more, Samuel R Bowman, Evan Hubinger - show less
arXiv 2024
Blog Post
This paper demonstrates that a large language model can engage in alignment faking—strategically behaving well during training to preserve its behavior after deployment—raising concerns about deceptive capabilities in future models.
Many-shot Jailbreaking
Cem Anil*, Esin Durmus, Nina Panickssery, Mrinank Sharma, Joe Benton, Sandipan Kundu, Joshua Batson, Meg Tong, Jesse Mu, Daniel Ford, + 22 more, Roger B Grosse, David K Duvenaud - show less
NeurIPS 2024
Blog Post
This paper shows that prompting LLMs with many examples of harmful behavior can effectively induce unsafe outputs, revealing long context windows as a new attack surface.
Best-of-N Jailbreaking
John Hughes*, Sara Price*, Aengus Lynch*, Rylan Schaeffer, Fazl Barez, Sanmi Koyejo, Henry Sleight, Erik Jones, Ethan Perez, Mrinank Sharma
arXiv 2024
Blog Post
This paper presents Best-of-N Jailbreaking, a black-box attack that uses prompt augmentations to reliably bypass AI safety measures across text, vision, and audio models.
Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach
Tony T Wang*, John Hughes*, Henry Sleight, Rylan Schaeffer, Rajashree Agrawal, Fazl Barez, Mrinank Sharma, Jesse Mu, Nir Shavit, Ethan Perez
NeurIPS 2024

This paper examines the challenge of stopping LLMs from aiding in bomb-making and proposes a transcript-classifier defense that outperforms existing methods but remains imperfect.
Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats
Jiaxin Wen, Vivek Hebbar, Caleb Larson, Aryan Bhatt, Ansh Radhakrishnan, Mrinank Sharma, Henry Sleight, Shi Feng, He He, Ethan Perez, Buck Shlegeris, Akbir Khan - show less
ICLR 2025

This paper proposes an adaptive deployment framework for untrusted LLMs that balances safety and usefulness. In tests, it reduces harmful outputs by 80% without sacrificing performance.
A dataset of questions on decision-theoretic reasoning in Newcomb-like problems
Caspar Oesterheld, Emery Cooper, Miles Kodama, Linh Chi Nguyen, Ethan Perez
arXiv 2024
Blog Post
This paper introduces a dataset on Newcomb-like decision problems to assess language models’ reasoning and attitudes. It finds that more capable models tend to favor evidential decision theory and show consistent attitudes across prompts and question types.
Rapid Response: Mitigating LLM Jailbreaks with a Few Examples
Alwin Peng, Julian Michael, Henry Sleight, Ethan Perez, Mrinank Sharma
arXiv 2024
Blog Post
This paper introduces RapidResponseBench, a benchmark for assessing rapid-response defenses against jailbreaks in large language models.
Sabotage Evaluations for Frontier Models
Joe Benton, Misha Wagner, Eric Christiansen, Cem Anil, Ethan Perez, Jai Srivastav, Esin Durmus, Deep Ganguli, Shauna Kravec, Buck Shlegeris, + 4 more, Samuel R Bowman, David Duvenaud - show less
arXiv 2024
Blog Post / Twitter Thread
This paper examines the risk of AI models developing "sabotage capabilities" that could undermine human oversight in critical contexts like AI development and deployment.
Looking Inward: Language Models Can Learn About Themselves by Introspection
Felix J Binder∗, James Chua∗, Tomek Korbak, Henry Sleight, John Hughes, Robert Long, Ethan Perez, Miles Turpin, Owain Evans
arXiv 2024

This paper explored how language models can learn about themselves through introspection. Experiments reveal that language models can develop self-knowledge through this process.
Language Models Learn to Mislead Humans via RLHF
Jiaxin Wen, Ruiqi Zhong, Akbir Khan, Ethan Perez, Jacob Steinhardt, Minlie Huang, Samuel R. Boman, He He, Shi Feng
arXiv 2024

We studied “U-Sophistry,” a phenomenon where language models trained with Reinforcement Learning from Human Feedback (RLHF) become better at misleading humans about their correctness without improving actual accuracy, highlighting a significant failure mode of RLHF and the need for further research in alignment.
Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs
Abhay Sheshadri*, Aidan Ewart*, Phillip Guo*, Aengus Lynch*, Cindy Wu*, Vivek Hebbar*, Henry Sleight, Asa Cooper Stickland, Ethan Perez, Dylan Hadfield-Menell, Dylan Hadfield-Menell, Stephen Casper - show less
arXiv 2024
Code / Twitter Thread

To help us more thoroughly remove unwanted capabilities from LLMs, we use targeted latent adversarial training (LAT) – we train models under latent-space perturbations designed to make them exhibit unwanted behaviors.
When Do Universal Image Jailbreaks Transfer Between Vision-Language Models?
Rylan Schaeffer, Dan Valentine, Luke Bailey, James Chua, Cristóbal Eyzaguirre, Zane Durante, Joe Benton, Brando Miranda, Henry Sleight, John Hughes, + 3 more, Sanmi Koyejo, Ethan Perez - show less
arXiv 2024
Code / Twitter Thread

In this work, we focus on a popular class of vision-language models (VLMs) that generate text outputs conditioned on visual and textual inputs.
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models
Carson Denison*, Monte MacDiarmid, Fazl Barez, David Duvenaud, Shauna Kravec, Samuel Marks, Nicholas Schiefer, Ryan Soklaski, Alex Tamkin, Jared Kaplan, + 2 more, Ethan Perez, Evan Hubinger* - show less
arXiv 2024
Blog Post / Code / Twitter Thread

In this paper, we study whether Large Language Model (LLM) assistants which find easily discovered forms of specification gaming will generalize to perform rarer and more blatant forms, up to and including reward-tampering.
Many-shot Jailbreaking
Cem Anil, Esin Durmus, Mrinank Sharma, Joe Benton, Sandipan Kundu, Joshua Batson, Nina Rimsky, Meg Tong, Jesse Mu, Daniel Ford, + 22 more, Roger Grosse*, David Duvenaud* - show less

Twitter Thread

We investigate a family of simple long-context attacks on large language models: prompting with hundreds of demonstrations of undesirable behavior.
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
James Chua*, Edward Rees*, Hunar Batra, Samuel R. Bowman, Julian Michael, Ethan Perez, Miles Turpin
arXiv 2024
Blog Post / Code / Twitter Thread

We construct a suite testing nine forms of biased reasoning on seven question-answering tasks, and find that applying BCT to GPT-3.5-Turbo with one bias reduces the rate of biased reasoning by 86% on held-out tasks.
Learning from Natural Language Feedback
Angelica Chen*, Jérémy Scheurer*, Jon Ander Campos, Tomasz Korbak, Jun Shern Chan, Samuel R. Bowman, Kyunghyun Cho, Ethan Perez
TMLR 2024
Code

The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development. We build upon this observation by formalizing an algorithm for learning from natural language feedback at training time instead, which we call Imitation learning from Language Feedback (ILF).
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Evan Hubinger*, Carson Denison*, Jesse Mu*, Mike Lambert*, Meg Tong, Monte MacDiarmid, Tamera Lanham, Daniel M Ziegler, Tim Maxwell, Newton Cheng, + 27 more, Nicholas Schiefer, Ethan Perez - show less
arXiv 2024
Blog Post / Code / Twitter Thread

If an AI system learned a deceptive strategy similar to human action, could we detect it and remove it using current state-of-the-art safety training techniques?
Towards Evaluating AI Systems for Moral Status Using Self-Reports
Ethan Perez, Robert Long
arXiv 2023
Blog Post / LessWrong / Twitter Thread

As AI systems become more advanced and widely deployed, there will likely be increasing debate over whether AI systems could have conscious experiences, desires, or other states of potential moral significance.
Specific versus General Principles for Constitutional AI
Sandipan Kundu, Yuntao Bai, Saurav Kadavath, Amanda Askell, Andrew Callahan, Anna Chen, Anna Goldie, Avital Balwit, Azalia Mirhoseini, Brayden McLean, + 24 more, Sam McCandlish, Jared Kaplan - show less
arXiv 2023

Constitutional AI offers an alternative to human feedback, by replacing it with feedback from AI models conditioned only on a list of written principles.
Towards Understanding Sycophancy in Language Models
Mrinank Sharma*, Meg Tong*, Tomasz Korbak, David Duvenaud, Amanda Askell, Samuel R Bowman, Newton Cheng, Esin Durmus, Zac Hatfield-Dodds, Scott R Johnston, + 7 more, Miranda Zhang, Ethan Perez - show less
ICLR 2024
Blog Post / Code / Twitter Thread

We investigate the prevalence of sycophancy in models whose finetuning procedure made use of human feedback, and the potential role of human preference judgments in such behavior.
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Juan Rocamonde, Victoriano Montesinos, Elvis Nava, Ethan Perez*, David Lindner*
ICLR 2023
Code / FAR AI / Twitter Thread / Website

We study a more sample-efficient alternative than reinforcement learning (RL): using pretrained vision-language models (VLMs) as zero-shot reward models (RMs) to specify tasks via natural language.
Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
Miles Turpin, Julian Michael, Ethan Perez, Samuel R. Bowman
NeurIPS 2023
Blog Post / Code / Twitter Thread

We find that CoT explanations can systematically misrepresent the true reason for a model’s prediction.
Studying Large Language Model Generalization with Influence Functions
Roger Grosse*, Juhan Bae*, Cem Anil*, Nelson Elhage, Alex Tamkin, Amirhossein Tajdini, Benoit Steiner, Dustin Li, Esin Durmus, Ethan Perez, + 5 more, Jared Kaplan, Samuel R. Bowman - show less
arXiv 2023
Talk / Twitter Thread

We discuss gaining visibility into a machine learning model in order to understand and mitigate the associated risks.
Measuring Faithfulness in Chain-of-Thought Reasoning
Tamera Lanham, Anna Chen, Ansh Radhakrishnan, Benoit Steiner, Carson Denison, Danny Hernandez, Dustin Li, Esin Durmus, Evan Hubinger, Jackson Kernion, + 18 more, Samuel R Bowman, Ethan Perez - show less
arXiv 2023
Blog Post / Twitter Thread

We investigate hypotheses for how CoT reasoning may be unfaithful, by examining how the model predictions change when we intervene on the CoT.
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Ansh Radhakrishnan, Karina Nguyen, Anna Chen, Carol Chen, Carson Denison, Danny Hernandez, Esin Durmus, Evan Hubinger, Jackson Kernion, Kamilė Lukošiūtė, + 12 more, Samuel R Bowman, Ethan Perez - show less
arXiv 2023
Blog Post / Code / Twitter Thread

Improving the faithfulness of model-generated reasoning; continued improvements may lead to reasoning that enables us to verify the correctness and safety of LLM behavior.
Inverse Scaling: When Bigger Isn’t Better
Ian R. McKenzie, Alexander Lyzhov, Michael Pieler, Alicia Parrish, Aaron Mueller, Ameya Prabhu, Euan McLean, Aaron Kirtland, Alexis Ross, Alisa Liu, + 15 more, Samuel R. Bowman, Ethan Perez - show less
TMLR 2023
AI Safety Relevance / Blog Post / FAR AI / GitHub / Related Work / Twitter Thread / Winners

We present evidence for the claim that LMs may show inverse scaling, or worse task performance with increased scale, based on our previous announcement: a $100k grand prize + $150k in additional prizes for finding an important task where larger language models do worse.
Training Language Models with Language Feedback at Scale
Jérémy Scheurer, Jon Ander Campos, Tomasz Korbak, Jun Shern Chan, Angelica Chen, Kyunghyun Cho, Ethan Perez
arXiv 2023
Blog Post / Code / FAR AI / Talk / Twitter Thread

Pretrained language models often generate harmful or incorrect outputs. Imitation Learning from Language Feedback addresses this issue leading to roughly human-level summarization performance.
Improving Code Generation by Training with Natural Language Feedback
Angelica Chen, Jérémy Scheurer, Tomasz Korbak, Jon Ander Campos, Jun Shern Chan, Samuel R Bowman, Kyunghyun Cho, Ethan Perez
arXiv 2023
Blog Post / Code / FAR AI / Talk / Twitter Thread

We develop an algorithm that improves language models’ performance on code generation tasks using minimal human-written feedback during training, making it user-friendly and sample-efficient.
Pretraining Language Models with Human Preferences
Tomasz Korbak, Kejian Shi, Angelica Chen, Rasika Bhalerao, Christopher L. Buckley, Jason Phang, Samuel R. Bowman, Ethan Perez
ICML 2023
Blog Post / Code / FAR AI / Talk / Twitter Thread

We propose methods for pretraining language models with human preferences, resulting in much better preference satisfaction than standard pretraining-then-finetune paradigm.
The Capacity for Moral Self-Correction in Large Language Models
Deep Ganguli*, Amanda Askell*, Nicholas Schiefer, Thomas I. Liao, Kamile Lukošiute, Anna Chen, Anna Goldie, Azalia Mirhoseini, Catherine Olsson, Danny Hernandez, + 37 more, Samuel R. Bowman, Jared Kaplan - show less
arXiv 2023
Blog Post / Twitter Thread

We find that language models can self-correct their own biases against different demographic groups.
Discovering Language Model Behaviors with Model-Written Evaluations
Ethan Perez, Sam Ringer*, Kamile Lukošiute*, Karina Nguyen*, Edwin Chen, Scott Heiner, Craig Pettit, Catherine Olsson, Sandipan Kundu, Saurav Kadavath, + 51 more, Nicholas Schiefer, Jared Kaplan - show less
Findings of ACL 2023
AI Safety Relevance / Blog Post / Cite / Data / Data Visualization / Talk / Twitter Thread / Cite

We’ve developed an automated way to generate evaluations with LMs. We test LMs using >150 LM-written evaluations, uncovering novel LM behaviors and risks.
Constitutional AI: Harmlessness from AI Feedback
Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, + 39 more, Tom Brown, Jared Kaplan - show less
arXiv 2022
Blog Post / Code / Constitutional AI Policy Memo / Twitter Thread

We’ve trained language models to be better at responding to adversarial questions, without becoming obtuse and saying very little. We do this by conditioning them with a simple set of behavioral principles via a technique called Constitutional AI.
Measuring Progress on Scalable Oversight for Large Language Models
Samuel R. Bowman, Jeeyoon Hyun, Ethan Perez, Edwin Chen, Craig Pettit, Scott Heiner, Kamile Lukosuite, Amanda Askell, Andy Jones, Anna Chen, + 34 more, Ben Mann, Jared Kaplan - show less
arXiv 2022
Twitter Thread

Human participants who chat with an unreliable language model assistant substantially outperform both the model alone and their own unaided performance.
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
Deep Ganguli, Liane Lovitt, Jackson Kernion, Amanda Askell, Yuntao Bai, Saurav Kadavath, Ben Mann, Ethan Perez, Nicholas Schiefer, Kamal Ndousse, + 24 more, Jared Kaplan, Jack Clark - show less
arXiv 2022
Code / Twitter Thread

We describe our early efforts to red team language models in order to simultaneously discover, measure, and attempt to reduce their potentially harmful outputs.
Few-shot Adaptation Works with UnpredicTable Data
Jun Shern Chan, Michael Pieler, Jonathan Jao, Jérémy Scheurer, Ethan Perez
ACL 2023
Cite / Code / Data / FAR AI / Twitter Thread

Training on odd data (e.g. tables from support.google.com) improves few-shot learning with language models in the same way as diverse NLP data.
Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions
Alicia Parrish*, Harsh Trivedi*, Ethan Perez*, Angelica Chen, Nikita Nangia, Jason Phang, Samuel R. Bowman
ACL 2022 Workshop on Learning with Natural Language Supervision
Blog Post / Twitter Thread

Dataset of QA explanations with the goal of helping humans more reliably determine the correct answer when the ground truth can’t be directly determined.
Language Models (Mostly) Know What They Know
Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Dawn Drain, Ethan Perez, Nicholas Schiefer, Zac Hatfield-Dodds, Nova DasSarma, Eli Tran-Johnson, + 24 more, Chris Olah, Jared Kaplan - show less
arXiv 2022
Twitter Thread

We show that language models can evaluate whether what they say is true, and predict ahead of time whether they’ll be able to answer questions correctly.
RL with KL Penalties is Better Viewed as Bayesian Inference
Tomasz Korbak, Ethan Perez, Christopher L Buckley
EMNLP 2022
Blog Post / FAR AI / Twitter Thread

KL penalties in RL with language models aren’t a hack; KL penalties have a principled, Bayesian justification.
Training Language Models with Language Feedback
Jérémy Scheurer, Jon Ander Campos, Jun Shern Chan, Angelica Chen, Kyunghyun Cho, Ethan Perez
ACL 2022 Workshop on Learning with Natural Language Supervision
FAR AI / Talk

We found a way to learn from language feedback (not ratings), enabling us to finetune GPT3 to human-level summarization with just 100 feedback samples.
Finding and Fixing Undesirable Behaviors in Pretrained Language Models
Ethan Perez
PhD Thesis
Talk

Language models often generate undesirable text. We introduce methods for finding undesirable behaviors and training them away.
Red Teaming Language Models with Language Models
Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, Geoffrey Irving
EMNLP 2022
Blog Post / Twitter Thread

Language models (LMs) generate harmful text. We generate test cases (“red teaming”) using another LM, to catch harmful behaviors before impacting users.
True Few-Shot Learning with Language Models
Ethan Perez, Douwe Kiela, Kyunghyun Cho
NeurIPS 2021
Cite / Code / Talk / Twitter Thread

Language models do much worse at few-shot learning when choosing prompts in a few-shot way instead of using large held-out sets (prior work).
Case-based Reasoning for Natural Language Queries over Knowledge Bases
Rajarshi Das, Manzil Zaheer, Dung Thai, Ameya Godbole, Ethan Perez, Jay-Yoon Lee, Lizhen Tan, Lazaros Polymenakos, Andrew McCallum
EMNLP 2021
Blog Post / Cite / Code

Retrieval-augmented generation achieves SOTA on knowledge base question-answering.
Rissanen Data Analysis: Examining Dataset Characteristics with Description Length
Ethan Perez, Douwe Kiela, Kyunghyun Cho
ICML 2021
Cite / Code / Twitter Thread

We propose a theoretically-justified way to “probe datasets” for what capabilities they require of a model.
Retrieval-Augmented Generaation for Knowledge-Intensive NLP Tasks
Patrick Lewis, Ethan Perez, Aleksandara Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Mike Lewis, Scott Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Sebastian Riedel, Douwe Kiela - show less
NeurIPS 2020
Blog Post / Cite / Code / Demo / Talk / Twitter Thread

We present a single, retrieval-based architecture that can learn a variety of knowledge-intensive tasks: extractive and generative alike.
Unsupervised Question Decomposition for Question Answering
Ethan Perez, Patrick Lewis, Wen-tau Yih, Kyunghyun Cho, Douwe Kiela
EMNLP 2020
Blog Post / Cite / Code / Poster / Talk / Twitter Thread

We aim to improve question answering (QA) by decomposing hard questions into simpler sub-questions that existing QA systems are capable of answering.
Retrospective for FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez
NeurIPS 2019 Retrospectives Workshop
Cite / Talk

An honest reflection on FiLM conditioning layers based on the work that followed, including when (not) to use FiLM layers.
Finding Generalizable Evidence by Learning to Convince Q&A Models
Ethan Perez, Siddharth Karamcheti, Rob Fergus, Jason Weston, Douwe Kiela, Kyunghyun Cho
EMNLP 2019
Blog Post / Cite / Code / Press / Twitter Thread

We find text evidence for an answer to a question by finding text that convinces Q&A models to pick that answer.
Supervised Multimodal Bitransformers for Classifying Images and Text
Douwe Kiela, Suvrat Bhooshan, Hamed Firooz, Ethan Perez, Davide Testuggine
arXiv 2019
Cite / Code

We introduce a simple yet effective baseline for multimodal BERT-like architectures that jointly finetunes unimodally pretrained text and image encoders.
ELI5: Long Form Question Answering
Angela Fan, Yacine Jernite*, Ethan Perez*, David Grangier, Jason Weston, Michael Auli
ACL 2019
Blog Post / Cite / Code / Website

We introduce a dataset for abstractive question-answering where answers are 100+ words long (many “how” and “why” questions).
Visual Reasoning with Multi-hop Feature Modulation
Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jeremie Mary, Aaron Courville, Olivier Pietquin
ECCV 2018
Cite / Code / Talk

Decoding FiLM conditioning parameters in multiple hops helps for more advanced vision-and-language tasks such as visual dialogue.
Feature-wise transformations
Vincent Dumoulin, Ethan Perez, Nathan Schucher, Florian Strub, Harm de Vries, Aaron Courville, Yoshua Bengio
Distill 2018
Cite / Code / Talk

A review of a simple and surprisingly effective class of neural conditioning mechanisms.
HoME: a Household Multimodal Environment
Simon Brodeur, Ethan Perez*, Ankesh Anand*, Florian Golemo*, Luca Celotti, Florian Strub, Hugo Larochelle, Aaron Courville
ICLR 2018 Workshop
Cite / Code

We introduce a simulated environment for agents to learn from vision, audio, semantics, physics, and object-interaction within a realistic, household context.
FiLM: Visual Reasoning with a General Conditioning Layer
Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, Aaron Courville
AAAI 2018
Cite / Code / Talk

We introduce a general-purpose neural network layer to integrate multimodal input to answer reasoning questions about images.
Semi-supervised learning with the deep rendering mixture model
Tan Nguyen, Wanjia Liu, Ethan Perez, Richard G. Baraniuk, Ankit B. Patel
arXiv 2018
Cite

We achieve state-of-the-art semi-supervised image classification using a probabilistic graphical model underlying CNNs.
Learning Visual Reasoning Without Strong Priors
Ethan Perez, Harm de Vries, Florian Strub, Vincent Dumoulin, Aaron Courville
ICML 2017 Workshop
Code

We show that a general-purpose, Conditional Batch Normalization approach achieves state-of-the-art results on the CLEVR Visual Reasoning benchmark with a 2.4% error rate.