Stack Neural Module Network Demo

This is a small barebones demo of the SNMN, from this paper. I'm using my own reproduction of the paper from this repository. To use, choose an image from the list, and then choose a question to use, or type your own in. You can also choose between a model trained on ground truth layouts or without them. Check out my blog post for more details!

If the prediction is taking a bit of time to run, just reload the page and try again (sometimes the requests timeout due to slow servers).
CLEVR image
The model is only trained to answer compositional-style questions - see the defaults for each image for examples. Ground truth layout gives more interpretable module layouts, but uses extra supervision at training.