Visual Question Answering

These are a series of projects we are working on with a broader goal of building AI complete systems. Visual Question Answering is a task that requires combining insights from visual and NLP domains in a deceptively non trivial fashion.

There are three ideas we are working on :

  • Improving parsing

  • Employing additional information such as captions to improve the models

  • Encouraging agents to learn by asking questions