Watch all the Transform 2020 sessions on-demand here.
Multiple choice tests allow test takers to compare answers in order to eliminate less promising options. Each choice can also be weighed against the question to infer patterns that might have been missed. The ability to narrow down sets of choices in order to come up with an answer is arguably the true comprehension test.
Researchers at Tel Aviv University and Facebook were inspired by this process to develop a machine learning model that generates answers to the Raven Progressive Matrix (RPM). RPM is a type of intelligence test that requires exam takers to complete the location in a grid of abstract images. The coauthors claim that their algorithm is not only able to generate a plausible set of answers competitive with state-of-the-art methods, but that it could also be used to build an automatic tutoring system that adjusts to the proficiencies of individual students.
RPM is a nonverbal test typically used in educational settings. It’s usually a 60-item exam given to measure abstract reasoning, which is regarded as a nonverbal estimate of fluid intelligence (i.e., the ability to solve novel reasoning problems). Each question is based on a single problem and consists of eight images placed on a 3 x 3 grid. The task is to generate the missing ninth image on the third row of the third column, such that it matches the patterns of the rows and columns of the grid.
June 5th: The AI Audit in NYC
Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.
RPM combines what the researchers describe as three pathways: reconstruction, recognition, and generation. The reconstruction pathway provides supervision so each image is encoded into a numerical representation and aggregated along rows and columns. The recognition pathway shapes the representations in a way that makes the semantic information more explicit. And the generation pathway relies on embedding the visual representation from the first pathway and the semantic embedding obtained with the assistance of the second to map the semantic representation of a question to an image.
In an experiment involving a dataset of matrix problems called RAVEN-FAIR, the researchers report that their model attained 60.8% accuracy overall. “Our method presents very convincing generation results. The state of the art recognition methods regard the generated answer as the right one in a probability that approaches that of the ground truth answer,” they wrote. “This is despite the non-deterministic nature of the problem, which means that the generated answer is often completely different … from the ground truth image. In addition, we demonstrate that the generation capability captures most rules, with little neglect of specific ones.”
Beyond potential applications in education, the researchers assert that the shift from selecting an answer from a closed set to generating an answer could lead to more interpretable machine learning methods. Because the generated output may reveal information about the underlying inference process, they say models like theirs could be useful in validating machine logic through the implementation of AI systems.