Coder Social home page Coder Social logo

gpt-4v-enade-cs-2021's Introduction

ChatGPT-4 Vision Evaluation on ENADE 2021 Bachelor in Computer Science Exam

This repository contains supplementary materials for the study reported in the paper Evaluating ChatGPT-4 Vision on Brazil's National Undergraduate Computer Science Exam, which has been accepted for publication in the ACM Transactions on Computing Education.

The tables below provide an overview of the open and multiple-choice questions, respectively, of the ENADE 2021 Bachelor in Computer Science exam. Each table includes the questions' subject, modality, reasoning strategy, and scoring status, as well as the accuracy of ChatGPT-4 Vision and the main challenges/errors it faced answering them. Click on the links on the right-most column of each table to view the model's full conversations and the expert assessments (when available) for each question in English and Portuguese.

Open Questions

# Subject Modality Reasoning Strategy Status Model Accuracy Model Challenges / Error Categories Model Conversations and Expert Assessments
03 Theory of Computing Visual Direct Scored Partly correct Logical Reasoning / Incorrect Multi-step Reasoning,
Visual Acuity / Misidentification of Visual Elements
English version,
Portuguese version
04 Computer Architecture Visual Direct Scored Partly correct Visual Acuity / Lack of Domain-specific Visual Output English version, Portuguese version
05 Algorithms Visual Direct Scored Partly correct Logical Reasoning / Incorrect Algorithmic Reasoning English version, Portuguese version

Multiple-choice Questions

# Subject Modality Reasoning Strategy Status Model Accuracy Model Challenges / Error Categories Model Conversations and Expert Assessments
09 Operating Systems Text Direct Scored Incorrect Logical Reasoning / Insufficient Domain Knowledge English version, Portuguese version
10 Programming Text Direct Scored Correct English version, Portuguese version
11 Artificial Intelligence Visual Indirect Scored Correct English version, Portuguese version
12 Computers in Society Text Indirect Not scored Incorrect Logical Reasoning / Evaluation Leniency,
Logical Reasoning / Incorrect Multi-step Reasoning
English version, Portuguese version
13 Software Engineering Text Direct Not scored Incorrect Question Interpretation / Inconsistent Responses,
Logical Reasoning / Insufficient Domain Knowledge
English version, Portuguese version
14 Computer Architecture Text Indirect Scored Incorrect Logical Reasoning / Evaluation Leniency,
Logical Reasoning / Insufficient Domain Knowledge
English version, Portuguese version
15 Software Engineering Text Indirect Scored Correct English version, Portuguese version
16 Computer Architecture Visual Direct Scored Correct English version, Portuguese version
17 Operating Systems Visual Indirect Not scored Correct English version, Portuguese version
18 Artificial Intelligence Text Indirect Scored Incorrect Question Interpretation / Inconsistent Responses,
Logical Reasoning / Incorrect Multi-step Reasoning
English version, Portuguese version
19 Human-Computer Interaction Text Indirect Scored Correct English version, Portuguese version
20 Programming Text Indirect Scored Correct English version, Portuguese version
21 Computer Networks Visual Direct Not scored Incorrect Logical Reasoning / Evaluation Leniency English version, Portuguese version
22 Information Systems Visual Direct Scored Correct English version, Portuguese version
23 Programming Visual Direct Scored Correct English version, Portuguese version
24 Computer Security Text Indirect Scored Correct English version, Portuguese version
25 Distributed Systems Text Indirect Not scored Incorrect Question Interpretation / Inconsistent Responses,
Logical Reasoning / Incorrect Multi-step Reasoning
English version, Portuguese version
26 Web Development Text Indirect Scored Incorrect Logical Reasoning / Insufficient Domain Knowledge,
Logical Reasoning / Incorrect Multi-step Reasoning
English version, Portuguese version
27 Performance Analysis Visual Direct Scored Correct English version, Portuguese version
28 Computer Architecture Visual Indirect Scored Correct English version, Portuguese version
29 Image Processing Visual Indirect Invalid English version, Portuguese version
30 Compilers Text Indirect Scored Correct English version, Portuguese version
31 Theory of Computing Visual Indirect Scored Incorrect Question Interpretation / Non-Compliance with Guidelines,
Logical Reasoning / Incorrect Algorithmic Reasoning,
Logical Reasoning / Incorrect Multi-step Reasoning,
Visual Acuity / Misidentification of Visual Elements
English version, Portuguese version
32 Algorithms Text Indirect Not scored Correct English version, Portuguese version
33 Programming Text Direct Invalid English version, Portuguese version
34 Graph Theory Visual Direct Scored Incorrect Visual Acuity / Misidentification of Visual Elements English version, Portuguese version
35 Distributed Systems Text Indirect Scored Correct English version, Portuguese version

The following materials are also available:

gpt-4v-enade-cs-2021's People

Contributors

nabormendonca avatar

Stargazers

Rodrigo Lira avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.