Development of a Standalone Chatbot for Assessing Code Understanding

This bachelor thesis addresses the challenge of verifying students' individual understanding of programming submissions in an era where large language models are widely used. Instructors traditionally rely on targeted questions during tutorials to confirm that students truly comprehend the code they submit, but this process is time-consuming and difficult to scale.

The goal of this thesis is to design and implement a standalone web application that simulates this tutorial setting independently of any existing programming platform. Teachers create programming tasks with deadlines, grading criteria, a ground-truth solution, and unit tests; students upload their source code, run the unit tests, and then answer multiple-choice and open questions generated by an LLM-powered chatbot based on their own submission. The system scores the answers automatically and combines them with unit test results to produce a grade for the programming task.

The thesis will cover the design and implementation of:

Backend (Python): source code intake, LLM-based question generation, automated scoring of multiple-choice and open answers, secure and isolated execution of unit tests (primarily on Java code), authentication, and persistency in PostgreSQL.
Frontend (Angular): a self-contained web interface for teachers (task and exercise-sheet creation, group management, publishing/hiding generated questions, viewing and adjusting scores) and students (viewing tasks and deadlines, uploading solutions, running unit tests, answering questions, flagging answers for manual review).
LLM integration: external-API integration for question generation, with secure API-key handling and an option for teachers to plug in their own model.

The final concept and implementation will be evaluated with example programming exercises, incorporating feedback from instructors to assess usability, pedagogical value, and the reliability of LLM-generated assessments.

Back to Theses Topics