Generating Diverse Code Explanations using the GPT-3 Large Language Model | AI Paper Digest

TL;DR Highlight

A study analyzing how GPT-3 can automatically generate multiple natural-language explanations of a single code snippet from different perspectives.

Who Should Read

Developers and researchers interested in automated code documentation, code comprehension tooling, and using LLMs for developer education.

Core Mechanics

GPT-3 can generate diverse natural language explanations for the same code snippet by varying the prompt perspective (what it does, why it exists, how it works)
Different prompting strategies yield explanations of varying quality and target audience appropriateness
Higher-quality explanations are produced when prompts specify the intended audience (beginner vs. expert) and explanation type
Multi-perspective explanations improve code comprehension more than single explanations in user studies
Automatic evaluation metrics correlate moderately with human judgments of explanation quality

Evidence

Human evaluation study comparing single vs. multi-perspective explanations on comprehension tasks
Tested across multiple GPT-3 variants with different prompt templates
Correlation analysis between automatic metrics (BLEU, BERTScore) and human ratings

How to Apply

When auto-generating code comments or documentation, prompt the LLM with specific explanation angles: 'explain what this does', 'explain why this approach was chosen', 'explain this for a junior dev'.
Generate multiple explanation candidates and use a ranker or human review to select the best for your documentation.
Tailor the prompt's target audience specification to match your actual readers for better output quality.

Code Example

snippet

import openai

code_snippet = """
for i in range(5):
    print(i * 2)
"""

explanation_types = {
    "execution_trace": "Explain step-by-step what happens when this Python code runs, including the value of each variable at each step:",
    "term_definition": "Define the key programming terms and concepts used in this Python code in simple language for a beginner:",
    "hint": "Give a helpful hint about what this Python code does without giving away the full answer, suitable for a student learning to code:"
}

def generate_code_explanation(code, explanation_type):
    prompt = f"{explanation_types[explanation_type]}\n\n{code}"
    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",  # or gpt-4
        messages=[
            {"role": "system", "content": "You are a helpful programming tutor."},
            {"role": "user", "content": prompt}
        ],
        max_tokens=300
    )
    return response.choices[0].message.content

# Generate explanations from three perspectives
for etype in explanation_types:
    print(f"=== {etype} ===")
    print(generate_code_explanation(code_snippet, etype))
    print()

Terminology

Code ExplanationNatural language text describing what a piece of code does, aimed at helping developers understand it without running it.

Multi-perspective explanationGenerating multiple descriptions of the same code from different angles (functionality, rationale, implementation details).

BERTScoreAn automatic text evaluation metric that measures semantic similarity between generated and reference text using BERT embeddings.

BLEUBilingual Evaluation Understudy. An automatic metric comparing generated text to reference text based on n-gram overlap.

Related Papers

Original Abstract (Expand)

Good explanations are essential to efficiently learning introductory programming concepts [10]. To provide high-quality explanations at scale, numerous systems automate the process by tracing the execution of code [8, 12], defining terms [9], giving hints [16], and providing error-specific feedback [10, 16]. However, these approaches often require manual effort to configure and only explain a single aspect of a given code segment. Large language models (LLMs) are also changing how students interact with code [7]. For example, Github's Copilot can generate code for programmers [4], leading researchers to raise concerns about cheating [7]. Instead, our work focuses on LLMs' potential to support learning by explaining numerous aspects of a given code snippet. This poster features a systematic analysis of the diverse natural language explanations that GPT-3 can generate automatically for a given code snippet. We present a subset of three use cases from our evolving design space of AI Explanations of Code.