Scenario I: Full Control (Chat-Based Coding)

Questions

How can I use generative AI chatbots effectively for coding tasks?
What are the risks and benefits of manual copy/paste workflows?
What prompting strategies work best for research code?

Objectives

Learn effective prompting techniques for code generation
Understand what information is shared when using chatbots
Develop a modular approach to building research pipelines with AI assistance
Practice critical evaluation of AI-generated code

The full control scenario

In this scenario, you interact with an AI assistant through a web interface (like ChatGPT, Claude, or Gemini) and manually copy code between the chatbot and your development environment.

Example of using AI assistant via chat and manually copying

Why this is the lowest-risk approach

You see everything: Every piece of code goes through your eyes and clipboard
Nothing runs automatically: You decide when and how to execute code
Clear boundaries: The AI cannot access your files, run commands, or modify anything
Explicit data sharing: You control exactly what context the AI receives

What information leaves your machine?

When you use a chat interface, the following is typically sent to the provider’s servers:

What you send	Considerations
Your prompts	May contain sensitive project details
Code you paste	Could include proprietary algorithms
Error messages	May reveal system configuration, usernames, local paths, filenames
Data samples	Never paste real research data!

Data handling policies vary

Different providers have different policies:

Some use your conversations to train future models
Some offer enterprise tiers with data isolation (e.g. Microsoft Azure)
Read the terms of service for your chosen tool
When in doubt, assume your input may be retained
Most providers still retain all your activities for 30 days, even if you opted out.

Three modes of AI-assisted coding

Effective AI-assisted coding involves switching between three distinct modes depending on where you are in the development process.

Exploration mode: Ask for options

When starting a project or evaluating approaches, use the AI for research:

"What are options for parsing XML in Python? Include pros/cons of each."

"What approaches could I use for parallelizing this computation?"

"What are some useful drag-and-drop libraries in JavaScript?
Build me an example demonstrating each one."

This helps you understand the landscape before committing to an approach. The AI’s training cut-off means newer libraries won’t be suggested, but for established options this is often fine—you want stability anyway.

Planning mode: Design the workflow

Once you have explored your options, good planning makes sure you have the workflow sketched out, dependencies are considered, and your system has the right setup for starting the actual work (folder strcuture, version control, environment and other dependencies).

If the first stage was about brainstorming, this is about project management. A good well written plan makes it easier later for adding or removing components in your workflow. This is the equivalent of writing down all the comments, before writing your code.

Example of useful prompts:

TODO ENRICO TO FINISH THIS

Production mode: Tell them exactly what to do

Once you’ve decided on an approach, switch to authoritarian prompting. Provide precise specifications:

Write a Python function with this signature:

def validate_participant_id(pid: str) -> bool:
    """
    Validate a participant ID matches our format: P followed by 3 digits.

    Parameters
    ----------
    pid : str
        The participant ID to validate

    Returns
    -------
    bool
        True if valid, False otherwise

    Raises
    ------
    TypeError
        If pid is not a string
    """

You act as the function designer; the AI implements to your specification.

Practitioner’s perspective: Function signatures as prompts

“I find LLMs respond extremely well to function signatures. I get to act as the function designer, the LLM does the work of building the body to my specification.

If your reaction to this is ‘surely typing out the code is faster than typing out an English instruction of it’, all I can tell you is that it really isn’t for me any more. Code needs to be correct. English has enormous room for shortcuts, and vagaries, and typos, and saying things like ‘use that popular HTTP library’ if you can’t remember the name off the top of your head.

The good coding LLMs are excellent at filling in the gaps. They’re also much less lazy than me—they’ll remember to catch likely exceptions, add accurate docstrings, and annotate code with the relevant types.”

— Simon Willison

Managing context effectively

The key skill in AI-assisted coding is managing context—the accumulated text in your conversation that influences the AI’s responses.

What goes into context

Element	Impact
Your prompts	Directly shape the response
AI’s previous responses	Become part of the “memory”
Code you paste	Provides examples and patterns
Error messages shared	Help with debugging

Contex Windows sizes 2025

Practical context management

Tips for context management:

Start fresh when stuck: If a conversation isn’t productive, begin a new one
Build iteratively: Get simple versions working, then add complexity
Seed with examples: Paste working code as context for new but similar tasks
Provide multiple examples: “Here are three similar functions I’ve written. Use them as inspiration for this new one.”

Effective prompting strategies

The quality of AI-generated code depends heavily on how you ask for it.

TODO: this is a collection of promopts in no particilar order from an initial list expanded by claude. It needs refining, citations, sorting, what works well…

Strategy 1: Start with architecture, not implementation

Instead of asking for 500 lines of code at once, begin with structure:

Poor approach:

“Write a complete Python script to analyze my fMRI data, including preprocessing, statistical analysis, and visualization.”

Better approach:

“I need to build a pipeline for fMRI analysis. What would be a good modular structure? I’m thinking separate functions for:

Loading data

Preprocessing

Statistical analysis

Visualization

Can you outline what each module should do and what the interfaces between them should look like?”

Strategy 2: Provide context about your environment

Help the AI understand your constraints:

I'm working on:
- Python 3.11
- Ubuntu 22.04
- Using NumPy, SciPy, and Matplotlib (prefer not to add new dependencies)
- Data is in HDF5 format

I need a function to...

Strategy 3: Ask for explanations, not just code

Write a function to perform bootstrap resampling. Please:
Explain the statistical principle first
Show the implementation
Include comments explaining each step
Note any assumptions the code makes

Strategy 4: Iterate incrementally

Session flow:
"Write a function to load CSV data with error handling"
[Test it, find an issue]
"This fails when the file has a BOM. Can you fix that?"
[Works now]
"Now let's add support for gzipped files..."

Discussion

Why modular development matters even more with AI

When you generate code in small, testable pieces:

You can verify each piece works before moving on
Errors are easier to localize and fix
You maintain understanding of what each part does
The AI has clearer context for each request

Strategy 5: Remember it’s a conversation

The first response is rarely the final answer. Use follow-up prompts freely:

“Break that repetitive code out into a function”
“Use string manipulation methods rather than a regular expression”
“Write that better!”
“Now write tests for this using pytest”
“Can you explain what line 15 does?”

Building a research pipeline: Worked example

Let’s walk through how to build a simple analysis pipeline using the chat-based approach.

Step 1: Define the task

“I have a CSV file with columns: participant_id, condition, reaction_time, accuracy. I need to:

Load and validate the data

Calculate summary statistics per condition

Run a statistical test comparing conditions

Create a visualization

Let’s start with step 1. Write a Python function to load the CSV and validate that all expected columns are present.”

Step 2: Review and test

The AI might respond with:

import pandas as pd

def load_and_validate_data(filepath):
    """
    Load experimental data and validate required columns.

    Parameters
    ----------
    filepath : str
        Path to the CSV file

    Returns
    -------
    pd.DataFrame
        Validated dataframe

    Raises
    ------
    ValueError
        If required columns are missing
    """
    required_columns = {'participant_id', 'condition', 'reaction_time', 'accuracy'}

    df = pd.read_csv(filepath)

    missing = required_columns - set(df.columns)
    if missing:
        raise ValueError(f"Missing required columns: {missing}")

    return df

Your review checklist:

Does it handle the file not existing?
What about encoding issues?
Is the error message helpful?

Step 3: Iterate based on testing

“This works, but I’d like to add:

Check for missing values in critical columns

Validate that reaction_time is positive

Log warnings instead of failing for recoverable issues”

Step 4: Continue to next module

Once satisfied with data loading, move to summary statistics:

“Great, that function is working. Now write a function that takes the validated dataframe and returns summary statistics (mean, std, n) for reaction_time and accuracy, grouped by condition.”

Common pitfalls and how to avoid them

Pitfall 1: Accepting code without understanding it

Problem: You paste code and it works, but you don’t understand why.

Solution: Always ask “Can you explain how this works?” If you can’t explain the code to a colleague, you shouldn’t use it.

Pitfall 2: Not testing edge cases

Problem: Code works for your test case but fails on real data.

Solution: Explicitly ask about edge cases:

“What edge cases should I test for this function? What inputs might break it?”

Pitfall 3: Accumulating technical debt

Problem: Quick solutions that are hard to maintain long-term.

Solution: Ask for production-quality code:

“Write this function following best practices: include docstrings, type hints, and error handling.”

Pitfall 4: Hallucinated packages or functions

Problem: AI suggests importing a package that doesn’t exist.

Solution: Before running pip install, verify the package exists:

Check on PyPI
Search GitHub for the repository
Be especially suspicious of packages you’ve never heard of

When to relax the rules: Learning and exploration

The careful, verify-everything approach described above is essential for production code. But for learning and exploration, a looser approach can accelerate your understanding of what AI can do.

Practitioner’s perspective: Vibe-coding for learning

Andrej Karpathy coined the term “vibe-coding” for a more relaxed approach:

“There’s a new kind of coding I call ‘vibe coding’, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. […] I ask for the dumbest things like ‘decrease the padding on the sidebar by half’ because I’m too lazy to find it. I ‘Accept All’ always, I don’t read the diffs anymore.”

Simon Willison’s take:

“Andrej suggests this is ‘not too bad for throwaway weekend projects’. It’s also a fantastic way to explore the capabilities of these models—and really fun. The best way to learn LLMs is to play with them. Throwing absurd ideas at them and vibe-coding until they almost sort-of work is a genuinely useful way to accelerate the rate at which you build intuition for what works and what doesn’t.”

When vibe-coding is appropriate:

Learning a new library or framework
Prototyping ideas quickly
Building throwaway demonstrations
Exploring what’s possible

When to be rigorous:

Research code that will be published
Code that processes real data
Anything that will be shared or maintained
Security-sensitive applications

Warning

The distinction matters: vibe-coding is for learning, not for production. Don’t let the ease of accepting suggestions lead you to deploy code you don’t understand.

Using AI to understand existing code

One of the lowest-risk, highest-value use cases for AI: asking it to explain code you didn’t write.

Why this is valuable for researchers

Onboarding to inherited codebases
Understanding library internals
Code review preparation
Learning from others’ implementations

Workflow for code understanding

Paste the code (or relevant portions) into a chat
Ask for an overview: “Give me an architectural overview of this code”
Drill down: “Explain what this specific function does”
Identify issues: “What could go wrong with this implementation?”

Practitioner’s perspective: Questions are low-stakes

“If the idea of using LLMs to write code for you still feels deeply unappealing, there’s another use-case for them which you may find more compelling. Good LLMs are great at answering questions about code.

This is also very low stakes: the worst that can happen is they might get something wrong, which may take you a tiny bit longer to figure out. It’s still likely to save you time compared to digging through thousands of lines of code entirely by yourself.

The trick here is to dump the code into a long context model and start asking questions.”

— Simon Willison

Example prompts for code understanding

"What is the main purpose of this module?"

"Trace the data flow from input to output."

"What external dependencies does this code have?"

"Where might this code fail with unexpected input?"

"Explain the algorithm in step 3 of this function."

This is a great starting point for researchers who are hesitant about AI-generated code—you remain in full control, and you’re using the AI to augment your understanding rather than replace your work.

Exercises

Exercise Chat-1: Build a data loader

Using duck.ai (or your preferred AI chatbot):

Ask it to write a function that loads JSON files containing experimental results with this structure:
```
{"participant": "P01", "scores": [85, 90, 78], "completed": true}
```
Ask it to add validation for:
- Required fields
- Score values between 0-100
- Proper data types
Test the resulting function with valid and invalid inputs
Ask the AI: “What could still go wrong with this function?”

Solution

The key learning points:

Starting with structure before asking for implementation
Iterating to add validation
Explicitly asking about failure modes

Your final function should handle:

Missing files
Invalid JSON syntax
Missing required fields
Out-of-range values
Wrong data types

Exercise Chat-2: Debug with AI assistance

Take this buggy code and use duck.ai to help fix it:

def calculate_statistics(data):
    mean = sum(data) / len(data)
    variance = sum((x - mean) ** 2 for x in data) / len(data)
    std = variance ** 0.5
    return {"mean": mean, "std": std, "variance": variance}

# This crashes:
result = calculate_statistics([])

Practice:

Describe the error to the chatbot
Ask it to fix the bug
Ask what other edge cases might cause problems

Solution

The function fails on empty lists (division by zero). A proper fix handles:

Empty lists
Single-element lists (variance undefined or requires Bessel’s correction)
The difference between population and sample variance

Good prompting would reveal these statistical subtleties.

Best practices checklist

When using chat-based AI coding:

[ ] Never paste sensitive data - use synthetic examples
[ ] Start with architecture - outline before implementation
[ ] Work modularly - small functions you can test
[ ] Verify packages exist - check PyPI before installing
[ ] Test thoroughly - include edge cases
[ ] Understand the code - if you can’t explain it, don’t use it
[ ] Document your process - note which parts were AI-generated

The real benefit: Enabling more ambitious projects

It’s tempting to think of AI coding assistants purely in terms of speed. But the deeper benefit is different.

Practitioner’s perspective: Speed enables ambition

“This is why I care so much about the productivity boost I get from LLMs: it’s not about getting work done faster, it’s about being able to ship projects that I wouldn’t have been able to justify spending time on at all.

The fact that LLMs let me execute my ideas faster means I can implement more of them, which means I can learn even more.”

— Simon Willison

For researchers, this might mean:

Building a quick visualization tool you wouldn’t have time to code manually
Prototyping analysis approaches before committing to one
Creating small utilities that improve your workflow
Exploring “what if” questions through rapid implementation

Warning

Managing expectations: LLMs amplify existing expertise. An experienced developer will get better results than a beginner because they:

Know what to ask for
Can evaluate whether the output is correct
Understand which follow-up questions to ask
Recognize when the AI is wrong

This doesn’t mean beginners can’t benefit—they can. But don’t expect AI to instantly give you expert-level results if you’re still learning the fundamentals.

Summary

The chat-based approach offers maximum control over AI-assisted coding:

You explicitly choose what information to share
You review all code before it runs
You maintain full understanding of your codebase

The trade-off is speed: manually copying code and context takes time. For many research applications, this trade-off is worthwhile, as the control and transparency support reproducibility and scientific integrity.

Scenario I: Full Control (Chat-Based Coding)

The full control scenario

Why this is the lowest-risk approach

What information leaves your machine?

What NOT to share

Three modes of AI-assisted coding

Exploration mode: Ask for options

Planning mode: Design the workflow

Production mode: Tell them exactly what to do

Managing context effectively

What goes into context

Practical context management

Effective prompting strategies

Strategy 1: Start with architecture, not implementation

Strategy 2: Provide context about your environment

Strategy 3: Ask for explanations, not just code

Strategy 4: Iterate incrementally

Strategy 5: Remember it’s a conversation

Building a research pipeline: Worked example

Step 1: Define the task

Step 2: Review and test

Step 3: Iterate based on testing

Step 4: Continue to next module

Common pitfalls and how to avoid them

Pitfall 1: Accepting code without understanding it

Pitfall 2: Not testing edge cases

Pitfall 3: Accumulating technical debt

Pitfall 4: Hallucinated packages or functions

When to relax the rules: Learning and exploration

Using AI to understand existing code

Why this is valuable for researchers

Workflow for code understanding

Example prompts for code understanding

Exercises

Best practices checklist

The real benefit: Enabling more ambitious projects

Summary

See also