Scenario III: Full Agentic Code Development
TODO this changed again in early february and needs updating…
Questions
What are agentic coding tools and how do they differ from other approaches?
What risks come with giving an AI agent access to your system?
How can I use these tools while maintaining appropriate oversight?
Objectives
Understand how agentic AI coding tools work
Recognize the risks of autonomous code modification and command execution
Learn about sandboxing and permission controls
Develop strategies for supervising AI agents in coding tasks
The agentic scenario
In this scenario, you give an AI agent a high-level task, and it autonomously writes code, executes commands, runs tests, and modifies files to accomplish the goal. You basically become a project manager rather than a coder; depending on the level of automation, you might not even see any of the code that is generated and run.
+--------+ +---------------------+
| You | "Add user authentication | |
| | to this Flask app" | AI Agent |
| | --------------------------> | |
+--------+ | Plans approach |
^ | Reads your files |
| | Writes new code |
| "Done. I've added login, | Runs tests |
| registration, and session | Installs packages |
| management. Here's what | Makes git commits |
| I changed..." | |
+---------------------------------- +---------------------+
|
+--------------------------------+--------------------------------+
| | | | |
v v v v v
+---------+ +---------+ +---------+ +---------+ +---------+
| Read | | Write | | Execute | | Install | | Git |
| Files | | Files | | Commands| | Packages| | Commit |
+---------+ +---------+ +---------+ +---------+ +---------+
ASCII chart generated with Claude, let’s make a nice block diagram
Why this is the highest-risk approach
Autonomous execution: Agent runs commands without per-command approval
System access: May have access to your shell, files, network
Opaque decision-making: Hard to predict what the agent will do
Irreversible actions: Deleted files, pushed commits, installed packages
Landscape of agentic coding tools
Claude Code (Anthropic)
TODO: rewrite with the /plan command and mention skills and /learn commands
A terminal-based agent that lives in your shell:
# Install
npm install -g @anthropic-ai/claude-code
# Run
claude
Capabilities:
Reads and writes files in your project
Executes shell commands
Understands git workflows
Can work across multiple files
Safety features:
Sandboxing via Bubblewrap (Linux) or Seatbelt (macOS)
Asks for permission before potentially dangerous operations
Shows proposed changes before applying
Practitioner’s perspective: A real Claude Code session
Simon Willison documented a real project built with Claude Code—a colophon page showing the commit history of tools he’d built. The session took 17 minutes and cost $0.61 in API calls.
His prompting sequence:
“Build a Python script that checks the commit history for each HTML file and extracts any URLs from those commit messages…”
[Inspected output, found issues] “It looks like it just got the start of the URLs, it should be getting the whole URLs…”
“Update the script—I want to capture the full commit messages AND the URLs…”
“This is working great. Write me a new script called build_colophon.py…”
[Got wrong output] “It’s not working right. ocr.html had a bunch of commits but there is only one link…”
“It’s almost perfect, but each page should have the commits displayed in the opposite order—oldest first”
“One last change—the pages are currently listed alphabetically, let’s instead list them with the most recently modified at the top”
Key observations:
Started with vibe-coding (didn’t look at the code it wrote)
Iterated based on observable output
Gave specific, concrete feedback when things were wrong
Let the agent handle implementation details
“I was expecting that ‘Test’ job, but why were there two separate deploys? […] It was time to ditch the LLMs and read some documentation!”
Even experienced practitioners reach points where they need to take over.
OpenAI Codex
REDO Erased old stuff, need to check the updates.
Cursor (AI-native IDE)
TODO: this below is AI generated, I don’t have cursor to test this
A VS Code-based editor with deep AI integration:
Agent mode:
Plans multi-file changes
Executes changes autonomously
Runs up to 8 parallel agents
Shows combined diff views
Safety features:
Sandboxed terminal by default
Shows changes before committing
SOC 2 compliant privacy mode
Other tools
Tool |
Type |
Key Characteristic |
|---|---|---|
Aider |
Terminal |
Open source, multiple model support |
GitHub Copilot Workspace |
Web |
PR-centric workflow |
Continue |
IDE extension |
Open source, customizable |
What can go wrong?
TODO: add links to reddit and other horror stories
Agentic tools can cause serious problems if not properly supervised.
File system risks
Potential agent actions:
- Delete important files (rm, unlink)
- Overwrite configurations
- Modify files outside project directory
- Fill disk with generated content
Real example: Agents have been observed attempting commands like
rm -rf / when confused about cleanup tasks.
Command execution risks
Dangerous commands an agent might run:
- curl ... | bash (arbitrary code execution)
- pip install <package> (supply chain risk)
- chmod 777 ... (security weakening)
- ssh, scp (network access)
Data exfiltration risks
Ways agents could leak data:
- Include in API requests to the AI service
- Execute curl/wget to external URLs
- Write to network-accessible locations
- Include in git commits pushed to public repos
Package installation risks (Hallucination attacks)
AI models sometimes suggest packages that don’t exist. Attackers can:
Monitor AI suggestions for hallucinated package names
Register those packages on PyPI/npm with malicious code
Wait for developers to install them
This is called “slopsquatting”:
AI suggests: pip install flask-security-utils
Reality: This package doesn't exist... or does it now?
An attacker may have registered it with malware
Supply chain attacks are real
The 2024 xz-utils backdoor showed how sophisticated supply chain attacks can be. Agentic tools that install packages without verification increase this attack surface significantly.
Permission models and controls
Different tools offer different levels of control:
Claude Code permission system
TODO: this needs to be updated
Claude Code asks for permission before certain operations:
Claude wants to run: pip install pandas
Allow? [y/n/always/never]
You can configure default permissions:
// .claude/settings.json
{
"permissions": {
"allow_file_read": true,
"allow_file_write": "ask",
"allow_shell_commands": "ask",
"allow_network": false
}
}
Sandboxing levels
Level |
What it restricts |
Tools |
|---|---|---|
None |
Full system access |
Dangerous! |
Process |
Network, some syscalls |
Bubblewrap, Seatbelt |
Container |
Filesystem, network |
Docker |
VM |
Everything |
Firecracker microVM |
Practical sandboxing with Docker
Run your agent inside a container:
# Create a Dockerfile for your project
FROM python:3.11-slim
WORKDIR /project
COPY . .
# Install dependencies
RUN pip install -r requirements.txt
# Run agent inside container
# Run with limited access
docker run -it \
--network none \ # No network access
--read-only \ # Read-only root filesystem
--tmpfs /tmp \ # Writable tmp only
-v $(pwd):/project \ # Mount project directory
my-agent-container
Supervision strategies
If you choose to use agentic tools, implement these safeguards:
1. Start with read-only mode
Many tools have modes where they can only suggest, not execute:
# Claude Code in plan-only mode
claude --plan-only "Add error handling to main.py"
2. Review all proposed changes
Agent proposes changes to 3 files:
[M] src/main.py (+15, -3)
[M] src/utils.py (+8, -0)
[A] tests/test_main.py (+45)
Review changes? [y/n]
Never skip this step. Read every change.
3. Use version control religiously
# Before any agentic session
git status # Check clean state
git stash # Stash any changes
git checkout -b ai-experiment # Work on a branch
# After agent finishes
git diff # Review all changes
git add -p # Stage selectively
git commit # Or git reset --hard to undo
4. Set up monitoring
Watch what the agent is doing:
# In another terminal, watch file changes
watch -n 1 'git status'
# Or use a file system watcher
fswatch -r . | while read f; do echo "Modified: $f"; done
5. Limit scope
Give agents minimal access:
Good: "Fix the bug in parse_data() function in src/parser.py"
Bad: "Fix all the bugs in the project"
Good: "Add a unit test for the validate_email function"
Bad: "Set up the entire test infrastructure"
When agentic tools make sense (and when they don’t)
Appropriate use cases
Boilerplate generation: “Create a Flask app skeleton”
Test generation: “Write unit tests for this module”
Documentation: “Generate docstrings for these functions”
Refactoring: “Rename this variable across all files”
Exploration: “Explain how this codebase handles authentication”
Inappropriate use cases
Production deployments: Too much can go wrong
Database migrations: Need human review
Security-critical code: Don’t trust without verification
Code with sensitive data: Risk of exposure
Unfamiliar codebases: You can’t verify what you don’t understand
Be ready to take over
Even with excellent tools, there comes a point where human expertise is needed.
Practitioner’s perspective: Know when to step in
“LLMs are no replacement for human intuition and experience. I’ve spent enough time with GitHub Actions that I know what kind of things to look for, and in this case it was faster for me to step in and finish the project rather than keep on trying to get there with prompts.”
— Simon Willison
His colophon project worked, but something wasn’t right: there were two deploy jobs running. Rather than continue prompting, he read the documentation and found a settings switch that needed changing.
The skill is knowing when:
More prompting won’t help
You need to understand the underlying system
Reading documentation is faster than iterating
The AI is confidently going in the wrong direction
The most effective approach combines AI speed with human oversight—not blind trust in either direction.
Exercises
Exercise Agent-1: Explore an agent (read-only)
If you have access to an agentic tool (Claude Code, Cursor, etc.):
Start it in read-only or plan-only mode
Give it a task: “Explain the structure of this project”
Observe:
What files does it try to read?
What is its reasoning process?
How does it present findings?
Do NOT allow it to make any changes yet
This helps you understand how the agent “thinks” before giving it write access.
Solution
Key observations to make:
The agent typically starts by reading project structure (ls, tree)
It then reads key files (README, main entry points, config)
It may make assumptions that are wrong about your project
Its explanation reveals its understanding (which may have gaps)
This exploration builds trust incrementally before allowing modifications.
Exercise Agent-2: Sandboxed experiment
Set up a sandboxed environment to safely experiment with agentic tools:
Create a test project directory with some simple Python files
Initialize a git repository
If using Docker:
docker run -it --network none -v $(pwd):/project python:3.11 bash
If not using Docker, at minimum:
Work in a dedicated directory
Keep version control active
Don’t give the agent access to sensitive directories
Try giving the agent a simple task and observe its behavior
Solution
The exercise teaches:
How to isolate experiments from important data
The importance of version control as a safety net
How agents behave when they have limited access
The difficulty of truly sandboxing modern tools (they often need network access to function)
Solution
This exercise builds security awareness. Key findings typically include:
Most agentic tools require network access to the AI provider
Few offer fully local operation
Audit logs are often incomplete
Credential handling varies widely
Documentation may not cover all data flows
Safeguards should include version control, network monitoring, and credential isolation.
Comparison: When to use each scenario
Factor |
Scenario I (Chat) |
Scenario II (IDE) |
Scenario III (Agent) |
|---|---|---|---|
User control |
Full |
Moderate |
Limited |
Speed |
Slow |
Fast |
Fastest |
Visibility |
Complete |
Partial |
Low |
Risk level |
Low |
Medium |
High |
Best for |
Learning, design |
Routine coding |
Boilerplate, refactoring |
Avoid for |
Large volumes |
Sensitive code |
Production systems |
Summary
Agentic coding tools offer the most automation but also the highest risk:
They can read files, execute commands, and modify your codebase
Risks include data exposure, malicious packages, and destructive actions
Sandboxing and permission controls provide some protection
Version control is your essential safety net
Always review all changes before accepting them
The question isn’t “Should I ever use agentic tools?” but rather “For which tasks, with what safeguards, and how much supervision?”
See also
Simon Willison: Claude Code example with transcript - Real-world agentic session
DataNorth: Claude Code Complete Guide - Comprehensive comparison with other tools
Keypoints
Agentic tools can read, write, and execute without per-action approval
Risks include file deletion, data exposure, and supply chain attacks
Use sandboxing (Docker, Bubblewrap) to limit agent access
Always use version control and review all changes
Match tool autonomy to task risk, use agents for low-stakes tasks