1 / 13
AI-Powered Development

The AI-Powered Developer

Strategic Prompting, Agentic Workflows & Tooling

Practical approaches, templates, and decisions for frontend delivery

OpenAI
Gemini
Claude
Grok
Mistral
Kimi
Meta
Deepseek

Agenda

What we'll cover today

01 Interactive: How would you approach a task?
02 Prompt template families & provider resources
03 LLM platforms & interactive leaderboard poll
04 Modes, MCP & Connectors overview
05 LLM versions & benchmarks with SWE-Bench
06 AI code editors poll & my pick
07 Risks, precautions & closing thoughts
🎯 Interactive Discussion

How would you approach these challenges?

Common Developer Scenarios

πŸ”¬ R&D / Research ⚑ Rapid Prototyping πŸ”— API Integration πŸš€ MVP Generation πŸ§ͺ Testing & QA
A

Design First

Create mock β†’ hand off to dev

Best for: Complex UX, team collaboration
Challenge: Slow iteration, design-dev gap
B

Code First

Prototype directly in code

Best for: Fast feedback, developer-led
Challenge: Technical debt, missing design
C

AI-Assisted

Prompt LLM to generate component + tests

Best for: Speed, boilerplate, exploration
Challenge: Hallucinations, review overhead

πŸ’¬ Let's discuss: Which approach fits your current project? What challenges do you face with each?

Prompt Template Families

Common patterns across all providers

πŸ“‹

System + User

Separate role/context from user input

🎯

Zero-Shot

Single clear instruction

πŸ“š

Few-Shot

Include input→output examples

🧠

Chain-of-Thought

Step-by-step reasoning

πŸ“¦

JSON Schema

Structured output templates

πŸ”§

Tool/Agent

External API call scaffolds

LLM Platforms & Your Preference

Click to vote for your preferred LLM!

Anthropic

Claude (Anthropic)

Best agentic reasoning

0 votes
OpenAI

GPT (OpenAI)

Reliable, wide ecosystem

0 votes
Gemini

Gemini (Google)

All-around champion

0 votes
Grok

Grok (xAI)

Strong reasoning

0 votes
Mistral

Mistral

Large context, efficient

0 votes
Kimi

Kimi K2

99.1% non-agentic coding

0 votes
Kimi

DeepSeek V3

Open-source, popular

0 votes

Other / Local

Ollama, Llama, custom

0 votes

Modes, MCP & Connectors

Enable these for maximum productivity

πŸ”

Research Mode

Gather references, docs, constraints

πŸ“„ Documentation search πŸ”— API discovery πŸ“š Best practices
🎨

Canvas Mode

Create visual/interactive mockups

πŸ–ΌοΈ UI/UX design πŸ“Š Diagrams 🎯 Wireframes
πŸŽ“ Experts Talk β†’
🧠

Thinking Mode

Let model plan (CoT) before coding

πŸ”„ Complex logic 🧩 Problem decomposition πŸ“‹ Planning

πŸ”Œ MCP (Model Context Protocol) & Connectors

What is MCP?

Standard protocol for AI models to connect with external tools, databases, and APIs

Connectors Enable

  • πŸ—„οΈ Database access (read-only recommended!)
  • 🌐 Web browsing & scraping
  • πŸ“ File system operations
  • πŸ”§ Custom tool integration

LLM Benchmark Comparison (2025)

SWE-Bench Agentic Scores & Recommended Use Cases

Model Key Strength Score Best For
Anthropic
Claude Sonnet 4.5
Best for real-world agents, 30hr+ autonomous tasks 82% Autonomous bug resolution, agent orchestration
Anthropic
Claude Opus 4.5
Complex reasoning, 200K context 80.9% Architectural planning, deep code review
OpenAI
GPT 5.1
Reliable code gen, wide ecosystem 76.3% API gen, high-volume code completion
Gemini
Gemini 3 Pro
Highest overall, Z-Score 1.38 76.2% General dev, multimodal apps
Grok
Grok 4
Strong reasoning, complex problem-solving 75% Code planning, robustness-critical tasks
Gemini
Gemini 2.5 Pro
All-around champion, highest Z-score Z: 1.38 Reliable general-purpose dev
Kimi
Kimi K2 Thinking
Non-agentic coding expert 99.1%* Specialized non-agentic tasks
Mistral
Mistral Large 3
Top-tier reasoning, 32K context Top-Tier Multilingual, function calling
Anthropic
Claude 3.5 Sonnet
Versatile, speed + cost optimized High Enterprise, cost-optimized agents
DeepSeek
DeepSeek V3
Open-source, popular foundation N/A Code gen, fine-tuning

* Non-agentic benchmark score

πŸ—³οΈ Interactive Poll

Which AI code editor do you use?

Click to vote! Multiple votes allowed.

VS Code
VS Code + Copilot
0 votes
Cursor
Cursor
0 votes
Antigravity
Antigravity
0 votes
Windsurf
Windsurf
0 votes
Kiro
Kiro (AWS)
0 votes
Qoder
Qoder
0 votes

πŸ“Š Free Tier Comparison

Editor Free Tier Type Models Available Limitation
VS Code + Copilot Quota-based GPT-4.1, GPT-4o mini, Claude 3.5 Monthly chat cap
Cursor Quota + promo DeepSeek v3, GPT-4o mini, Grok 3 Daily/monthly limits
Antigravity Preview freemium Gemini 3 Pro, Claude 3.5, GPT-OSS Preview may end
Windsurf Hybrid freemium SWE-1 Lite, Claude/Gemini credits Credit exhaustion
Kiro (AWS) Credit-based Sonnet 4.0/4.5, Haiku, Opus 4.5 Credit depletion
Qoder Public preview Alibaba Qwen3-Coder Preview closure

My Pick

The IDE I use for agentic development

🎬 LIVE DEMO

Demo Time

Let me show you what AI-assisted development looks like in practice

πŸ’»

Application demonstration in progress...

⚠️ Critical Awareness

Real AI Editor Incidents

Why precautions matter β€” these actually happened

πŸ’₯

Entire Drive Deleted

AI deleted user's entire D:\ drive instead of a cache folder. Irreversible data loss.

πŸ—„οΈ

Production DB Wiped

Replit AI Agent wiped live production codebase & database despite explicit instructions not to.

πŸ”“

30+ IDE Vulnerabilities

"IDESaster" research found critical security flaws in Cursor, Copilot, Windsurf, Zed...

πŸ’Έ

Costly Outages

Untested AI-generated code pushed to production β€” hundreds of thousands in damages.

🎯

AI editors are no longer just assistants β€” they are operators. Without strict permission boundaries, they cause damage at machine speed.

Precautions & Best Practices

Protect yourself and your team

πŸ”’

Never Give DB Write Access

Use read-only connections. No direct production access for AI agents.

πŸ‘οΈ

Always Review Output

Never auto-deploy AI-generated code. Human review is mandatory.

πŸ§ͺ

Sandbox & Test

Run AI operations in isolated environments. Test before production.

⚑

Disable Auto-Run

Turn off autonomous command execution. Require confirmation for destructive actions.

πŸ’Ύ

Backup Everything

Regular backups before AI operations. Version control is your friend.

πŸ“‹

Scope Limitations

Define clear boundaries. Limit file system and network access.

Thank You!

🧠

Don't Make It a Habit

AI is a powerful tool, but don't let it replace your thinking. Always brainstorm the logical flow yourself first.

πŸ”„

Stay Critical

Review, test, and validate. The human in the loop is what keeps everything safe.

πŸ“ˆ

Keep Learning

The landscape changes fast. Stay updated on best practices and new risks.

Questions? Let's discuss!

← β†’ or Space to navigate