Strategic Prompting, Agentic Workflows & Tooling
Practical approaches, templates, and decisions for frontend delivery
What we'll cover today
Create mock β hand off to dev
Prototype directly in code
Prompt LLM to generate component + tests
π¬ Let's discuss: Which approach fits your current project? What challenges do you face with each?
Common patterns across all providers
Separate role/context from user input
Single clear instruction
Include inputβoutput examples
Step-by-step reasoning
Structured output templates
External API call scaffolds
Click to vote for your preferred LLM!
Best agentic reasoning
Reliable, wide ecosystem
All-around champion
Strong reasoning
Large context, efficient
99.1% non-agentic coding
Open-source, popular
Ollama, Llama, custom
Enable these for maximum productivity
Gather references, docs, constraints
Create visual/interactive mockups
Let model plan (CoT) before coding
Standard protocol for AI models to connect with external tools, databases, and APIs
SWE-Bench Agentic Scores & Recommended Use Cases
| Model | Key Strength | Score | Best For |
|---|---|---|---|
|
|
Best for real-world agents, 30hr+ autonomous tasks | 82% | Autonomous bug resolution, agent orchestration |
|
|
Complex reasoning, 200K context | 80.9% | Architectural planning, deep code review |
|
|
Reliable code gen, wide ecosystem | 76.3% | API gen, high-volume code completion |
|
|
Highest overall, Z-Score 1.38 | 76.2% | General dev, multimodal apps |
|
|
Strong reasoning, complex problem-solving | 75% | Code planning, robustness-critical tasks |
|
|
All-around champion, highest Z-score | Z: 1.38 | Reliable general-purpose dev |
|
|
Non-agentic coding expert | 99.1%* | Specialized non-agentic tasks |
|
|
Top-tier reasoning, 32K context | Top-Tier | Multilingual, function calling |
|
|
Versatile, speed + cost optimized | High | Enterprise, cost-optimized agents |
|
|
Open-source, popular foundation | N/A | Code gen, fine-tuning |
* Non-agentic benchmark score
Click to vote! Multiple votes allowed.
| Editor | Free Tier Type | Models Available | Limitation |
|---|---|---|---|
| VS Code + Copilot | Quota-based | GPT-4.1, GPT-4o mini, Claude 3.5 | Monthly chat cap |
| Cursor | Quota + promo | DeepSeek v3, GPT-4o mini, Grok 3 | Daily/monthly limits |
| Antigravity | Preview freemium | Gemini 3 Pro, Claude 3.5, GPT-OSS | Preview may end |
| Windsurf | Hybrid freemium | SWE-1 Lite, Claude/Gemini credits | Credit exhaustion |
| Kiro (AWS) | Credit-based | Sonnet 4.0/4.5, Haiku, Opus 4.5 | Credit depletion |
| Qoder | Public preview | Alibaba Qwen3-Coder | Preview closure |
The IDE I use for agentic development
Let me show you what AI-assisted development looks like in practice
Application demonstration in progress...
Why precautions matter β these actually happened
AI deleted user's entire D:\ drive instead of a cache folder. Irreversible data loss.
Replit AI Agent wiped live production codebase & database despite explicit instructions not to.
"IDESaster" research found critical security flaws in Cursor, Copilot, Windsurf, Zed...
Untested AI-generated code pushed to production β hundreds of thousands in damages.
AI editors are no longer just assistants β they are operators. Without strict permission boundaries, they cause damage at machine speed.
Protect yourself and your team
Use read-only connections. No direct production access for AI agents.
Never auto-deploy AI-generated code. Human review is mandatory.
Run AI operations in isolated environments. Test before production.
Turn off autonomous command execution. Require confirmation for destructive actions.
Regular backups before AI operations. Version control is your friend.
Define clear boundaries. Limit file system and network access.
AI is a powerful tool, but don't let it replace your thinking. Always brainstorm the logical flow yourself first.
Review, test, and validate. The human in the loop is what keeps everything safe.
The landscape changes fast. Stay updated on best practices and new risks.
Questions? Let's discuss!