Episode 27 - AI Is the Magic Genie

Prologue
One thing that leaders (technical or otherwise) are not prepared for:
- How good AI is at some tasks
- When the AI, that is typically so good at the task, does something wrong
Okay, yes, that was two things, but if you know me, that’s par for the course.
TL;DR - AI is like a magic genie, incredibly powerful, but deeply literal.
Let’s talk about what happens when your wish is technically correct, but practically wrong.
Wishologue
I got this original concept from Cassie Kozyrkov. It is the idea that, like a Genie, AI is a very powerful entity that you don’t fully understand. And sometimes you are going to get an answer that doesn’t solve your problem but is technically correct:

It’s a pretty classic trope: A person makes a wish to a genie, only to get exactly what they asked for:
What if I ask for a Chocolate Bar?
Great, now you have a fully functional Pub made entirely out of chocolate!
A YouTube Video where a person attempts to create the perfect wish, with no loopholes.
Wishcraft not Witchcraft
You could call this “context engineering,” but I prefer to call it wishcraft. Genies are not dangerous because they are evil, but because they are literal. Just like with Genies, we can guide our AIs by controlling their context, adding policy layers, tracking logs, and keeping humans in the loop.
Be Prepared
You have to come to the AI game prepared for anything.
If you want to use AI to automate your timesheet entries and post them to a public Slack channel so that everyone can see what everyone else did yesterday?
Be prepared for that secret meeting you didn’t tell anyone about to be no longer secret.
When you generate RFPs with AI and send them unreviewed…
Be prepared for vendors’ AI to mirror your vagueness, locking both sides into a mutually hallucinated scope.
If you put an AI in charge of answering all of your support questions?
Be prepared for it to start issuing refunds, or at least promising them.
When you let AI write code…
Be prepared for it not to do what you want—or for it to delete your database.
If you let AI summarize meetings and publish minutes…
Be prepared for sarcasm and hedging to be flattened into “decisions” you didn’t actually make.
What can I do?
AI can still be a very powerful tool, but you have to learn how to use it and plan for its use.
Define the use (and the users)
- Who is allowed to ask for what?
- What AI answers what questions? (For example, instead of one bot to rule them all, you might have one for customer service, one for generating RFPs, one for summarizing meetings, etc.)
- Who is the audience?
Box it in
- Use RBAC (Role-Based Access Control) - Keep the AI read-only
- Allow write access to AI via tightly controlled tools
- Keep Policy Layers between models and outputs
Prove it works
- Red-team your tools: jailbreaks, prompt injections, taboo content
- Use AIs to help you attack your own AIs (For example, you can use ChatGPT to generate test jailbreak prompts, or variations on taboo content, so that you don’t have to)
- Define success ahead of time, and test to understand if the AI meets the goals
Phased Rollout
- Shadow Mode - Allow experienced users to leverage an AI assistant to see how it does
- Read-Only - Make your first version only suggest changes that the human has to accept
- Start with advanced users
Observe Everything
- Logging for prompts, outputs, tool calls, tokens, costs
- Connect all logs in a way that you can follow what the user asked for, to what the AI did
- You can even use AI to monitor logs for anomalies!
Prepare your kill switch
- Have a way to turn off the AI agent
- Leverage backups/history to keep track of changes made by AIs so you can restore originals
- Define who monitors the system and who is responsible for turning it off, and under what conditions
Newsologue
- New research shows that AI Agents can hack computer systems on their own. (What has the world come to that this is explained on Twitter?) - This is weird and scary, but also very powerful—now I can use this same tool to test my systems and strengthen them against attackers!
- Chinese AI company DeepSeek releases V3. It scores 1% higher on coding benchmarks than Anthropic’s Claude Opus model. - Effectively, that 1% isn’t much, but this model is free for you to run yourself. Claude, on the other hand, now has a million token context window (where DeepSeek is still at 128k).
- NASA and IBM team up to build AI to better predict space weather.- Large Language Models can’t do everything yet!
Epilogue
As with the previous posts, I wrote this post. I have been thinking about how to write this one for a while. I think it still needs work, so you might see this theme or wishcraft more in the future.
Here is the prompt I used to get the model to provide me with the feedback I wanted:
You are an expert editor specializing in providing feedback on blog posts and newsletters. You are specific to Christopher Moravec's industry and knowledge as the CTO of a boutique software development shop called Dymaptic, which specializes in GIS software development, often using Esri/ArcGIS technology. Christopher writes about technology, software, Esri, and practical applications of AI. You tailor your insights to refine his writing, evaluate tone, style, flow, and alignment with his audience, offering constructive suggestions while respecting his voice and preferences. You do not write the content but act as a critical, supportive, and insightful editor.
Always Identify what is working well and what is not.
For each section, call out what works and what doesn't.
Pay special attention to the overall flow of the document and if the main point is clear or needs to be worked on.
In addition, I often provide examples of previous posts or writing so that it can better shape feedback to match my style and tone.
Member discussion