16 May 2025 7 min read Almost Entirely Human

Episode 13 - How Many Calories are in this Cat?

An AI generated image of judging the calories of a cat (my cat is not orange, but this was the best image ChatGPT created)

Prologue

Last week, I got AI to help me cook dinner. This week, I'm getting it to judge my dinner.

Let's talk about counting calories. You've probably done it before—maybe to lose weight, gain muscle, or just figure out why your pants "shrunk" (spoiler: they didn't). When you're staring down a burrito bowl, trying to guess whether it's 500 or 1,500 calories, you know how absurd this guessing game can be.

Sure, there are apps. You can search and endlessly scroll through databases for something vaguely resembling your plate. But ultimately, it's still just guessing.

What if we could make these estimates better—faster, more consistent, and maybe with a little less self-judgment? Let's ask AI.

Countologe

This all started a few months ago while counting calories with some friends. I hated even the idea of using the apps. They were clunky, slow, and somehow made me feel worse about every snack. I knew that I wasn’t going to be able to stick with it if I didn’t reduce the friction!

So, naturally, I wondered: Could I find a way to make this easier with AI?

Turns out, yes. Let me show you how.

💡

It is these novel uses for AI that really excite me about what we can do with these tools!

Building the Prompt

So, how do we get AI to guess calories from a photo? As usual, building a prompt is part formula and part Jedi Mind Trick.

Here’s how I built mine:

🧠 Step 1: Teach It to “Think”

At first, I just asked Claude to look at the image and tell me how many calories were in it. The results were… uninspired. Like, “this might be a sandwich” level of helpful.

So I gave it instructions to think—literally.

Before you answer, always think. Enclose your thinking in <thinking> tags. Then provide your final answer in <output> tags.

Some models (Claude, in particular) respond really well when you explicitly tell them to think things through step-by-step. It’s like a mini chain-of-thought prompt. The XML tags don’t matter to the model, but they make it easy for me to parse the response programmatically later on.

Newer models like Claude 3.7 or GPT o3 have thinking built-in to their processes, but I still like this mini chain-of-thought prompt for simple tasks and it works on the cheaper models too!

💡

Asking the model to output in XML tags is a quick/easy way to parse the results with some code. In my case, I was using some JavaScript to automate this process. It’s a good place to use a Disposable App!

🥘 Step 2: Tell It What It’s Getting

Next, I told the model what to expect: an image of food.

Analyze this image and tell me the calorie content of the food shown. The image will always be of food. Try really hard to estimate the calorie content.

That last sentence might sound silly—“try really hard”?—but it helps. AI models are weirdly literal, and sometimes you have to cheerlead them into effort.

🤣

Side Quest: I once forgot to give it a food image. It tried to estimate the calories in a photo of my cat. 0/10 for accuracy, 10/10 for entertainment.

If I were turning this into a product, I’d add guardrails like, “If the image doesn’t contain food, say so politely.” But for this experiment, I just assumed good input.

🚨 Step 3: Add a Fallback

Sometimes the model doesn’t know, but it’ll still try to guess—confidently, and wrong. That’s bad for calorie tracking and code parsing.

So I added:

If you do not know, return -1.

That way, I know to skip or retry the request if the model gets confused.

🔢 Step 4: Define the Output

To avoid messy responses like “I think this is 750 calories, give or take,” I pinned it down:

In the output tag, include only the number. Nothing else.

If you’ve ever tried to parse model responses in code, you know why this matters. The cleaner the output, the fewer headaches later.

🧩 Step 5: Account for a Grid of Meals

I’m human (almost entirely). Sometimes, I forget to track meals in real time, but I take pictures. When that happens, I group multiple food photos into a grid and send that in instead.

To handle that, I updated the prompt with:

The image may be a grid of several images of food for the day. In this case, provide the total calories for all food items present.

That one line makes the model way more versatile.

🧮 Averaging Answers

Sometimes the model would confidently say “550” calories, then turn around and claim “900” for the same image. That’s the thing about AI: it always wants to answer, even when it’s not sure.

So here’s the fix:

Ask three times. Average the results.

💡

This is like The Wisdom of Crowds—the average of 17,000 people guessing the weight of a cow is much better than any single guess. We can leverage this with AI, too!

Same prompt, same image, but run in three separate threads (use a temporary chat so that previous answers don’t pollute the AI’s process). The results are usually close. If they’re not, that’s your signal the model’s confused—maybe the food’s ambiguous, or the photo is blurry.

🔑

Ask the model three (or more) separate times to get a better set of results that you can average together. If you want even more accuracy, a few words about the food can go a long way!

This isn’t just a calorie hack—it’s a general AI pattern:

When in doubt, ask more than once. Compare, then trust the consensus.

Bonus trick: give it a hint like “vegetable stew” alongside the image. Even one or two words can dramatically improve accuracy.

🤔 Can We Tell When AI Is Wrong?

If we are talking about numbers, averaging works most of the time. But what if it doesn’t? Or what if the response is more complex, a description of an image instead of just a calorie estimate?

It turns out that you can ask AI to judge itself.

All you need to do is run a second prompt asking it to evaluate all three of your previous answers, something like: “Which of these seems the most accurate, and why? Provide three pieces of evidence.”

It is kind of like asking AI to be a panel of judges (although sometimes it’s more like a panel of overconfident interns 😜).

You can even use this method to compare outputs from different models.

I'll break that down in another post, but for now, just remember:

AI can help check AI. You just have to tell it how.

💡

Use AI to verify the output of itself or other AI models. This can help reduce errors and even reduce bias in the outputs.

✅ Recap: The Prompt Ingredients

Here’s how the final prompt shaped up:

Before you answer, always think. Enclosing your thinking in XML <thinking> tags. Then provide your final answer in <output> tags. Analyze this image and tell me the calorie content of the food shown. The image will always be of food. Try really hard to estimate the calorie content. If you do not know, return -1. The image may be a grid of several images of food for the day, in this case provide the total calories for all food items present. In the output tag only include the number, nothing else.

That prompt might seem like overkill, but a good prompt makes the difference between a shaky calorie guesser and a helpful tool. Going through this thought process also helps build your intuition for consistently getting useful results from the models, not just using them as glorified search engines.

Newsologue

Google released AlphaEvolve this week. This is an AI that they used to solve some interesting math problems, which on the surface don’t seem particularly useful. I think what’s interesting here is:

It’s another use of an LLM. Again, we are using an LLM to do work that we didn’t think it could do (like moving robot arms)
It isn’t just an LLM; it is a system where the LLM writes code to solve a problem, the solution is judged, and the LLM works to improve the solution. This framework is a fascinating use of AI, and I think it will be replicated to solve different types of problems.

If you want a fun, very light-on-math video about some of the discoveries made, check out Stand Up Maths.

Epilogue

For this post, I leaned on my AI editor a bit more than normal, having it help me break up my prompt description text into a step by step guide, and also giving me commentary on AI judging itself. It didn't make up any of the content, just help me organize it; the ideas all came from my brain and experience.

I used the same feedback prompt as before to make edits and generally clean up the post before I had a couple of humans read it and give me feedback.

Here is the prompt I used to get the model to provide me with the feedback I wanted:

You are an expert editor specializing in providing feedback on blog posts and newsletters. You are specific to Christopher Moravec's industry and knowledge as the CTO of a boutique software development shop called Dymaptic, which specializes in GIS software development, often using Esri/ArcGIS technology. Christopher writes about technology, software, Esri, and practical applications of AI. You tailor your insights to refine his writing, evaluate tone, style, flow, and alignment with his audience, offering constructive suggestions while respecting his voice and preferences. You do not write the content but act as a critical, supportive, and insightful editor.

In addition, I often provide examples of previous posts or writing so that it can better shape feedback to match my style and tone.