Context Engineering at Work

Why shaping what the model sees is more important than the model itself

Jul 08, 2025

You build an agent. It can talk to tools. It responds to questions. You test it. It feels promising.

Then it breaks.

It forgets things. It hallucinates. It picks the wrong tool. It gives answers that don't make sense. Suddenly, it feels unreliable.

Most people assume the model is the problem. But in most cases, it’s not the model, it’s the context.

What Is Context Engineering?

Large Language Models (LLMs) don’t think, and they don’t remember. They’re stateless. That means every time you call them, you have to give them everything they need again. That "everything" is context.

Context engineering is the process of carefully shaping that input: the task, the goal, the tool outputs, the history, the instructions, and the data all compacted and delivered in a way that the model can work with.

Prompting is giving a command.
Context engineering is giving a full briefing.

This becomes even more important when you move beyond single-turn tasks and start building workflows or products around LLMs.

You need precision.

You need consistency.

You need the model to “understand” what’s going on. It never actually does but if the context is right, it behaves as if it does.

Prompt Engineering vs Context Engineering

Prompt engineering is just one part of the picture. It’s about wording. Context engineering is about what information reaches the model and how.

Few Examples

These examples are tested on Google Gemini

1. Zero-shot Prompt

This is the most basic version. You just give an instruction and one input. Useful for quick summaries, but doesn’t guide the model much.

Write a summary of this internal update:
"We closed Q3 with 12% growth in revenue, mainly driven by strong performance in enterprise sales and a reduction in churn."

What to expect: The model may restate the original sentence or shorten it without focusing on what matters most to a leader (e.g., what drove the growth).

The model accurately captured the main facts and added a confident, upbeat tone ("We're excited to announce..."), even though that wasn’t in the original prompt.

Zero-shot prompting works well for factual rewriting, but models may add tone or framing on their own. Larger models like Gemini often produce fluent, polished outputs, but may still require context control for consistency in business settings.

2. Few-shot Prompt

Now we add an example first. This helps the model learn the format or tone before doing the task.

Example:
Update: "The IT team completed the system upgrade with no major issues."
Summary: "System upgrade completed smoothly by IT."

Write summary for following update:
Update: "We closed Q3 with 12% growth in revenue, mainly driven by strong performance in enterprise sales and a reduction in churn."

What to expect: The model will match the structure of the first summary. It makes outputs more consistent. Use this if you’re building a reporting assistant.

The summary is concise, well structured, and closely follows the tone and format of the example provided.

Few-shot prompting works effectively with larger models like Gemini. When the example is clear and relevant, the model adapts well to the desired style and structure, making it ideal for standardized business outputs.

3. Chain-of-thought Prompt

We now guide the model to reason out its response before summarizing.

Write summary for following update:
"We closed Q3 with 12% growth in revenue, mainly driven by strong performance in enterprise sales and a reduction in churn."

Let’s think step-by-step:
1. What is the main achievement?
2. What contributed to it?
3. Why is it important?

What to expect: This helps extract a deeper explanation. It’s useful for junior analysts, or if you want the model to show its thought process for transparency.

The model broke down the update step by step, identified the key achievement and drivers, and added a final sentence to highlight business significance.

Chain-of-thought prompting guides the model to reason through context before answering. This produces richer summaries with implied meaning and relevance ideal when clarity, impact, or rationale matters in communication.

4. Context-Engineered Prompt

This version simulates a real-world scenario with instructions and formatting guidance. It frames the task more fully.

Context:
You are writing a leadership summary for an internal company newsletter. This newsletter is ready by 1000 employees. Employees are keenly waiting for the newsletter to get the latest company updates. Leaders are empathetic, and always direct. 


Here is the raw update:
"We closed Q3 with 12% growth in revenue, mainly driven by strong performance in enterprise sales and a reduction in churn."

Instruction:
Write the summary for the newsletter

Use this format:
- Quarter:
- Key Outcome:
- What Drove It:
- What’s Next:

What to expect: This gives you the most useful output for a business setting. Structured, clear, and aligned with internal communication needs.

The model followed the given structure, adapted to the audience tone, and produced a leadership-friendly summary with a clear forward-looking statement.

Context engineering combines instructions, tone, formatting, and task framing. This gives the model everything it needs to generate aligned, consistent, and purpose driven outputs ideal for high-stakes communication like internal leadership updates.

Observations Across Models

We also tested these prompts using local models like Gemma 3, Microsoft Phi-4 reasoning, and DeepSeek R1 via LM Studio. While these models could follow basic instructions, they struggled with maintaining tone, structure, and reasoning depth for few-shot and context-engineered prompts.

The quality of your results depends heavily on the model you're using.

Smaller or instruction-tuned models may work for simple tasks, but for enterprise-grade use cases involving workflows, summaries, or tool use larger and more capable models from Open AI, Google and Claude generally produce more coherent, reliable outputs.

It’s Not Prompting, It’s Framing

You don’t need to fine tune a model. You don’t need to build your own LLM.

You don’t need 50-shot prompting. But you do need to control what the model sees.

That’s context engineering. It’s the difference between throwing a question over the wall and briefing a smart assistant who’s ready to help.

It’s not a trick. It’s a habit.

When you start shaping what the model sees such as your goal, your past steps, your tools, your constraints; it starts behaving much closer to how you’d expect.

The better the view you give it, the better the answers get.

That’s not magic. That’s context.

Autonomous Thoughts

Discussion about this post