GitHub Copilot Context Injection Mechanism Explained [Diagrams]¶
Target audience:
Developers using GitHub Copilot or Claude Code instruction settings who want to understand the technical rationale behind "why you should set it this way."
Key Takeaways¶
- According to VS Code team implementation documentation, Copilot's custom instructions are added to the chat context (system prompt side). Claude Code's CLAUDE.md is injected as a User Message per official spec. This difference decisively determines how persistent rules are.
- The main reason to keep Copilot instructions concise is context consumption per request — but "context limits" (instructions being overlooked when too long) is a separate issue that also exists.
- Understanding how both tools work gives you clear criteria for deciding what to write where.
Background: "Why Keep It Concise?" — A Rationale That Got Lost¶
Best practices for GitHub Copilot custom instructions widely say "keep the file concise" and "use references to external documents." However, in-depth explanations of the technical mechanism behind why conciseness is necessary are rarely found.
Rather than "because the official docs recommend it," understanding how LLM prompt injection works fundamentally changes how you approach instruction design.
This article compares the internal mechanisms of GitHub Copilot and Claude Code to explain the technical rationale for "what to write where."
Related article
For specific setup steps and best practices for custom instructions, see GitHub Copilot Custom Instructions Complete Guide.
GitHub Copilot's Internal Prompt Structure¶
The Full Prompt Sent to the LLM¶
When GitHub Copilot (VS Code) sends a request to the LLM, the prompt is structured around System Prompt and User Message areas. According to a VS Code team member's implementation documentation1, custom instructions are added to the system prompt side — though the official VS Code docs use the term "added to the chat context" without specifying the exact internal area2.
graph TB
subgraph SP["🔒 System Prompt / Chat Context (Foundation for model behavior)"]
direction TB
A["Core Identity & Global Rules<br><i>e.g., You are an expert AI programming assistant...</i>"]
B["General Instructions<br><i>Changes dynamically per model</i>"]
C["Tool Use Instructions<br><i>Guidelines for terminal execution, file editing</i>"]
D["Output Format Instructions<br><i>Rules for response formatting</i>"]
E["📄 .instructions.md content<br><i>Added via applyTo glob / semantic matching</i>"]:::inst
F["📄 copilot-instructions.md content<br><i>Added to chat context (personal > repo > org priority)</i>"]:::inst
G["📄 Custom Agent instructions<br><i>Appended only when in use</i>"]:::inst
A --> B --> C --> D --> E --> F --> G
end
subgraph UM["💬 User Message (Part of the conversation)"]
direction TB
H["Prompt Files content<br><i>Only when explicitly referenced</i>"]
I["Environment / Workspace Info<br><i>OS info, directory structure</i>"]
J["Editor Context<br><i>Open files, selected text</i>"]
K["The user's actual prompt"]
H --> I --> J --> K
end
SP --> UM
classDef inst fill:#0288d1,color:#ffffff,stroke:#01579b,stroke-width:2pxAbout the source
The structure above is based on VS Code team member Burke Holland's implementation documentation1. The official VS Code docs describe this as "combined and added to the chat context" — the exact internal area and ordering are not guaranteed as specification2.
What Instruction Injection Means¶
Instructions are included in the chat context on each request. Per VS Code team documentation, they are added to the system prompt side. The key characteristics are:
Persistence: Unlike the "position-based attenuation" seen in Claude Code's CLAUDE.md (which becomes an older message as turns progress), Copilot's instructions are re-assembled fresh in the context at each request. The weight of rules doesn't diminish over session turns in principle.
Cost: Instruction tokens are consumed on every request. 100 lines of instructions reduce the context window available for code generation and responses by that many tokens. If instructions are very long, some may be overlooked (context limits3).
In other words, the main reason to "keep Copilot instructions concise" is per-request context consumption, not positional attenuation over turns. However, the "context limits" problem (instructions being overlooked when too long) exists separately and is worth noting.
When Path-Specific Instructions Apply¶
When you specify applyTo in the YAML Frontmatter of a .instructions.md file, those instructions are added to the chat context conditionally2. There are two evaluation paths:
- applyTo glob match: The open file's path matches the
applyTopattern - Semantic matching: The agent determines relevance from the task content
sequenceDiagram
participant Dev as Developer
participant VSCode as VS Code
participant LLM as LLM
Dev->>VSCode: Opens src/api/users.ts and asks in Chat
VSCode->>VSCode: Evaluates via applyTo glob / semantic matching
VSCode->>LLM: Adds backend.instructions.md<br>to chat context
Note over LLM: ※Re-evaluated per request.<br>This request's context:<br>copilot-instructions.md +<br>backend.instructions.md
Dev->>VSCode: Opens src/components/Button.tsx and asks in Chat
VSCode->>VSCode: Evaluates via applyTo glob / semantic matching
VSCode->>LLM: Adds frontend.instructions.md<br>to chat context
Note over LLM: ※Re-evaluated per request.<br>This request's context:<br>copilot-instructions.md +<br>frontend.instructions.mdThis mechanism enables adding only the rules needed, only when needed. There is no need to cram all rules into copilot-instructions.md. Note that path-specific instructions from a previous request are not automatically carried over — context is re-assembled fresh on each request.
Copilot Code Review Constraints¶
In Copilot Code Review, only the first 4,000 characters of custom instruction files are read3. This limitation does not apply to Copilot Chat or Coding Agent, but for review purposes, prioritizing what appears at the top of your instructions becomes especially important. At the recommended size target for copilot-instructions.md (≈ 50 lines ≈ 1,500–2,000 characters), the entire file falls well within the Code Review limit.
Three Limitations the Official Docs Acknowledge¶
GitHub's official documentation explicitly states the following limitations of custom instructions3.
Non-deterministic behavior: Copilot may not always perfectly follow every instruction on every request. This stems from the probabilistic nature of LLMs.
Context limits: Very long instruction files may result in some instructions being ignored.
Importance of specificity: Clear, specific instructions are more effective than vague ones.
Agent Skills On-Demand Injection: A Third Mechanism¶
Agent Skills have a third injection mechanism distinct from Custom Instructions and applyTo.
According to the VS Code official documentation, Skills operate with 3-level Progressive Disclosure7.
| Level | What gets loaded | When |
|---|---|---|
| Level 1 | name and description only | Always (used for skill discovery/relevance matching; how it's included in input is implementation-dependent7) |
| Level 2 | SKILL.md body (instructions, guidelines) | When the agent determines relevance, or when the user explicitly invokes it via / |
| Level 3 | Bundled resources (scripts/, references/, templates/) | When the Skill's instructions reference them |
sequenceDiagram
participant User as User
participant Agent as Copilot Agent
participant LLM as LLM
Note over Agent,LLM: Level 1: metadata for skill<br>discovery/relevance matching<br>(how it's included is implementation-dependent)
Agent->>LLM: name + description (discovery metadata)
User->>Agent: "Review this code"
Agent->>Agent: Determines relevant Skill from description
Agent->>LLM: Level 2: Inject SKILL.md body
Note over LLM: Only relevant Skill instructions added
LLM->>Agent: Resource reference needed within Skill
Agent->>LLM: Level 3: Inject scripts/ referenced filesThis mechanism allows injecting detailed instructions on demand. However, the tradeoff is that Level 2+ injection depends on the agent's judgment, meaning there is no guarantee it will always be loaded.
Registering large numbers of Skills can hit visibility limits in practice8.
Claude Code (CLAUDE.md) Internal Mechanism¶
Injected as a User Message¶
In Claude Code, the content of CLAUDE.md is not injected into the System Prompt — it is injected as the first User Message at the start of the session4. This is the structural contrast to Copilot, which holds instructions in the chat context (system prompt side).
CLAUDE.md adds the contents as a user message following Claude Code's default system prompt.4
graph TB
subgraph SP2["🔒 System Prompt"]
direction TB
A2["Claude Code's default System Prompt"]
end
subgraph UM2["💬 User Messages (chronological order)"]
direction TB
B2["📄 CLAUDE.md content<br><i>Injected at session start</i>"]:::claude
C2["User's prompt #1"]
D2["Claude's response #1"]
E2["...(CLAUDE.md gets older as conversation continues)"]
F2["User's prompt #N"]
B2 --> C2 --> D2 --> E2 --> F2
end
SP2 --> UM2
classDef claude fill:#e65100,color:#ffffff,stroke:#bf360c,stroke-width:2pxBecause CLAUDE.md is injected as a User Message, its influence on rules tends to weaken later in a session (lost-in-the-middle effect6). This is the key difference from Copilot — the comparison table below shows the full picture.
Comparison: The Full Picture of Five Injection Mechanisms¶
Placing the mechanisms of all three tools side by side reveals the complete tradeoff structure of "context consumption vs. rule persistence."
| Mechanism | Added to | Context consumption | Rule persistence | When loaded |
|---|---|---|---|---|
copilot-instructions.md | chat context (system prompt side1) | Always consumed (every request) | High | Always |
.instructions.md + applyTo | chat context (system prompt side1) | Conditional | High | applyTo glob match or semantic matching |
| Agent Skills | chat context (L1: discovery metadata / L2+: on-demand) | Minimal (L1 always, placement implementation-dependent) | Medium (reliability concern) | Agent judgment or / invocation |
CLAUDE.md | User Message (per official spec4) | Always consumed | Low–Medium (attenuates, but re-injected after /compact5) | Session start (re-injected after /compact) |
.claude/rules/ | Context (injection format unspecified5) | Conditional | Medium | First interaction with matching file (lazy load) |
graph LR
subgraph Copilot["GitHub Copilot"]
direction TB
CopSP["System Prompt / Chat Context<br>─────────<br>copilot-instructions.md<br>.instructions.md<br>Skills (L1: discovery metadata / L2+: on-demand)"]:::sp
CopUM["User Messages<br>─────────<br>Prompt Files<br>User's questions"]:::um
CopSP --- CopUM
end
subgraph ClaudeCode["Claude Code"]
direction TB
CCSP["System Prompt<br>─────────<br>Default only<br><i>(can append with<br>--append-system-prompt)</i>"]:::sp
CCUM["User Messages<br>─────────<br>CLAUDE.md<br>.claude/rules/<br>User's questions"]:::um
CCSP --- CCUM
end
classDef sp fill:#2e7d32,color:#ffffff,stroke:#1b5e20,stroke-width:2px
classDef um fill:#b71c1c,color:#ffffff,stroke:#880e4f,stroke-width:2pxImpact on Design Decisions¶
From this structural difference, the following design decisions follow.
For Copilot: Since instructions are included in the chat context on every request (system prompt side per VS Code team explanation), putting "rules to enforce throughout the entire session" in copilot-instructions.md is rational. However, to minimize token consumption, keep content concise and handle details with the external document reference pattern. Separating frequently-invoked specialized instructions (code generation templates, debugging procedures, etc.) into Skills significantly improves context efficiency.
For Claude Code: Since CLAUDE.md is a User Message (per official spec4), its persistence characteristics differ from Copilot's chat context injection. Claude Code design best practices are covered separately in detail.
Practice: Instruction Design Based on Mechanism¶
Recommended Structure for Copilot¶
.github/
├── copilot-instructions.md # Always applied (resident in System Prompt)
│ → Tech stack, naming conventions, required rules
│ → Keep concise (aim for 50 lines or fewer)
│ → Handle details with references to docs/
│
└── instructions/
├── frontend.instructions.md # applyTo for frontend only
├── backend.instructions.md # applyTo for backend only
└── testing.instructions.md # applyTo for tests only
→ Injected into System Prompt only when needed
→ Conserves space in copilot-instructions.md
Design principle: Be mindful of per-request context consumption costs; use "is this needed in every context?" as your decision criterion. Separate rules needed only for specific paths into .instructions.md to avoid unnecessary token consumption.
Summary¶
The optimal design for instructions only becomes clear once you understand the internal mechanism.
The essence of "keep it concise" is per-request context consumption for Copilot (added to the chat context, system prompt side per VS Code team documentation), and positional attenuation as a User Message for Claude Code (per official spec). The same conclusion (keep it concise) but different rationales. Note that for Claude Code, /compact re-injects CLAUDE.md fresh, so the attenuation effect is primarily a concern in very long, uncompacted sessions.
Common to both tools is the principle of adding the right rules at the right time in the right place. The applyTo / semantic matching in Copilot and the paths in .claude/rules/ are precisely the features that realize this principle.
Next steps
For specific setup steps and templates for custom instructions, see the following articles.
🔗 Related Articles¶
- GitHub Copilot Custom Instructions Complete Guide Practical guide for setup and best practices
- applyTo Pattern Guide Managing conditional instructions by file path
- CLAUDE.md Design Guide Instruction design for Claude Code
Burke Holland (VS Code team), "Prompt Files vs Custom Instructions vs Custom Agents" — Implementation documentation by a VS Code team member. Explains that Custom Instructions are "appended to the system prompt" and "always last." Note: this is an implementation explanation, not an official specification document. ↩↩↩↩
VS Code official documentation, "Use custom instructions in VS Code" — Official docs use "combined and added to the chat context." Specifies instruction priority (personal > repository > organization) and that application is determined by applyTo glob matching + semantic matching. File concatenation order is not guaranteed. ↩↩↩
GitHub Docs, "Using custom instructions to unlock the power of Copilot code review" — Explicitly states: 4,000-character limit for Code Review, non-deterministic behavior (Copilot may not always follow all instructions), and context limits (very long files may result in some instructions being ignored). ↩↩↩
Claude Code official documentation, "Output styles" — Explicitly states: "CLAUDE.md adds the contents as a user message following Claude Code's default system prompt." The
--append-system-promptoption appends to the system prompt instead. ↩↩↩↩Claude Code official documentation, "Memory management" — Explicitly states: "Claude treats them as context, not enforced configuration." Rules in
.claude/rules/are "loaded into context" when matching files are worked on. The exact injection format (User Message vs. otherwise) is not specified. ↩↩CureApp Tech Blog (7tsuno), "What Should You Actually Write in CLAUDE.md" — Secondary commentary on the Session Start Hook framing and weakening influence over time. Referenced as supplementary material. ↩
VS Code official documentation, "Use agent skills in VS Code" — The 3-level Progressive Disclosure mechanism for Skills (name/description → SKILL.md body → resources). ↩↩
GitHub copilot-cli Issue #1130, "Skills token limit impact on context window" — A reported case where only 31 of 49 Skills were visible in Copilot CLI. This is a Copilot CLI implementation-specific example and should not be generalized as a VS Code specification. ↩