Skip to content

GitHub Copilot Complete Guide

GitHub Copilot Context Injection Mechanism Explained [Diagrams]

Target audience:

Developers using GitHub Copilot or Claude Code instruction settings who want to understand the technical rationale behind "why you should set it this way."

Key Takeaways

  • According to VS Code team implementation documentation, Copilot's custom instructions are added to the chat context (system prompt side). Claude Code's CLAUDE.md is injected as a User Message per official spec. This difference decisively determines how persistent rules are.
  • The main reason to keep Copilot instructions concise is context consumption per request — but "context limits" (instructions being overlooked when too long) is a separate issue that also exists.
  • Understanding how both tools work gives you clear criteria for deciding what to write where.

Background: "Why Keep It Concise?" — A Rationale That Got Lost

Best practices for GitHub Copilot custom instructions widely say "keep the file concise" and "use references to external documents." However, in-depth explanations of the technical mechanism behind why conciseness is necessary are rarely found.

Rather than "because the official docs recommend it," understanding how LLM prompt injection works fundamentally changes how you approach instruction design.

This article compares the internal mechanisms of GitHub Copilot and Claude Code to explain the technical rationale for "what to write where."

Related article

For specific setup steps and best practices for custom instructions, see GitHub Copilot Custom Instructions Complete Guide.


GitHub Copilot's Internal Prompt Structure

The Full Prompt Sent to the LLM

When GitHub Copilot (VS Code) sends a request to the LLM, the prompt is structured around System Prompt and User Message areas. According to a VS Code team member's implementation documentation1, custom instructions are added to the system prompt side — though the official VS Code docs use the term "added to the chat context" without specifying the exact internal area2.

graph TB
    subgraph SP["🔒 System Prompt / Chat Context (Foundation for model behavior)"]
        direction TB
        A["Core Identity & Global Rules<br><i>e.g., You are an expert AI programming assistant...</i>"]
        B["General Instructions<br><i>Changes dynamically per model</i>"]
        C["Tool Use Instructions<br><i>Guidelines for terminal execution, file editing</i>"]
        D["Output Format Instructions<br><i>Rules for response formatting</i>"]
        E["📄 .instructions.md content<br><i>Added via applyTo glob / semantic matching</i>"]:::inst
        F["📄 copilot-instructions.md content<br><i>Added to chat context (personal > repo > org priority)</i>"]:::inst
        G["📄 Custom Agent instructions<br><i>Appended only when in use</i>"]:::inst
        A --> B --> C --> D --> E --> F --> G
    end

    subgraph UM["💬 User Message (Part of the conversation)"]
        direction TB
        H["Prompt Files content<br><i>Only when explicitly referenced</i>"]
        I["Environment / Workspace Info<br><i>OS info, directory structure</i>"]
        J["Editor Context<br><i>Open files, selected text</i>"]
        K["The user's actual prompt"]
        H --> I --> J --> K
    end

    SP --> UM

    classDef inst fill:#0288d1,color:#ffffff,stroke:#01579b,stroke-width:2px

About the source

The structure above is based on VS Code team member Burke Holland's implementation documentation1. The official VS Code docs describe this as "combined and added to the chat context" — the exact internal area and ordering are not guaranteed as specification2.

What Instruction Injection Means

Instructions are included in the chat context on each request. Per VS Code team documentation, they are added to the system prompt side. The key characteristics are:

Persistence: Unlike the "position-based attenuation" seen in Claude Code's CLAUDE.md (which becomes an older message as turns progress), Copilot's instructions are re-assembled fresh in the context at each request. The weight of rules doesn't diminish over session turns in principle.

Cost: Instruction tokens are consumed on every request. 100 lines of instructions reduce the context window available for code generation and responses by that many tokens. If instructions are very long, some may be overlooked (context limits3).

In other words, the main reason to "keep Copilot instructions concise" is per-request context consumption, not positional attenuation over turns. However, the "context limits" problem (instructions being overlooked when too long) exists separately and is worth noting.

When Path-Specific Instructions Apply

When you specify applyTo in the YAML Frontmatter of a .instructions.md file, those instructions are added to the chat context conditionally2. There are two evaluation paths:

  1. applyTo glob match: The open file's path matches the applyTo pattern
  2. Semantic matching: The agent determines relevance from the task content
sequenceDiagram
    participant Dev as Developer
    participant VSCode as VS Code
    participant LLM as LLM

    Dev->>VSCode: Opens src/api/users.ts and asks in Chat
    VSCode->>VSCode: Evaluates via applyTo glob / semantic matching
    VSCode->>LLM: Adds backend.instructions.md<br>to chat context
    Note over LLM: ※Re-evaluated per request.<br>This request's context:<br>copilot-instructions.md +<br>backend.instructions.md

    Dev->>VSCode: Opens src/components/Button.tsx and asks in Chat
    VSCode->>VSCode: Evaluates via applyTo glob / semantic matching
    VSCode->>LLM: Adds frontend.instructions.md<br>to chat context
    Note over LLM: ※Re-evaluated per request.<br>This request's context:<br>copilot-instructions.md +<br>frontend.instructions.md

This mechanism enables adding only the rules needed, only when needed. There is no need to cram all rules into copilot-instructions.md. Note that path-specific instructions from a previous request are not automatically carried over — context is re-assembled fresh on each request.

Copilot Code Review Constraints

In Copilot Code Review, only the first 4,000 characters of custom instruction files are read3. This limitation does not apply to Copilot Chat or Coding Agent, but for review purposes, prioritizing what appears at the top of your instructions becomes especially important. At the recommended size target for copilot-instructions.md (≈ 50 lines ≈ 1,500–2,000 characters), the entire file falls well within the Code Review limit.

Three Limitations the Official Docs Acknowledge

GitHub's official documentation explicitly states the following limitations of custom instructions3.

Non-deterministic behavior: Copilot may not always perfectly follow every instruction on every request. This stems from the probabilistic nature of LLMs.

Context limits: Very long instruction files may result in some instructions being ignored.

Importance of specificity: Clear, specific instructions are more effective than vague ones.

Agent Skills On-Demand Injection: A Third Mechanism

Agent Skills have a third injection mechanism distinct from Custom Instructions and applyTo.

According to the VS Code official documentation, Skills operate with 3-level Progressive Disclosure7.

LevelWhat gets loadedWhen
Level 1name and description onlyAlways (used for skill discovery/relevance matching; how it's included in input is implementation-dependent7)
Level 2SKILL.md body (instructions, guidelines)When the agent determines relevance, or when the user explicitly invokes it via /
Level 3Bundled resources (scripts/, references/, templates/)When the Skill's instructions reference them
sequenceDiagram
    participant User as User
    participant Agent as Copilot Agent
    participant LLM as LLM

    Note over Agent,LLM: Level 1: metadata for skill<br>discovery/relevance matching<br>(how it's included is implementation-dependent)
    Agent->>LLM: name + description (discovery metadata)

    User->>Agent: "Review this code"
    Agent->>Agent: Determines relevant Skill from description
    Agent->>LLM: Level 2: Inject SKILL.md body
    Note over LLM: Only relevant Skill instructions added

    LLM->>Agent: Resource reference needed within Skill
    Agent->>LLM: Level 3: Inject scripts/ referenced files

This mechanism allows injecting detailed instructions on demand. However, the tradeoff is that Level 2+ injection depends on the agent's judgment, meaning there is no guarantee it will always be loaded.

Registering large numbers of Skills can hit visibility limits in practice8.


Claude Code (CLAUDE.md) Internal Mechanism

Injected as a User Message

In Claude Code, the content of CLAUDE.md is not injected into the System Prompt — it is injected as the first User Message at the start of the session4. This is the structural contrast to Copilot, which holds instructions in the chat context (system prompt side).

CLAUDE.md adds the contents as a user message following Claude Code's default system prompt.4

graph TB
    subgraph SP2["🔒 System Prompt"]
        direction TB
        A2["Claude Code's default System Prompt"]
    end

    subgraph UM2["💬 User Messages (chronological order)"]
        direction TB
        B2["📄 CLAUDE.md content<br><i>Injected at session start</i>"]:::claude
        C2["User's prompt #1"]
        D2["Claude's response #1"]
        E2["...(CLAUDE.md gets older as conversation continues)"]
        F2["User's prompt #N"]
        B2 --> C2 --> D2 --> E2 --> F2
    end

    SP2 --> UM2

    classDef claude fill:#e65100,color:#ffffff,stroke:#bf360c,stroke-width:2px

Because CLAUDE.md is injected as a User Message, its influence on rules tends to weaken later in a session (lost-in-the-middle effect6). This is the key difference from Copilot — the comparison table below shows the full picture.


Comparison: The Full Picture of Five Injection Mechanisms

Placing the mechanisms of all three tools side by side reveals the complete tradeoff structure of "context consumption vs. rule persistence."

MechanismAdded toContext consumptionRule persistenceWhen loaded
copilot-instructions.mdchat context (system prompt side1)Always consumed (every request)HighAlways
.instructions.md + applyTochat context (system prompt side1)ConditionalHighapplyTo glob match or semantic matching
Agent Skillschat context (L1: discovery metadata / L2+: on-demand)Minimal (L1 always, placement implementation-dependent)Medium (reliability concern)Agent judgment or / invocation
CLAUDE.mdUser Message (per official spec4)Always consumedLow–Medium (attenuates, but re-injected after /compact5)Session start (re-injected after /compact)
.claude/rules/Context (injection format unspecified5)ConditionalMediumFirst interaction with matching file (lazy load)
graph LR
    subgraph Copilot["GitHub Copilot"]
        direction TB
        CopSP["System Prompt / Chat Context<br>─────────<br>copilot-instructions.md<br>.instructions.md<br>Skills (L1: discovery metadata / L2+: on-demand)"]:::sp
        CopUM["User Messages<br>─────────<br>Prompt Files<br>User's questions"]:::um
        CopSP --- CopUM
    end

    subgraph ClaudeCode["Claude Code"]
        direction TB
        CCSP["System Prompt<br>─────────<br>Default only<br><i>(can append with<br>--append-system-prompt)</i>"]:::sp
        CCUM["User Messages<br>─────────<br>CLAUDE.md<br>.claude/rules/<br>User's questions"]:::um
        CCSP --- CCUM
    end

    classDef sp fill:#2e7d32,color:#ffffff,stroke:#1b5e20,stroke-width:2px
    classDef um fill:#b71c1c,color:#ffffff,stroke:#880e4f,stroke-width:2px

Impact on Design Decisions

From this structural difference, the following design decisions follow.

For Copilot: Since instructions are included in the chat context on every request (system prompt side per VS Code team explanation), putting "rules to enforce throughout the entire session" in copilot-instructions.md is rational. However, to minimize token consumption, keep content concise and handle details with the external document reference pattern. Separating frequently-invoked specialized instructions (code generation templates, debugging procedures, etc.) into Skills significantly improves context efficiency.

For Claude Code: Since CLAUDE.md is a User Message (per official spec4), its persistence characteristics differ from Copilot's chat context injection. Claude Code design best practices are covered separately in detail.


Practice: Instruction Design Based on Mechanism

.github/
├── copilot-instructions.md          # Always applied (resident in System Prompt)
│   → Tech stack, naming conventions, required rules
│   → Keep concise (aim for 50 lines or fewer)
│   → Handle details with references to docs/
│
└── instructions/
    ├── frontend.instructions.md     # applyTo for frontend only
    ├── backend.instructions.md      # applyTo for backend only
    └── testing.instructions.md      # applyTo for tests only
        → Injected into System Prompt only when needed
        → Conserves space in copilot-instructions.md

Design principle: Be mindful of per-request context consumption costs; use "is this needed in every context?" as your decision criterion. Separate rules needed only for specific paths into .instructions.md to avoid unnecessary token consumption.


Summary

The optimal design for instructions only becomes clear once you understand the internal mechanism.

The essence of "keep it concise" is per-request context consumption for Copilot (added to the chat context, system prompt side per VS Code team documentation), and positional attenuation as a User Message for Claude Code (per official spec). The same conclusion (keep it concise) but different rationales. Note that for Claude Code, /compact re-injects CLAUDE.md fresh, so the attenuation effect is primarily a concern in very long, uncompacted sessions.

Common to both tools is the principle of adding the right rules at the right time in the right place. The applyTo / semantic matching in Copilot and the paths in .claude/rules/ are precisely the features that realize this principle.

Next steps

For specific setup steps and templates for custom instructions, see the following articles.



  1. Burke Holland (VS Code team), "Prompt Files vs Custom Instructions vs Custom Agents" — Implementation documentation by a VS Code team member. Explains that Custom Instructions are "appended to the system prompt" and "always last." Note: this is an implementation explanation, not an official specification document. 

  2. VS Code official documentation, "Use custom instructions in VS Code" — Official docs use "combined and added to the chat context." Specifies instruction priority (personal > repository > organization) and that application is determined by applyTo glob matching + semantic matching. File concatenation order is not guaranteed. 

  3. GitHub Docs, "Using custom instructions to unlock the power of Copilot code review" — Explicitly states: 4,000-character limit for Code Review, non-deterministic behavior (Copilot may not always follow all instructions), and context limits (very long files may result in some instructions being ignored). 

  4. Claude Code official documentation, "Output styles" — Explicitly states: "CLAUDE.md adds the contents as a user message following Claude Code's default system prompt." The --append-system-prompt option appends to the system prompt instead. 

  5. Claude Code official documentation, "Memory management" — Explicitly states: "Claude treats them as context, not enforced configuration." Rules in .claude/rules/ are "loaded into context" when matching files are worked on. The exact injection format (User Message vs. otherwise) is not specified. 

  6. CureApp Tech Blog (7tsuno), "What Should You Actually Write in CLAUDE.md" — Secondary commentary on the Session Start Hook framing and weakening influence over time. Referenced as supplementary material. 

  7. VS Code official documentation, "Use agent skills in VS Code" — The 3-level Progressive Disclosure mechanism for Skills (name/description → SKILL.md body → resources). 

  8. GitHub copilot-cli Issue #1130, "Skills token limit impact on context window" — A reported case where only 31 of 49 Skills were visible in Copilot CLI. This is a Copilot CLI implementation-specific example and should not be generalized as a VS Code specification.