Data Flow
End-to-End Flow
1) Source Change
The user edits .tex files in VS Code.
The editor sends incremental text synchronization events to the server.
2) Source Analysis (Immediate)
On didOpen/didChange/didSave, the server updates the document store:
- snapshot text
- update CST/AST incrementally
- update per-document index entries
The server then updates workspace indices and publishes:
- source diagnostics (parse recovery, unresolved references)
- language features (completion, definition, references, outline, semantic tokens)
3) Build Trigger
One of:
- explicit command (compile)
- save-triggered compilation
- file watcher-triggered compilation
The VS Code extension triggers server commands; the server triggers engine execution.
4) Engine Execution
The server uses an adapter (latexmk / tectonic / direct engine).
Outputs are ingested through one or both channels:
- stdout/stderr capture (low latency)
- log file tailing (canonical reference)
5) Build Parsing + Reconstruction
The parser produces:
- a typed event stream
- a reconstructed file context stack timeline
The reconstruction layer associates diagnostics with:
- file path (best-effort)
- line reference (when present)
- confidence score
6) Diagnostics Publication
The server emits LSP diagnostics.
Diagnostics may be produced by:
- Source pipeline (CST/AST + indexing)
- Build pipeline (engine/log observability)
Diagnostics are stable under incremental updates:
- newly appended bytes should only affect downstream diagnostics if they complete a previously partial structure or introduce new diagnostics
7) User Interaction
The user clicks diagnostics, opens log excerpts, or triggers code actions.
Observability Goals
- Ability to show “why this diagnostic is mapped here” using:
- log byte spans (for build diagnostics)
- file stack at time of build diagnostic emission
- CST node identity / source range (for source diagnostics)
Failure Modes
- Ambiguous file transitions (parenthesis-in-filename)
- Wrapped lines breaking path recognition
- Engine outputs mimicking file stack tokens
In these cases, outputs MUST degrade gracefully:
- emit diagnostics with reduced confidence
- preserve raw log excerpts for user inspection