Adding Language Support via Config (No Code Required)¶
lang-check supports two no-code ways to recognize additional file formats:
map a new extension onto an existing built-in language, or
define a Simplified Language Schema (SLS) in YAML for a format that does not have a built-in tree-sitter extractor.
This page focuses on the SLS workflow.
When to use an SLS schema¶
Use an SLS schema when:
the format is mostly line-oriented markup or config,
you can describe prose lines with regular expressions,
code blocks or metadata can be skipped with simple line or block rules, and
you do not want to write Rust or add a tree-sitter grammar.
Common examples include AsciiDoc, TOML-with-comments, INI-like formats, and project-specific note formats.
Where schema files live¶
Create schema files in your workspace at:
.langcheck/schemas/
lang-check loads every .yaml and .yml file from that directory in both the
CLI and the language-check server.
Example layout:
my-project/
.languagecheck.yaml
.langcheck/
schemas/
asciidoc.yaml
toml-notes.yml
Schema file format¶
Each schema file is a YAML object with these fields:
name: a human-readable language ID for the schema.extensions: file extensions handled by the schema, without leading dots.prose_patterns: regex rules for lines that count as prose.skip_patterns: regex rules for lines that should be ignored.skip_blocks: start/end regex pairs for multi-line blocks that should be ignored entirely.
Minimal shape:
name: my-format
extensions:
- myfmt
prose_patterns:
- pattern: "..."
skip_patterns:
- pattern: "..."
skip_blocks:
- start: "..."
end: "..."
prose_patterns¶
prose_patterns is a list of line-level regexes. A non-empty line is treated
as prose only if it matches at least one pattern.
If you leave prose_patterns empty, every non-empty line is treated as prose
unless it is excluded by skip_patterns or skip_blocks.
skip_patterns¶
skip_patterns is a list of line-level regexes. If a line matches any of them,
that line is excluded from checking.
Use this for:
headings,
directive lines,
comments,
key/value assignments,
delimiter lines.
skip_blocks¶
skip_blocks defines multi-line regions to exclude. Each entry has:
start: regex for the opening line,end: regex for the closing line.
Once a start line matches, lang-check skips that line, everything inside the
block, and the closing delimiter line.
Use this for:
fenced code blocks,
literal blocks,
embedded scripts,
heredoc-style regions.
Worked Example: AsciiDoc¶
This schema treats normal text as prose, skips headings, and ignores fenced
listing blocks delimited by ----.
Create .langcheck/schemas/asciidoc.yaml in your workspace with:
name: asciidoc
extensions:
- adoc
- asciidoc
prose_patterns: []
skip_patterns:
- pattern: "^=+\\s"
skip_blocks:
- start: "^----\\s*$"
end: "^----\\s*$"
Sample file:
= Document Title
This is an test.
----
This is an test in code.
----
With the schema above:
the heading line is skipped by
skip_patterns,the fenced block is skipped by
skip_blocks,This is an test.is treated as prose and checked.
Verify it with:
language-check check path/to/file.adoc
Worked Example: TOML Notes¶
For a TOML-like file where only free-form comment lines should be checked:
name: toml-notes
extensions:
- toml
prose_patterns: []
skip_patterns:
- pattern: "^\\s*\\["
- pattern: "^\\s*\\w+\\s*="
skip_blocks: []
This will ignore tables and assignments while still checking plain text lines that do not match those patterns.
Built-in languages still win¶
SLS schemas are a fallback only. If a file extension already has a built-in tree-sitter extractor, lang-check keeps using the built-in extractor for that extension.
That means:
a custom
.rstschema will not replace the built-in reStructuredText extractor,a custom
.mdschema will not replace Markdown,schemas are primarily for new extensions that do not already have first-class support.
When to use extension aliases instead¶
If the new format is truly the same as an existing built-in format, a simple
extension alias in .languagecheck.yaml is usually better:
languages:
extensions:
markdown:
- mdx
That reuses the target language’s full tree-sitter extractor instead of the regex-based SLS fallback.
Limitations¶
SLS extraction is regex-based, not AST-aware.
Matching is line-oriented; nested syntax is not understood.
Block skipping depends entirely on your
start/endpatterns.If a format needs structural parsing, use the plugin workflow instead.
For a dedicated tree-sitter-based language integration, see guide-plugin-language.md.