Problem:
Codebase inconsistently binds vim.api onto a or api.
Solution:
Use api everywhere. a as an identifier is too short to have at the
module level.
Problem:
Help tags like vim.treesitter.language.add() are confusing because
`vim.treesitter.language` is (thankfully) not a user-facing module.
Solution:
Ignore the "fstem" when generating "treesitter" tags.
Problem:
Treesitter injections are slow because all injected trees are invalidated on every change.
Solution:
Implement smarter invalidation to avoid reparsing injected regions.
- In on_bytes, try and update self._regions as best we can. This PR just offsets any regions after the change.
- Add valid flags for each region in self._regions.
- Call on_bytes recursively for all children.
- We still need to run the query every time for the top level tree. I don't know how to avoid this. However, if the new injection ranges don't change, then we re-use the old trees and avoid reparsing children.
This should result in roughly a 2-3x reduction in tree parsing when the comment injections are enabled.
Problem:
vim.treesitter does not know how to map a specific filetype to a parser.
This creates problems since in a few places (including in vim.treesitter itself), the filetype is incorrectly used in place of lang.
Solution:
Add an API to enable this:
- Add vim.treesitter.language.add() as a replacement for vim.treesitter.language.require_language().
- Optional arguments are now passed via an opts table.
- Also takes a filetype (or list of filetypes) so we can keep track of what filetypes are associated with which langs.
- Deprecated vim.treesitter.language.require_language().
- Add vim.treesitter.language.get_lang() which returns the associated lang for a given filetype.
- Add vim.treesitter.language.register() to associate filetypes to a lang without loading the parser.
Use the first, not last, query for a language on runtimepath. Typically,
this implies that a user query will override a site plugin query, which
will override a bundled runtime query.
Problem: Treesitter queries for a given language in runtime were merged together,
leading to errors if they targeted different parser versions (e.g., bundled viml queries
and those shipped by nvim-treesitter).
Solution: Runtime queries now work as follows:
* The last query in the rtp without `; extends` in the header will be used as the base query
* All queries (without a specific order) with `; extends` are concatenated with the base query
BREAKING CHANGE: queries need to be updated if they are meant to extend other queries
As part of the upstream of utility functions from nvim-treesitter, this
option when set to false allows to return a table (downstream behavior).
Effectively making the switch from the downstream to the upstream
function much easier.
Previously the `offset!` directive populated the metadata in such a way
that the new range could be attributed to a specific capture. #14046
made it so the directive simply stored just the new range in the
metadata and information about what capture the range is based from is
lost.
This change reverts that whilst also correcting the docs.
Based on https://github.com/neovim/neovim/pull/14445
This extends `vim.treesitter.query.get_node_text` to return the text
that spans a node's range even if start_row ~= end_row.
The official developer documentation in in :h dev-lua-doc specifies to
use "--@" for special/magic tokens. However, this format is not
consistent with EmmyLua notation (used by some Lua language servers) nor
with the C version of the magic docstring tokens which use three comment
characters.
Further, the code base is currently split between usage of "--@",
"---@", and "--- @". In an effort to remain consistent, change all Lua
magic tokens to use "---@" and update the developer documentation
accordingly.
For the case of Clojure and other Lisp syntax highlighting, it is
necessary to create huge regexps consisting of hundreds of symbols with
the pipe (|) character. To make things more difficult, these Lisp
symbols sometimes consists of special characters that are themselves
part of special regexp characters like '*'. In addition to being
difficult to maintain, it's performance is suboptimal.
This patch introduces a new predicate to perform 'source' matching in
amortized constant time. This is accomplished by compiling a hash table
on the first use.