-
Notifications
You must be signed in to change notification settings - Fork 780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define a protocol for syntax tokens #1063
Comments
That is what Visual Studio Code does. I am not sure about other LSP clients.
About merging the tokens (via applying them on top) or replacing the tokens (by discarding the client information from the grammar), that has been discussed in the past (#18 (comment)) but it seems to have stalled a bit. |
All I can currently think of is to make this a client capability. I guess we will not be able to force a consistent behavior over all clients. |
Was it ever an ambition of semantic tokens to replace the existing (static, often regex-based) grammars? e.g. do you foresee any VS Code extension using exclusively semantic tokens and have no TM grammar? Or is there a reason why using semantic tokens as the exclusive way of highlighting files would be a bad idea? Personally I think it would be great if the wider community could agree on a single protocol/format for syntax highlighting and adding this to LSIF would probably help too. There is just way too many different solutions to the same problem (TextMate is the most popular one, but really it's one of many). I can see that being a long journey though as I reckon the editors would still want to provide some highlighting without LSP, assuming language servers today usually aren't built in the editors, so having to take any extra step just to get syntax highlighting to work seems like a potential source of friction in that context. |
Added a capability |
I'd love to see a clarification on the intended/expected scope of Even within just the few language servers I use regularly, there is huge variety in what highlighting tokens they provide. I agree with radeksimko above that it would be great if LSP could serve as a unified provider for syntax+semantic highlighting. |
FWIW We have recently implemented custom token types and modifiers while still keeping the predefined values from spec as fallback, so with the right use of capability negotiation this doesn't need to be a "mutually exclusive" choice. This of course assumes that if you want to make use of these custom token types and modifiers, the client needs to implement them - and practically couple it a bit more with a particular server, but that seems okay to me - as the highlighting should work just as before for those clients which do not support these custom types and modifiers, as long as they both do the capability negotiation right. However the whole highlighting chain in practice looks more like this There is also ambiguity in handling of conflicts between extensions/themes which claim the same files on the client. VS Code extension API allows you to define mapping of your custom types to TM scopes via Perhaps none of the above are strictly LSP problems but they are problems client maintainers will likely run into when implementing semantic token based highlighting.
|
There are some languages (e.g. XQuery) where a state-based tokenizer is necessary in order to determine the correct tokens. This is partially because it functions as a templating language such as PHP or the Liquid templating engine where you can mix XML and XPath/XQuery code. Additionally, XQuery can use most of the keywords as identifiers, with different versions of the language (and vendor extensions) having different sets of reserved keywords. This means that any regex-based keyword syntax highlighting will need to be able to remove keyword highlighting and specify a semantics-based highlighting (variable, function, etc.). Having to maintain two different grammars or hand-written lexers/parsers for the syntax highlighting and everything else generally defeats the point of having the LSP separate from the editor. It also means that there can be differences in the highlighting expecially for complex languages, or language features like string interpolation and language injection (e.g. CSS in HTML style elements). |
Is my reading correct, that semantic tokens only add additional information to tokens?
Can we trust that the clients will render keywords correctly, if they already have a syntax highlighter (frequently regexp based).
If that's the case, should there be an option to "unset" a type. Just as an example, if you have
#ifdef
s you could think of giving the LSP server the option of removing syntax coloring from inactive code segments by giving it the token-typeNone
.The text was updated successfully, but these errors were encountered: