Can I get scope / scopeRange at a position? #580

seanmcbreen · 2015-11-24T20:49:47Z

From @billti on November 1, 2015 6:10

The API call document.getWordRangeAtPosition(position) appears to use its own definition of a word. For example, my tmLanguage defines attrib-name as a token/scope, yet getWordRangeAtPosition appears to break this into 2 words on the - character.

How can I get token ranges at a position based on my custom syntax? (And it would be really useful if I could get the scope name that goes along with it too).

Copied from original issue: Microsoft/vscode-extensionbuilders#76

The text was updated successfully, but these errors were encountered:

seanmcbreen · 2015-11-24T20:49:48Z

From @vilic on November 1, 2015 15:19

👍

seanmcbreen · 2015-11-24T20:49:49Z

From @egamma on November 2, 2015 8:16

Exposing the scope names in the API is on the backlog, but will not make it into the November update.

seanmcbreen · 2015-11-24T20:49:49Z

From @jrieken on November 2, 2015 18:16

@billti despite the lack of access to scopes you can define your a custom word definition such that it will be picked up by document.getWordRangeAtPosition. You can register a ITokenTypeClassificationSupport which can contribute a regex to classify words.

seanmcbreen · 2015-11-24T20:49:50Z

From @billti on November 2, 2015 19:3

Thanks @jrieken , I spotted that, and it may be a useful interim solution. But generally for now, if I want to know the classification accurately for a position in a CFG, seems I'll need to document.getText() and run my own parser over it - is that right?

seanmcbreen · 2015-11-24T20:49:52Z

From @jrieken on November 3, 2015 9:59

unfortunately yes

hoovercj · 2016-05-14T14:51:29Z

@egamma on November 2, 2015 8:16
Exposing the scope names in the API is on the backlog, but will not make it into the November update.

Is there any update on if/when we can expect a way to get the scopes at a position or offset?

egamma · 2016-05-16T20:36:09Z

@hoovercj all I can currently say is that this is still on the backlog, sorry.

TimonVS · 2016-10-13T21:08:01Z

@egamma Any progress on this? Is there any way I can contribute? :)

siegebell · 2016-10-14T02:26:44Z

Would it be trivial to provide a command that returns a url to the TextMate grammar file being used for a particular document/languageId (or return the contents of the file to keep them read-only)? Then we could use vscode-textmate ourselves to get the token info at a particular location.

hoovercj · 2016-10-14T07:56:17Z

@siegebell -- As a short-term solution, I have successfully included a textmate grammar with my extension , referenced that, and referenced the built-in vscode-textmate package to get token scopes in an extension.

It's a pain and it really should be part of the API, but it's definitely possible to do today.

I was given the advice to use: var tm = require(path.join(require.main.filename, '../../node_modules/vscode-textmate/release/main.js')); to access vscode-textmate, but since I have a language server I had to evaluate require.main.filename in the language client and pass it over with the initializationOptions to get the right value in my server.

egamma · 2016-10-14T09:57:53Z

@TimonVS exposing the scopes in API requires that we re-visit the internal representation of scopes, this requires major re-architecting and this makes challenging to open-up for contributions.

siegebell · 2016-12-14T18:17:11Z

In the meantime, I've published an extension, scope-info, that provides an API to query the scope at a particular position. It works by querying the installed extensions for language definitions and grammars, and then maintains a parse-state for each open document using vscode-textmate. Only one instance will exist per vscode instance, regardless of how many other extensions depend on it.

Example usage:

import * as api from 'scope-info'
async function example(doc : vscode.TextDocument, pos: vscode.Position) {
    const siExt = vscode.extensions.getExtension<api.ScopeInfoAPI>('siegebell.scope-info');
    const si = await siExt.activate();
    const t1 : api.Token = si.getScopeAt(doc, pos);
}

Notes:

For typings, refer to scope-info.d.ts.
You can also query the vscode-textmate-IGrammar and scope name of a language.
Your extension should list 'siegebell.scope-info' as an extensionDependency.
If multiple extensions contribute to the same language, scope-info may pick the wrong one.
Scope-info might return a scope corresponding to a slightly newer or older document version than what your extension thinks is current.
Pull requests are welcome.

ramya-rao-a · 2017-01-16T19:03:14Z

exposing the scopes in API requires that we re-visit the internal representation of scopes, this requires major re-architecting and this makes challenging to open-up for contributions

@alexandrudima I believe the above was done as part of #18317

@aeschli Will #18068 be covering the feature ask in this current issue or are we suggesting extension authors to use https://marketplace.visualstudio.com/items?itemName=siegebell.scope-info?

aeschli · 2017-01-19T10:40:31Z

Alex added a developer tool that lets you see the tokens at a location. See #17933 (comment)

There's still no extension API that returns text-mate scopes. Several reasons for that one of them that we don't want that extensions start depending on a particular tokenizer grammar.

ImUrX · 2021-05-06T15:30:37Z

this issue would be so useful and is one of the oldest thats still open

ghost · 2021-10-21T08:48:34Z

I recently wrote vscode-textmate-languageservice precisely to exploit Textmate tokens in providers such as folding, outline/TOC etcetera. Unfortunately the performance leaves much to be lacking because the code is tokenized again - Gimly/vscode-matlab#142

universemaster · 2021-10-21T13:45:51Z

I appreciate this issue is about a vscode API.

However, are you familiar with https://github.com/draivin/hscopes ?

A meta-extension for vscode that provides TextMate scope information. Its intended usage is as a library for other extensions to query scope information.

and

This extension provides an API by which your extension can query scope & token information.

ghost · 2021-11-11T10:27:27Z

I need a dump of all the tokens in a document tbh. The information is there, it just needs to be exposed in a sane manner.

ghost · 2021-12-10T10:25:49Z

For what it's worth I have hooked into the native module using Microsoft's getCoreNodeModule trick.
It works! But is also slow and retokenizes the entire document - https://github.com/SNDST00M/vscode-textmate-languageservice/blob/v0.2.2/src/util/getCoreNodeModule.ts

savetheclocktower · 2022-02-01T21:01:13Z

There's still no extension API that returns text-mate scopes. Several reasons for that one of them that we don't want that extensions start depending on a particular tokenizer grammar.

I feel silly responding to a four-year-old comment, but it is the last “official” word on this issue, so here I go.

VSCode is the third major code editor to borrow TextMate's grammar system, and I wonder if they all thought that its scope names were simply an implementation detail of its grammars. Quite the opposite — a lot of thought went into this system, since it was also used as the basis for much of TextMate's customizability.

Scope names aren't just hooks for syntax highlighting. TextMate commands are tightly woven to the semantics of scope names. You can have the same key combination perform different commands based on scope, so that your command doesn't monopolize a hotkey for something that (say) only works when the cursor is within a string. Conversely, you can define a command that behaves identically across different languages because it hooks into the presence of a generic scope name. This is how TextMate recognizes URLs across multiple contexts — inside HTML files, inside Markdown files, inside code comments regardless of language — and implements a single “open this URL in my browser” command that works identically across all of them.

Even after moving from TextMate to Atom several years ago, I was able to keep almost all of my ornery customizations because Atom allowed me to inspect scopes at the cursor. I define a command in my Atom init-file whose only purpose is to interpret Enter on my num pad and delegate it to one of three other commands based on what scope I'm in. If I migrate to VSCode in the future, it'll be a reluctant migration if this issue is still open.

TextMate's scope naming conventions are middleware. VSCode could move to tree-sitter grammars tomorrow without breaking anyone's syntax coloring themes; it'd just need to map tree-sitter token names to existing scope names. If a “get all scopes at position X” API existed in VSCode, and I relied upon it when writing an extension, that extension would keep working in a future version of VSCode that no longer supported TM-style grammars but kept TM's scope naming conventions.

You may think, “Yes, but we don't want to make these naming conventions permanent! That's the whole point!” To which I'd ask: what would you replace them with, and why? Is there something that the existing naming conventions can't do? Is there a compelling reason to invent something new that would justify the amount of community effort it would take to adopt a new scheme? Would it make a migration toward tree-sitter grammars easier or harder if syntax coloring themes had to support two different naming systems at once?

As the comments on this issue illustrate, not having this functionality doesn't remove the need for extensions to know this information; it just means those extensions have to use imperfect workarounds. And it results in tighter coupling to TextMate grammars than would otherwise be necessary, since those workarounds need to know the grammar's implementation details to reproduce the result.

I hope you'll consider this feature request sometime soon; it'd be a huge customizability win.

tjx666 · 2022-05-16T02:32:25Z

If I understand correctly, the extension author can using this to implements some function like hover tip without ast parsing. Ast parsing is really expensive sometimes.

sandipchitale · 2022-07-01T03:55:09Z

I have implemented an extension:

Show scopes at cursor in active editor

showing how to use API exposed by:

HyperScopes

Hope this helps someone.

m-paternostro · 2022-07-03T17:04:05Z

Another informal vote for this feature, if I may...

In our case, we want to correlate the content from the editor to an extremely rich repository of runtime-produced information. Without the minimum understanding of the code (precisely what symbols/scope/tokens would provide), such a correlation is faulty way more often than acceptable.

Zxynine · 2022-07-15T20:59:16Z

I would love this feature too, I keep ending up here trying to find a way to know for sure if a given position is in a comment or not, it seems like without regex and reading of a language config file there is no clear way to know that. Even if I wasnt just looking for a way to know if something is within a comment, this feature is still something I want to see implemented and would make a world of difference to many extensions.

pelmers · 2022-08-13T13:28:29Z

Like @Zxynine, I am also looking for a way to tell whether a selection in the editor is a comment. Apparently we are not the only ones. I found this commit in Better Comments which defines the line comment format for many languages: aaron-bond/better-comments@47717e7

So that's one way to implement this (at least for line comments), though not my favorite.

I have also tried the API exposed by Hyperscopes (https://github.com/draivin/hscopes), but I experienced multi-second freezes of the extension host even when editing very modestly sized files.

Perhaps tree-sitter would be fast enough to parse files without delay. I see the extensions https://github.com/georgewfraser/vscode-tree-sitter and https://github.com/EvgeniyPeshkov/syntax-highlighter import web-tree-sitter (wasm-compiled tree sitter modules) to provide syntax highlighting.

Of course these are all workarounds, and I think VS Code should provide access to this information. It knows it already, after all!

I agree with @savetheclocktower that the only given justification doesn't seem adequate. Can we quantify the risk? How big of a change has happened to Textmate grammars in the last 7 years?

lukstafi · 2022-11-13T17:24:52Z

If vscode.provideDocumentRangeSemanticTokens was outputting all tokens for most languages, it would satisfy many of the needs discussed here. But from my limited experiments, it only outputs "interesting" tokens, it doesn't output tokens for comments, string literals, operators.

Fred-Vatin · 2022-11-22T12:55:01Z

Seven years this issue is open.

SEVEN YEARS !!!

I don’t expect this will be fixed soon. We’ll have to learn to live with it or move to another IDE.

PoetaKodu · 2022-12-23T18:08:51Z

Great. I cannot access information about my documents that are there, hidden behind VS Code wall. How is this a thing in 2022? Admins, do not ignore that please.

zm-cttae-archive · 2023-01-28T23:34:41Z

How to turn language & grammar contributions into IGrammar parser from context and language-id alone:
- 97af23cd/src/index.ts#L36-51
- 97af23cd/src/services/resolver.ts
How to tokenize a document in a performant, async way and then cache it: tokenizer.ts snippet

Just to be clear - if you want one line scope and scopeRange, there is HyperScopes.
If you need full document use this.
The code is significantly more complex because we need browser support, promise caching and cross-env resource hashing for that.

Also, I chose not to use the browser streaming compiler for onig.wasm, I used webpack instead.
It's a very different approach from the monaco-tm repo.

If you use fetch you still need ${vscode.env.appRoot}/blablabla

zm-cttae-archive · 2023-01-29T09:01:03Z

I need to update my code examples to work on web because only fetch works for wasm on web

iCSawyer · 2023-02-13T11:52:06Z

EIGHT years! Eight years after eight years, do you know how I've spent the last eight years? Why is it so difficult to provide them?

zm-cttae-archive · 2023-02-24T06:40:18Z

I have solved this issue for myself and any language extension authors that can pass vscode.ExtensionContext.

EDIT: Some folk at the extension development slack want TypeScript support - I just released it this week.

There is a quite fast full-document tokenization API in vscode-textmate-languageservice - I haven't put JSDoc on it, but its stable and I'll never want/need to change API shape.

You need to set up your contributes to wire upa language and its grammar:

Language contribution:
vscode-textmate-languageservice@e71fd80fbda0108ed4b6fda89a3450a902fa7397/package.json • line 44 to 48
Grammar contribution:
vscode-textmate-languageservice@e71fd80fbda0108ed4b6fda89a3450a902fa7397/package.json • line 31 to 40

Then get our tokens:

import TextmateLanguageService from 'vscode-textmate-languageservice';

export async function activate(context: vscode.ExtensionContext) {
    const selector: vscode.DocumentSelector = 'matlab';
    const lsp = new TextmateLanguageService('matlab', context);
    const tokenService = await lsp.initTokenService();
    const activeTextDocument = vscode.window.activeTextEditor!.document;
    const tokens = tokenService.fetch(activeTextDocument);
};

It works in the browser and can do hugefiles quite quickly too. File hashing + caching is built in also.

There is a compulsory configuration which serves to enhance the results and generate folding level data.
If you are lazy make it {} at ./textmate-configuration.json it'll still work.

You can write your own scope and scopeRange functions by using startIndex endIndex and line properties.
The line property is zero-indexed FWIW (~~the way real API line numbers should be 😉~~)

Enjoy!

zm-cttae-archive · 2023-03-29T04:09:48Z

Tokenization of Typescript and any grammar (without having to set up configuration) now available!

I used textmate-languageservice-contributes key to replace contributes so we don't override existing language contribution.

vsce-toolroom/vscode-textmate-languageservice@v1.2.1/README.md #tokenization

zm-cttae · 2023-09-08T23:30:01Z

https://github.com/vsce-toolroom/vscode-textmate-languageservice/releases/tag/v2.0.0

Add getTokenInformationAtPosition method for fast positional token polyfill: vscode.TokenInformation.
Add getScopeInformationAtPosition method to get Textmate token data: TextmateToken.
Add getScopeRangeAtPosition method to get token range: vscode.Range.
Add getLanguageConfiguration method for language configuration: LanguageDefinition.
Add getGrammarConfiguration method to get language grammar wiring: GrammarLanguageDefinition.
Add getContributorExtension method to get extension source of language ID: vscode.Extension.

Please star the project on GitHub if you think there is further use you could make of it.

zm-cttae · 2023-09-18T08:34:10Z

@alexdima seeing as this has been solved by an external library and the internal proposed API, will this be closed?

seanmcbreen assigned alexdima Nov 24, 2015

alexdima added feature-request Request for new features or functionality api labels Nov 25, 2015

jrieken mentioned this issue Dec 3, 2015

Use default Format options in Extensions #940

Closed

egamma modified the milestone: Backlog Dec 10, 2015

This was referenced Jan 13, 2016

Expose a proper tokenization/colorization API to extensions #1967

Closed

Don't try to content assist in comments #1657

Closed

alexdima added the editor label Mar 17, 2016

alexdima removed their assignment Mar 17, 2016

alexdima added the tokenization Text tokenization label Aug 9, 2016

hoovercj mentioned this issue Aug 12, 2016

API to get the scopes applied to the current position in the document #7027

Closed

jrieken mentioned this issue Sep 8, 2016

Make TextMate scopes available for extensions. #11649

Closed

siegebell mentioned this issue Oct 22, 2016

[Feature Request] View the TMLanguage scope that is called in the editor. #14204

Closed

siegebell mentioned this issue Nov 30, 2016

Allow scoped snippet #15287

Closed

siegebell mentioned this issue Jan 15, 2017

API for extensions to know the scope as per grammar of the cursor position #18485

Closed

ryanfitzer mentioned this issue Mar 2, 2021

Spellcheck only comments streetsidesoftware/vscode-spell-checker#107

Open

This was referenced Nov 22, 2022

VSCode language specific settings Gruntfuggly/todo-tree#365

Closed

A way to get an overview, filter and jump to the relevant comment Zxynine/EvenBetterComments#2

Open

alexdima removed their assignment Dec 6, 2022

joelday mentioned this issue Dec 16, 2022

Grammar check of comments znck/grammarly#17

Open

zm-cttae-archive mentioned this issue Feb 25, 2023

Give access to the AST and custom TS queries for other extension developpers microsoft/vscode-anycode#15

Open

saurabharch mentioned this issue Nov 27, 2023

[Snyk] Security upgrade chokidar from 1.0.5 to 2.0.0 saurabharch/vscode#15

Open

agup006 mentioned this issue Nov 30, 2023

[Snyk] Security upgrade chokidar from 1.0.5 to 2.0.0 agup006/vscode#8

Open

eric-wieser mentioned this issue Jan 2, 2024

[themes] Themes don't support background styling #3429

Open

Can I get scope / scopeRange at a position? #580

Can I get scope / scopeRange at a position? #580

Comments

seanmcbreen commented Nov 24, 2015

seanmcbreen commented Nov 24, 2015

seanmcbreen commented Nov 24, 2015

seanmcbreen commented Nov 24, 2015

seanmcbreen commented Nov 24, 2015

seanmcbreen commented Nov 24, 2015

hoovercj commented May 14, 2016

egamma commented May 16, 2016

TimonVS commented Oct 13, 2016

siegebell commented Oct 14, 2016

hoovercj commented Oct 14, 2016

egamma commented Oct 14, 2016

siegebell commented Dec 14, 2016

ramya-rao-a commented Jan 16, 2017

aeschli commented Jan 19, 2017

ImUrX commented May 6, 2021

ghost commented Oct 21, 2021

universemaster commented Oct 21, 2021

ghost commented Nov 11, 2021

ghost commented Dec 10, 2021 • edited by ghost Loading

savetheclocktower commented Feb 1, 2022 • edited Loading

tjx666 commented May 16, 2022 • edited Loading

sandipchitale commented Jul 1, 2022

m-paternostro commented Jul 3, 2022 • edited Loading

Zxynine commented Jul 15, 2022

pelmers commented Aug 13, 2022

lukstafi commented Nov 13, 2022 • edited Loading

Fred-Vatin commented Nov 22, 2022

PoetaKodu commented Dec 23, 2022

zm-cttae-archive commented Jan 28, 2023 • edited Loading

zm-cttae-archive commented Jan 29, 2023 • edited Loading

iCSawyer commented Feb 13, 2023

zm-cttae-archive commented Feb 24, 2023 • edited Loading

zm-cttae-archive commented Mar 29, 2023 • edited Loading

zm-cttae commented Sep 8, 2023 • edited Loading

zm-cttae commented Sep 18, 2023

ghost commented Dec 10, 2021 •

edited by ghost

Loading

savetheclocktower commented Feb 1, 2022 •

edited

Loading

tjx666 commented May 16, 2022 •

edited

Loading

m-paternostro commented Jul 3, 2022 •

edited

Loading

lukstafi commented Nov 13, 2022 •

edited

Loading

zm-cttae-archive commented Jan 28, 2023 •

edited

Loading

zm-cttae-archive commented Jan 29, 2023 •

edited

Loading

zm-cttae-archive commented Feb 24, 2023 •

edited

Loading

zm-cttae-archive commented Mar 29, 2023 •

edited

Loading

zm-cttae commented Sep 8, 2023 •

edited

Loading