Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qestion segmentation #1

Open
jakubMitura14 opened this issue Sep 7, 2023 · 2 comments
Open

qestion segmentation #1

jakubMitura14 opened this issue Sep 7, 2023 · 2 comments

Comments

@jakubMitura14
Copy link

Hello thanks for publishing your fantastic work - I was wondering if can I use your encoding as a positional encoding step in the Vision transformer - the task is segmentation of 3D medical image , or the algorithm is created specifically for image reconstruction?

@osiriszjq
Copy link
Owner

Thanks for your interest. It is a general positional encoding method. It is the most 'ideal' one so the computation is vast. In this paper, we use the Kronecker property to speed it up. However, this only works when followed by a linear layer, which is true in transformer if you do some math. I think the problem is the output dimension. Here for image reconstruction, it's only 3 but for transformer it's much larger, so maybe it will be slow or out of memory (it also depends on the image size). What's more, from my understanding, positional encoding is not that important in vision transformers (I guess?). But if your method needs very good positional information, our method would be a good choice to try.

@jakubMitura14
Copy link
Author

Thanks! yes it is not critical for the vision transformer some improvements can be observed in general, although here I am working on 3d medical images so in principle relative location is critical, ok Thanks I will experiment with it !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants