Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize tl.dendrogram function to either axis #2771

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

dburkhardt
Copy link

@dburkhardt dburkhardt commented Nov 28, 2023

  • Release notes not necessary because:

@dburkhardt dburkhardt mentioned this pull request Nov 28, 2023
3 tasks
Copy link

codecov bot commented Nov 28, 2023

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (ddeb820) 74.56% compared to head (521a590) 74.65%.
Report is 7 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2771      +/-   ##
==========================================
+ Coverage   74.56%   74.65%   +0.08%     
==========================================
  Files         115      115              
  Lines       12713    12783      +70     
==========================================
+ Hits         9480     9543      +63     
- Misses       3233     3240       +7     
Files Coverage Δ
scanpy/plotting/_anndata.py 84.93% <100.00%> (-0.06%) ⬇️
scanpy/plotting/_baseplot_class.py 89.88% <100.00%> (ø)
scanpy/tools/_utils.py 70.58% <85.71%> (+0.96%) ⬆️
scanpy/tools/_dendrogram.py 88.40% <89.58%> (+1.73%) ⬆️

... and 9 files with indirect coverage changes

scanpy/tools/_dendrogram.py Outdated Show resolved Hide resolved
scanpy/tools/_dendrogram.py Outdated Show resolved Hide resolved
scanpy/tools/_dendrogram.py Outdated Show resolved Hide resolved
dim
Dimension (obs or var) along which to calculate the dedrogram. Pass either `dim` or `axis` but not both.
axis
Axis (0 or 1) along which to calculate the dendrogram. Pass either `dim` or `axis` but not both.
{n_pcs}
{use_rep}
var_names
Copy link
Contributor

@ilan-gold ilan-gold Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it overkill to make this also generalize i.e., you can groupby on either axis and you can subset (i.e., what var_names does now) on the opposite axis?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will say this part kind of threw me given the title of the PR. I see what what is going on now, but would be curious your thoughts @flying-sheep. I would expect to do identical operations on either axis but again, maybe overkill

@flying-sheep flying-sheep added this to the 1.10.0 milestone Jan 15, 2024
@flying-sheep
Copy link
Member

OK, updated this so it follows the decision implemented in #1244

Copy link
Member

@flying-sheep flying-sheep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I checked this out and did some more stylistic changes myself.

The big things for me are:

  1. Documentation:

    The obs column(s) to use to group observations. Default is None.

    What does None mean?

  2. The corresponding .pl function: Examples look like this:

    >>> sc.tl.dendrogram(adata, groupby='bulk_labels')
    >>> sc.pl.dendrogram(adata, groupby='bulk_labels')

    and there’s no guidance or example for leaving out pl.
    If people try this, it will throw an error:

    >>> sc.tl.dendrogram(adata)
    >>> sc.pl.dendrogram(adata)

    So will this

    >>> sc.tl.dendrogram(adata, axis='var')
    >>> sc.pl.dendrogram(adata, axis='var')

    I think these things should work, or at least be very well documented

@dburkhardt
Copy link
Author

@flying-sheep the first is easy to fix. For 2. , it's not clear to me what you're asking for here. It's been a while since I worked on this. Am I supposed to import this function in the __init__.py for tl and pl?

@flying-sheep
Copy link
Member

Hi Dan, about 1.: I’m asking you what the semantic meaning is 😄

about 2.: there are two functions called dendrogram, and they have compatible signatures. Each computed dendrogram can be plotted. So what I’m saying is that the plotting version hasn’t been adapted.

Also an important question: in tl.dendrogram, we call _choose_representation, which will compute a PCA for the .obs axis. When specifying axis='var', should it compute a PCA for the var axis instead?

@flying-sheep flying-sheep removed this from the 1.10.0 milestone Feb 22, 2024
@flying-sheep flying-sheep added this to the 1.11.0 milestone Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sc.tl.dendrogram doesn't use var_names
3 participants