Skip to content

Description of "legend_loc" option in scanpy.pl.embedding is inaccurate or not implemented yet. #2229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
adkinsrs opened this issue Apr 8, 2022 · 2 comments · Fixed by #3163
Closed
3 tasks done

Comments

@adkinsrs
Copy link

adkinsrs commented Apr 8, 2022

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of scanpy.
  • (optional) I have confirmed this bug exists on the master branch of scanpy.

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

In the documentation at https://scanpy.readthedocs.io/en/stable/generated/scanpy.pl.embedding.html#scanpy.pl.embedding, the description of the "legend_loc" param is as follows:

legend_loc : str (default: 'right margin')
Location of legend, either 'on data', 'right margin' or a valid keyword for the loc parameter of Legend.

The part in bold does not seem to be implemented, as the "legend_loc" only has if/elif conditions for it is "right margin" or "on data"... nothing for the matplotlib "loc" keywords. I am not sure if it is best to implement this to match the description, or change the description to omit the part in bold.

Minimal code sample (that we can copy&paste without having any data)

@_doc_params(
adata_color_etc=doc_adata_color_etc,
edges_arrows=doc_edges_arrows,
scatter_bulk=doc_scatter_embedding,
show_save_ax=doc_show_save_ax,
)
def embedding(
adata: AnnData,
basis: str,
*,
color: Union[str, Sequence[str], None] = None,
gene_symbols: Optional[str] = None,
use_raw: Optional[bool] = None,
sort_order: bool = True,
edges: bool = False,
edges_width: float = 0.1,
edges_color: Union[str, Sequence[float], Sequence[str]] = 'grey',
neighbors_key: Optional[str] = None,
arrows: bool = False,
arrows_kwds: Optional[Mapping[str, Any]] = None,
groups: Optional[str] = None,
components: Union[str, Sequence[str]] = None,
dimensions: Optional[Union[Tuple[int, int], Sequence[Tuple[int, int]]]] = None,
layer: Optional[str] = None,
projection: Literal['2d', '3d'] = '2d',
scale_factor: Optional[float] = None,
color_map: Union[Colormap, str, None] = None,
cmap: Union[Colormap, str, None] = None,
palette: Union[str, Sequence[str], Cycler, None] = None,
na_color: ColorLike = "lightgray",
na_in_legend: bool = True,
size: Union[float, Sequence[float], None] = None,
frameon: Optional[bool] = None,
legend_fontsize: Union[int, float, _FontSize, None] = None,
legend_fontweight: Union[int, _FontWeight] = 'bold',
legend_loc: str = 'right margin',
legend_fontoutline: Optional[int] = None,
colorbar_loc: Optional[str] = "right",
vmax: Union[VBound, Sequence[VBound], None] = None,
vmin: Union[VBound, Sequence[VBound], None] = None,
vcenter: Union[VBound, Sequence[VBound], None] = None,
norm: Union[Normalize, Sequence[Normalize], None] = None,
add_outline: Optional[bool] = False,
outline_width: Tuple[float, float] = (0.3, 0.05),
outline_color: Tuple[str, str] = ('black', 'white'),
ncols: int = 4,
hspace: float = 0.25,
wspace: Optional[float] = None,
title: Union[str, Sequence[str], None] = None,
show: Optional[bool] = None,
save: Union[bool, str, None] = None,
ax: Optional[Axes] = None,
return_fig: Optional[bool] = None,
**kwargs,
) -> Union[Figure, Axes, None]:
"""\
Scatter plot for user specified embedding basis (e.g. umap, pca, etc)
Parameters
----------
basis
Name of the `obsm` basis to use.
{adata_color_etc}
{edges_arrows}
{scatter_bulk}
{show_save_ax}
Returns
-------
If `show==False` a :class:`~matplotlib.axes.Axes` or a list of it.
"""
#####################
# Argument handling #
#####################
check_projection(projection)
sanitize_anndata(adata)
basis_values = _get_basis(adata, basis)
dimensions = _components_to_dimensions(
components, dimensions, projection=projection, total_dims=basis_values.shape[1]
)
args_3d = dict(projection='3d') if projection == '3d' else {}
# Figure out if we're using raw
if use_raw is None:
# check if adata.raw is set
use_raw = layer is None and adata.raw is not None
if use_raw and layer is not None:
raise ValueError(
"Cannot use both a layer and the raw representation. Was passed:"
f"use_raw={use_raw}, layer={layer}."
)
if use_raw and adata.raw is None:
raise ValueError(
"`use_raw` is set to True but AnnData object does not have raw. "
"Please check."
)
if isinstance(groups, str):
groups = [groups]
# Color map
if color_map is not None:
if cmap is not None:
raise ValueError("Cannot specify both `color_map` and `cmap`.")
else:
cmap = color_map
cmap = copy(get_cmap(cmap))
cmap.set_bad(na_color)
kwargs["cmap"] = cmap
# Prevents warnings during legend creation
na_color = colors.to_hex(na_color, keep_alpha=True)
if 'edgecolor' not in kwargs:
# by default turn off edge color. Otherwise, for
# very small sizes the edge will not reduce its size
# (https://github.com/scverse/scanpy/issues/293)
kwargs['edgecolor'] = 'none'
# Vectorized arguments
# turn color into a python list
color = [color] if isinstance(color, str) or color is None else list(color)
if title is not None:
# turn title into a python list if not None
title = [title] if isinstance(title, str) else list(title)
# turn vmax and vmin into a sequence
if isinstance(vmax, str) or not isinstance(vmax, cabc.Sequence):
vmax = [vmax]
if isinstance(vmin, str) or not isinstance(vmin, cabc.Sequence):
vmin = [vmin]
if isinstance(vcenter, str) or not isinstance(vcenter, cabc.Sequence):
vcenter = [vcenter]
if isinstance(norm, Normalize) or not isinstance(norm, cabc.Sequence):
norm = [norm]
# Size
if 's' in kwargs and size is None:
size = kwargs.pop('s')
if size is not None:
# check if size is any type of sequence, and if so
# set as ndarray
if (
size is not None
and isinstance(size, (cabc.Sequence, pd.Series, np.ndarray))
and len(size) == adata.shape[0]
):
size = np.array(size, dtype=float)
else:
size = 120000 / adata.shape[0]
##########
# Layout #
##########
# Most of the code is for the case when multiple plots are required
if wspace is None:
# try to set a wspace that is not too large or too small given the
# current figure size
wspace = 0.75 / rcParams['figure.figsize'][0] + 0.02
if components is not None:
color, dimensions = list(zip(*product(color, dimensions)))
color, dimensions = _broadcast_args(color, dimensions)
# 'color' is a list of names that want to be plotted.
# Eg. ['Gene1', 'louvain', 'Gene2'].
# component_list is a list of components [[0,1], [1,2]]
if (
not isinstance(color, str)
and isinstance(color, cabc.Sequence)
and len(color) > 1
) or len(dimensions) > 1:
if ax is not None:
raise ValueError(
"Cannot specify `ax` when plotting multiple panels "
"(each for a given value of 'color')."
)
# each plot needs to be its own panel
fig, grid = _panel_grid(hspace, wspace, ncols, len(color))
else:
grid = None
if ax is None:
fig = pl.figure()
ax = fig.add_subplot(111, **args_3d)
############
# Plotting #
############
axs = []
# use itertools.product to make a plot for each color and for each component
# For example if color=[gene1, gene2] and components=['1,2, '2,3'].
# The plots are: [
# color=gene1, components=[1,2], color=gene1, components=[2,3],
# color=gene2, components = [1, 2], color=gene2, components=[2,3],
# ]
for count, (value_to_plot, dims) in enumerate(zip(color, dimensions)):
color_source_vector = _get_color_source_vector(
adata,
value_to_plot,
layer=layer,
use_raw=use_raw,
gene_symbols=gene_symbols,
groups=groups,
)
color_vector, categorical = _color_vector(
adata,
value_to_plot,
color_source_vector,
palette=palette,
na_color=na_color,
)
# Order points
order = slice(None)
if sort_order is True and value_to_plot is not None and categorical is False:
# Higher values plotted on top, null values on bottom
order = np.argsort(-color_vector, kind="stable")[::-1]
elif sort_order and categorical:
# Null points go on bottom
order = np.argsort(~pd.isnull(color_source_vector), kind="stable")
# Set orders
if isinstance(size, np.ndarray):
size = np.array(size)[order]
color_source_vector = color_source_vector[order]
color_vector = color_vector[order]
coords = basis_values[:, dims][order, :]
# if plotting multiple panels, get the ax from the grid spec
# else use the ax value (either user given or created previously)
if grid:
ax = pl.subplot(grid[count], **args_3d)
axs.append(ax)
if not (settings._frameon if frameon is None else frameon):
ax.axis('off')
if title is None:
if value_to_plot is not None:
ax.set_title(value_to_plot)
else:
ax.set_title('')
else:
try:
ax.set_title(title[count])
except IndexError:
logg.warning(
"The title list is shorter than the number of panels. "
"Using 'color' value instead for some plots."
)
ax.set_title(value_to_plot)
if not categorical:
vmin_float, vmax_float, vcenter_float, norm_obj = _get_vboundnorm(
vmin, vmax, vcenter, norm, count, color_vector
)
normalize = check_colornorm(
vmin_float,
vmax_float,
vcenter_float,
norm_obj,
)
else:
normalize = None
# make the scatter plot
if projection == '3d':
cax = ax.scatter(
coords[:, 0],
coords[:, 1],
coords[:, 2],
marker=".",
c=color_vector,
rasterized=settings._vector_friendly,
norm=normalize,
**kwargs,
)
else:
scatter = (
partial(ax.scatter, s=size, plotnonfinite=True)
if scale_factor is None
else partial(
circles, s=size, ax=ax, scale_factor=scale_factor
) # size in circles is radius
)
if add_outline:
# the default outline is a black edge followed by a
# thin white edged added around connected clusters.
# To add an outline
# three overlapping scatter plots are drawn:
# First black dots with slightly larger size,
# then, white dots a bit smaller, but still larger
# than the final dots. Then the final dots are drawn
# with some transparency.
bg_width, gap_width = outline_width
point = np.sqrt(size)
gap_size = (point + (point * gap_width) * 2) ** 2
bg_size = (np.sqrt(gap_size) + (point * bg_width) * 2) ** 2
# the default black and white colors can be changes using
# the contour_config parameter
bg_color, gap_color = outline_color
# remove edge from kwargs if present
# because edge needs to be set to None
kwargs['edgecolor'] = 'none'
# remove alpha for outline
alpha = kwargs.pop('alpha') if 'alpha' in kwargs else None
ax.scatter(
coords[:, 0],
coords[:, 1],
s=bg_size,
marker=".",
c=bg_color,
rasterized=settings._vector_friendly,
norm=normalize,
**kwargs,
)
ax.scatter(
coords[:, 0],
coords[:, 1],
s=gap_size,
marker=".",
c=gap_color,
rasterized=settings._vector_friendly,
norm=normalize,
**kwargs,
)
# if user did not set alpha, set alpha to 0.7
kwargs['alpha'] = 0.7 if alpha is None else alpha
cax = scatter(
coords[:, 0],
coords[:, 1],
marker=".",
c=color_vector,
rasterized=settings._vector_friendly,
norm=normalize,
**kwargs,
)
# remove y and x ticks
ax.set_yticks([])
ax.set_xticks([])
if projection == '3d':
ax.set_zticks([])
# set default axis_labels
name = _basis2name(basis)
axis_labels = [name + str(d + 1) for d in dims]
ax.set_xlabel(axis_labels[0])
ax.set_ylabel(axis_labels[1])
if projection == '3d':
# shift the label closer to the axis
ax.set_zlabel(axis_labels[2], labelpad=-7)
ax.autoscale_view()
if edges:
_utils.plot_edges(ax, adata, basis, edges_width, edges_color, neighbors_key)
if arrows:
_utils.plot_arrows(ax, adata, basis, arrows_kwds)
if value_to_plot is None:
# if only dots were plotted without an associated value
# there is not need to plot a legend or a colorbar
continue
if legend_fontoutline is not None:
path_effect = [
patheffects.withStroke(linewidth=legend_fontoutline, foreground='w')
]
else:
path_effect = None
# Adding legends
if categorical:
_add_categorical_legend(
ax,
color_source_vector,
palette=_get_palette(adata, value_to_plot),
scatter_array=coords,
legend_loc=legend_loc,
legend_fontweight=legend_fontweight,
legend_fontsize=legend_fontsize,
legend_fontoutline=path_effect,
na_color=na_color,
na_in_legend=na_in_legend,
multi_panel=bool(grid),
)
elif colorbar_loc is not None:
pl.colorbar(
cax, ax=ax, pad=0.01, fraction=0.08, aspect=30, location=colorbar_loc
)
if return_fig is True:
return fig
axs = axs if grid else ax
_utils.savefig_or_show(basis, show=show, save=save)
if show is False:
return axs

if legend_loc == 'right margin':
for label in cats:
ax.scatter([], [], c=palette[label], label=label)
ax.legend(
frameon=False,
loc='center left',
bbox_to_anchor=(1, 0.5),
ncol=(1 if len(cats) <= 14 else 2 if len(cats) <= 30 else 3),
fontsize=legend_fontsize,
)
elif legend_loc == 'on data':

Versions

Not applicable

@LisaSikkema
Copy link
Contributor

LisaSikkema commented Sep 24, 2022

Yeah agree this is inconsistent, I'd say we should either correct the documentation + throw an error when the wrong argument is passed, or implement the matplotlib options. @ivirshup what do you think?

In _anndata.py plotting script, VALID_LEGENDLOCS is set (not sure in which plotting functions that is eventually used), could we use something similar for the above example?

VALID_LEGENDLOCS = {
    'none',
    'right margin',
    'on data',
    'on data export',
    'best',
    'upper right',
    'upper left',
    'lower left',
    'lower right',
    'right',
    'center left',
    'center right',
    'lower center',
    'upper center',
    'center',
}

Also related to issue #2322

@LisaSikkema
Copy link
Contributor

Oh this might actually be addressed in #2267

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants