v0.11.0 (September 2020)#

This is a major release with several important new features, enhancements to existing functions, and changes to the library. Highlights include an overhaul and modernization of the distributions plotting functions, more flexible data specification, new colormaps, and better narrative documentation.

For an overview of the new features and a guide to updating, see this Medium post.

Required keyword arguments#

API

Most plotting functions now require all of their parameters to be specified using keyword arguments. To ease adaptation, code without keyword arguments will trigger a FutureWarning in v0.11. In a future release (v0.12 or v0.13, depending on release cadence), this will become an error. Once keyword arguments are fully enforced, the signature of the plotting functions will be reorganized to accept data as the first and only positional argument (#2052, #2081).

Modernization of distribution functions#

The distribution module has been completely overhauled, modernizing the API and introducing several new functions and features within existing functions. Some new features are explained here; the tutorial documentation has also been rewritten and serves as a good introduction to the functions.

New plotting functions#

Feature Enhancement

First, three new functions, displot(), histplot() and ecdfplot() have been added (#2157, #2125, #2141).

The figure-level displot() function is an interface to the various distribution plots (analogous to relplot() or catplot()). It can draw univariate or bivariate histograms, density curves, ECDFs, and rug plots on a FacetGrid.

The axes-level histplot() function draws univariate or bivariate histograms with a number of features, including:

  • mapping multiple distributions with a hue semantic

  • normalization to show density, probability, or frequency statistics

  • flexible parameterization of bin size, including proper bins for discrete variables

  • adding a KDE fit to show a smoothed distribution over all bin statistics

  • experimental support for histograms over categorical and datetime variables.

The axes-level ecdfplot() function draws univariate empirical cumulative distribution functions, using a similar interface.

Changes to existing functions#

API Feature Enhancement Defaults

Second, the existing functions kdeplot() and rugplot() have been completely overhauled (#2060, #2104).

The overhauled functions now share a common API with the rest of seaborn, they can show conditional distributions by mapping a third variable with a hue semantic, and they have been improved in numerous other ways. The github pull request (#2104) has a longer explanation of the changes and the motivation behind them.

This is a necessarily API-breaking change. The parameter names for the positional variables are now x and y, and the old names have been deprecated. Efforts were made to handle and warn when using the deprecated API, but it is strongly suggested to check your plots carefully.

Additionally, the statsmodels-based computation of the KDE has been removed. Because there were some inconsistencies between the way different parameters (specifically, bw, clip, and cut) were implemented by each backend, this may cause plots to look different with non-default parameters. Support for using non-Gaussian kernels, which was available only in the statsmodels backend, has been removed.

Other new features include:

  • several options for representing multiple densities (using the multiple and common_norm parameters)

  • weighted density estimation (using the new weights parameter)

  • better control over the smoothing bandwidth (using the new bw_adjust parameter)

  • more meaningful parameterization of the contours that represent a bivariate density (using the thresh and levels parameters)

  • log-space density estimation (using the new log_scale parameter, or by scaling the data axis before plotting)

  • “bivariate” rug plots with a single function call (by assigning both x and y)

Deprecations#

API

Finally, the distplot() function is now formally deprecated. Its features have been subsumed by displot() and histplot(). Some effort was made to gradually transition distplot() by adding the features in displot() and handling backwards compatibility, but this proved to be too difficult. The similarity in the names will likely cause some confusion during the transition, which is regrettable.

Standardization and enhancements of data ingest#

Feature Enhancement Docs

The code that processes input data has been refactored and enhanced. In v0.11, this new code takes effect for the relational and distribution modules; other modules will be refactored to use it in future releases (#2071).

These changes should be transparent for most use-cases, although they allow a few new features:

  • Named variables for long-form data can refer to the named index of a pandas.DataFrame or to levels in the case of a multi-index. Previously, it was necessary to call pandas.DataFrame.reset_index() before using index variables (e.g., after a groupby operation).

  • relplot() now has the same flexibility as the axes-level functions to accept data in long- or wide-format and to accept data vectors (rather than named variables) in long-form mode.

  • The data parameter can now be a Python dict or an object that implements that interface. This is a new feature for wide-form data. For long-form data, it was previously supported but not documented.

  • A wide-form data object can have a mixture of types; the non-numeric types will be removed before plotting. Previously, this caused an error.

  • There are better error messages for other instances of data mis-specification.

See the new user guide chapter on data formats for more information about what is supported.

Other changes#

Documentation improvements#

  • Docs Added two new chapters to the user guide, one giving an overview of the types of functions in seaborn, and one discussing the different data formats that seaborn understands.

  • Docs Expanded the color palette tutorial to give more background on color theory and better motivate the use of color in statistical graphics.

  • Docs Added more information to the installation guidelines and streamlined the introduction page.

  • Docs Improved cross-linking within the seaborn docs and between the seaborn and matplotlib docs.

Theming#

  • API The set() function has been renamed to set_theme() for more clarity about what it does. For the foreseeable future, set() will remain as an alias, but it is recommended to update your code.

Relational plots#

  • Enhancement Defaults Reduced some of the surprising behavior of relational plot legends when using a numeric hue or size mapping (#2229):

    • Added an “auto” mode (the new default) that chooses between “brief” and “full” legends based on the number of unique levels of each variable.

    • Modified the ticking algorithm for a “brief” legend to show up to 6 values and not to show values outside the limits of the data.

    • Changed the approach to the legend title: the normal matplotlib legend title is used when only one variable is assigned a semantic mapping, whereas the old approach of adding an invisible legend artist with a subtitle label is used only when multiple semantic variables are defined.

    • Modified legend subtitles to be left-aligned and to be drawn in the default legend title font size.

  • Enhancement Defaults Changed how functions that use different representations for numeric and categorical data handle vectors with an object data type. Previously, data was considered numeric if it could be coerced to a float representation without error. Now, object-typed vectors are considered numeric only when their contents are themselves numeric. As a consequence, numbers that are encoded as strings will now be treated as categorical data (#2084).

  • Enhancement Defaults Plots with a style semantic can now generate an infinite number of unique dashes and/or markers by default. Previously, an error would be raised if the style variable had more levels than could be mapped using the default lists. The existing defaults were slightly modified as part of this change; if you need to exactly reproduce plots from earlier versions, refer to the old defaults (#2075).

  • Defaults Changed how scatterplot() sets the default linewidth for the edges of the scatter points. New behavior is to scale with the point sizes themselves (on a plot-wise, not point-wise basis). This change also slightly reduces the default width when point sizes are not varied. Set linewidth=0.75 to reproduce the previous behavior. (#2708).

  • Enhancement Improved support for datetime variables in scatterplot() and lineplot() (#2138).

  • Fix Fixed a bug where lineplot() did not pass the linestyle parameter down to matplotlib (#2095).

  • Fix Adapted to a change in matplotlib that prevented passing vectors of literal values to c and s in scatterplot() (#2079).

Categorical plots#

  • Enhancement Defaults Fix Fixed a few computational issues in boxenplot() and improved its visual appearance (#2086):

    • Changed the default method for computing the number of boxes to``k_depth=”tukey”, as the previous default (``k_depth="proportion") is based on a heuristic that produces too many boxes for small datasets.

    • Added the option to specify the specific number of boxes (e.g. k_depth=6) or to plot boxes that will cover most of the data points (k_depth="full").

    • Added a new parameter, trust_alpha, to control the number of boxes when k_depth="trustworthy".

    • Changed the visual appearance of boxenplot() to more closely resemble boxplot(). Notably, thin boxes will remain visible when the edges are white.

  • Enhancement Allowed catplot() to use different values on the categorical axis of each facet when axis sharing is turned off (e.g. by specifying sharex=False) (#2196).

  • Enhancement Improved the error messages produced when categorical plots process the orientation parameter.

  • Enhancement Added an explicit warning in swarmplot() when more than 5% of the points overlap in the “gutters” of the swarm (#2045).

Multi-plot grids#

  • Feature Enhancement Defaults A few small changes to make life easier when using PairGrid (#2234):

    • Added public access to the legend object through the legend attribute (also affects FacetGrid).

    • The color and label parameters are no longer passed to the plotting functions when hue is not used.

    • The data is no longer converted to a numpy object before plotting on the marginal axes.

    • It is possible to specify only one of x_vars or y_vars, using all variables for the unspecified dimension.

    • The layout_pad parameter is stored and used every time you call the PairGrid.tight_layout() method.

  • Feature Added a tight_layout method to FacetGrid and PairGrid, which runs the matplotlib.pyplot.tight_layout() algorithm without interference from the external legend (#2073).

  • Feature Added the axes_dict attribute to FacetGrid for named access to the component axes (#2046).

  • Enhancement Made FacetGrid.set_axis_labels() clear labels from “interior” axes (#2046).

  • Feature Added the marginal_ticks parameter to JointGrid which, if set to True, will show ticks on the count/density axis of the marginal plots (#2210).

  • Enhancement Improved FacetGrid.set_titles() with margin_titles=True, such that texts representing the original row titles are removed before adding new ones (#2083).

  • Defaults Changed the default value for dropna to False in FacetGrid, PairGrid, JointGrid, and corresponding functions. As all or nearly all seaborn and matplotlib plotting functions handle missing data well, this option is no longer useful, but it causes problems in some edge cases. It may be deprecated in the future. (#2204).

  • Fix Fixed a bug in PairGrid that appeared when setting corner=True and despine=False (#2203).

Color palettes#

  • Docs Improved and modernized the color palettes chapter of the seaborn tutorial.

  • Feature Added two new perceptually-uniform colormaps: “flare” and “crest”. The new colormaps are similar to “rocket” and “mako”, but their luminance range is reduced. This makes them well suited to numeric mappings of line or scatter plots, which need contrast with the axes background at the extremes (#2237).

  • Enhancement Defaults Enhanced numeric colormap functionality in several ways (#2237):

  • Enhancement Added a rich HTML representation to the object returned by color_palette() (#2225).

  • Fix Fixed the "{palette}_d" logic to modify reversed colormaps and to use the correct direction of the luminance ramp in both cases.

Deprecations and removals#

  • Enhancement Removed an optional (and undocumented) dependency on BeautifulSoup (#2190) in get_dataset_names().

  • API Deprecated the axlabel function; use ax.set(xlabel=, ylabel=) instead.

  • API Deprecated the iqr function; use scipy.stats.iqr() instead.

  • API Final removal of the previously-deprecated annotate method on JointGrid, along with related parameters.

  • API Final removal of the lvplot function (the previously-deprecated name for boxenplot()).