v0.6.0 (June 2015)#
This is a major release from 0.5. The main objective of this release was to unify the API for categorical plots, which means that there are some relatively large API changes in some of the older functions. See below for details of those changes, which may break code written for older versions of seaborn. There are also some new functions (stripplot(), and countplot()), numerous enhancements to existing functions, and bug fixes.
Additionally, the documentation has been completely revamped and expanded for the 0.6 release. Now, the API docs page for each function has multiple examples with embedded plots showing how to use the various options. These pages should be considered the most comprehensive resource for examples, and the tutorial pages are now streamlined and oriented towards a higher-level overview of the various features.
Changes and updates to categorical plots#
In version 0.6, the “categorical” plots have been unified with a common API. This new category of functions groups together plots that show the relationship between one numeric variable and one or two categorical variables. This includes plots that show distribution of the numeric variable in each bin (boxplot(), violinplot(), and stripplot()) and plots that apply a statistical estimation within each bin (pointplot(), barplot(), and countplot()). There is a new tutorial chapter that introduces these functions.
The categorical functions now each accept the same formats of input data and can be invoked in the same way. They can plot using long- or wide-form data, and can be drawn vertically or horizontally. When long-form data is used, the orientation of the plots is inferred from the types of the input data. Additionally, all functions natively take a hue variable to add a second layer of categorization.
With the (in some cases new) API, these functions can all be drawn correctly by FacetGrid. However, factorplot can also now create faceted versions of any of these kinds of plots, so in most cases it will be unnecessary to use FacetGrid directly. By default, factorplot draws a point plot, but this is controlled by the kind parameter.
Here are details on what has changed in the process of unifying these APIs:
Changes to
boxplot()andviolinplot()will probably be the most disruptive. Both functions maintain backwards-compatibility in terms of the kind of data they can accept, but the syntax has changed to be more similar to other seaborn functions. These functions are now invoked withxand/oryparameters that are either vectors of data or names of variables in a long-form DataFrame passed to the newdataparameter. You can still pass wide-form DataFrames or arrays todata, but it is no longer the first positional argument. See the github pull request (#410) for more information on these changes and the logic behind them.As
pointplot()andbarplot()can now plot with the major categorical variable on the y axis, thex_orderparameter has been renamed toorder.Added a
hueargument toboxplot()andviolinplot(), which allows for nested grouping the plot elements by a third categorical variable. Forviolinplot(), this nesting can also be accomplished by splitting the violins when there are two levels of thehuevariable (usingsplit=True). To make this functionality feasible, the ability to specify where the plots will be draw in data coordinates has been removed. These plots now are drawn at set positions, like (and identical to)barplot()andpointplot().Added a
paletteparameter toboxplot()/violinplot(). Thecolorparameter still exists, but no longer does double-duty in accepting the name of a seaborn palette.palettesupersedescolorso that it can be used with aFacetGrid.
Along with these API changes, the following changes/enhancements were made to the plotting functions:
The default rules for ordering the categories has changed. Instead of automatically sorting the category levels, the plots now show the levels in the order they appear in the input data (i.e., the order given by
Series.unique()). Order can be specified when plotting with theorderandhue_orderparameters. Additionally, when variables are pandas objects with a “categorical” dtype, the category order is inferred from the data object. This change also affectsFacetGridandPairGrid.Added the
scaleandscale_hueparameters toviolinplot(). These control how the width of the violins are scaled. The default isarea, which is different from how the violins used to be drawn. Usescale='width'to get the old behavior.Used a different style for the
boxkind of interior plot inviolinplot(), which shows the whisker range in addition to the quartiles. Useinner='quartile'to get the old style.
New plotting functions#
Added the
stripplot()function, which draws a scatterplot where one of the variables is categorical. This plot has the same API asboxplot()andviolinplot(). It is useful both on its own and when composed with one of these other plot kinds to show both the observations and underlying distribution.Added the
countplot()function, which uses a bar plot representation to show counts of variables in one or more categorical bins. This replaces the old approach of callingbarplot()without a numeric variable.
Other additions and changes#
The
corrplot()and underlyingsymmatplot()functions have been deprecated in favor ofheatmap(), which is much more flexible and robust. These two functions are still available in version 0.6, but they will be removed in a future version.Added the
set_color_codes()function and thecolor_codesargument toset()andset_palette(). This changes the interpretation of shorthand color codes (i.e. “b”, “g”, k”, etc.) within matplotlib to use the values from one of the named seaborn palettes (i.e. “deep”, “muted”, etc.). That makes it easier to have a more uniform look when using matplotlib functions directly with seaborn imported. This could be disruptive to existing plots, so it does not happen by default. It is possible this could change in the future.The
color_palette()function no longer trims palettes that are longer than 6 colors when passed into it.Added the
as_hexmethod to color palette objects, to return a list of hex codes rather than rgb tuples.jointplot()now passes additional keyword arguments to the function used to draw the plot on the joint axes.Changed the default
linewidthsinheatmap()andclustermap()to 0 so that larger matrices plot correctly. This parameter still exists and can be used to get the old effect of lines demarcating each cell in the heatmap (the old defaultlinewidthswas 0.5).heatmap()andclustermap()now automatically use a mask for missing values, which previously were shown with the “under” value of the colormap per defaultplt.pcolormeshbehavior.Added the
seaborn.crayonsdictionary and thecrayon_palette()function to define colors from the 120 box (!) of Crayola crayons.Added the
line_kwsparameter toresidplot()to change the style of the lowess line, when used.Added open-ended
**kwargsto theadd_legendmethod onFacetGridandPairGrid, which will pass additional keyword arguments through when calling the legend function on theFigureorAxes.Added the
gridspec_kwsparameter toFacetGrid, which allows for control over the size of individual facets in the grid to emphasize certain plots or account for differences in variable ranges.The interactive palette widgets now show a continuous colorbar, rather than a discrete palette, when
as_cmapis True.The default Axes size for
pairplot()andPairGridis now slightly smaller.Added the
shade_lowestparameter tokdeplot()which will set the alpha for the lowest contour level to 0, making it easier to plot multiple bivariate distributions on the same axes.The
heightparameter ofrugplot()is now interpreted as a function of the axis size and is invariant to changes in the data scale on that axis. The rug lines are also slightly narrower by default.Added a catch in
distplot()when calculating a default number of bins. For highly skewed data it will now use sqrt(n) bins, where previously the reference rule would return “infinite” bins and cause an exception in matplotlib.Added a ceiling (50) to the default number of bins used for
distplot()histograms. This will help avoid confusing errors with certain kinds of datasets that heavily violate the assumptions of the reference rule used to get a default number of bins. The ceiling is not applied when passing a specific number of bins.The various property dictionaries that can be passed to
plt.boxplotare now applied after the seaborn restyling to allow for full customizability.Added a
savefigmethod toJointGridthat defaults to a tight bounding box to make it easier to save figures using this class, and set a tight bbox as the default for thesavefigmethod on other Grid objects.You can now pass an integer to the
xticklabelsandyticklabelsparameter ofheatmap()(and, by extension,clustermap()). This will make the plot use the ticklabels inferred from the data, but only plot everynlabel, wherenis the number you pass. This can help when visualizing larger matrices with some sensible ordering to the rows or columns of the dataframe.Added
"figure.facecolor"to the style parameters and set the default to white.The
load_dataset()function now caches datasets locally after downloading them, and uses the local copy on subsequent calls.
Bug fixes#
Fixed bugs in
clustermap()where the mask and specified ticklabels were not being reorganized using the dendrograms.Fixed a bug in
FacetGridandPairGridthat lead to incorrect legend labels when levels of thehuevariable appeared inhue_orderbut not in the data.Fixed a bug in
FacetGrid.set_xticklabels()orFacetGrid.set_yticklabels()whencol_wrapis being used.Fixed a bug in
PairGridwhere thehue_orderparameter was ignored.Fixed two bugs in
despine()that caused errors when trying to trim the spines on plots that had inverted axes or no ticks.Improved support for the
margin_titlesoption inFacetGrid, which can now be used with a legend.