seaborn.lvplot

seaborn.lvplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, orient=None, color=None, palette=None, saturation=0.75, width=0.8, dodge=True, k_depth='proportion', linewidth=None, scale='exponential', outlier_prop=None, ax=None, **kwargs)

Draw a letter value plot to show distributions of large datasets.

Letter value (LV) plots are non-parametric estimates of the distribution of a dataset, similar to boxplots. LV plots are also similar to violin plots but without the need to fit a kernel density estimate. Thus, LV plots are fast to generate, directly interpretable in terms of the distribution of data, and easy to understand. For a more extensive explanation of letter value plots and their properties, see Hadley Wickham’s excellent paper on the topic:

http://vita.had.co.nz/papers/letter-value-plot.html

Input data can be passed in a variety of formats, including:

  • Vectors of data represented as lists, numpy arrays, or pandas Series objects passed directly to the x, y, and/or hue parameters.
  • A “long-form” DataFrame, in which case the x, y, and hue variables will determine how the data are plotted.
  • A “wide-form” DataFrame, such that each numeric column will be plotted.
  • Anything accepted by plt.boxplot (e.g. a 2d array or list of vectors)

In most cases, it is possible to use numpy or Python objects, but pandas objects are preferable because the associated names will be used to annotate the axes. Additionally, you can use Categorical types for the grouping variables to control the order of plot elements.

Parameters:

x, y, hue : names of variables in data or vector data, optional

Inputs for plotting long-form data. See examples for interpretation.

data : DataFrame, array, or list of arrays, optional

Dataset for plotting. If x and y are absent, this is interpreted as wide-form. Otherwise it is expected to be long-form.

order, hue_order : lists of strings, optional

Order to plot the categorical levels in, otherwise the levels are inferred from the data objects.

orient : “v” | “h”, optional

Orientation of the plot (vertical or horizontal). This is usually inferred from the dtype of the input variables, but can be used to specify when the “categorical” variable is a numeric or when plotting wide-form data.

color : matplotlib color, optional

Color for all of the elements, or seed for light_palette() when using hue nesting.

palette : seaborn color palette or dict, optional

Colors to use for the different levels of the hue variable. Should be something that can be interpreted by color_palette(), or a dictionary mapping hue levels to matplotlib colors.

saturation : float, optional

Proportion of the original saturation to draw colors at. Large patches often look better with slightly desaturated colors, but set this to 1 if you want the plot colors to perfectly match the input color spec.

width : float, optional

Width of a full element when not using hue nesting, or width of all the elements for one level of the major grouping variable.

dodge : bool, optional

When hue nesting is used, whether elements should be shifted along the categorical axis.

k_depth : “proportion” | “tukey” | “trustworthy”, optional

The number of boxes, and by extension number of percentiles, to draw. All methods are detailed in Wickham’s paper. Each makes different assumptions about the number of outliers and leverages different statistical properties.

linewidth : float, optional

Width of the gray lines that frame the plot elements.

scale : “linear” | “exonential” | “area”

Method to use for the width of the letter value boxes. All give similar results visually. “linear” reduces the width by a constant linear factor, “exponential” uses the proportion of data not covered, “area” is proportional to the percentage of data covered.

outlier_prop : float, optional

Proportion of data believed to be outliers. Is used in conjuction with k_depth to determine the number of percentiles to draw. Defaults to 0.007 as a proportion of outliers. Should be in range [0, 1].

ax : matplotlib Axes, optional

Axes object to draw the plot onto, otherwise uses the current Axes.

kwargs : key, value mappings

Other keyword arguments are passed through to plt.plot and plt.scatter at draw time.

Returns:

ax : matplotlib Axes

Returns the Axes object with the boxplot drawn onto it.

See also

violinplot
A combination of boxplot and kernel density estimation.
boxplot
A traditional box-and-whisker plot with a similar API.

Examples

Draw a single horizontal letter value plot:

>>> import seaborn as sns
>>> sns.set_style("whitegrid")
>>> tips = sns.load_dataset("tips")
>>> ax = sns.lvplot(x=tips["total_bill"])
../_images/seaborn-lvplot-1.png

Draw a vertical letter value plot grouped by a categorical variable:

>>> ax = sns.lvplot(x="day", y="total_bill", data=tips)
../_images/seaborn-lvplot-2.png

Draw a letter value plot with nested grouping by two categorical variables:

>>> ax = sns.lvplot(x="day", y="total_bill", hue="smoker",
...                 data=tips, palette="Set3")
../_images/seaborn-lvplot-3.png

Draw a letter value plot with nested grouping when some bins are empty:

>>> ax = sns.lvplot(x="day", y="total_bill", hue="time",
...                 data=tips, linewidth=2.5)
../_images/seaborn-lvplot-4.png

Control box order by passing an explicit order:

>>> ax = sns.lvplot(x="time", y="tip", data=tips,
...                 order=["Dinner", "Lunch"])
../_images/seaborn-lvplot-5.png

Draw a letter value plot for each numeric variable in a DataFrame:

>>> iris = sns.load_dataset("iris")
>>> ax = sns.lvplot(data=iris, orient="h", palette="Set2")
../_images/seaborn-lvplot-6.png

Use stripplot() to show the datapoints on top of the boxes:

>>> ax = sns.lvplot(x="day", y="total_bill", data=tips)
>>> ax = sns.stripplot(x="day", y="total_bill", data=tips,
...                    size=4, jitter=True, color="gray")
../_images/seaborn-lvplot-7.png

Use factorplot() to combine a lvplot() and a FacetGrid. This allows grouping within additional categorical variables. Using factorplot() is safer than using FacetGrid directly, as it ensures synchronization of variable order across facets:

>>> g = sns.factorplot(x="sex", y="total_bill",
...                    hue="smoker", col="time",
...                    data=tips, kind="lv",
...                    size=4, aspect=.7);
../_images/seaborn-lvplot-8.png