# Demonstration of next-generation seaborn interface#

Warning

This API is **experimental** and **unstable**. Please try it out and provide feedback, but expect it to change without warning prior to an official release.

## The basic interface#

The new interface exists as a set of classes that can be acessed through a single namespace import:

```
import seaborn.objects as so
```

This is a clean namespace, and I’m leaning towards recommending
`from seaborn.objects import *`

for interactive usecases. But let’s
not go so far just yet.

Let’s also import the main namespace so we can load our trusty example datasets.

```
from seaborn import load_dataset
tips = load_dataset("tips")
```

The main object is `seaborn.objects.Plot`

. You instantiate it by
passing data and some assignments from columns in the data to roles in
the plot:

```
so.Plot(tips, x="total_bill", y="tip")
```

But instantiating the `Plot`

object doesn’t actually plot anything.
For that you need to add some layers:

```
so.Plot(tips, x="total_bill", y="tip").add(so.Dots())
```

Variables can be defined globally, or for a specific layer:

```
so.Plot(tips).add(so.Dots(), x="total_bill", y="tip")
```

Each layer can also have its own data:

```
(
so.Plot(tips, x="total_bill", y="tip")
.add(so.Dots(color=".6"), data=tips.query("size != 2"))
.add(so.Dots(), data=tips.query("size == 2"))
)
```

As in the existing interface, variables can be keys to the `data`

object or vectors of various kinds:

```
(
so.Plot(tips.to_dict(), x="total_bill")
.add(so.Dots(), y=tips["tip"].to_numpy())
)
```

The interface also supports semantic mappings between data and plot variables. But the specification of those mappings uses more explicit parameter names:

```
so.Plot(tips, x="total_bill", y="tip", color="time").add(so.Dots())
```

It also offers a wider range of mappable features:

```
(
so.Plot(tips, x="total_bill", y="tip", color="day", fill="time")
.add(so.Dots(fillalpha=.8))
)
```

## Core components#

### Visual representation: the Mark#

Each layer needs a `Mark`

object, which defines how to draw the plot.
There will be marks corresponding to existing seaborn functions and ones
offering new functionality. But not many have been implemented yet:

```
fmri = load_dataset("fmri").query("region == 'parietal'")
so.Plot(fmri, x="timepoint", y="signal").add(so.Line())
```

`Mark`

objects will expose an API to set features directly, rather
than mapping them:

```
so.Plot(tips, y="day", x="total_bill").add(so.Dot(color="#698", alpha=.5))
```

### Data transformations: the Stat#

Built-in statistical transformations are one of seaborn’s key features.
But currently, they are tied up with the different visual
representations. E.g., you can aggregate data in `lineplot`

, but not
in `scatterplot`

.

In the new interface, these concerns are separated. Each layer can
accept a `Stat`

object that applies a data transformation:

```
so.Plot(fmri, x="timepoint", y="signal").add(so.Line(), so.Agg())
```

A `Stat`

is computed on subsets of data defined by the semantic
mappings:

```
so.Plot(fmri, x="timepoint", y="signal", color="event").add(so.Line(), so.Agg())
```

Each mark also accepts a `group`

mapping that creates subsets without
altering visual properties:

```
(
so.Plot(fmri, x="timepoint", y="signal", color="event")
.add(so.Line(), so.Agg(), group="subject")
)
```

The `Mark`

and `Stat`

objects allow for more compositionality and
customization. There will be guidelines for how to define your own
objects to plug into the broader system:

```
class PeakAnnotation(so.Mark):
def _plot(self, split_generator, scales, orient):
for keys, data, ax in split_generator():
ix = data["y"].idxmax()
ax.annotate(
"The peak", data.loc[ix, ["x", "y"]],
xytext=(10, -100), textcoords="offset points",
va="top", ha="center",
arrowprops=dict(arrowstyle="->", color=".2"),
)
(
so.Plot(fmri, x="timepoint", y="signal")
.add(so.Line(), so.Agg())
.add(PeakAnnotation(), so.Agg())
)
```

The new interface understands not just `x`

and `y`

, but also range
specifiers; some `Stat`

objects will output ranges, and some `Mark`

objects will accept them. (This means that it will finally be possible
to pass pre-defined error-bars into seaborn):

```
(
fmri
.groupby("timepoint")
.signal
.describe()
.pipe(so.Plot, x="timepoint")
.add(so.Line(), y="mean")
.add(so.Band(alpha=.2), ymin="min", ymax="max")
)
```

### Overplotting resolution: the Move#

Existing seaborn functions have parameters that allow adjustments for
overplotting, such as `dodge=`

in several categorical functions,
`jitter=`

in several functions based on scatter plots, and the
`multiple=`

parameter in distribution functions. In the new interface,
those adjustments are abstracted away from the particular visual
representation into the concept of a `Move`

:

```
(
so.Plot(tips, "day", "total_bill", color="time")
.add(so.Dot(), so.Dodge())
)
```

Separating out the positional adjustment makes it possible to add additional flexibility without overwhelming the signature of a single function. For example, there will be more options for handling missing levels when dodging and for fine-tuning the adjustment.

```
(
so.Plot(tips, "day", "total_bill", color="time")
.add(so.Bar(), so.Agg(), so.Dodge(empty="fill", gap=.1))
)
```

By default, the `move`

will resolve all overlapping semantic mappings:

```
(
so.Plot(tips, "day", "total_bill", color="time", alpha="sex")
.add(so.Bar(), so.Agg(), so.Dodge())
)
```

But you can specify a subset:

```
(
so.Plot(tips, "day", "total_bill", color="time", alpha="smoker")
.add(so.Dot(), so.Dodge(by=["color"]))
)
```

It’s also possible to stack multiple moves or kinds of moves:

```
(
so.Plot(tips, "day", "total_bill", color="time", alpha="smoker")
.add(so.Dot(), so.Dodge(by=["color"]), so.Jitter(.5))
)
```

Separating the `Stat`

and `Move`

from the visual representation
affords more flexibility, greatly expanding the space of graphics that
can be created.

### Semantic mapping: the Scale#

The declarative interface allows users to represent dataset variables
with visual properites such as position, color or size. A complete plot
can be made without doing anything more defining the mappings: users
need not be concerned with converting their data into units that
matplotlib understands. But what if one wants to alter the mapping that
seaborn chooses? This is accomplished through the concept of a
`Scale`

.

The notion of scaling will probably not be unfamiliar; as in matplotlib,
seaborn allows one to apply a mathematical transformation, such as
`log`

, to the coordinate variables:

```
planets = load_dataset("planets").query("distance < 1000")
```

```
(
so.Plot(planets, x="mass", y="distance")
.scale(x="log", y="log")
.add(so.Dots())
)
```

But the `Scale`

concept is much more general in seaborn: a scale can
be provided for any mappable property. For example, it is how you
specify the palette used for color variables:

```
(
so.Plot(planets, x="mass", y="distance", color="orbital_period")
.scale(x="log", y="log", color="rocket")
.add(so.Dots())
)
```

While there are a number of short-hand “magic” arguments you can provide
for each scale, it is also possible to be more explicit by passing a
`Scale`

object. There are several distinct `Scale`

classes,
corresponding to the fundamental scale types (nominal, ordinal,
continuous, etc.). Each class exposes a number of relevant parameters
that control the details of the mapping:

```
(
so.Plot(planets, x="mass", y="distance", color="orbital_period")
.scale(
x="log",
y=so.Continuous(trans="log").tick(at=[3, 10, 30, 100, 300]),
color=so.Continuous("rocket", trans="log"),
)
.add(so.Dots())
)
```

There are several different kinds of scales, including scales appropriate for categorical data:

```
(
so.Plot(planets, x="year", y="distance", color="method")
.scale(
y="log",
color=so.Nominal(["b", "g"], order=["Radial Velocity", "Transit"])
)
.add(so.Dots())
)
```

It’s also possible to disable scaling for a variable so that the literal values in the dataset are passed directly through to matplotlib:

```
(
so.Plot(planets, x="distance", y="orbital_period", pointsize="mass")
.scale(x="log", y="log", pointsize=None)
.add(so.Dots())
)
```

Scaling interacts with the `Stat`

and `Move`

transformations. When
an axis has a nonlinear scale, any statistical transformations or
adjustments take place in the appropriate space:

```
so.Plot(planets, x="distance").add(so.Bars(), so.Hist()).scale(x="log")
```

This is also true of the `Move`

transformations:

```
(
so.Plot(
planets, x="distance",
color=(planets["number"] > 1).rename("multiple")
)
.add(so.Bars(), so.Hist(), so.Dodge())
.scale(x="log", color=so.Nominal())
)
```

## Defining subplot structure#

Seaborn’s faceting functionality (drawing subsets of the data on
distinct subplots) is built into the `Plot`

object and works
interchangably with any `Mark`

/`Stat`

/`Move`

/`Scale`

spec:

```
(
so.Plot(tips, x="total_bill", y="tip")
.facet("time", order=["Dinner", "Lunch"])
.add(so.Dots())
)
```

Unlike the existing `FacetGrid`

it is simple to *not* facet a layer,
so that a plot is simply replicated across each column (or row):

```
(
so.Plot(tips, x="total_bill", y="tip")
.facet(col="day")
.add(so.Dots(color=".75"), col=None)
.add(so.Dots(), color="day")
.layout(size=(7, 3))
)
```

The `Plot`

object *also* subsumes the `PairGrid`

functionality:

```
(
so.Plot(tips, y="day")
.pair(x=["total_bill", "tip"])
.add(so.Dot())
)
```

Pairing and faceting can be combined in the same plot:

```
(
so.Plot(tips, x="day")
.facet("sex")
.pair(y=["total_bill", "tip"])
.add(so.Dot())
)
```

Or the `Plot.pair`

functionality can be used to define unique pairings
between variables:

```
(
so.Plot(tips)
.pair(x=["day", "time"], y=["total_bill", "tip"], cross=False)
.add(so.Dot())
)
```

It’s additionally possible to “pair” with a single variable, for univariate plots like histograms.

Both faceted and paired plots with subplots along a single dimension can be “wrapped”, and this works both columwise and rowwise:

```
(
so.Plot(tips)
.pair(x=tips.columns, wrap=3)
.share(y=False)
.add(so.Bar(), so.Hist())
)
```

Importantly, there’s no distinction between “axes-level” and
“figure-level” here. Any kind of plot can be faceted or paired by adding
a method call to the `Plot`

definition, without changing anything else
about how you are creating the figure.

## Customization#

This API is less developed than other aspects of the new interface, but it will be possible to customize various aspects of the plot through the seaborn interface, without dropping down to matplotlib:

```
(
so.Plot(tips, "day", "total_bill", color="sex")
.add(so.Bar(), so.Agg(), so.Dodge())
.scale(y=so.Continuous().label(like="${x:.0f}"))
.label(x=str.capitalize, y="Total bill", color=None)
.limit(y=(0, 28))
)
```

## Iterating and displaying#

It is possible (and in fact the deafult behavior) to be completely pyplot-free, and all the drawing is done by directly hooking into Jupyter’s rich display system. Unlike in normal usage of the inline backend, writing code in a cell to define a plot is independent from showing it:

```
p = so.Plot(fmri, x="timepoint", y="signal").add(so.Line(), so.Agg())
```

```
p
```

By default, the methods on `Plot`

do *not* mutate the object they are
called on. This means that you can define a common base specification
and then iterate on different versions of it.

```
p = (
so.Plot(fmri, x="timepoint", y="signal", color="event")
.scale(color="crest")
)
```

```
p.add(so.Line())
```

```
p.add(so.Line(), group="subject")
```

```
p.add(so.Line(), so.Agg())
```

```
(
p
.add(so.Line(linewidth=.5, alpha=.5), group="subject")
.add(so.Line(linewidth=3), so.Agg())
)
```

It’s also possible to hook into the `pyplot`

system by calling
`Plot.show`

. (As you might in a terminal interface, or to use a GUI).
Notice how this looks lower-res: that’s because `Plot`

is generating
“high-DPI” figures internally!

```
(
p
.add(so.Line(linewidth=.5, alpha=.5), group="subject")
.add(so.Line(linewidth=3), so.Agg())
.show()
)
```

## Matplotlib integration#

It’s always been a design aim in seaborn to allow complicated seaborn
plots to coexist within the context of a larger matplotlib figure. This
is acheived within the “axes-level” functions, which accept an `ax=`

parameter. The `Plot`

object *will* provide a similar functionality:

```
import matplotlib as mpl
_, ax = mpl.figure.Figure().subplots(1, 2)
(
so.Plot(tips, x="total_bill", y="tip")
.on(ax)
.add(so.Dots())
)
```

But a limitation has been that the “figure-level” functions, which can
produce multiple subplots, cannot be directed towards an existing
figure. That is no longer the case; `Plot.on()`

also accepts a
`Figure`

(created either with or without `pyplot`

) object:

```
f = mpl.figure.Figure()
(
so.Plot(tips, x="total_bill", y="tip")
.on(f)
.add(so.Dots())
.facet("time")
)
```

Providing an existing figure is perhaps only marginally useful. While it
will ease the integration of seaborn with GUI frameworks, seaborn is
still using up the whole figure canvas. But with the introduction of the
`SubFigure`

concept in matplotlib 3.4, it becomes possible to place a
small-multiples plot *within* a larger set of subplots:

```
f = mpl.figure.Figure(constrained_layout=True, figsize=(8, 4))
sf1, sf2 = f.subfigures(1, 2)
(
so.Plot(tips, x="total_bill", y="tip", color="day")
.layout(algo=None)
.add(so.Dots(), legend=None)
.on(sf1)
.plot()
)
(
so.Plot(tips, x="total_bill", y="tip", color="day")
.layout(algo=None)
.facet("day", wrap=2)
.add(so.Dots())
.on(sf2)
.plot()
)
```

Note that there may be some rough edges around this concept in the first couple releases, especially relating to the legend positioning.