Select Page

2.6 Line Plots


Line plots are most typically used to track the value of some numeric variable across time.  

Most often, line plots will feature a time variable on the x-axis, with the variable whose measurement is being tracked on the y-axis.  The “line” in line plot refers to the fact that the points are connected to one another.  Below, we will create one with seaborn’s lineplot() function, in order to see the number of season passes that Lobster Land sold for the following season, by date.

An interesting pattern emerges from this line plot – we can see that at some point mid-way in the season, daily sales for next year’s passes spiked.  They remained steady at that new plateau before spiking again just prior to the end of the season.  That x-axis looks like a huge mess, though!  We can improve upon this by instructing pandas to view these x-axis values as dates.

More information about working with dates and times in Python can be found in Chapter 11: Forecasting.  

Now that the ‘Date’ column’s data type has been converted, the graph appears in a much more reader-friendly way.

Line plots can be modified with different line styles, colors, and markers.  Seaborn’s line plot documentation contains more details about these modifications.

Note that by default, seaborn gave us a y-axis that started at 0 and spanned all of our data.  Let’s see what happens if we greatly expand the limits of that y-axis.

After moving to a range of 0 to 800, the graph becomes slightly harder to interpret, but we can still identify the way the numbers sort of “jump” at some point in late July, and then stay steady until that spike at the end.  What if we tried an even larger y-axis range?

With an exaggerated y-axis range such as the one used here, the plot becomes totally unreadable.  We demonstrate this here only to point out that sometimes, a decision made by the modeler – even one that does not impact the underlying data at all – can greatly alter a plot’s interpretability.  

So what’s the right way to set up your axes?  Ultimately, that’s a judgment call on the part of the person building the graph. In the vast majority of cases, you will not need to alter the default settings of the plotting library that you are using, but there may be situations – like the presence of extreme outliers – that require you to consider adjustments.  

Some sources might state that the y-axis of any line plot should always start at 0.  However, there is no hard-and-fast requirement for this.  To think about why, just imagine that you are plotting something whose true value never comes anywhere near 0 – now, adherence to this rule could force you to stretch your axis in a way that greatly reduces readability.  

The only ironclad rule that we believe in for axis adjustments is this:  Never intentionally deceive your audience.  In other words, do not distort your axis in order to make your data “tell the story” in the way that you want it to.