18.1 Slow is Smooth, and Smooth is Fast: So Take it Slow!

Our Fundamental Rule: Slow is Smooth, and Smooth is Fast

The single-biggest mistake that all new learners make – regardless of the topic – is to rush. Ironically, when we take the time to learn something deeply, even though that involves more of an up-front time investment, we come away with a stronger conceptual grasp as a result, and we actually learn it faster.

Rushing is natural. We all do it. After all, when we are eager to learn something, we prefer to acquire this knowledge now, rather than later on. If you truly wish to master a concept, though, train yourself to fight this urge. Take things as slow as humanly possible. Stop and ask as many questions as you need to. Heed the maxim that dates back to the Roman Empire: Festina lente (Make haste, slowly).

So what does this mean, in practice?

It means that when you first encounter a new concept, you should dig your heels in. Read about it in your textbook. If there’s a code example, don’t just skim it visually – actually enter and run the code yourself. If the answers in one source don’t take things deep enough for your liking, look at alternative sources. At times, these searches will frustrate you. The data science community is awash in “pop” material that only goes an inch or two beneath the surface. By digging a bit deeper, you will find the answers.

When it comes to coding, by all means, use existing scripts as templates to solve the problem at hand. When you do, do not settle for “I’m running this line of code, but I have no idea what it does.” Instead, use that wonderful help() function to learn more about it, and to dive into the nitty gritty details. Which parameters are required? Which ones are optional? How can the defaults be changed?

Suppose you found an example (perhaps in this book!) that included a lambda function, or a list comprehension, or a user-defined function, and you felt like the explanation accompanying the syntax was too light. Your assessment may be right – perhaps it could have been explained more fully – but regardless, it’s now up to you to take ownership of the situation. Explore it independently, find the answers you need, and drive on.

Next, challenge yourself a bit – can you code up another, similar example that solves a slightly different problem? Putting the code aside for a bit, and then coming back to it, with your own homemade example, is often the best way to reinforce the important underlying concepts.

As for the exactly correct pace, this will vary based on the situation and on your goals. You may be up against a deadline. You may be okay with a general, thematic understanding of an idea from statistics. Whenever possible, though, just bear in mind that slow is smooth, and smooth is fast. This rule is so essential, in fact, that it comes with three separate corollaries, each of which is shown below.

Corollary #1: Stop to Smell the Roses. And to Read the Python Error Messages

Python error messages can be intimidating! Sometimes, they can be tough to decipher. But if we take a good look at these error messages by breaking them down, we can often tell the type of error, detail about the error, and where it occurred. Was it a data type error, indentation error, or name error? Did the mistake happen along line 10 or 3? These are handy clues that tell us where to look to solve the problem, and what the mistake is. Familiarizing yourself with these common error types saves time when debugging code.

We once had a student tell us that he spent eight hours the previous day attempting to generate a prediction using his logistic regression model. The crux of the issue was that in the dataframe he used to generate a hypothetical record for the prediction, he used one too many variables. When he showed us his code the next day, we read the error message: “Expected 24 variables but received 23.” We don’t know how long he had really spent debugging this on the day before, but his frustration that morning was evident. Working through the issue with him, though, we learned that he had never actually read the error message.

Corollary #2: Avoid the Rush to Automation

Of course, automation has its place, especially in technology. Manual procedures are not only more tedious and time-consuming, but they are more error-prone, too. However, when starting out, they can sometimes be more effective for learning. Simply put, it’s very easy – perhaps too easy – to run a script and generate a result without any syntax errors, all without really knowing what happened, or why.

To illustrate this point, we’ll make an analogy to agriculture. If you were running a large-scale agribusiness, with many thousands of acres of farmland, you would of course need to automate some of your processes. While there might be some satisfaction gleaned from a day’s worth of tough labor in the field, a purely manual process simply would not scale well.

But what if you were teaching someone to grow a tomato garden, using a two meter by two meter plot of soil? By the end of the day, would you expect that person’s hands to be a bit dirty? We should hope so! And we would also hope that the lessons that someone can learn in that tiny plot are illustrative of bigger-picture principles that would apply on a large scale. Having really seen the effects of sunlight, water, and soil treatment, and crop rotation in a detailed, micro-level way will enable someone to more deeply appreciate those things when applied in a macro-level way, too.

To take an example from this book, let’s think about the process we use to find the prediction for the next period in a time series with a linear trend. It is, by all means, a bit painful, and a bit messy. There are two separate equations that we must update, and enough B-sub-this and L-sub-that variables to juggle that it could drive a person crazy. But working through it a few times by hand helps someone to see how the trend and level are updated – along with the way that larger alpha and beta coefficients would speed up the adjustments as new data flows in.

If we had to generate many such predictions, especially in a short timeframe? Of course, we would just use a function. But knowing how the function works means that if something goes wrong – or we just see a result that’s very unexpected – we can start to pick it apart. We can troubleshoot the issue without feeling like we are stabbing in the dark. Knowing the data, knowing the models, and knowing how the input values as well as the parameters work together, is always our goal as analytics professionals.

If analytics were an Olympic sport, how would its Gold medals be awarded? They would go to people, or to teams, who found useful takeaways from data.

For a real-life example, let’s take SoFi, a fintech company whose name comes from “Social Finance.” SoFi’s founders got started after poring through data about student loan defaults. After analyzing the data, they found something remarkable – while the default rate on all types of student loans was approximately nine percent, that rate among MBA holders was closer to just one percent. At Stanford’s Graduate School of Business (GSB), where SoFi’s founders were studying at the time, the rate over the past 25 years was just 0.58%.¹ However, federal lending programs were charging the same interest rates to all students, at all programs and schools. Meanwhile, traditional bank credit scoring and lending models were still using outdated models that ignored potentially important data points about borrowers. SoFi got started by acting on what its founders learned – they made loans to GSB students and refinanced loans held by recent alumni.

Today, SoFi is a publicly-traded, multi-billion dollar entity that operates across the financial spectrum – and it all started from some great analytics work. Note that:

No one cares whether that analysis was conducted in Python, R, Excel…or, for that matter, a piece of notebook paper;
No one cares how many milliseconds it took to run the credit scoring model that led SoFi towards its first round of investors;
No one cares how many lines of code its founders used to come to their conclusions about loan refinancing.

What people do care about are the conclusions that came from that initial analysis, which have quite literally revolutionized traditional finance models.

Corollary #3: Got the Need for Speed? Slow Down, Buckeroo!

Coding beginners (and even quite a few textbooks and tutorials) often overstate the importance of processing speed when comparing various types of algorithms and functions.

For instance, let’s say we are comparing the processing time required for using an exhaustive grid search, compared with a randomized grid search. Suppose the exhaustive search takes 230 seconds (admittedly, that feels like an eternity in the computing world!) whereas the randomized search takes only 34 seconds.

We could say the same for a question about k-means vs. hierarchical clustering. Especially when working with large datasets, the calculations required for hierarchical clustering mean that the model fitting process takes longer. We would stop short of calling that a meaningful disadvantage, though.

How important is this difference, really? Sure, if you were planning to make thousands of different clustering models every single day, for weeks and months on end, the distinction would matter. We are not aware of anyone, anywhere, who builds that many clustering models (but if you are, please let us know – we would appreciate the mutual introduction, and we will edit this paragraph with an update).

Far more realistically, an analytics professional will be asked by someone in her organization, or by an external client, to segment a consumer dataset, with the goal of generating unique clusters that the company would use to deliver distinct marketing approaches, tailored to each group. This only requires one model (or perhaps a small handful of models, if you wish to present the client with some alternative choices).

If Version B of a segmentation model is even slightly better than Version A, then Version B is the one you should build – and whatever method you used to build it was the right one to pick. The impact of a successful model can be enormously impactful, and this impact could pretty quickly drown out any traumatic memories caused by the extra 112 seconds of processing time that were needed to build it.

¹ https://www.forbes.com/sites/petercohan/2012/05/15/sofis-mike-cagney-wants-to-fix-student-loans/?sh=50db34f81eab