18.4 Don’t be a Donkey!
The story of Buridan’s Donkey is sometimes attributed to the 14th century French philosopher Jean Buridan, but may actually have roots going back much further. Regardless of its origins, Buridan’s Donkey is a cautionary tale about indecision – and its message is valuable for anyone seeking to expand their analytics knowledge through self-learning.
Hungry and thirsty at the same time, this donkey wandered and wandered in the fields before stumbling upon what should have been a miraculous discovery – a delicious bale of hay and a bucket of clean, fresh water. However, he now had a new problem, as he did not know which one to choose first. The hay looked wonderfully tasty, and the water looked wonderfully refreshing. What’s more, they were equidistant from where the donkey stood, so he could not simply go to the nearest one first.
For hours, the donkey stood there, as his eyes darted back and forth between the hay and the water. Just when he thought he had nearly made up his mind, he would stop to think, “Wait, what if I should go to the other one first?” Finally, paralyzed by indecision, the donkey dropped dead from both hunger and thirst, simultaneously.
Do not fall prey to the same problem! Rather than agonize endlessly over what to study, pick something, and run with it. Most YouTube series on data analytics are informative. Most DataCamp courses are high quality.
Indecision can hurt you not only during the learning phase of analytics, but also during the doing phase.
When modeling data, the “secret sauce” hiding so often in plain sight is iteration. Every dataset is unique, and every problem statement is different. Sometimes the only way to know which subset of features will work best for a particular model is to check.
If you need some inspiration based on a topic covered in this book, try being more like the k-means algorithm. While k-means uses some fairly sophisticated math to determine when to stop adjusting those centroid positions, it has no clue at the outset with regards to the best place to start. Rather than risk meeting the fate of Buridan’s Donkey, k-means just picks those initial positions randomly, knowing that it will iteratively improve on them.
Once you have something, you can always improve on it incrementally. But a percentage increase on a baseline of zero is, well…nothing!