Skip to content

Why use Cookiecutter-Spatial-Data-Science?

When people hear data analysis, data science, or data engineering, the first thing that comes to mind is usually the results — interesting reports, infographics, and maps. Getting to those insightful products requires creative exploration, and that exploration must be unstructured. It is nearly always a bit messy.

Reproducing the results of that exploration, however, requires exactly the opposite: organized structure, clean code, and good documentation.

Trying to impose structure mid-exploration kills the creativity that makes discovery possible. Trying to untangle and add structure after the fact — once you've already landed on results — is difficult at best and, more often than not, nearly impossible.

That's why you should use Cookiecutter-Spatial-Data-Science. Once you're familiar with the structure, you can follow opinionated best practices that let you explore, discover, and communicate insights creatively and reproducibly — by you, and by others. That makes your work exponentially more valuable.

Others Will Thank You

Your projects should be clear enough that colleagues can dive in, contribute, and trust the results. A standardized structure lets newcomers understand an analysis without wading through pages of documentation or reading every line of code.

Well-organized code is often self-documenting: the structure itself provides context. Collaboration gets easier, learning gets smoother, and conclusions become more trustworthy.

You Will Thank You

If you've ever come back to a project after a few weeks, months, or years, you've probably asked yourself questions like these:

  • Do I need to intersect the states with the points before running make_data.py?
  • Do I run clean_data.py first, or evaluate_data.py?
  • Where do the Zip Code polygons come from?
  • Do I use make_figures.py.old, make_figures_working.py, or new_make_figures01.py?

These questions are the symptom of a disorganized project — what creative exploration leaves behind when there's no structure to make the results reproducible.

A consistent structure means you can return to your projects easily, and you'll spend far less time answering these questions for the people using your work (including future-you).

Opinions Are Not Binding

A foolish consistency is the hobgoblin of little minds.

— Ralph Waldo Emerson (and PEP 8)

Cookiecutter-Spatial-Data-Science is built on opinions — notably PEP 8 and the idea of analysis as a DAG. These are best practices, but they're still opinions. You don't have to follow them.

If something else works better for you, do it. Just please do it consistently, for all of the reasons above. Others will understand how to use your projects, and you'll remember how to use them too.

In Short

Use Cookiecutter-Spatial-Data-Science to keep the messy, creative part of your work messy and creative — while making the result reproducible by default. Explore freely, ship clearly, and thank yourself later.