Defensive Data Science

White Paper

Overview

The Data Scientist’s skill set is a collection of coding and analytic talents, usually emphasizing predictive analytics knowledge over coding acumen. Josh Wills of Cloudera characterizes Data Scientists as people who are, “Better at coding than the average statistician,” which is probably accurate but is also setting the programming bar fairly low.

Software engineers put a lot of thought and energy into the minutiae of their projects, like code structure and organization, and these considerations are a major contributing factor to the success of such projects.

Predictive analytics projects have similar pain points as software development projects (with some additional aggravations unique to analytics), and so the data science community can gain significant benefit by adopting strategies and philosophies pioneered in the world of software engineering.

Explore five such strategies in this white paper:

  • Version control
  • Code readability
  • Documentation
  • Semantic folder structure
  • Pipelines

Download the White Paper