A Data Science Workflow Canvas to Kickstart Your Projects

A Data Science Workflow Canvas to Kickstart Your ProjectsUse this guide to help you complete your data science projects.

Jasmine VasandaniBlockedUnblockFollowFollowingMay 5Learning from my own mistakes and best practices, I designed a Data Science Workflow Canvas* to help others achieve their own data science projects.

This canvas helps you prioritize your goals first and then work backward to achieve them.

You can think about it this way: instead of following steps in a recipe to cook a predetermined meal, you first envision what the meal looks and tastes like and then you start developing a recipe.

When working on a data science project, you usually don’t have a set of instructions to achieve a predetermined outcome.

Instead, you have to determine the outcomes and the steps to achieve those outcomes.

This Data Science Workflow Canvas was designed with that process in mind.

Here, I’ll walk you through how to use this canvas, and I’ll share examples of how I implemented this canvas in my own projects.

The Data Science Workflow Canvas.

Download the Data Science Workflow Canvas.

How to Use the Data Science Workflow CanvasStep 1: Identify your problem statementWhat problem are you trying to solve?.And what larger issues do that problem address?.This section helps you address the “why” of your project.

Step 2: State your intended outcomes/predictionsYes, you won’t know what your outcomes are until after you’re done with your project, but you should at least have an idea of what you think they should look like.

Identify potential predictor (X) and/or target (y) variables.

Step 3: Determine your data sourcesWhere are you sourcing your data from?.Is there enough data?.And can you actually work with it?.Sometimes you might have access to ready-made datasets, or you might need to scrape your data.

Step 4: Choose your model(s)Choose your model(s) depending on your answers to these questions: are your outcomes discrete or continuous?.Do you have labeled or unlabeled datasets?.Are you concerned with outliers?.How well do you want to interpret your results?.The list of questions can vary depending on your project.

Step 5: Identify model evaluation metricsIdentify corresponding model evaluation metrics to interpret your outcomes.

Every model will have its own set of evaluation metrics.

Step 6: Create a data preparation planWhat do you need to do to your data in order to run your model and achieve your outcomes?.Data preparation includes data cleaning, feature selection, feature engineering, exploratory data analysis, and so on.

Bringing it all togetherOnce you finish brainstorming your ideas on the canvas, it’s time to bring it all together and activate your project!.Refer the order listed in the “Activation” section of the canvas as you bring your project to life.

Data Science Workflow Canvas Example #1Here’s an example of how I implemented the Data Science Workflow Canvas while working on my WNBA project.

Read the article about this project or follow along with the GitHub repository while referring to this canvas.

Data Science Workflow Canvas for the WNBA Clustering project.

Data Science Workflow Canvas Example #2Here’s another example of how I implemented the Data Science Workflow Canvas while working on my fake news detector project.

Read the article about this project or follow along with the GitHub repository while referring to this canvas.

Data Science Workflow Canvas for the fake news detector project.

ConclusionIn order to implement this Data Science Canvas Workflow well, keep these three things in mind:Don’t be afraid to make mistakes.

Use this canvas as a space to brainstorm your initial ideas, and come back to it to refine your process.

Keep what works and remove what doesn’t.

Focus on what you want to accomplish.

Even if your initial goals change, stay focused on what you want to accomplish.

No matter how often you have to go back and update your goals, just stay focused on them.

Data science is a nonlinear and iterative process.

There is no correct or linear way to achieve a data science project.

You can use this canvas as a resource to help you get started on a project, but it’s okay if you develop another process that works better for you.

Good luck on your data science projects, and may the canvas be with you!*Content from the Data Science Workflow Canvas was inspired by notes taken during my time at General Assembly’s Data Science Immersive.

The structure of the canvas was inspired by the Business Model Canvas.

Jasmine Vasandani is a data scientist, strategist, and researcher.

You can learn more about her here: www.

jasminev.

co/.. More details

Leave a Reply