• Home
  • About
  • Our Services
    • AI Applied
    • AI Accelerate
  • Key Industries
    • Finance and Banking
    • Healthcare
    • Manufacturing
    • Retail
    • Product Design and Development
    • Smart City and Infrastructure
  • Resources
    • Case Studies
    • Ebook: How to Bring AI to Your Organization
    • Free Guide: Discussion Questions for AI Readiness
    • New Research: 50 AI Examples from the Fortune 500
  • Projects
    • Unredactor
    • Using Bigquery to Debunk Coronavirus Myths
  • Coronavirus + AI
    • Apply for Manceps' Covid-19 AI Initiative
    • Using BigQuery to Debunk Coronavirus Myths
  • Careers
    • Current Openings
    • Get Google Cloud Artchitect Certified
  • Blog
  • Contact us
Manceps

OUR LATEST ARTICLES

The Ideal Phases of Machine Learning Projects

The Ideal Phases of Machine Learning Projects

Phase 1: Define ML use case

  • Identify and define the ML use case and problems to be solved
  • Define hypothesis
    • “Hypothesis” = potential pattern we expect to see in data
  • Define experiment(s) to validate hypothesis
  • Identify data source(s)
  • Agree on metrics to evaluate experiment(s)

 

Phase 2: Explore data

  • Describe data
  • Determine quality and cleanliness
  • Explore data through queries and visualization
  • Identify patterns, outliers in data

 

Phase 3: Select algorithm

  • Research existing strategies and white papers
  • Select an algorithm based on hypothesis, type of features, patterns in data
    • Classification vs. regression
    • Supervised vs. unsupervised learning
    • Univariate or multivariate 
    • Time series
    • DNN vs. Non DNN
  • Assets: 
    • Algorithm cheat sheet: show algorithms for different use cases
    • Document selection and reason
    • Link farm to research papers and relevant external resources
  • Deliverables: document decisions related to algorithms 

 

Phase 4: Do feature engineering

  • Use domain knowledge to identify features
  • Transform raw data into features
  • Craft new features as needed
  • Remove redundant/duplicate features
  • Remove highly correlated features
  • Reduce dimensionality as required
  • Check for class imbalance
  • Check for data leakage

 

Phase 5: Build ML model

  • Select the dataset for training, and test the set
  • Write code for experiment
  • Build a model
  • Determine duration and the amount of data for the initial experiment
  • Determine whether the model meets ROI requirements and risk requirements
  • Tools:TensorFlow, Python libraries
  • Deliverable: TF code/trained model 
  • Assets:
    • Code template for use cases
    • Reference architecture for IaaS solution

 

Phase 6: Iterate to improve model performance

  • Evaluate the model result
  • Visualize the model result
  • Iterate and Improve the result
  • Assets:
    • Troubleshooting guide for performance and testing techniques
    • TensorBoard internal asset

 

Phase 7: Present results, tell a story from the data

  • Present result: use data + visualization + narrative to tell a story
  • Tools: Slides, TensorBoard
  • Deliverables: Results report

 

Phase 8: Plan for deployment

  • Make a prediction on production data and build a business case for operationalizing it
  • Prepare performance and scale requirements for production
  • Prepare operationalization requirements for training and scoring
  • Prepare architecture for model training and retraining
  • Prepare architecture for prediction
  • Prepare work breakdown structure
  • Develop proposed timelines for training and retraining the model
  • Prepare a plan for rollout and the success criteria for increasing traffic.
  • Tools: DataFlow, BQ, GCS
  • Deliverables: Architecture Design doc, WBS
  • Assets:
    • Deployment plan
    • Testing scripts, guide

 

Phase 9: Deploy and operationalize the model

  • Convert the model into an API
  • Build dataset training and scoring architecture
  • Consume the model in business application(s)
  • Build an automated test
  • Build the feedback loop
  • Assets:
    • Operations Guide
    • Configuration scripts for API

 

Phase 10: Integrate with business, and monitor

  • Business process reply on the ML model
  • Data analysis and feedback loop

03.06.2020

The Complete Guide to Bringing AI to Your Organization

GET THE EBOOK ▾

Get notified when we publish a new story.

Our Most Recent Articles

DevFest West Coast 2020

DevFest West Coast 2020

Video: Machine Learning Engineering with Tensorflow Extended

Video: Machine Learning Engineering with Tensorflow Extended

Video: How to Build a Reproducible ML Pipeline

Video: How to Build a Reproducible ML Pipeline

Video: ML adventures with AutoML and TFHub

Video: ML adventures with AutoML and TFHub

Load More

50 AI Secrets: How Every Fortune 50 Company is Using AI Right Now

GET THE REPORT →
OUR LATEST RESOURCES
OUR LATEST ARTICLES

DevFest West Coast 2020

Watch videos of some of the world's top AI experts discuss everything from Tensorflow Extended to Kubernetes to AutoML to Coral.

Video: Machine Learning Engineering with Tensorflow Extended

In this talk, Hannes is providing insights into Machine Learning Engineering with TensorFlow Extended (TFX). He introduces how TFX for machine learning pipeline tasks and how to orchestrate entire ML pipelines with TFX. The audience learns how to run ML production pipelines with Kubeflow Pipelines, and therefore, free the data scientist's time from maintaining production machine learning models.

Video: How to Build a Reproducible ML Pipeline

Solving a data science problem usually requires multiple steps. These steps can include extracting and transforming data, training a model, and deploying the model into production. In this session, we'll discuss how to specify those steps with Python into an ML pipeline. We'll show how to create a Kubeflow Pipeline, a component of the Kubeflow open-source project. The audience will learn about how to integrate TensorFlow Extended components into the pipeline, and how to deploy the pipeline to the hosted Cloud AI Pipelines environment on Google Cloud. The key takeaway is how to improve reuse and reproducibility of the machine learning process.

LOAD MORE

OUR HEADQUARTERS
Headquartered in the heart of Portland, Oregon, our satellite offices span North America, Europe, the Middle East, and Africa.

(503) 922-1164

Our address is
US Custom House
220 NW 8th Ave
Portland, OR 97209

Copyright © 2019 Manceps