1. Introduction

AI/ML is not a futuristic topic anymore. For example the company Icons8 use AI/ML to generate headshots. This put stock photo companies on notice (here).

Developers and BI analysts can use AI/ML tools now, it is not only for Data Scientists.

This blog post shows ML.NET, which is a collection of cross-platform libraries and CLI to create and train ML models.

This post is a summary of the following video and this tutorial.

2. Important Concepts

  • ML helps to solve challenging programming questions.
  • Example of challenging programming questions:
    • Write a function which takes an image and tells if there is a face on it.
    • Write  a function to tell the price of the shirt based on textual description?
    • Write a function try to play rock paper scissors with a machine (think about different skin tones, different hand sizes, etc).
  • Machine Learning creates a function (called Model) using sample data. In Machine Learning, you develop/train you custom ML models to infuse custom machine learning into their applications.
    • A model is the algorithm
  • Interpreting the data is tricky. What do you do if you’re told your 82% healthy
    A more comprehensive answer would be, «here the characteristics you can improve, bla bla»
  • There are Supervised and non-supervised training.
    • In supervised learning you use algorithms:
      • Regressions.
      • Clusters.
      • Classification.
      • Classification.
    • Machine Learning is: unsupervised training
  • There are tons of out of the box ML Trainers (described in the Tooling section below).

3. Machine Leaning Stages

There is a training VS running

ml-workflow.png

4. Working Samples for Developers

Clone the ML.NET sample git repo (link here). Some facts about the code:

    • About Data Inputs:
      • In this example there are two datasets:
        • One to train your model (called wikipedia-detox-250-line-data.tsv, which is larger) and test data (called wikipedia-detox-250-line-test.tsv) to determine how well model is performing.
      • There is also an input class called SentimentIssue to model the data in file: wikipedia-detox-250-line-test.tsv
    • About Data Processing:
      • Then you need to transform data (a.k.a. Featurize Text) for ML algorithms to understand it.
        • In AI/ML, arrays are the way how Images, text, etc are represented (more on that here).
        • transformers
      • Last but not least, you need to apply ML algorithms ML (a.k.a. Trainers, see here).
      • ml_algos
      • How can you tell which one perform best for your scenario? -> ML.NET offers Model Builder can do it for you (more on that here).
      • auto_ml
    • Time for Predictions:
      • The whole point about building a model is to use it for Predictions.
      • Last but not least, you evaluate the accuracy of the model.
      • For the record, a model is a zip file. The zip file is what you deploy to PROD.

5, ML Tools

There are lots of ML/AI tools, in this post we provide below only a general introduction to ML.NET and BigQuery ML.

5.a ML.NET

There is a complete post about ML.NET [here], but in general with ML.NET you can do the following general tasks:

  • Automated Model Generation
    •  Classification
      • Binary Classification
      • Sentiment Analysis
      • Spam detection
      • Credit Card Fraud Detection
      • Hearth Disease Prediction
    • Multi-class classification
      • GitHub Labelers
      • Iris Flowers Classification
      • MNIST
      • Support Text Classification
    • Models for Regression (to answer questions like how much? how many?)
      • Price Prediction
      • Sales Forecasting
      • Demand Prediction
  • Recommendation
    • Product Recommendation
    • Movie Recommendation (Matrix Factorization)
    • Movie Recommendation (Field Aware Factorization Machines)
  • Clustering
    • Customer Segmentation
    • IRIS Flowers Clustering
  • Time Series Forecasting
  • Ranking, etc.
    • Rank Search Engine Results
  • Anomaly Detection
    • Sales Spike Detection
    • Power Anomaly Detection
    • Credit Card Fraud Detection
  • Computer Vision
    • Training
      • High Level API
      • Featuizer Estimator
      • Predictions (Pretrained TensorFlow model scoring)
    • Object Detection
  • Cross Cutting Scenarios

5.b BigQuery ML

In BigQuery ML, only two steps are needed to create a model and make predictions based on it, the syntax is inspired in SQL.

big_query_ML

Thanks,

Javier Caceres