11 Machine Learning System Design

Prioritizing What to Work On

How to strategize putting together a complex machine learning system.

It’s hard to choose the options which is the best use of your time.

In fact, if you even get to the stage where you brainstorm a list of different options to try, you’re probably already ahead of the curve.

We must have a more systematic way to choose among the options of the many different things.

Error Analysis

If you’re starting work on a machine learning product or building a machine learning application, it is often considered very good practice to start, not by building a very complicated system with lots of complex features and so on, but to instead start by building a very simple algorithm, the you can implement quickly.

It’s often by implementing even a very, very quick and dirty implementation and by plotting learning curves that that helps you make these decisions.

And often by doing that, this is the process that would inspire you to design new features. Or they’ll tell you whether the current things or current shortcomings of the system and give you the inspiration you need to come up with improvements to it.

Recommended way :

  1. start by building a very simple algorithm
  2. plot a learning curve
  3. error analysis (a single rule number evaluation metric)

Strongly recommended way to do error analysis is on the cross validation set rather than the test set.

Error Metrics for Skewed Classes

It’s particularly tricky to come up with an appropriate error metric, or evaluation metric, for your learning algorithm.

MatrixPrediction Value
PositiveNegtive
Actual ValueNegtiveFPTN
PositiveTPFN

Trading Off Precision and Recall

\(Precision = TP / (TP + FP)\)
  • If you want to make predicting only when you’re more confident, and so you end up with a classifier that has higher precision.
\(Recall= TP / (TP + FN)\)
  • If we want to avoid missing too many actual cases. So we want to avoid the false negatives. And in this case, what we would have is going to be a higher recall classifier.
\(F_1Score : 2 \frac{PR}{P + R}\)
  • Maybe we can choose the higher F1 value in some cases.

Data For Machine Learning

In some cases, I had cautioned against blindly going out and just spending lots of time collecting lots of data, because it’s only sometimes that that would actually help.

In machine learning that often in machine learning it’s not who has the best algorithm that wins, it’s who has the most data.

If you have a lot of data and you train a learning algorithm with lot of parameters, that might be a good way to give a high performance learning algorithm.

The Key : 

  1. Find some features x and confidently predict the value of y.
  2. Actually get a large training set, and train the learning algorithm with a lot of parameters in the training set.

If you can’t do both then you need a very kind performance learning algorithm.

Leave a Reply

Your email address will not be published. Required fields are marked *