One of the ways a machine learns is through Supervised Learning. In Supervised Learning, the machine is given data and expected output and the machine is expected to generate a rule. Once the machine generates the rule, the machine uses the rule to solve similar problems. One of the ways a machine generates the needed rule for given set of data and output is by optimising using Linear Programming. Let us see this mechanism in action.
Consider that we set up an experiment to determine the length of the hypotenuse of right-angled triangles. We could do this by drawing triangles on a sheet of paper and measuring the hypotenuse using a scale. Through the data obtained from our experiment, we give the following data and output to the machine.
Features |
Target Variable |
|
Side A (A) |
Side B (B) |
Hypotenuse (C) |
4 |
2 |
4.45 |
6 |
6 |
8.50 |
3 |
9 |
9.50 |
7 |
5 |
8.60 |
2 |
12 |
12.20 |
3 |
9 |
9.40 |
4 |
7 |
8.00 |
Table 1: Results from the Experiment
To analyse the experiment data in Table 1, the Data Scientist creates a visualisation as shown in the Figure 1.
Figure 1: Visualisation of Experiment Data
Suppose the Data Scientist has a hunch that there might be a linear relationship between the squares of A, B and C, i.e., between A2, B2 and C2. So, the Data Scientist instructs the machine to try to create a linear model based on the experiment results as shown in the Table 1. So, the machine tries to create an equation as shown below.
To create the equation eq1, the machine has the values of A, B. So, the machine needs to calculate the values of β0, β1 and β2 to estimate the value of C and compare how good the estimates were by comparing with the actual values of C. The Data Scientist provides the initial values of values of β0, β1, β2 as 5, 5, 5 respectively. So, the machine calculates the value of C as shown in the Table 2.
Original Features |
Target Variable |
Engineered Features |
First |
||
Side A (A) |
Side B (B) |
Hypotenuse (C) |
A2 |
B2 |
β0 = 5, β1 = 5, β2 = 5 |
4 |
2 |
4.45 |
16 |
4 |
10.24695077 |
6 |
6 |
8.50 |
36 |
36 |
19.10497317 |
3 |
9 |
9.50 |
9 |
81 |
21.33072901 |
7 |
5 |
8.60 |
49 |
25 |
19.36491673 |
2 |
12 |
12.20 |
4 |
144 |
27.29468813 |
3 |
9 |
9.40 |
9 |
81 |
21.33072901 |
4 |
7 |
8.00 |
16 |
49 |
18.16590212 |
Table 2: Initial Estimates made by the machine
We plot the machine’s estimates to see how the machine did (as shown in the Figure 2).
Figure 2: First Estimate made by the machine
The machine realises that the estimates are not so good. So, the machine determines the error in its estimates. The errors may be positive or negative.
So, the net effect of the error may be lost. Realising this, the machine takes the squares of the error so that there are only positive numbers. This is shown in the Table 3.
Original Features |
Target Variable |
Engineered Features |
First |
Error |
|||
Side A (A) |
Side B (B) |
Hypotenuse (C) |
A2 |
B2 |
β0 = 5, β1 = 5, β2 = 5 |
E |
E^2 |
4 |
2 |
4.45 |
16 |
4 |
10.24695077 |
-5.80 |
33.60 |
6 |
6 |
8.50 |
36 |
36 |
19.10497317 |
-10.60 |
112.47 |
3 |
9 |
9.50 |
9 |
81 |
21.33072901 |
-11.83 |
139.97 |
7 |
5 |
8.60 |
49 |
25 |
19.36491673 |
-10.76 |
115.88 |
2 |
12 |
12.20 |
4 |
144 |
27.29468813 |
-15.09 |
227.85 |
3 |
9 |
9.40 |
9 |
81 |
21.33072901 |
-11.93 |
142.34 |
4 |
7 |
8.00 |
16 |
49 |
18.16590212 |
-10.17 |
103.35 |
Table 3: Calculation of the error in estimate
The machine needs to minimise the error. So, the machine calculates the mean of all the squared errors, i.e., the mean of all E^2. This is called the Mean Square Error (MSE). We get MSE = 125.07.
Now, this is an optimisation problem where MSE needs to be minimised by changing the values of β0, β1, β2. This is a Linear Programming Problem (LPP) which can be solved using Excel. So, we set up the LPP as shown in the Figure 3.
Figure 3: The setup of the Linear Programming Problem
Solving this LPP, we get the values of β0, β1, β2 as shown in the Figure 4.
Figure 4: Solution obtained after optimisation
Lastly, let us see how well our machine learnt.
Figure 5: Predictions by the machine at the end of Supervised Learning
Conclusion
Machines learn through Mathematical Models. When a machine learning mathematical model is optimised, it becomes a machine learning algorithm. One of the ways to optimise a machine learning mathematical model is by using Linear Programming. To learn Linear Programming through practical example, read “Linear Programming for Project Management Professionals”.
Request you to kindly leave a reply.