Car mileage prediction with ANFIS
This slide show addresses the use of ANFIS function in the Fuzzy Logic Toolbox for predicting the MPG (miles per gallon) of a given automobile.
Copyright 1994-2002 The MathWorks, Inc. $Revision: 1.9 $
Automobile MPG (miles per gallon) prediction is a typical nonlinear regression problem, in which several attributes of an automobile's profile information are used to predict another continuous attribute, that is, the fuel consumption in MPG. The training data is available from the UCI (Univ. of California at Irvine) Machine Learning Repository (http://www.ics.uci.edu/~mlearn/MLRepository.html). It contains data collected from automobiles of various manufactures and models, as shown in the next slide.
The table shown above is several tuples from the MPG data set. The six input attributes are no. of cylinders, displacement, horsepower, weight, acceleration, and model year; the output variable to be predicted is the fuel consumption in MPG. (The automobile's manufacturers and models in the first column of the table are not used for prediction. The data set is obtained from the original data file 'auto-gas.dat'. Then we partition the data set into a training set (odd-indexed tuples) and a checking set (even-indexed tuples), and use the function 'exhsrch' to find the input attributes that have better prediction power for ANFIS modeling.
a=imread('gasdata.jpg', 'jpg'); image(a); colormap(gray); axis image; axis off [data, input_name] = loadgas; trn_data = data(1:2:end, :); chk_data = data(2:2:end, :);
To select the best input attribute, 'exhsrch' constructs six ANFIS, each with a single input attribute. Here the result after executing exhsrch(1, trn_data, chk_data, input_name). Obviously, 'Weight' is the most influential input attribute and 'Disp' is the second one, etc. The training and checking errors are comparable in size, which implies that there is no overfitting and we can select more input variables. Intuitively, we can simply select 'Weight' and 'Disp' directly. However, this will not necessarily lead to a two-ANFIS model with the minimal training error. To verify this, we can issue the command exhsrch(2, trn_data, chk_data, input_name) to select the best two inputs from all possible combinations.
exhsrch(1, trn_data, chk_data, input_name); win1 = gcf;
Train 6 ANFIS models, each with 1 inputs selected from 6 candidates... ANFIS model 1: Cylinder --> trn=4.6400, chk=4.7255 ANFIS model 2: Disp --> trn=4.3106, chk=4.4316 ANFIS model 3: Power --> trn=4.5399, chk=4.1713 ANFIS model 4: Weight --> trn=4.2577, chk=4.0863 ANFIS model 5: Acceler --> trn=6.9789, chk=6.9317 ANFIS model 6: Year --> trn=6.2255, chk=6.1693
Demonstrate the result of selecting two inputs. 'Weight' and 'Year' are selected as the best two input variables, which is quite reasonable. The training and checking errors are getting distinguished, indicating the outset of overfitting. As a comparison, let us use exhsrch to select three inputs
input_index = exhsrch(2, trn_data, chk_data, input_name); new_trn_data = trn_data(:, [input_index, size(trn_data,2)]); new_chk_data = chk_data(:, [input_index, size(chk_data,2)]); win2 = gcf;
Train 15 ANFIS models, each with 2 inputs selected from 6 candidates... ANFIS model 1: Cylinder Disp --> trn=3.9320, chk=4.7920 ANFIS model 2: Cylinder Power --> trn=3.7364, chk=4.8683 ANFIS model 3: Cylinder Weight --> trn=3.8741, chk=4.6763 ANFIS model 4: Cylinder Acceler --> trn=4.3287, chk=5.9625 ANFIS model 5: Cylinder Year --> trn=3.7129, chk=4.5946 ANFIS model 6: Disp Power --> trn=3.8087, chk=3.8594 ANFIS model 7: Disp Weight --> trn=4.0271, chk=4.6350 ANFIS model 8: Disp Acceler --> trn=4.0782, chk=4.4890 ANFIS model 9: Disp Year --> trn=2.9565, chk=3.3905 ANFIS model 10: Power Weight --> trn=3.9310, chk=4.2976 ANFIS model 11: Power Acceler --> trn=4.2740, chk=3.8738 ANFIS model 12: Power Year --> trn=3.3796, chk=3.3505 ANFIS model 13: Weight Acceler --> trn=4.0875, chk=4.0095 ANFIS model 14: Weight Year --> trn=2.7657, chk=2.9953 ANFIS model 15: Acceler Year --> trn=5.6242, chk=5.6481
The popped figure demonstrates the result of selecting three inputs, in which 'Weight', 'Year', and 'Acceler' are selected as the best three input variables. However, the minimal training (and checking) error do not reduce significantly from that of the best 2-input model, which indicates that the newly added attribute 'Acceler' does not improve the prediction too much. For better generalization, we always prefer a model with a simple structure. Therefore we will stick to the two-input ANFIS for further exploration
exhsrch(3, trn_data, chk_data, input_name); win3 = gcf;
Train 20 ANFIS models, each with 3 inputs selected from 6 candidates... ANFIS model 1: Cylinder Disp Power --> trn=3.4446, chk=11.5329 ANFIS model 2: Cylinder Disp Weight --> trn=3.6686, chk=4.8922 ANFIS model 3: Cylinder Disp Acceler --> trn=3.6610, chk=5.2384 ANFIS model 4: Cylinder Disp Year --> trn=2.5463, chk=4.9001 ANFIS model 5: Cylinder Power Weight --> trn=3.4797, chk=9.3761 ANFIS model 6: Cylinder Power Acceler --> trn=3.5432, chk=4.4804 ANFIS model 7: Cylinder Power Year --> trn=2.6300, chk=3.6300 ANFIS model 8: Cylinder Weight Acceler --> trn=3.5708, chk=4.8378 ANFIS model 9: Cylinder Weight Year --> trn=2.4951, chk=4.0435 ANFIS model 10: Cylinder Acceler Year --> trn=3.2698, chk=6.2616 ANFIS model 11: Disp Power Weight --> trn=3.5879, chk=7.4948 ANFIS model 12: Disp Power Acceler --> trn=3.5395, chk=3.9953 ANFIS model 13: Disp Power Year --> trn=2.4607, chk=3.3563 ANFIS model 14: Disp Weight Acceler --> trn=3.6075, chk=4.2318 ANFIS model 15: Disp Weight Year --> trn=2.5617, chk=3.7865 ANFIS model 16: Disp Acceler Year --> trn=2.4149, chk=3.2480 ANFIS model 17: Power Weight Acceler --> trn=3.7884, chk=4.0480 ANFIS model 18: Power Weight Year --> trn=2.4371, chk=3.2852 ANFIS model 19: Power Acceler Year --> trn=2.7276, chk=3.2580 ANFIS model 20: Weight Acceler Year --> trn=2.3603, chk=2.9152
The input-output surface of the best two-input ANFIS model for MPG prediction is shown above. It is a nonlinear and monotonic surface, in which the predicted MPG increases with the increase in 'Weight' and decrease in 'Year'. The training RMSE (root mean squared error) is 2.766; the checking RMSE is 2.995. In comparison, a simple linear regression using all input candidates results in a training RMSE of 3.452, and a checking RMSE of 3.444.
if ishandle(win1), delete(win1); end if ishandle(win2), delete(win2); end if ishandle(win3), delete(win3); end in_fis=genfis1(new_trn_data); mf_n = 2; mf_type = 'gbellmf'; epoch_n = 1; ss = 0.01; ss_dec_rate = 0.5; ss_inc_rate = 1.5; in_fismat = genfis1(new_trn_data, mf_n, mf_type); [trn_out_fismat trn_error step_size chk_out_fismat chk_error] = anfis(new_trn_data, in_fismat, [epoch_n nan ss ss_dec_rate ss_inc_rate], nan, new_chk_data, 1); for i=1:length(input_index), chk_out_fismat = setfis(chk_out_fismat, 'input', i, 'name', deblank(input_name(input_index(i), :))); end chk_out_fismat = setfis(chk_out_fismat, 'output', 1, 'name', deblank(input_name(size(input_name, 1), :))); gensurf(chk_out_fismat); colormap('default'); set(gca, 'box', 'on'); view(-22, 36); fprintf('\nLinear regression with parameters:\n'); param= size(trn_data,2) A_trn = [trn_data(:, 1:size(data,2)-1) ones(size(trn_data,1), 1)]; B_trn = trn_data(:, size(data,2)); coef = A_trn\B_trn; trn_error = norm(A_trn*coef-B_trn)/sqrt(size(trn_data,1)); A_chk = [chk_data(:, 1:size(data,2)-1) ones(size(chk_data,1), 1)]; B_chk = chk_data(:, size(data,2)); chk_error = norm(A_chk*coef-B_chk)/sqrt(size(chk_data,1)); fprintf('\nRMSE for training data: '); RMSE=trn_error fprintf('\nRMSE for checking data: '); RMSE = chk_error
ANFIS info: Number of nodes: 21 Number of linear parameters: 12 Number of nonlinear parameters: 12 Total number of parameters: 24 Number of training data pairs: 196 Number of checking data pairs: 196 Number of fuzzy rules: 4 Start training ANFIS ... 1 2.7657 2.99534 Designated epoch number reached --> ANFIS training completed at epoch 1. Linear regression with parameters: param = 7 RMSE for training data: RMSE = 3.4527 RMSE for checking data: RMSE = 3.4444
The function exhsrch only trains each ANFIS for a single epoch in order to be able to find the right inputs shortly. Now that the inputs are fixed, we can spend more time on ANFIS training. The above plot is the error curves for 100 epochs of ANFIS training. The green curve is the training errors; the red one is the checking errors. The minimal checking error occurs at about epoch 45, which is indicated by a circle. Notice that the checking error curve is going up after 50 epochs, indicating that further training overfits the data and produce worse generalization
watchon; epoch_n = 100; [trn_out_fismat trn_error step_size chk_out_fismat chk_error] = anfis(new_trn_data, in_fismat, [epoch_n nan ss ss_dec_rate ss_inc_rate], nan, new_chk_data, 1); [a, b] = min(chk_error); plot(1:epoch_n, trn_error, 'g-', 1:epoch_n, chk_error, 'r-', b, a, 'ko'); axis([-inf inf -inf inf]); title('Training (green) and checking (red) error curve'); xlabel('Epoch numbers'); ylabel('RMS errors'); watchoff;
ANFIS info: Number of nodes: 21 Number of linear parameters: 12 Number of nonlinear parameters: 12 Total number of parameters: 24 Number of training data pairs: 196 Number of checking data pairs: 196 Number of fuzzy rules: 4 Start training ANFIS ... 1 2.7657 2.99534 2 2.7654 2.9953 3 2.7651 2.99523 4 2.76479 2.99518 5 2.76449 2.99511 Step size increases to 0.015000 after epoch 5. 6 2.76418 2.99503 7 2.76372 2.99493 8 2.76325 2.9949 9 2.76278 2.99477 Step size increases to 0.022500 after epoch 9. 10 2.7623 2.99469 11 2.76159 2.99447 12 2.76086 2.99438 13 2.76013 2.99426 Step size increases to 0.033750 after epoch 13. 14 2.75939 2.99405 15 2.75827 2.99386 16 2.75714 2.99358 17 2.75599 2.99339 Step size increases to 0.050625 after epoch 17. 18 2.75483 2.99326 19 2.75306 2.99298 20 2.75126 2.99261 21 2.74944 2.99233 Step size increases to 0.075938 after epoch 21. 22 2.7476 2.99207 23 2.7448 2.99169 24 2.74199 2.99135 25 2.73918 2.99102 Step size increases to 0.113906 after epoch 25. 26 2.73639 2.99082 27 2.73231 2.99054 28 2.72842 2.9902 29 2.72482 2.99007 Step size increases to 0.170859 after epoch 29. 30 2.72159 2.9899 31 2.71752 2.99 32 2.71441 2.99015 33 2.71202 2.98944 Step size increases to 0.256289 after epoch 33. 34 2.71005 2.98859 35 2.70749 2.9876 36 2.70515 2.98597 37 2.70277 2.98462 Step size increases to 0.384434 after epoch 37. 38 2.7001 2.98264 39 2.69559 2.98146 40 2.69272 2.98034 41 2.69265 2.98195 Step size increases to 0.576650 after epoch 41. 42 2.68943 2.9807 43 2.69947 2.98322 44 2.68677 2.98027 45 2.69536 2.98285 46 2.68347 2.97939 Step size decreases to 0.288325 after epoch 46. 47 2.69 2.98215 48 2.67722 2.98046 49 2.67455 2.9781 50 2.67212 2.98094 51 2.67049 2.9809 Step size increases to 0.432488 after epoch 51. 52 2.6699 2.98267 53 2.67101 2.97783 54 2.66589 2.98443 55 2.66175 2.98259 56 2.65602 2.99089 57 2.64908 2.9976 Step size increases to 0.648732 after epoch 57. 58 2.64504 2.99444 59 2.69116 2.98964 60 2.65037 2.98146 61 2.68259 2.99837 62 2.64958 2.98423 Step size decreases to 0.324366 after epoch 62. 63 2.68029 3.00159 64 2.62932 3.00066 65 2.62503 2.9973 66 2.6443 3.00182 67 2.62391 2.99736 68 2.64091 3.00233 69 2.62307 2.99737 Step size decreases to 0.162183 after epoch 69. 70 2.63843 3.00288 71 2.61774 3.00274 72 2.62032 2.99779 73 2.6167 3.00287 Step size decreases to 0.081091 after epoch 73. 74 2.61891 2.99806 75 2.61479 3.00121 76 2.61458 3.00276 77 2.61399 3.00091 78 2.61359 3.00289 Step size increases to 0.121637 after epoch 78. 79 2.61321 3.00099 80 2.61459 3.00368 81 2.6126 3.00101 82 2.61349 3.00398 83 2.61202 3.0011 Step size decreases to 0.060819 after epoch 83. 84 2.61258 3.00425 85 2.61029 3.00315 86 2.60979 3.00258 87 2.60961 3.00415 88 2.60923 3.00271 Step size increases to 0.091228 after epoch 88. 89 2.60894 3.00435 90 2.6096 3.002 91 2.60841 3.00494 92 2.60888 3.00249 93 2.6079 3.00538 Step size decreases to 0.045614 after epoch 93. 94 2.60823 3.00291 95 2.60688 3.00465 96 2.60653 3.00562 97 2.60634 3.00468 98 2.60603 3.00603 Step size increases to 0.068421 after epoch 98. 99 2.60583 3.00508 100 2.606 3.00699 Designated epoch number reached --> ANFIS training completed at epoch 100.
The snapshot of the two-input ANFIS at the minimal checking error has the above input-output surface. Both the training and checking errors are lower than before, but we can see some spurious effects at the far-end corner of the surface. The elevated corner says that the heavier an automobile is, the more gas-efficient it will be. This is totally counter-intuitive, and it is a direct result from lack of data.
gensurf(chk_out_fismat); set(gca, 'box', 'on'); view(-22, 36);
This plot shows the data distribution. The lack of training data at the upper right corner causes the spurious ANFIS surface mentioned earlier. Therefore the prediction by ANFIS should always be interpreted with the data distribution in mind.
plot(new_trn_data(:,1), new_trn_data(:, 2), 'bo', new_chk_data(:,1), new_chk_data(:, 2), 'rx'); axis([-inf inf -inf inf]); xlabel(deblank(input_name(input_index(1), :))); ylabel(deblank(input_name(input_index(2), :))); title('Training (o) and checking (x) data');