An Introduction to Statistical Learning

I would like to write a review of a book about statistical learning. (It is often called ISLR).

An Introduction to Statistical Learning
Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani
Springer Science & Business Media
2013-06-24
ISBN-13: 1461471389
426 Seiten
google reader, amazon.com, amazon.de.

I bought an e-book at half price. But you can get a PDF free at the official website. You can get datasets and R codes there, as well.

The reason why I read this book.

I wanted to learn machine learning (to apply it to a recommender system). The well-known book "The Elements of Statistical Learning" (ESL for short) deals with advanced topics and contains a lot of mathematics. But I would like to know practical things rather than theoretical stuff. But of course a book of how-to-write-a-code is not for me. ISLR seems to be balanced, so I chose it.

Contents of the book

This consists of 10 chapters. The first two chapters are introduction. The rest of the book widely covers topics and algorithms of machine learning, but there is only a little explanation about Bayesian theory and a prediction for time-series.

This book contains no pointers to books or articles for advanced topics. ESL is probably for it.

ISLR also deals with R codes, but it does not explain anything about graphical devices or other useful packages such as caret, ggplot2, etc.

Do not forget the video lectures!

I should say that I did not read the book entirely. I never think that it is a good idea to read it completely, because video lectures are provided by the authors of the book. (However the author of the linked blog post for videos says that you should read the book from cover to cover.) The videos cover most of the topics of the book, so I recommend watching the videos first and reading the book to complement your understanding.

To be honest, there are lots of redundant explanation in the book and they make the book rather longer. We can probably say that the explanation is very kind for beginners, but it is quite painful for me.

The sections of R are separated from the rest of the book

The last section of each chapter deals with R programming. It concentrates on R commands which are relevant to the chapter, however the reader can safely ignore the R sections. Namely there is no R-code outside the sections of R.

In my opinion the videos corresponding to the sections of R are not helpful, because they show only the output of R commands. Therefore it is better to try R commands by yourself to be used to R.

Note that this book does not provide basic functions of R. In other words, you are not likely to be able to write a complicated R script after reading the book, because this book concentrates only on the commands relating to machine learning. (For this purpose you might want to try "Data Science Specialization" on coursera.)

Less mathematics. Less statistics.

The main topic of Chapter 3 is linear regression. As I said in the previous entry, one of the biggest advantage of linear regression is that we can look at a statistical statement as a statement of linear algebra. In other words, we can convert statistics into (linear) geometry. Even though this is a great help to understand (even abstract) theory, the author never try to make use of this advantage.

The book seems to assume that a reader is familiar with statistical concepts such as confidence intervals and p-values, because there is only a brief explanation about them. But I do not think that the author makes use of the understanding of readers about statistics.

For example, two formulas (4.15) of estimators appear in Chapter 4. The reader who studied statistics immediately find that the formulas come from so-called "K-sample problem" (K-Stichprobenproblem in German). But ISLR does not mention anything about it. Moreover we can also say that we can get the formulas by using the linear regression, which the reader has just studied, but this is not mentioned, neither.

Balanced. But experiments are missing.

Although the explanation in the book is often redundant, it contains compactly a lot of topics. Moreover it explains mathematical aspects of theory, even though I feel that it is not enough.

But I wonder why the authors do not make use of R for experimental purpose. I think that R is a very good tool to do it. The reader can easily reproduce the result, moreover she can try it with a different turning. Such an experiment is helpful to understand more confidently. But as I said above, the R sections are independent of the other sections. It is true that the reader should try an experiment by herself. But ISLR does not provide enough tutorial for R.

Conclusion

This book is very good for understanding theoretical aspects of machine learning, but you might want to watch the lecture videos at first to understand the big picture and motivation. The book is for studying details of the theory and algorithms after watching the videos.

You should not expect R on the book. To be able to write a practical R-scripts you need to look for other resources (such as a coursera course, which I put a link above).

Share this page on        
Categories: #data-mining