Classification Models Pros and Cons

Source: https://github.com/ctufts/Cheat_Sheets/wiki/Classification-Model-Pros-and-Cons

Classification Model Pros and Cons (Generalized)

  • Logistic Regression
    • Pros
      • low variance
      • provides probabilities for outcomes
      • works well with diagonal (feature) decision boundaries
      • NOTE: logistic regression can also be used with kernel methods
    • Cons
      • high bias
  • Decision Trees
    • Regular (not bagged or boosted)
      • Pros
        • easy to interpret visually when the trees only contain several levels
        • Can easily handle qualitative (categorical) features
        • Works well with decision boundaries parellel to the feature axis
      • Cons
        • prone to overfitting
        • possible issues with diagonal decision boundaries
    • Bagged Trees : train multiple trees using bootstrapped data to reduce variance and prevent overfitting
      • Pros
        • reduces variance in comparison to regular decision trees
        • Can provide variable importance measures
          • classification: Gini index
          • regression: RSS
        • Can easily handle qualitative (categorical) features
        • Out of bag (OOB) estimates can be used for model validation
      • Cons
        • Not as easy to visually interpret
        • Does not reduce variance if the features are correlated
    • Boosted Trees : Similar to bagging, but learns sequentially and builds off previous trees
      • Pros
        • Somewhat more interpretable than bagged trees/random forest as the user can define the size of each tree resulting in a collection of stumps (1 level) which can be viewed as an additive model
        • Can easily handle qualitative (categorical) features
      • Cons
        • Unlike bagging and random forests, can overfit if number of trees is too large
  • Random Forest
    • Pros
      • Decorrelates trees (relative to bagged trees)
        • important when dealing with mulitple features which may be correlated
      • reduced variance (relative to regular trees)
    • Cons
      • Not as easy to visually interpret
  • SVM
    • Pros
      • Performs similarly to logistic regression when linear separation
      • Performs well with non-linear boundary depending on the kernel used
      • Handle high dimensional data well
    • Cons
      • Susceptible to overfitting/training issues depending on kernel
  • Neural Network (This section needs further information based on different types of NN’s)
  • Naive Bayes
    • Pros
      • Computationally fast
      • Simple to implement
      • Works well with high dimensions
    • Cons
      • Relies on independence assumption and will perform badly if this assumption is not met

Microsoft Research PhD Summer School 2016

I was lucky enough (thanks to Dr. Matteo Venanzi) to be invited to Microsoft’s summer school that takes place in Cambridge, UK, every year (4-8 July 2016). This is a one week school that aims to familiarise PhD students with the work at Microsoft. Moreover, students get the opportunity to present their work to other students, communication experts and Microsoft researchers. The benefit of this is twofold. First, students get feedback on their work and their presentation skills and style. Also, it is an opportunity to meet new people and measure the impact of your work in terms of the interest your poster attracts. At the same time, it is an opportunity to evaluate whether your work has any relevance to anything that Microsoft’s is currently working on. The school focuses on early PhD students but late students, like me, attended as well.

In terms of the everyday schedule, as I already mentioned, we had some coaching on how to communicate our research but we also had some more technical lectures as well. For example, I particularly liked Prof. Bishop’s presentation (“The Road to General Artificial Intelligence”). He focused on the advancements of AI in the recent years and that the potential is still greater.

Overall, I believe this yearly summer school is a great initiative from Microsoft Research and I am glad I was a part of it.

 

Group-pic-2-2 copy.jpg

AAAI (Association for Advancement of Artificial Intelligence) 2016 Conference

Phoenix, Arizona, USA, 2016

The Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) was held in February 12–17 at the Phoenix Convention Center, Phoenix, Arizona, USA. I was lucky enough to attend.

First of all, it was my first time in the USA and I have to admit that I was impressed. The buildings, the streets, the coffee, the food, the life!

Concerning the conference itself, it attracted a lot of people and celebrity academics working in the area of Artificial Intelligence. The first couple of days I attended interesting tutorials. From the traditional heuristic search to the trendy and hot topic of Deep Learning!! I will just mention that the deep learning tutorial was meant to be for students or people who were not familiar with the concepts. Even though they started from scratch, they managed to discuss about complex concepts by the end. So, the speakers presented what is included in their new book (check the book on Amazon) which is available for pre-orders at the time of writing this post.

On the days of the actual conference I had the opportunity to attend a number of talks about machine learning, security and multi-agent systems. I also attended the educational panel where Russell Stuart and Peter Norvig  announced the release of  the 4th edition of their famous book: Artificial Intelligence: A Modern Approach. The book will include material on Deep Learning and Monte Carlo Tree Search and meta-heuristics.

The evening of the first day of the conference I attended an invited talk from Andreas Krause who is currently leading the Learning & Adaptive Systems Group at the ETH University. He covered a lot of application themes as  the presentation’s title was From Proteins to Robots: Learning to Optimize with Confidence. The focus though was on Bayesian Optimization and how submodularity can be handy in such approaches in terms of providing guarantees and bounds on the solutions provided.

The next day, I attended another important talk. Demis Hassabis was there, the CEO of DeepMind which was acquired by Google for hundreds of millions back in 2014. There, it was announced that DeepMind’s AlphaGo will face the world champion in the game of Go in mid March (starting on March 9th 2016) and that it will be live streamed. For more details click here. Beside advertising and promoting DeepMind’s brain child, Demis went in to some details about the algorithms they used, giving us  intuition on how they approached the problem, why and how their algorithm works. Concretely, he said that contrary to people’s belief, not everything in Deepmind has to do with Deep Learning, but instead they rely on reinforcement learning which is an area inspired by behavioural psychology and our attempt to understand how brain works and ultimately how we learn. Put simply and briefly, reinforcement learning is about rewarding behaviour that is beneficial and penalising behaviour that is detrimental in some context.

My talk was just after lunch on the 3rd day of the conference. I was in the Computational Sustainability session and I only had a spotlight talk, which means I had to talk for 2 minutes in order to advertise my work and invite people to the poster session in the evening where I could present and talk more about my work. The session was interesting overall. The key speaker in my opinion was Pascal Van Henteryck. He and his students had 2 or 3 presentations in that session. It was about evacuation plans in cities that are vulnerable to flooding and natural disasters. Pascal is very well-known in the area and I personally have attended his online optimization course in Coursera.

The evening of that day I presented my poster in the allocated session. A lot of people stopped by and asked about my work and I am glad about it.

All in all, AAAI was a very good experience and I feel lucky to have the opportunity to be there! If you want to know more details about it or talk about specific papers please get in touch with me.

aaai16