This is to all beginners of the data science & data analytics. DO NOT get caught by the “Complexity Trap”!!! The "complexity trap" is especially deadly for beginners.
What is the “Complexity Trap” you might ask? Well I see A LOT of beginners, when they started of with machine learning, they ONLY use ‘complex’ algorithm or they have this belief that complexity breeds accuracy. Let me remind you that, it is NOT the case! For instance, many beginners I come across feels something is wrong when a decision tree does better than random forest, or support vector machines. NOT true! Depending on the relationship between the signals and the target, any known model can be a good fit! So try all the known models!
Why is that so? Well basically when we are doing machine learning, answering two questions here:
1) What are the possible signals that leads us to the different target/label?
2) What is the type of relationship between the signal and the target? Is it additive (logistics regression) or splitting rules (decision tree)?
Remember what Judea Pearl says about the training of models, it is a curve-fitting exercise.
We have no idea what the actual type of relationship is until we try it! So try ALL known models!
TLDR, avoid the “complexity trap”. It is not necessary that the neural network is a better model compared to decision trees all the time. For a bit more details, you may want to read this post on using complex machine learning models, that I have written a while back. Here it is, the article. :)
Do keep this in mind at all times! Have fun in your Data Science learning journey! If the post have been useful, do share it and if you'll like to link up, can look for me on LinkedIn.