Deep Neural Networks (DNN) have radically changed the landscape of state-of-the-art performance in Natural Language Processing (NLP) within recent years. These versatile models are being used in many applications including text classification, language creation, question answering, image captioning, language translation, named entity recognition, and speech recognition. The state-of-the-art is changing quickly, sometimes leading to large leaps in performance with the release of new architectures. In October of 2018 Google released BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding which performed best in 11 different NLP benchmarks upon release. Since then, there have been many more models adding new components or tweaking the approach. In this article we’ll review some of the traditional machine learning methods used in deep learning and new trends such as Transfer Learning and Transformers to provide a foundation no matter what model is currently leading.