Natural Language Processing (NLP) is an important area of Artificial Intelligence concerned with the processing and understanding (NLU) of a human language. The goal of NLP and NLU is to process and harness information from a large corpus of text with very little manual intervention.
This course will introduce various techniques to find similar words using the context of surrounding words, build a Language model to predict the next word and generate sentences, encode every word in the vocabulary of the corpus into a vector form that represents its context and similar words and encode a sentence for machine translation and conversation purposes.
The course will help learners to gather sufficient knowledge and proficiency in probabilistic, Artificial Neural Network (ANN) and deep learning techniques for NLP.
Any interested learners
Essential – Algorithms, Python proficiency, elementary probability and statistics, Linear Algebra, basic understanding of machine learning
NOTE: Only English corpus is considered throughout this course.
ABOUT THE INSTRUCTOR
Prof. Ramaseshan R. He is currently working as a Visiting faculty at CMI and handling NLP. He has more than 30 years of experience in research and development, teaching, product development, information technology, innovation, and convergence.
Chennai Mathematical Institute
1. Join the course
Learners may pay the applicable fees and enrol to a course on offer in the portal and get access to all of its contents including assignments. Validity of enrolment, which includes access to the videos and other learning material and attempting the assignments, will be mentioned on the course. Learner has to complete the assignments and get the minimum required marks to be eligible for the certification exam within this period.
COURSE ENROLMENT FEE: The Fee for Enrolment is Rs. 3000 + GST
2. Watch Videos+Submit Assignments
After enrolling, learners can watch lectures and learn and follow it up with attempting/answering the assignments given.
3. Get qualified to register for exams
A learner can earn a certificate in the self paced course only by appearing for the online remote proctored exam and to register for this, the learner should get minimum required marks in the assignments as given below:
CRITERIA TO GET A CERTIFICATE
Assignment score = Score more than 50% in at least 9/12 assignments.
Exam score = 50% of the proctored certification exam score out of 100
Only the e-certificate will be made available. Hard copies will not be dispatched.”
4. Register for exams
The certification exam is conducted online with remote proctoring. Once a learner has become eligible to register for the certification exam, they can choose a slot convenient to them from what is available and pay the exam fee. Schedule of available slot dates/timings for these remote-proctored online examinations will be published and made available to the learners.
EXAM FEE: The remote proctoring exam is optional for a fee of Rs.1500 + GST. An additional fee of Rs.1500 will apply for a non-standard time slot.
5. Results and Certification
After the exam, based on the certification criteria of the course, results will be declared and learners will be notified of the same. A link to download the e-certificate will be shared with learners who pass the certification exam.
WEEK 1: Introduction, terminologies, empirical rules WEEK 2: Word to Vectors WEEK 3: Probability and Language Model WEEK 4: Neural Networks for NLP WEEK 5: Distributed word vectors (word embeddings) WEEK 6: Recurrent Neural Network, Language Model
WEEK 11: Information Retrieval tasks using Neural Networks- Learn to Rank, Understanding Phrases, analogies
WEEK 12: Spelling Correction using traditional and Neural networks, end notes
Books and References
Niladri Sekhar Dash and S. Arulmozi, Features of a Corpus. Singapore: Springer Singapore, 2018, pp. 17–34. isbn: 978-981-10-7458-5. doi: 10.1007/978- 981- 10- 7458- 5_2, url:https://doi.org/10.1007/978981-10-7458-5_2.
Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, http://www.deeplearningbook.org. MIT Press, 2016.
Nitin Indurkhya and Fred J Damerau, “Handbook of natural language processing,” Chapman and Hall/CRC, 2010.
Daniel Jurafsky and James H. Martin “Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition,” 1st. Upper Saddle River, NJ, USA: Prentice Hall PTR, 2000. isbn: 0130950696.
C.D. Manning et al, “Foundations of Statistical Natural Language Processing,” Mit Press. MIT Press, 1999. isbn: 9780262133609. url: https://books.google.co.in/books?id=YiFDxbEX3SUC.
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schutze, “An Introduction to Information Retrieval,” Cambridge UP, 2009. Chap. 6,pp. 109–133.
Jacob Perkins, “Python 3 text processing with NLTK 3 cookbook,” Packt Publishing Ltd, 2014.
Noah A. Smith, “Linguistic Structure Prediction. Synthesis Lectures on Human Language Technologies,” Morgan and Claypool, May 2011.
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. “Neural machine translation by jointly learning to align and translate”. English (US). In: arXiv (2014).
Yoshua Bengio et al. “A Neural Probabilistic Language Model”. In: Journal of Machine Learning Research 3 (Mar. 2003), pp. 1137–1155. issn:1532-4435.
Peter F. Brown et al. “Class-based N-gram Models of Natural Language”. In: Comput. Linguist. 18.4 (Dec. 1992), pp. 467–479. issn: 0891-2017.
Peter F. Brown et al. “The Mathematics of Statistical Machine Translation: Parameter Estimation”. In: Comput. Linguist. 19.2 (June 1993), pp. 263–311. issn: 0891-2017.
KyungHyun Cho et al. “On the Properties of Neural Machine Translation:Encoder-Decoder Approaches”. In: CoRR abs/1409.1259 (2014). arXiv:1409.1259.
Scott Deerwester et al. “Indexing by latent semantic analysis”. In: JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE 41.6 (1990), pp. 391–407.
Chris Dyer. “Notes on Noise Contrastive Estimation and Negative Sampling”. In: CoRR abs/1410.8251 (2014). arXiv: 1410.8251.
Yoav Goldberg. “A Primer on Neural Network Models for Natural Language Processing”. In: CoRR abs/1510.00726 (2015). arXiv: 1510.00726.
Nils Hadziselimovic et al. “Forgetting Is Regulated via Musashi-Mediated Transnational Control of the Arp2/3 Complex.” In: Cell 156.6 (Mar. 2014),pp. 1153–1166. issn: 1097-4172.
Sepp Hochreiter and Jürgen Schmidhuber. “Long Short-Term Memory”. In: Neural Comput. 9.8 (Nov. 1997), pp. 1735–1780. issn: 0899-7667.
Chiori Hori and Takaaki Hori. “End-to-end Conversation Modeling Track in DSTC6”. In: CoRR abs/1706.07440 (2017). arXiv: 1706.07440.
Andrej Karpathy, Justin Johnson, and Fei-Fei Li. “Visualizing and Understanding Recurrent Networks.” In: CoRR abs/1506.02078 (2015).
Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. “Effective Approaches to Attention-based Neural Machine Translation”. In: CoRR abs/1508.04025 (2015). arXiv:1508.04025.
Tomas Mikolov et al. “Efficient Estimation of Word Representations in Vector Space”. In: CoRR abs/1301.3781 (2013).
Franz Josef Och and Hermann Ney. “The Alignment Template Approach to Statistical Machine Translation”. In: Computational Linguistics 30.4 (Dec. 2004), pp. 417–449. issn: 0891-2017.
F. Pedregosa et al. “Scikit-learn: Machine Learning in Python”. In: Journal of Machine Learning Research 12 (2011), pp. 2825–2830.
Fraser W. Smith and Lars Muckli. “Nonstimulated early visual areas carry information about surrounding context”. In: Proceedings of the National Academy of Sciences 107.46 (2010), pp. 20099–20103.
Kyunghyun Cho et al. “Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation”. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, Oct. 2014, pp. 1724–1734.
Rafal Jozefowicz, Wojciech Zaremba, and Ilya Sutskever. “An Empirical Exploration of Recurrent Network Architectures”. In: Proceedings of the 32Nd International Conference on International Conference on Machine Learning – Volume 37. ICML’15. Lille, France: JMLR.org, 2015, pp. 2342–2350.
Quoc Le and Tomas Mikolov. “Distributed representations of sentences and documents”. In: International conference on machine learning. 2014,pp. 1188–1196.
Edward Loper and Steven Bird. “NLTK: The Natural Language Toolkit”, In: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics – Volume 1. ETMTNLP ’02. Philadelphia, Pennsylvania: Association for Computational Linguistics, 2002, pp. 63–70.
Tomas Mikolov et al. “Distributed Representations of Words and Phrase and Their Compositionality”. In: Proceedings of the 26th International Conference on Neural Information Processing Systems – Volume 2. NIPS’13.Lake Tahoe, Nevada: Curran Associates Inc., 2013, pp. 3111–3119.
Andriy Mnih and Geoffrey Hinton. “A Scalable Hierarchical Distributed Language Model”. In: Proceedings of the 21st International Conference on Neural Information Processing Systems. NIPS’08. Vancouver, British Columbia, Canada: Curran Associates Inc., 2008, pp. 1081–1088. isbn:978-1-6056-0-949-2.
Frederic Morin and Yoshua Bengio. “Hierarchical probabilistic neural network language model.” In: Aistats. Vol. 5. Citeseer. 2005, pp. 246–252.
Kishore Papineni et al. “Bleu: a Method for Automatic Evaluation of Machine Translation”. In: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia, Pennsylvania, USA: Association for Computational Linguistics, July 2002, pp. 311–318.