Practical Natural Language Processing

Northeastern University Khoury College of Computer Sciences

Semester term: Fall 2024
Instructor: Mohammad Selim
Email: m.selim@northeastern.edu
Lectures: Friday 9.30 am - 12:50 pm in room 1505
Office Hours: Wednesday 5 pm-6 pm (MS Team) and Friday after lecture

TAs:

Schedule:


DateTopicSlides/ReadingAssignment Due
Sep 06Introduction, Review Python and Machine LearningSlideAssignment 1 available on canvas
Sep 13N-gram Language Models, Naive BayesCh3.1-3.4 & Ch4.1-4.8Assignment 1 Due , Assignment 2 Out
Sep 20Classification continued, feature engineering,
Word Representations, Neural Networks
Ch.6Assignment 2 &,
Project concept proposal due
Sep 27
Word Representations (cont.)
SlideAssignment 3 Due
OCT 04Deep Learning and Language Models (RNN, LSTM)Slide
LSTM blog-post
 
OCT 11Deep Learning and Language Models (cont.) (Attention & Transformer)Slide
Attention is all you need
Alammar’s Transformer blog-post
Alammar’s Attention blog-post
Assignment 4 due
OCT 18Pretrained Language Models, Finetuning, Commonly used Python tools, Discuss practice midterm examBERT Slide ,
Huggingface Librery
Example
Project proposal due
OCT 25Midterm Exam The midterm exam (Take home)
NOV 01Coreference Resolution, Unsupervised LearningSlideAssignment 5 due
NOV 08POS Tagging, Constituency and Dependency ParsingSlide 
NOV 15LLM, RL, Bias and Fairness in NLPSlide 
NOV 22Applications of NLP (Dialogue System)Slide 
DEC 2Final Project Poster due
DEC 6Final Project Presentation Final paper+code due

Learning Objectives:

This course aims to provide a solid understanding of the basics of natural language processing (NLP) and hands-on implementation of basic NLP algorithms. Students will gain familiarity with the challenges of NLP and broaden their understanding of how NLP impacts the world. Additionally, the course offers exposure to current research in the field.

Resources

Grades

Final grades will be assigned based on the overall percentage calculated using the weightings listed above (no curving). There is no absolute direct mapping to letter grades, but the minimum overall percentage required to obtain each letter grade will be no higher than the following: A (94%), A- (90%), B+ (87%), B (83%), B- (80%), C- (65%).

Project selection and guidelines

You are allowed to select any senario/problem which you want to solve using NLP. The project will be evaluated based on the NLP point of view, not as a complete software package (e.g. how is your text processed, engineered, model developed, trained and evaluated?). Your NLP components need to be supported by literature. You can adopt any paper published in one of the top NLP venues including

Assignment submission and late days

Assignments will be submitted on Canvas/Gradescope and will be due by 11:59 PM on the specified deadline.

Three penalty-free late days will be granted without any excuse for any given assignment, excluding group projects, presentations, and exams. Late days cannot be divided fractionally and can be used for a maximum of three assignments. The penalty for a late assignment will be 25% per business day.

Respect for diversity

Classrooms full of computing students from diverse backgrounds and perspectives are crucial for us to make progress in our field.

It is my intent that diverse students will be successful in this course, that each student’s learning needs are addressed both in and out of class, and that the diversity that each student brings to this class is viewed as a resource, strength, and benefit. I expect you to feel challenged and sometimes outside of your comfort zone in this course, but it is my intent to present materials and activities that are inclusive and respectful of all persons, no matter their gender, sexual orientation, disability, age, socioeconomic status, ethnicity, race, culture, perspective, and other background characteristics. We should all strive for these principles both inside and outside of the classroom.

The course meetings are on Thursdays. If a class meeting conflicts with your religious observances, please let me know in the first two weeks of the class so that we can make other arrangements. Northeastern University respects the religious practices of its students, faculty, and staff and is committed to ensuring that all students are able to observe their religious beliefs without academic penalty.

Class rosters are provided to each instructor with each student’s legal name. I will gladly honor your request to address you by an alternate name and/or pronoun. Please advise me of this early in the semester so that I may make appropriate changes to my records.

Global Learner Support

Northeastern University’s Global Learner Support (GLS) offers “language, cultural, and academic support while promoting the development of intercultural competence and global understanding.” They offer tutoring, workshops, and much more. Visit https://gls.northeastern.edu/ Links to an external site.to learn more.

Academic accommodations

If you have a documented need for an academic accommodation, please contact the professor within the first two weeks so we can have a conversation about how best to make appropriate arrangements.

If you require support during the course due to a disability please ensure that you are already registered with the Disability Resource Center, and contact your course instructors to coordinate any support needed during the course.

Mental health issues are real and can prevent you from doing your best work. Your Khoury advisor is your primary contact for accessing University resources. You can also directly access University Health and Counseling Services. Do not hesitate to make use of them as needed. Please do not wait until it has seriously impacted your work.

Collaboration with humans

Computer science, both academically and professionally, is a collaborative discipline. In any collaboration, however, all parties are expected to make their own contributions and to generously credit the contributions of others. In our class, therefore, collaboration on homework and programming assignments is encouraged, but you as an individual are responsible for understanding all the material in the assignment and doing your own work. Always strive to do your best, give generous credit to others, start early, and seek help early from both your professors and classmates.

The following rules are intended to help you get the most out of your education and to clarify the line between honest and dishonest work. The professor reserves the right to ask you to verbally explain the reasoning behind any answer or code that you turn in and to modify your project grade based on your answers. It is vitally important that you turn in work that is your own. Follow the guidelines for academic honesty.

If you have had a substantive discussion of any homework or programming solution with a classmate, then be sure to cite them in your report. If you are unsure of what constitutes “substantive”, then ask us or err on the side of caution. You will not be penalized for working together. You must not copy answers or code from another student either by hand or electronically. Another way to think about it is that you should be talking English with one another, not Java.

The following rules apply to anything you hand in for a grade.

The university’s academic integrity policy discusses actions regarded as violations and consequences for students: Office of Student Conduct and Conflict Resolution - Academic Integrity Policy

Collaboration with language models

Pair programming with a language model has arrived, and it is available to all for free. Collaborating with a language model is going to become as common as Google searching bugs or discussing algorithms with a classmate. I want to prepare you for the future of this field while also ensuring everyone has a solid grasp of the fundamentals.

This semester, we will allow limited collaboration with a large language model such as ChatGPT (https://chat.openai.com/chat) or Github Copilot (https://github.com/features/copilot). For each homework assignment, I will list the things that you can use a model for. You may not use the model for any tasks other than what I list in the assignments. For example, I might say that you can ask the model to write code to get the concatenation of a list of strings. This is intended to save time spent getting the syntax for simple or mundane tasks. It is in your best interest (to learn what you need for interviews) to use language models only for the small tasks that I list, and not to have a model solve the main part of the assignment for you. We also generally don’t want AI to be writing all our code, but we want to use it as a tool.

At the top of each assignment, please credit:

Title IX

Title IX of the Education Amendments of 1972 protects individuals from sex or gender-based discrimination, including discrimination based on gender-identity, in educational programs and activities that receive federal financial assistance.

Northeastern’s Title IX Policy prohibits Prohibited Offenses, which are defined as sexual harassment, sexual assault, relationship or domestic violence, and stalking. The Title IX Policy applies to the entire community, including students of all genders, faculty, and staff.

If you or someone you know has been a survivor of a Prohibited Offense, confidential support and guidance can be found through University Health and Counseling Services staff and the Center for Spirituality, Dialogue, and Service clergy members. By law, those employees are not required to report allegations of sex or gender-based discrimination to the University.

Alleged violations can be reported non-confidentially to the Title IX Coordinator within The Office for University Equity and Compliance at: titleix@northeastern.edu and/or through NUPD (Emergency 617.373.3333; Non-Emergency 617.373.2121). Reporting Prohibited Offenses to NUPD does NOT commit the victim/affected party to future legal action.

Faculty members are considered “responsible employees” at Northeastern University, meaning they are required to report all allegations of sex or gender-based discrimination to the Title IX Coordinator.

In case of an emergency, please call campus police.

Please visit the Office for University Equity and Compliance for a complete list of reporting options and resources both on- and off-campus.