BDA 696 Machine Learning Engineering
Fall 2021
20493
Class Days: Tuesday
Class Times: 7:00pm - 9:45pm
Class Location: SH-221
Mode: lecture
Platform: teaching.mrsharky.com/
Instructor: Dr. Julien Pierret
Phone: None
Email: julien.pierret@gmail.com
Office location: GMCS-314
Office hours: Tuesday 5:00pm - 6:45pm
Effective Fall 2021, students who register for face-to-face classes are expected to attend as indicated in the course schedule. Faculty teaching face-to-face courses will not be required to create a new, alternative on-line class as an accommodation for any student.
Students with medical conditions that would present a COVID-related risk in a face-to-face instructional setting should contact the Student Ability Success Center (https://sdsu.edu/sasc) to begin the process of getting support. Students who do not adhere to the Covid19 Student Policies or the directives of their faculty will be directed to leave the classroom and will be referred to the Center for Student Rights and Responsibilities.
Do not come to campus if you do not feel well. Remain home and monitor your symptoms and seek medical attention as needed.
Materials (including texts, readings, course fees, equipment, and any technology requirements) |
Required or optional |
Where and how it can be obtained |
Computer with a Linux/Bash environment |
Required |
OSX, Ubuntu, WSL
|
This course introduces practical machine learning model building techniques with a strong emphasis on bringing these models to a production environment. We will first introduce proper ways to work with code. Next, we will go over structuring unstructured data for easier model building ingestion. Feature engineering and its importance in the model building process will be emphasized and pipelining these transformations will be explored. Finally, we will build prediction models which we will wrap up into an online application so predictions can be made in real-time.
We are each going to build a library that we can use to aid in building machine learning models. Throughout the semester you will be assigned a task of new features or abilities we would like to add to this library. All these “features” will be submitted as “pull requests” (PRs) in Github. You will need to select a peer / friend / acquaintance in the class to review your PR and give meaningful and insightful comments. As a peer, if you think changes need to be done to the code, you can and should reject the PR asking for fixes. You will be graded as both a reviewer and a submitter of PRs. The instructor will do a final review of the PR and a grade will be determined based off of this. It is important not to fall behind on these projects as these tasks will build on top of one another!
An important thing to note here is that “I” almost never get a PR accepted on its first try. Other eyes will always see something you’ve missed
For the final project you will either pick a dataset of interest (several interesting datasets will also be provided), and predict something of importance from this dataset. You will need to analyze the dataset, do feature engineering, build a predictive model and wrap the final model into a flask app. You will need to provide a way for the instructor to run this code from beginning to end. The preferred method would be a docker container but a bash script with minimal dependencies would also be acceptable.
The instructor should be able to check out your Github code, run a single script that will then go and download the necessary data, go through the workflow of cleaning / organizing / preparing the data, do all the necessary transformations, build a predictive model and output the predictive model. There should also be a way given to run this predictive model in a production environment via a REST interface.
A final report will need to be written discussing the original dataset and showing how the data was converted from its raw form to a clean and organized dataset (the Extract Transform and Load). Analysis of what features were important and of the features you tried to generate yourself. If a feature you believe had promise, but failed to live up to that promise, you should write about that as well. Charts and graphs can be very helpful here.
You must also go over all the models you tried to build. I expect you to try more than a few models and compare the accuracy between them and discuss why you picked one over the other. Discuss why you considered or didn’t consider some models. Final model performance metrics should be given and ideas for improving the model should also be provided.
You are all adults and the final report should be of sufficient length to be able to cover your modeling process in enough depth that someone without access to your code could recreate your work with enough effort. A printed copy of this final paper must be handed to the professor in person on or before the last day of class.
Lastly, you will also give a presentation to the class about your project going over specific highlights from above
Students are provided with an SDSU Gmail account, and this SDSU email address will be used for all communications. University Senate policy notes that students are responsible for checking their official university email once per day during the academic term. For more information, please see Student Official Email Address Use Policy here.
Need help finding an advisor, tutor, counselor, or require emergency economic assistance? The SDSU Student Success Help Desk is here for you. Student assistants are available via Zoom Monday through Friday, 9:00 AM to 4:30 PM to help you find the office or service that can best assist with your particular questions or concerns.
The University adheres to a strict policy prohibiting cheating and plagiarism. Examples of academic dishonesty include but are not limited to:
Unauthorized recording or dissemination of virtual course instruction or materials by students, especially with the intent to disrupt normal university operations or facilitate academic dishonesty, is a violation of the Student Conduct Code. This includes posting of exam problems or questions to on-line platforms. Violators may be subject to discipline.
The California State University system requires instructors to report all instances of academic misconduct to the Center for Student Rights and Responsibilities. Academic dishonesty will result in disciplinary review by the University and may lead to probation, suspension, or expulsion. Instructors may also, at their discretion, penalize student grades on any assignment or assessment discovered to have been produced in an academically dishonest manner.
SDSU students are expected to abide by the terms of the Student Conduct Code in classrooms and other instructional settings. Violation of these standards will result in referral to appropriate campus authorities. Prohibited conduct includes:
SDSU via the Student Ability Success Center (SASC) provides accommodations for students with documented disabilities or medical conditions covered under the Americans with Disabilities Act (ADA). In keeping with current public health guidance, I cannot provide arrangements to students without an ADA-qualified disability or medical condition.
If you are a student with a disability and are in need of accommodations for this class, please contact the Student Ability Success Center at sascinfo@sdsu.edu (or go to sdsu.edu/sasc) as soon as possible. Please know accommodations are not retroactive, and I cannot provide accommodations based upon disability until I have received an accommodation letter from the Student Ability Success Center. SASC registration and accommodation approvals may take up to 10-14 business days, so please plan accordingly.
The Family Educational Rights and Privacy Act (FERPA) mandates the protection of student information, including contact information, grades, and graded assignments. I will use email to communicate with you, and I will not post grades or leave graded assignments in public places. Students will be notified at the time of an assignment if copies of student work will be retained beyond the end of the semester or used as examples for future students or the wider public. Students maintain intellectual property rights to work products they create as part of this course unless they are formally notified otherwise.
According to the University Policy File, students should notify instructors of planned absences for religious observances by the end of the second week of classes.
A complete list of all academic support services—including the Writing Center and Math Learning Center—is available on the Student Affairs’ Academic Success website. Counseling & Psychological Services (619-594-5220, sdsu.edu/cps) offers a range of psychological services for students. Emergency support is available after hours at the same phone number. The San Diego Access and Crisis Line can also be accessed 24 hours/day (1-888-724-7240).
As an instructor, one of my responsibilities is to help create a safe learning environment on our campus. I am a mandated reporter in my role as an SDSU employee. It is my goal that you feel able to share information related to your life experiences in classroom discussions, in your written work, and in our one-on-one meetings. I will seek to keep the information you share private to the greatest extent possible. However, I am required to share information regarding sexual violence on SDSU’s campus with the Title IX coordinator, Gail Mendez (619-594-6464). She (or her designee) will contact you to let you know about accommodations and support services at SDSU and possibilities for holding accountable the person who harmed you. Know that you will not be forced to share information you do not wish to disclose and your level of involvement will be your choice. If you do not want the Title IX Officer notified, instead of disclosing this information to your instructor, you can speak confidentially with the following people on campus and in the community. They can connect you with support services and discuss options for pursuing a University or criminal investigation. Sexual Violence Victim Advocate (619-594-0210) or Counseling and Psychological Services (619-594-5220, psycserv@sdsu.edu). For more information regarding your university rights and options as a survivor of sexual misconduct or sexual violence, please visit titleix.sdsu.edu.
If you or a friend are experiencing food or housing insecurity, technology concerns, or any unforeseen financial crisis, it is easy to get help! Visit sdsu.edu/ecrt for more information or to submit a request for assistance.
SDSU’s Economic Crisis Response Team (ECRT) aims to bridge the gap in resources for students experiencing immediate food, housing, or unforeseen financial crises that impact student success. Using a holistic approach to well-being, ECRT supports students through crisis by leveraging a campus-wide collaboration that utilizes on- and off-campus partnerships and provides direct referrals based on each student’s unique circumstances. ECRT empowers students to identify and access long-term, sustainable solutions in an effort to successfully graduate from SDSU. Within 24 to 72 hours of submitting a referral, students are contacted by a member of ECRT and are quickly connected to the appropriate resources and services.
For students who need assistance accessing technology for their classes, visit our ECRT website (sdsu.edu/ecrt) to be connected with the SDSU library's technology checkout program. The technology checkout program is available to both SDSU and Imperial Valley students.
For millennia, the Kumeyaay people have been a part of this land. This land has nourished, healed, protected and embraced them for many generations in a relationship of balance and harmony. As members of the San Diego State University community, we acknowledge this legacy. We promote this balance and harmony. We find inspiration from this land, the land of the Kumeyaay.