Software Engineering for Data Scientists
Asynchronous
-
Application/Registration Deadlines
-
-
-
Application/Registration Link
You are registered for this program.
Overview
The Software Engineering for Data Scientists course is meant to help data scientists write production ready code as well as gain familiarity with the tools used to make models available to their users. The core idea we will be exploring is making code robust and re-usable across a team. This course can also serve as an introduction toward ideas used in ML Ops and Data Engineering.
Instructional Team
Kevin Nowland
Lead instructor, ML Ops engineer
Office Hours:
Intermittent Thursday Afternoons
Email:
Preferred Contact:
Slack
Please reach out on slack if you have any questions about the content in this course!
Objectives
After completing this course, you will be able to the following:
- Understand common tools used to deploy models for real-time inference
- Improve your code's robustness through unit testing
- Improve your code's readability through using linters and type checking
- Use basic command line commands
- Be able to implement a simple continuous integration pipeline using GitHub Actions
Project Examples
First Steps/Prerequisites
- Figure out how to access a terminal emulator, e.g., the Terminal program on Mac OS / Ubuntu
- If using Windows, enable the Windows Subsystem for Linux and access a terminal emulator
- Download pyenv and use it to install python 3.10.x
To access the program schedule and content, you must first create an account and member profile and be logged in.
Program Content
Textbook/Notes
Intro to the CLI - part 1
Getting ready to code
We’ll be talking about the different shells that allow you to interact with your computer, navigating the filesystem, and basic ways to manipulate files.
Dependency management
Getting ready to code
How to setup a python environment with an emphasis on communicating what the environment is across a team
Unit testing
Code quality
We talk about refactoring code to by DRY, the scope of functions, and unit testing some (but not all!) of them
Intro to the CLI - part 2
Getting ready to code
We'll talk about configuring your shell, file ownership and permissions, how to talk to other computers, and other useful command line tools.
Code style
Code quality
Taking to heart the truism that each line of code will be read more often than it is written, we explore python style as commonly used in the greater python community
Type hints
Code quality
In this video we’re going to talk about type hints, which is a way to help you and your teammates know what objects a function should be returning and what types of objects a function requires.
Text Editors
Getting ready to code
A brieg introduction to the command line editor vim and compare with VS Code, a full IDE.
Packages
Code quality
We present one way to write a pip-installable python package, refactoring code written in the previous video.
Trunk based development
Code quality
How do we incorporate changes to a repository when multiple team members are working on the repository at once?
Project/Homework Instructions
Schedule
Click any date for more details
Please check your registration email for program schedule and zoom links.
Project/Homework Deadlines