Applied Data Science and Machine Learning in Python

Duration: 6 months;

Technologies: Data Science and Machine Learning in Python

This data science bootcamp is a deep dive into the fundamentals of data science and machine learning with Python.Throughout the course, you will gain a comprehensive understanding of the entire data science process from end-to-end, including data prep, data analysis and visualization, as well as how to properly apply machine learning algorithms to various situations or tasks. You’ll also walk away with a portfolio of projects showcasing your data science certification to prospective employers.

Python for Data Science

Learn the Python fundamentals needed for data science.

Manipulating and Understanding Data

Learn how to load, clean, and manipulate data using the Python library Pandas. Additionally, you will learn the strengths and weaknesses of using Python to manipulate data.

Foundations of Data Modeling

Build visualizations to not only understand your data, but also how to communicate results to stakeholders.

Statistical Inference

Learn how to use Python to implement key statistical techniques and understand statistics better by

experimenting with Python on real-world datasets. This week concludes with a project to showcase your knowledge.

Intro to Machine Learning

What is machine learning and why should you use the Python library Scikit-Learn for Machine Learning. Topics include types of machine learning, how to format your data to be acceptable for an algorithm, and how to train an algorithm.

Decision Trees & Random Forests

Learn about tree-based machine learning algo¬rithms, how to tune them to maximize their performance, and the strengths and weaknesses of each algorithm. Additional topics include feature selection for machine learning and comparing machine learning algorithms.

Logistic Regression and Regularization

Learn about the logistic regression algorithm and get a visual understanding of how the algorithm works. Additional

topics include logistic regres¬sion for multiclass classification, L1 and L2 regular¬ization, and hyperparameter

tuning the algorithms learned so far.

Clustering Algorithms

You’ll learn about a host of clustering algorithms, how to tune them, and the strengths and weaknesses of each.

Dimensionality Reduction

What is dimensionality reduction? How to use it for data visualization, speed up machine learn-ing algorithms, and understand your data better. Algorithms covered include Principal Component Analysis (PCA).

Gradient Boosting Machines

Learn what gradient boosting algorithms are, why they are so performant, and how to get started with Kaggle competitions.

Using SQL with Python

Working with databases is an essential part of being a data analyst, data scientist, and data engineer. This unit will cover how SQL and Python work together.

Intro to Deep Learning

Learn about why deep learning has transformed industries, various deep learning frameworks, and when to use deep learning techniques. Topics include recurrent neural networks (RNN) and Convolutional Neural Networks (CNN).

Database Architecture

Become familiar with entity relationship diagrams (ERD) and learn the advantages of using a relational database.

Learn intermediate SQL queries to access and aggregate information.

Intro to ETL

Develop an understanding of the process of extracting, transforming, and loading data.

Introduction to Statistics

Learn tools for statistical analysis including measures of central tendency, variance, and standard deviation and comparing means.

Model Assumptions

Explore model assumptions and how to test for them. Apply this knowledge to choose the appropriate model for a data set.

Capstone project