Python 203 - Python for Data Sciences (PDS)

 

Course Overview

In this course you will learn to use Python, the most popular programming language for data sciences, for data analysis and data visualization. Explore Python libraries to more easily sort and analyze data sets for emerging trends. Quickly produce Excel quality visualizations appropriate for displaying data in real time monitoring systems.

Intro to data science using Python libraries like pandas and numpy to identify trends within datasets. Create rich visualizations with matplotlib, folium and seaborn. Use open source toolset scipy for mathematics, science, and engineering applications. Introduction to scikit-learn, a machine learning tool for datasets.

Who should attend

This course was written for professionals interested in Python and Data Sciences. This includes: Engineers, Mathematicians, Actuaries, Network Specialists, System Admins, and developers.

Prerequisites

Keyboard proficiency, and some previous python coding experience is the only hard requirement. Students with some previous exposure to Python, or any another scripting experience, will take the most from the course. In lieu of previous experience, Alta3 Research’s Python Basics course is recommended.

Recommended Prerequisite: Python Basics (5 days)

Follow On Courses

Outline: Python 203 - Python for Data Sciences (PDS)

Introduction to Python Libraries for Data Sciences

  • Python with Jupyter Notebook overview
    • Live code
    • Equations
    • Data cleaning
    • Transformation
    • Numerical simulation
    • Statistical modeling
    • Data visualization
    • Machine Learning
  • Pandas
    • Filter DataFrames
    • Dictionaries to DataFrames
    • CSV to DataFrames
    • Excel to DataFrames
  • Numpy
    • Work across arrays
  • Requests
    • Pull from RESTful APIs
    • JSON

Sort, Analyze, and Visualize Data with Python

  • Matplotlib
    • Line Plots
    • Area Plots
    • Histograms
    • Bar Charts
    • Pie Charts
    • Box Plots
    • Scatter Plots
    • Bubble Plots
    • Waffle Charts
    • Word Clouds
  • Seaborn
    • visualization techniques
      • Relational
      • Categorical
      • Distributions
      • Regressions
  • Folium
    • interactive leaflet maps
    • choropleth visualizations
    • rich vector/raster/HTML visual markers
  • Saving visualizations output in various formats

Python and Databases

  • Creating a database engine in Python
  • sqlite3
  • Looking at tables in a database
  • Querying relational databases
  • MySQL and Python
  • SQL Queries
    • Filtering with SQL WHERE
    • Ordering with SQL ORDER BY
    • Querying with pandas
    • Table relationships with INNER JOIN
  • MongoDB
    • Understanding noSQL
    • Python and MongoDB
    • Pymongo
      • Query
      • Find
      • Delete
      • Update
      • Limit

Introduction to Machine Learning with Python

  • scipy open ecosystem
    • numerical integration
    • Interpolation
    • Optimization
    • linear algebra
    • statistics
  • Scikit-learn
    • Applications of Machine Learning
    • Training vs Testing sets
    • Supervised vs Unsupervised Learning
    • Python libraries suitable for Machine Learning
    • Loading an example dataset
    • Learning and predicting

Introduction to Machine Learning with Python (continued)

  • Scikit-learn
    • Model persistence
    • Conventions
    • Refitting and updating parameters
    • Multiclass vs. multilabel fitting
  • Moving output to remote systems
    • Streaming (push) to real-time dashboard APIs
    • Move data with SFTP
    • Email attachments

Prices & Delivery methods

Online Training

Duration
5 days

Price
  • CAD 3,160
Classroom Training

Duration
5 days

Price
  • Canada: CAD 3,160

Schedule

Currently there are no training dates scheduled for this course.