Application Development & Programming Training Icon
Data Analyst Training Icon

Python in Data Science: Intermediate Python with NumPy, Pandas, SciKit Learn, SciPy, Spark, Streaming & More


5 Days Classroom Session   |  
5 Days Live Online
Classroom Registration
Individual:
$2595.00
Group Rate:
$2495.00
(per registrant, 2 or more)
GSA Individual:
$1894.35
Live Online Registration
Live Online:
$2595.00
Private Onsite Package

This course can be tailored to your needs for private, onsite delivery at your location.

Request a Private Onsite Price Quote

Professional Credits

IIBA (CDU)

ASPE is an IIBA Endorsed Education Provider of business analysis training. Select Project Delivery courses offer IIBA continuing development units (CDU) in accordance with IIBA standards.

PMI (PDU)

Select courses offer Leadership (PDU-L), Strategic (PDU-S) and Technical PMI professional development units that vary according to certification. Technical PDUs are available in the following types: ACP, PBA, PfMP, PMP/PgMP, RMP, and SP.

Certification
Overview

This course covers the essentials of using Python as a tool for data scientists to perform exploratory data analysis, complex visualizations, and large-scale distributed processing on “Big Data”. In this course, we cover essential mathematical and statistics libraries such as NumPy, Pandas, SciPy, SciKit-Learn, frameworks like TensorFlow and Spark, as well as visualization tools like matplotlib, PIL, and Seaborn. This course is ‘intermediate level’ as it assumes that attendees have solid data analytics and data science background and have basic Python knowledge.  Topics are introductory in nature but are covered in-depth, geared for experienced students.

This course is about 50% hands-on lab to 50% lecture ratio, combining engaging instructor presentations, demos, and practical group discussions with extensive machine-based student labs and project work. Throughout the course, students will learn to write Python scripts and apply them within a scientific framework working with the latest technologies listed on the agenda. This course provides indoctrination in the practical use of the umbrella of technologies that are on the leading edge of data science development. 

Working in a hands-on learning environment led by our expert practitioner, students will learn:

  • How to work with Python in a Data Science Context
  • How to use NumPy, Pandas, and MatPlotLib
  • How to create and process images with PIL
  • How to visualize with Seaborn
  • Key features of SciPy and Scikit Learn
  • How to interact with Spark using DataFrames
  • How to use SparkSQL, MLlib, and Streaming in BigData
Upcoming Dates and Locations
All Live Online times are listed in Eastern Time Guaranteed To Run
Request a quote for private onsite training Request
Nov 2, 2020 – Nov 6, 2020    10:00am – 6:00pm Live Online Register
Dec 7, 2020 – Dec 11, 2020    10:00am – 6:00pm Live Online Register
Course Outline

Session: Python for Data Science

Lesson: Python Review (Optional)

  • Python Language
  • Essential Syntax
  • Lists, Sets, Dictionaries, and Comprehensions
  • Functions
  • Classes, Modules, and imports
  • Exceptions

Lesson: iPython

  • iPython basics
  • Terminal and GUI shells
  • Creating and using notebooks
  • Saving and loading notebooks
  • Ad hoc data visualization
  • Web Notebooks (Jupyter)

Lesson: numpy

  • numpy basics
  • Creating arrays
  • Indexing and slicing
  • Large number sets
  • Transforming data
  • Advanced tricks

Lesson: scipy

  • What can scipy do?
  • Most useful functions
  • Curve fitting
  • Modeling
  • Data visualization
  • Statistics

Lesson: A tour of scipy subpackages

  • Clustering
  • Physical and mathematical Constants
  • FFTs
  • Integral and differential solvers
  • Interpolation and smoothing
  • Input and Output
  • Linear Algebra
  • Image Processing
  • Distance Regression
  • Root-finding
  • Signal Processing
  • Sparse Matrices
  • Spatial data and algorithms
  • Statistical distributions and functions
  • C/C++ Integration

Lesson: pandas

  • pandas overview
  • Dataframes
  • Reading and writing data
  • Data alignment and reshaping
  • Fancy indexing and slicing
  • Merging and joining data sets

Lesson: matplotlib

  • Creating a basic plot
  • Commonly used plots
  • Ad hoc data visualization
  • Advanced usage
  • Exporting images

Lesson: The Python Imaging Library (PIL)

  • PIL overview
  • Core image library
  • Image processing
  • Displaying images

Lesson: seaborn

  • Seaborn overview
  • Bivariate and univariate plots
  • Visualizing Linear Regressions
  • Visualizing Data Matrices
  • Working with Time Series data

Lesson: SciKit-Learn Machine Learning Essentials

  • SciKit overview
  • SciKit-Learn overview
  • Algorithms Overview
  • Classification, Regression, Clustering, and Dimensionality Reduction
  • SciKit Demo

Lesson: TensorFlow Overview

  • TensorFlow overview
  • Keras
  • Getting Started with TensorFlow

Session: Python on Spark

Lesson: PySpark Overview

  • Python and Spark
  • SciKit-Learn vs. Spark MLlib
  • Python at Scale
  • PySpark Demo

Lesson: RDDs and DataFrames

  • DataFrames and Resilient Distributed Datasets (RDDs)
  • Partitions
  • Adding variables to a DataFrame
  • DataFrame Types
  • DataFrame Operations
  • Dependent vs. Independent variables
  • Map/Reduce with DataFrames

Lesson: Spark SQL

  • Spark SQL Overview
  • Data stores: HDFS, Cassandra, HBase, Hive, and S3
  • Table Definitions
  • Queries

Lesson: Spark MLib

  • MLib overview
  • MLib Algorithms Overview
  • Classification Algorithms
  • Regression Algorithms
  • Decision Trees and forests
  • Recommendation with ALS
  • Clustering Algorithms
  • Machine Learning Pipelines
  • Linear Algebra (SVD, PCA)
  • Statistics in MLib

Lesson: Spark Streaming

  • Streaming overview
  • Integrating Spark SQL, MLlib, and Streaming
Who should attend
  • Experienced data analysts, developers, engineers or anyone tasked with utilizing Python for data analytics tasks. 
  • Attending students are required to have a background in basic Python development skills.
Pre-Requisites
  • Attending students are required to have a background in basic Python development skills.
  • Completion of or equivalent skills included in Applied Python for Data Science
0
1