Data Science with Python - Simplilearn | IT Training & Certification | Info Trek
Respect Your Dreams
Follow through on your goals with courses

Data Science with Python - Simplilearn

  • On Demand Class Icon
    On Demand
    • HRDF SBL Claimable
    • Certificate of Attendance available
    • 180 days of access from date of purchase
    Starting From
    RM 1550.79
    44 Hours
  • Private Class Icon
    Private Class
    • All of our private classes are customized to your organization's needs.

      Click on the button below to send us your details and you will be contacted shortly.
    0 Days

Course Details

Expand All

The Data Science with Python course explores different Python libraries and tools that help you tackle each stage of Data Analytics. Python is a general purpose multi-paradigm programming language for data science that has gained wide popularity-because of its syntax simplicity and operability on different eco-systems. This Python course can help programmers play with data by allowing them to do anything they need with data - data munging, data wrangling, website scraping, web application building, data engineering and more. Python language makes it easy for programmers to write maintainable, large scale robust code.

The course starts off with a brief introduction to Data Science, statistical concepts pertaining to Data Analytics, and a few basic concepts of Python programming. It then goes on to cover in-depth content for libraries such as NumPy, Pandas, SciPy, scikit-learn, and Matplotlib. The course also tackles important activities such as web scraping and Python integration with Hadoop MapReduce and Spark.

  • Analytics professionals who want to work with Python or Open source tools
  • Software professionals looking for a career switch in the field of analytics
  • Graduates looking to build a career in Analytics and Data Science
  • Experienced professionals who would like to harness data science in their fields
  • Anyone with a genuine interest in the field of Data Science

After completing this course, you will be able to:

  • Outline what Data Science and how Python can help implement it
  • Describe each stage of the Data Analytics process
  • Explain basic statistical concepts relevant to Data Analytics
  • Install the required Python environment and other auxiliary tools and libraries
  • Review the important concepts of Python programming used to implement Data Science
  • Demonstrate the use of the major Python libraries such as NumPy, Pandas, SciPy, scikit-learn, and Matplotlib to carry out different aspects of the Data Analytics process
  • Employ different tools and methods to perform web scraping
  • Illustrate Python integration with Hadoop MapReduce and Spark

Modules

Expand All
  • Data Science
  • Data Scientists
  • Examples of Data Science
  • Python for Data Science
  • Introduction to Data Visualization
  • Processes in Data Science
  • Data Wrangling, Data Exploration, and Model Selection
  • Exploratory Data Analysis or EDA
  • Data Visualization
  • Plotting
  • Hypothesis Building and Testing
  • Introduction to Statistics
  • Statistical and Non-Statistical Analysis
  • Some Common Terms Used in Statistics
  • Data Distribution: Central Tendency, Percentiles, Dispersion
  • Histogram
  • Bell Curve
  • Hypothesis Testing
  • Chi-Square Test
  • Correlation Matrix
  • Inferential Statistics
  • Introduction to Anaconda
  • Installation of Anaconda Python Distribution - For Windows, Mac OS, and Linux
  • Jupyter Notebook Installation
  • Jupyter Notebook Introduction
  • Variable Assignment
  • Basic Data Types: Integer, Float, String, None, and Boolean; Typecasting
  • Creating, accessing, and slicing tuples
  • Creating, accessing, and slicing lists
  • Creating, viewing, accessing, and modifying dicts
  • Creating and using operations on sets
  • Basic Operators: 'in', '+', '*'
  • Functions
  • Control Flow
  • NumPy Overview
  • Properties, Purpose, and Types of ndarray
  • Class and Attributes of ndarray Object
  • Basic Operations: Concept and Examples
  • Accessing Array Elements: Indexing, Slicing, Iteration, Indexing with Boolean Arrays
  • Copy and Views
  • Universal Functions (ufunc)
  • Shape Manipulation
  • Broadcasting
  • Linear Algebra
  • SciPy and its Characteristics
  • SciPy sub-packages
  • SciPy sub-packages –Integration
  • SciPy sub-packages – Optimize
  • Linear Algebra
  • SciPy sub-packages – Statistics
  • SciPy sub-packages – Weave
  • SciPy sub-packages - I O
  • Introduction to Pandas
  • Data Structures
  • Series
  • DataFrame
  • Missing Values
  • Data Operations
  • Data Standardization
  • Pandas File Read and Write Support
  • SQL Operation
  • Introduction to Machine Learning
  • Machine Learning Approach
  • How Supervised and Unsupervised Learning Models Work
  • Scikit-Learn
  • Supervised Learning Models - Linear Regression
  • Supervised Learning Models: Logistic Regression
  • K Nearest Neighbors (K-NN) Model
  • Unsupervised Learning Models: Clustering
  • Unsupervised Learning Models: Dimensionality Reduction
  • Pipeline
  • Model Persistence
  • Model Evaluation - Metric Functions
  • NLP Overview
  • NLP Approach for Text Data
  • NLP Environment Setup
  • NLP Sentence analysis
  • NLP Applications
  • Major NLP Libraries
  • Scikit-Learn Approach
  • Scikit - Learn Approach Built - in Modules
  • Scikit - Learn Approach Feature Extraction
  • Bag of Words
  • Extraction Considerations
  • Scikit - Learn Approach Model Training
  • Scikit - Learn Grid Search and Multiple Parameters
  • Pipeline
  • Introduction to Data Visualization
  • Python Libraries
  • Plots
  • Matplotlib Features:
  • Line Properties Plot with (x, y)
  • Controlling Line Patterns and Colors
  • Set Axis, Labels, and Legend Properties
  • Alpha and Annotation
  • Multiple Plots
  • Subplots
  • Types of Plots and Seaborn
  • Web Scraping
  • Common Data/Page Formats on The Web
  • The Parser
  • Importance of Objects
  • Understanding the Tree
  • Searching the Tree
  • Navigating options
  • Modifying the Tree
    • Parsing Only Part of the Document
    • Printing and Formatting
    • Encoding
  • Need for Integrating Python with Hadoop
  • Big Data Hadoop Architecture
  • MapReduce
  • Cloudera QuickStart VM Set Up
  • Apache Spark
  • Resilient Distributed Systems (RDD)
  • PySpark
  • Spark Tools
  • PySpark Integration with Jupyter Notebook

Reviews

0
based on 0 ratings reviews