Quantitative Data management, analysis and Visualization with Python
Details
Introduction
This comprehensive course will be your guide to learning how to use the power of Python to analyze big data, create beautiful visualizations, and use powerful machine learning algorithms. This course is designed for both beginners with basic programming experience or experienced developers looking to make the jump to Data Science and big data Analysis. Python has been one of the most adaptable, and robust open-source languages that are easy to learn and uses powerful libraries for data manipulation and analysis. For many years now, Python has been used in scientific computing and mathematical domains such as physics, finance, oil and gas, and signal processing.This Big Data Analytics with Python course provides a complete overview of data analysis techniques using Python.
Outline
Course content
Module1: Basic statistical terms and concepts
· Introduction to statistical concepts
· Descriptive Statistics
· Inferential statistics
Module 2:Research Design
· The role and purpose of research design
· Types of research designs
· The research process
· Which method to choose?
· Exercise: Identify a project of choice and developing a research design
Module 3: Survey Planning, Implementation and Completion
· Types of surveys
· The survey process
· Survey design
· Methods of survey sampling
· Determining the Sample size
· Planning a survey
· Conducting the survey
· After the survey
· Exercise: Planning for a survey based on the research design selected
MODULE 4: DATA SCIENCE OVERVIEW
- Introduction to Data Science
- Different Sectors Using Data Science
- Purpose and Components of Python
MODULE 5: DATA ANALYTICS OVERVIEW
- Data Analytics Process
- Knowledge Check
- Exploratory Data Analysis (EDA)
- EDA-Quantitative Technique
- EDA – Graphical Technique
- Data Analytics Conclusion or Predictions
- Data Analytics Communication
- Data Types for Plotting
- Data Types and Plotting
MODULE 6: STATISTICAL ANALYSIS AND BUSINESS APPLICATIONS
- Introduction to Statistics
- Statistical and Non-statistical Analysis
- Major Categories of Statistics
- Statistical Analysis Considerations
- Population and Sample
- Statistical Analysis Process
- Data Distribution
- Dispersion
- Histogram
- Correlation and Inferential Statistics
MODULE 7 PYTHON ENVIRONMENT SETUP AND ESSENTIALS
- Anaconda
- Installation of Anaconda Python Distribution
- Data Types with Python
- Basic Operators and Functions
MODULE 8: MATHEMATICAL COMPUTING WITH PYTHON (NUMPY)
- Introduction to NumPy
- Activity-Sequence it Right
- Creating and Printing an nd array
- Class and Attributes of nd array
- Basic Operations
- Copy and Views
- Mathematical Functions of NumPy
- Evaluate the datasets containing GDPs of different countries
- Evaluate the datasets of Summer Olympics 2012
MODULE 9: SCIENTIFIC COMPUTING WITH PYTHON (SCIPY)
- Introduction to SciPy
- SciPy Sub Package – Integration and Optimisation
- SciPy Sub package
- Demo – Calculate Eigenvalues and Eigenvector
- Use SciPy to solve a linear algebra problem
- Use SciPy to define 20 random variables for random values
MODULE 10: DATA MANIPULATION WITH PANDAS
- Introduction to Pandas
- Understanding DataFrame
- View and Select Data Demo
- Missing Values
- Data Operations
- File Read and Write Support
- Pandas SQL Operation
- Analyse the Federal Aviation Authority (FAA) dataset using Pandas
- Analyse the dataset in CSV format given for fire department