Curriculum

CURRICULUM

Data Science

Module 1: Excel For Data Science
  • Introduction to Excel
    • Overview of Excel interface and navigation
    • Creating and saving workbooks
    • Entering and editing data
    • Basic formatting techniques
  • Essential Functions and Formulas
    • Understanding basic arithmetic functions (SUM, AVERAGE, etc.)
    • Logical functions (IF, AND, OR)
    • Lookup and reference functions (VLOOKUP, HLOOKUP, INDEX, MATCH)
    • Date and time functions (TODAY, DATE, YEAR, MONTH, DAY)
  • Data Management and Analysis
    • Sorting and filtering data
    • Conditional formatting
    • Data validation and error checking
    • Using tables for structured data management
  • Advanced Formulas and Functions
    • Text functions (LEFT, RIGHT, CONCATENATE)
    • Statistical functions (COUNTIF, SUMIF, AVERAGEIF)
    • Array formulas and handling array operations
    • Mathematical and trigonometric functions
  • Data Visualization
    • Creating and customizing charts (bar, line, pie, scatter)
    • Using PivotTables and PivotCharts for data analysis
    • Advanced charting techniques (combo charts, sparklines)
    • Data analysis with What-If Analysis tools
  • Working with Large Datasets
    • Handling large datasets efficiently
    • Using filters and slicers for data analysis
    • Importing data from external sources (text files, CSV)
  • Advanced Data Analysis
    • Goal Seek and Solver for optimization problems
    • Statistical analysis using Data Analysis ToolPak
    • Advanced data modeling techniques
    • Using scenario manager and forecasting tools
  • Collaboration and Sharing
    • Sharing workbooks and managing shared workbooks
    • Protecting data and workbook structure
    • Reviewing and tracking changes
    • Using Excel in collaborative environments (OneDrive, SharePoint)
  • Excel Tips and Tricks
    • Efficiency tips and keyboard shortcuts
    • Customizing Excel settings and options
    • Troubleshooting common issues
    • Best practices for using Excel in business contexts
Module 2: Tableau for Data Science
  • Introduction to Tableau
    • Overview of Tableau: Purpose, features, and benefits
    • Understanding Tableau workspace: Data pane, marks card, and shelves
    • Connecting to data sources: Excel, CSV, databases, and cloud platforms
    • Basic visualization types: Bar charts, line graphs, scatter plots, and pie charts
    • Applying filters, sorting, and grouping data
  • Intermediate Tableau
    • Advanced chart types: Heat maps, tree maps, box plots, and dual-axis charts
    • Working with calculated fields and parameters
    • Tableau dashboards: Design principles and best practices
    • Interactive features: Tooltips, actions, and highlighters
    • Using sets, groups, and hierarchies for data analysis
  • Advanced Tableau Techniques
    • Advanced calculations: LOD expressions (Level of Detail)
    • Mapping in Tableau: Geo-spatial data visualization and custom territories
    • Integration with R and Python for advanced analytics
    • Implementing table calculations and trend lines
    • Dashboard interactivity: Parameters, actions, and filters
  • Tableau Server and Online
    • Deploying Tableau workbooks on Tableau Server and Tableau Online
    • Managing permissions and access control
    • Scheduling and refreshing extracts
    • Collaboration and sharing: Workbooks, views, and subscriptions
    • Monitoring and performance optimization
  • Advanced Data Visualization Strategies
    • Designing effective dashboards for different audiences
    • Visual analytics: Exploring trends, outliers, and correlations
    • Storytelling with data: Using Tableau stories for impactful presentations
    • Integration with Big Data platforms and real-time data sources
    • Best practices in data visualization and dashboard design
Module 3: SQL for Data Analysis
  • Introduction to SQL
    • Overview of relational databases: Tables, rows, columns, and relationships
    • Introduction to SQL: History, standards, and common database systems
    • Basic SQL commands:
      • SELECT statement: Retrieving data from a single table
      • INSERT statement: Adding new records to a table
      • UPDATE statement: Modifying existing records in a table
      • DELETE statement: Removing records from a table
    • Filtering data: WHERE clause, comparison operators, and logical operators
    • Sorting and limiting results: ORDER BY, LIMIT, OFFSET
  • Advanced SQL Queries
    • Joins and relationships:
      • INNER JOIN, LEFT JOIN, RIGHT JOIN, FULL JOIN
      • Self-joins and cross joins
    • Subqueries and derived tables:
      • Scalar subqueries
      • Correlated subqueries
      • Common table expressions (CTEs)
    • Set operations:
      • UNION, UNION ALL, INTERSECT, EXCEPT
    • Aggregation functions:
      • SUM, AVG, MIN, MAX, COUNT
      • GROUP BY and HAVING clauses
  • Data Manipulation and Transaction Control
    • String and date functions:
      • String manipulation functions (CONCAT, SUBSTRING)
      • Date and time functions (DATE, TIME, TIMESTAMP)
    • Updating and deleting data:
      • UPDATE and DELETE statements with conditions
      • Handling NULL values with COALESCE and IS NULL/IS NOT NULL
    • Transactions and concurrency control:
      • ACID properties of transactions
      • COMMIT, ROLLBACK statements
      • Locking mechanisms and isolation levels
  • SQL Optimization and Performance Tuning
    • Indexing and query optimization:
      • Creating and managing indexes
      • Analyzing query execution plans
      • Using hints (INDEX, OPTIMIZE FOR) to improve performance
    • Partitioning strategies:
      • Range, list, and hash partitioning
      • Managing partitioned tables
    • Optimizing joins and subqueries:
      • Rewriting queries for performance
      • Using EXISTS and NOT EXISTS for efficient subquery evaluation
  • Advanced SQL Topics
    • Stored procedures and functions:
      • Creating and managing stored procedures
      • Using parameters and variables in procedures
      • Error handling and transaction control within procedures
    • Triggers and events:
      • Defining triggers for automated actions
      • Trigger types: BEFORE, AFTER, INSTEAD OF
    • Dynamic SQL:
      • Generating and executing SQL statements dynamically
      • Building flexible queries based on runtime conditions
    • Security and permissions:
      • Granting and revoking privileges
      • Managing roles and users
      • Implementing row-level security with views and policies
Module 4: Power BI
  • Introduction to Power BI
    • Overview of Power BI: Components and capabilities
    • Installing and setting up Power BI Desktop
    • Connecting to various data sources: Excel, databases, web data
    • Understanding Power BI Service and Power BI Mobile
  • Power BI Basics
    • Building your first report: Creating visuals and dashboards
    • Working with Power Query Editor: Data loading, transformation, and cleaning
    • Creating calculated columns and measures using DAX
    • Using Power BI visuals: Bar charts, line charts, maps, and more
  • Advanced Power BI Features
    • Advanced data modeling with relationships: One-to-one, one-to-many, many-to-many
    • Using DAX functions for complex calculations:
      • Aggregation functions: SUM, AVERAGE, COUNT, MIN, MAX
      • Time intelligence functions: YTD, QTD, MTD, DATESYTD, PARALLELPERIOD
      • Filter functions: CALCULATE, ALL, FILTER, RELATEDTABLE
    • Advanced visualization techniques:
      • Custom visuals and marketplace integrations
      • Drill-down and drill-through capabilities
      • Bookmarks, tooltips, and interactive features
  • Power BI Data Analysis and Reporting
    • Advanced data transformation with Power Query Editor:
      • Merging queries, conditional columns, unpivoting data
      • Advanced data cleaning techniques
    • Data analysis with Power BI:
      • Statistical analysis: Regression analysis, clustering, and forecasting
      • Integrating R and Python scripts for advanced analytics
  • Power BI Deployment and Administration
    • Deploying Power BI reports to Power BI Service
    • Managing datasets, workspaces, and permissions
    • Scheduling data refresh and optimizing performance
    • Security considerations in Power BI: Row-level security and encryption
  • Real-World Applications and Best Practices
    • Building end-to-end solutions with Power BI: From data ingestion to visualization
    • Case studies and industry-specific applications
    • Best practices for designing efficient and effective Power BI reports
    • Collaboration and sharing insights with Power BI Service
Module 5: Alteryx for Data Processing
  • Introduction to Alteryx
    • Overview of Alteryx Designer interface
    • Alteryx Designer tools and workflow concepts
    • Connecting to data sources
  • Data Preparation and Blending
    • Input and Output tools
      • Text Input, Excel Input/Output, Database Input/Output
    • Data Cleansing tools
      • Data Cleansing, Select, Filter, Sort
    • Join tools
      • Join, Union, Append, Join Multiple
  • Spatial and Demographic Analysis
    • Spatial tools
      • Trade Area, Distance, Create Points, Spatial Match
    • Demographic tools
      • Demographic Summary, Demographic Imputation, Address/Census Info
  • Parsing and Transforming Data
    • Parsing tools
      • Text to Columns, Data Cleansing, RegEx
    • Transform tools
      • Formula, Filter, Summarize, Multi-Field Formula
  • Advanced Data Blending and Analysis
    • Data Investigation tools
      • Frequency, Histogram, Summarize
    • Data Transformation tools
      • Cross Tab, Transpose, Dynamic Rename, Data Cleansing
  • Reporting and Visualization
    • Reporting tools
      • Table, Layout, Charting
    • Interactive tools
      • Filter, Drop Down, Radio Button, Action
  • Advanced Analytics and Predictive Tools
    • Predictive tools
      • Predictive Grouping, Find Nearest, Predictive Tools
    • Time Series Analysis
      • Time Series Charts, Time Series Formula, Forecast, ARIMA
  • Macros and Batch Processing
    • Macro Basics
      • Macro Input/Output, Interface Tools, Action Tools
    • Batch Processing
      • Batch Macro, Control Parameter
  • Advanced Techniques and Optimization
    • Optimization tools
      • Cache, Sample, Summarize, Unique
  • Integration and Automation
    • API tools
      • Download, Parse, Query
    • Integration tools
      • R/Python, XML/JSON, API Connect
  • Debugging and Error Handling
    • Debugging tools
      • Browse, Comment, Sample, Write Data
    • Error Handling
      • Error Message, Log Message, Test
  • Best Practices and Efficiency Tips
    • Efficiency tools
      • Cache, Sample, Summarize, Unique
    • Best practices
      • Documentation, Annotation, Workflow Overview
Module 6: Python for Data Science
  • Basic Python for Data Science
    • Introduction to Python
      • What is Python?
      • Installing Python and IDEs (Anaconda, Jupyter Notebooks)
      • Python syntax basics: variables, data types, operators
    • Control Flow and Functions
      • Conditional statements: if, elif, else
      • Loops: for loops, while loops
      • Functions: defining functions, arguments, return statements
    • Data Structures
      • Lists, tuples, dictionaries, sets
      • Indexing and slicing
      • List comprehensions
    • NumPy for Numerical Computing
      • Introduction to NumPy arrays
      • Array creation, indexing, slicing
      • Array operations: arithmetic, broadcasting
      • Array methods: reshaping, stacking, splitting
  • Intermediate Python for Data Science
    • Pandas for Data Manipulation
      • Introduction to Pandas DataFrames
      • Data ingestion: reading and writing data
      • Data cleaning and preprocessing
      • Indexing and selecting data
    • Data Visualization with Matplotlib and Seaborn
      • Introduction to Matplotlib: basic plots (line plots, scatter plots, histograms)
      • Customizing plots: labels, titles, colors
      • Introduction to Seaborn: statistical visualization, pair plots
      • Plotting with Pandas and Seaborn
    • Working with APIs and Web Scraping
      • HTTP requests: GET, POST
      • Introduction to JSON and XML
      • Accessing web APIs with Python
      • Web scraping using BeautifulSoup and requests
  • Advanced Python for Data Science
    • Machine Learning with Scikit-Learn
      • Introduction to machine learning concepts
      • Supervised learning: regression, classification
      • Unsupervised learning: clustering, dimensionality reduction
      • Model evaluation and validation
    • Advanced Data Analysis
      • Time series analysis
      • Handling missing data
      • Statistical methods and hypothesis testing
      • Advanced data manipulation with Pandas
    • Integration with Big Data and Cloud Platforms
      • Introduction to Apache Spark with PySpark
      • Connecting Python with cloud platforms (AWS, Google Cloud)
      • Distributed computing and data processing
  • Capstone Project
    • Applying Python and data science skills to a real-world project
    • Data exploration, analysis, and visualization
    • Presenting insights and findings
Download Curriculum

Machine Learning

Module 1: Fundamentals of Machine Learning
  • Types of ML: Supervised, unsupervised, and reinforcement learning with examples
  • The ML workflow: Problem definition, data collection, preprocessing, modeling, and deployment
  • Training, validation, and test sets: Purpose and best practices for data splitting
  • Overfitting and underfitting: Causes, detection, and mitigation strategies
  • Bias-variance tradeoff: Understanding the fundamental tension in machine learning
  • Feature engineering and selection techniques
  • Introduction to deep learning and neural networks
Module 2: Supervised Learning
  • Linear and polynomial regression: Ordinary least squares, regularization (Ridge, Lasso)
  • Logistic regression: Binary and multiclass classification
  • Decision trees and random forests: Entropy, information gain, and ensemble methods
  • Support Vector Machines: Linear and non-linear kernels, margin optimization
  • K-Nearest Neighbors: Distance metrics, choosing k, and the curse of dimensionality
  • Naive Bayes classifiers: Gaussian, Multinomial, and Bernoulli variants
  • Gradient Boosting Machines: XGBoost, LightGBM, and CatBoost
Module 3: Unsupervised Learning
  • K-means clustering: Algorithm, choosing k, and silhouette analysis
  • Hierarchical clustering: Agglomerative and divisive approaches
  • Principal Component Analysis (PCA): Dimensionality reduction and feature extraction
  • t-SNE: Non-linear dimensionality reduction for data visualization
  • Association rule learning: Apriori algorithm and frequent itemset mining
  • Gaussian Mixture Models and Expectation-Maximization algorithm
  • Anomaly detection techniques: Isolation Forest and One-Class SVM
Module 4: Neural Networks and Deep Learning
  • Perceptrons and multilayer networks: Activation functions and forward propagation
  • Backpropagation algorithm: Chain rule and gradient descent optimization
  • Convolutional Neural Networks (CNNs): Convolution, pooling, and applications in computer vision
  • Recurrent Neural Networks (RNNs) and LSTMs: Sequential data processing and natural language tasks
  • Transfer learning: Fine-tuning pre-trained models for new tasks
  • Generative Adversarial Networks (GANs): Architecture and applications
  • Attention mechanisms and Transformers: BERT, GPT, and their variants
Module 5: Model Evaluation and Optimization
  • Evaluation metrics for classification: Accuracy, precision, recall, F1-score, ROC-AUC
  • Evaluation metrics for regression: MSE, MAE, R-squared, RMSE
  • Cross-validation techniques: k-fold, stratified k-fold, and leave-one-out
  • Hyperparameter tuning: Grid search, random search, and Bayesian optimization
  • Ensemble methods: Bagging, boosting, and stacking
  • Feature importance and selection: Filter, wrapper, and embedded methods
  • Model interpretation techniques: SHAP values and LIME

Artificial Intelligence

Module 1: Introduction to AI
  • History and evolution of AI: From early rule-based systems to modern deep learning
  • Types of AI: Narrow (weak) AI vs general (strong) AI, and superintelligence
  • AI paradigms: Symbolic AI (rule-based systems) vs machine learning (data-driven approaches)
  • Turing test and Chinese room argument: Philosophical debates on machine intelligence
  • Ethics and societal impact of AI: Bias, privacy, job displacement, and existential risks
  • AI in industry: Current applications and future trends
  • Challenges in AI development: Explainability, robustness, and alignment with human values
Module 2: Search and Problem Solving
  • Problem formulation and state space: Defining goals, actions, and transition models
  • Uninformed search strategies: Breadth-first search (BFS), depth-first search (DFS), and uniform cost search
  • Informed search: A* algorithm, admissible heuristics, and optimality
  • Constraint satisfaction problems: Backtracking, forward checking, and arc consistency
  • Game playing and minimax algorithm: Alpha-beta pruning and expectimax for probabilistic scenarios
  • Local search algorithms: Hill climbing, simulated annealing, and genetic algorithms
  • Planning under uncertainty: Markov decision processes and reinforcement learning
Module 3: Knowledge Representation and Reasoning
  • Propositional logic: Syntax, semantics, and inference rules
  • First-order logic: Quantifiers, predicates, and unification
  • Inference and resolution: Forward chaining, backward chaining, and resolution refutation
  • Probabilistic reasoning: Bayesian networks, inference in graphical models
  • Fuzzy logic: Membership functions, fuzzy set operations, and fuzzy inference systems
  • Ontologies and semantic web: RDF, OWL, and knowledge graphs
  • Reasoning under uncertainty: Dempster-Shafer theory and possibility theory
Module 4: Natural Language Processing
  • Text preprocessing and tokenization: Stemming, lemmatization, and stop word removal
  • Part-of-speech tagging and named entity recognition: HMMs and CRFs
  • Syntax and parsing: Context-free grammars, dependency parsing, and constituency parsing
  • Sentiment analysis: Lexicon-based and machine learning approaches
  • Machine translation: Statistical and neural machine translation models
  • Text generation: Language models, sequence-to-sequence models, and transformer architectures
  • Question answering and dialogue systems: Information retrieval and conversational AI
Module 5: Computer Vision
  • Image processing fundamentals: Filters, edge detection, and morphological operations
  • Feature detection and matching: SIFT, SURF, and ORB algorithms
  • Object detection and recognition: R-CNN family, YOLO, and SSD
  • Image segmentation: Thresholding, region-based, and semantic segmentation
  • Face recognition: Eigenfaces, local binary patterns, and deep learning approaches
  • 3D computer vision: Stereo vision, structure from motion, and SLAM
  • Image segmentation
  • Face recognition