Big Data Analytics (21CSH-471)

The Iterative Nature of Data Science Projects

  1. Introduction to Data Science Projects:
    1. Stages and Lifecycle;
    2. Iterative process in Data Science:
      1. Problem Definition
      2. Data collection and exploration
      3. Model development and evaluation;
    3. Refinement and deployment;
    4. Importance of Iteration:
      1. Continuous improvement and error correction;
    5. Tools supporting Iteration:
      1. Notebooks
      2. Version Control
      3. CI/CD

Notebooks in Data Science

  1. Introduction to Data Science Notebooks:
    1. Characteristics –
      1. Interactive
      2. reproducible
      3. modular workflow
    2. Key benefits –
      1. Visualization
      2. Documentation
      3. Collaboration;
  2. Programming Languages for Data Science:
    1. Python – Libraries like pandas, NumPy and Matplotlib
    2. R – Strengths in statistical analysis and visualization;
    3. Mechanisms and Tolls in Notebooks:
      1. Code cells
      2. Markdown
      3. Widgets
      4. Extensions,
      5. Integration with Git and other data tools

Notebooks and Data Science tools in Big Data

  1. Major Data Science Notebooks:
    1. Jupyter Notebook,
    2. Google Colab
    3. Zeppelin,
  2. Comparing features:
    1. Offline vs. cloud,
    2. extensions
    3. performance;
  3. Getting started with Jupyter Notebook:
    1. Installation,
    2. environment setup,
    3. basic usage,
    4. Working with Python and R in Jupyter;
  4. Introduction to Tableau:
    1. Key features and use-cases,
    2. Data connection,
    3. visualization building
    4. Dashboard creation;
  5. Collaboration and Presentation tools for Data Insights

Data Visualization (21CSH-461)

Chapter 1: Programming and Tools for Statistical Data Visualization

  1. Java language for statistical data visualization,
  2. Web-based statistical graphics using XML technologies;
  3. Google Maps API for geographical data visualization,
  4. Google Chart for creating interactive charts and graphs,
  5. Tableau for advanced visualizations and heat map generation

Chapter 2: Rank Analysis and Trend Analysis Tools