Big Data Analytics (21CSH-471)

Understanding Big Data and the 5 V’s

  1. Introduction to Big Data – Definition and Characteristics;
  2. The 5 V’s of Big Data
    1. Volume: Data at scale,
    2. Velocity: Real-time data processing,
    3. Variety: Structured, semi-structured, unstructured data,
    4. Veracity: Uncertainty and trustworthiness in data,
    5. Value: Transforming data into insights; Challenges and Opportunities in Big Data; Big Data Use Cases in Real-World Applications

Big Data Architecture

  1. Fundamentals of Big Data Architecture:
    1. Data ingestion
    2. Storage
    3. Processing and visualization layers
  2. Hadoop Ecosystem in Big Data Architecture:
    1. Tools like HDFS, YARN, Hive and Sqoop
  3. Streaming Data in Big Data:
    1. Tools such as Apache Kafka and Flink
  4. Real-World Big Data Architecture:
    1. Lambda and Kappa Architectures,
    2. Hybrid Architecture for batch and real-time processing

The Hadoop Ecosystem

  1. Introduction to the Hadoop Ecosystem
  2. HDFS (Hadoop Distributed File System):
    1. Architecture and Functionality
  3. MapReduce Programming Model:
    1. Workflow and Applications
  4. YARN (Yet Another Resource Negotiator):
    1. Resource Management
  5. Tools in the Ecosystem:
    1. Pig, HBase, Flume, and Oozie
  6. Data Processing with Hadoop:
    1. ETL, Analytics and Reporting

Data Visualization (21CSH-461)

Chapter 1: Data Handling and Introduction to Visualization

  1. Data extraction, cleaning, and annotation