This course can be tailored to your needs for private, onsite delivery at your location.
IIBA (CDU)
ASPE is an IIBA Endorsed Education Provider of business analysis training. Select Project Delivery courses offer IIBA continuing development units (CDU) in accordance with IIBA standards.
NASBA (CPE)
NASBA continuing professional education credits (CPE) assist Certified Public Accountants in reaching their continuing education requirements.
PMI (PDU)
Select courses offer Leadership (PDU-L), Strategic (PDU-S) and Technical PMI professional development units that vary according to certification. Technical PDUs are available in the following types: ACP, PBA, PfMP, PMP/PgMP, RMP, and SP.
This course is a survey of big data – the landscape, the technology behind it, business drivers and strategic possibilities. “Big data” is a hot buzzword, but most organizations are struggling to put it to practical use. Without assuming any prior knowledge of Apache Hadoop or big data management, this course teaches a wide range of professional roles how to tap and manage the potential benefits of big data, including:
- Discovering customer insights buried in your existing data
- Uncovering product opportunities from data insights
- Pinpointing decision points and criteria
- Scaling your existing workflows and operations
- Learning to ask questions that drive tangible business value from Big Data tools
- Upcoming Dates and Locations
-
Guaranteed To Run
- Course Outline
-
- Introduction to Big Data
- Academic
- Early web
- Web scale
- 1994 – 2012
- 2016
- 2020
- Sources (Examples)
- Internet
- Transport systems
- Medical, healthcare
- Insurance
- Military and others
- Hadoop – the free platform for working with big data
- History
- Yahoo
- Platform fragmentation
- What usage looks like in the enterprise
- The concepts
- Load data how you find it
- Process it when you can
- Project it into various schemas on the fly
- Push it back to where you need it
- The basics
- What it’s good for
- What can’t it do / disadvantages
- Most common use cases for big data
- Introduction to HDFS
- Robustness
- Data Replication
- Gotchas
- MapReduce – the core big data function
- Map explained
- Sort and shuffle explained
- Reduce explained
Demonstration: Hadoop, HDFS, and MapReduce - Let’s try it!
- YARN
- How it fits
- How it works
- Resource Manager
- Application Master
- PIG
- What it is
- How it works
- Compatibilities
- Advantages
- Disadvantages
Demonstration: YARN and PIG - Let’s try it!
- Processing Data
- The Piggy Bank
- Loading and Illustrating the data
- Writing a Query
- Storing the Result
- HIVE
- Data warehousing
- What it is, what it’s not
- Language compatibilities
- Advantages
Demonstration: HIVE - Let’s try it!
Example demo walkthrough: Contextual advertising
- OOZIE
- What it is
- Complex workflow environments
- Reducing time-to-market
- Frequency execution
- How it works with other big data tools
Example demo walkthrough: How to run a job
- FLUME – stream, collect, store and analyze high-volume log data
- How it works: Event, source, sink, channel, agent and client
- How it works illustrated
- How it works demonstrated
- SPARK
- Move over 2012 Big Data tools: Apache SPARK is the new power tool
- The new open source cluster framework
- When SPARK performs 100 times faster
- Performance comparison of Spark and Hadoop
- What else can it do?
- HBASE
- What it is
- Common use cases
- Using External Tools
- Introduction to Big Data
- Who should attend
-
This class is for anyone involved in project, product, or IT work who is actively consuming or considering big data services. No specific technical experience or prerequisites are needed.
• Software Engineers and Team Leads
• Project Managers
• Business Analysts
• DBAs and Data Engineering teams
• Business Customers
• System Analysts - Pre-Requisites
-
No specific technical experience or prerequisites are needed.