EMC Data Science and Big Data Analytics
- Course Code DSBDA
- Duration 5 days
Course Delivery
Additional Payment Options
-
GTC 66 inc. VAT
GTC, Global Knowledge Training Credit, please contact Global Knowledge for more details
Jump to:
Course Delivery
This course is available in the following formats:
-
Public Classroom
Traditional Classroom Learning
-
Virtual Learning
Learning that is virtual
Request this course in a different delivery format.
Course Overview
TopCourse Schedule
TopTarget Audience
TopCourse Objectives
Top- Immediately participate and contribute as a Data Science Team Member on big data and other analytics projects by:
- Deploying the Data Analytics Lifecycle to address big data analytics projects
- Reframing a business challenge as an analytics challenge
- Applying appropriate analytic techniques and tools to analyze big data, create statistical models, and identify insights that can lead to actionable results
- Selecting appropriate data visualizations to clearly communicate analytic insights to business sponsors and analytic audiences
- Using tools such as: R and RStudio, MapReduce/Hadoop, in-database analytics, Window and MADlib functions
- Explain how advanced analytics can be leveraged to create competitive advantage and how the data scientist role and skills differ from those of a traditional business intelligence analyst
Course Content
Top- Introduction and Course Agenda
- Introduction to Big Data Analytics
- Big Data Overview
- State of the Practice in Analytics
- The Data Scientist
- Big Data Analytics in Industry Verticals
- Data Analytics Lifecycle
- Discovery
- Data Preparation
- Model Planning
- Model Building
- Communicating Results
- Operationalizing
- Review of Basic Data Analytic Methods Using R
- Using R to Look at Data – Introduction to R
- Analyzing and Exploring the Data
- Statistics for Model Building and Evaluation
- Advanced Analytics – Theory And Methods
- K Means Clustering
- Association Rules
- Linear Regression
- Logistic Regression
- Naïve Bayesian Classifier
- Decision Trees
- Time Series Analysis
- Text Analysis
- Advanced Analytics - Technologies and Tools
- Analytics for Unstructured Data - MapReduce and Hadoop
- The Hadoop Ecosystem
- In-database Analytics – SQL Essentials
- Advanced SQL and MADlib for In-database Analytics
- The Endgame, or Putting it All Together
- Operationalizing an Analytics Project
- Creating the Final Deliverables
- Data Visualization Techniques
- Final Lab Exercise on Big Data Analytics
Course Prerequisites
TopTo complete this course successfully and gain the maximum benefits from it, a student should have the following knowledge and skill sets:
- A strong quantitative background with a solid understanding of basic statistics, as would be found in a statistics 101 level course
- Experience with a scripting language, such as Java, Perl, or Python (or R). Many of the lab examples taught in the course use R (with an RStudio GUI), which is an open source statistical tool and programming
- Experience with SQL (some course examples use
Consider the above as a list of specific prerequisite (or refresher) training and reading to be completed prior to enrolling for or attending this course. Having this requisite background will help ensure a positive experience in the class, and enable students to build on their expertise to learn many of the more advanced tools and analytical methods taught in the course.
- /-/media/global-knowledge/rte-images/campaigns-and-promotions/aws_awardwebbanner.jpg https://www.globalknowledge.com/us-en/company/awards/ #000000