Taught bya team which includes2Stanford-educated, ex-Googlers. This team has decades of practical experience in working with Java and with billions of rows of data. Use Flume and Sqoop to import data to HDFS, HBase and Hive from a variety of sources, including Twitter and MySQLLets parse that. Import data: Flume and Sqoopplay a special role in the Hadoop ecosystem. They transport data fromsources like local file systems, HTTP, MySQLand Twitter which hold/produce data to data stores like HDFS, HBase and Hive. Both tools come with built-in functionality and abstract away users from the complexity of transporting data between these systems. Flume: Flume Agents can transport data produced by a streaming application to data stores like HDFS and HBase. Sqoop: Use Sqoop to bulk import data from traditional RDBMSto Hadoop storage architectures like HDFSor Hive. What’s Covered: Practical implementations for a variety of sources and data stores. Sources: Twitter, MySQL, Spooling Directory, HTTPSinks: HDFS, HBase, Hive. Flume features: Flume Agents, Flume Events, Event bucketing, Channel selectors, Interceptors. Sqoopfeatures: Sqoop import from MySQL, Incremental imports using Sqoop Jobs

Flume and Sqoop for Ingesting Big Data

Recommended products