The open-source project that spawned generations of big-data technologies and provides the foundation for Hive, Pig, and MapReduce, Hadoop is still today’s choice for workloads that require virtually unlimited scalability, a high degree of dependability, and support for a wide range of workload types. These characteristics make Apache Hadoop particularly suitable for batch processing of ETL jobs on large data sets, complex workflow diagrams, or data structures that exceed the in-memory limitations of other engines.