Apache Spark

- August 11, 2018

Apache spark is fast,in memory Data processing engine with elegant and expressive development APIs to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets. Originally developed at the University of California' Berkeley's AMPLAB, the Spark codebase was later donated to the Apache Software Foundation which has maintained it since With Spark running on Apache Hadoop YARN, developers everywhere can now create applications to exploit Spark’s power, derive insights, and enrich their data science workloads within a single, shared dataset in Hadoop.