Apache Beam

Posted on July 31, 2022

by Thomas

Software

Apache Beam is an advanced unified programming open-source model launched in 2016. It derives its name “Beam” which is from “Batch” + “Stream” from its functionalities for both batch and streaming the parallel processing pipelines for data. To execute pipelines, beam supports numerous distributed processing back-ends, including Apache Flink, Apache Spark, Apache Samza, Hazelcast Jet, Google Cloud Dataflow, etc. It even allows you to build a program that defines the data pipeline using open-source Beam SDKs (Software Development Kits) in any three programming languages: Java, Python, and Go.

Apache Beam has certain features that give an advantage to the user, the primary one being unified batch and streaming APIs with an increased level of abstraction and portability across runtimes. The only pitfalls are lesser transparency and control, the limited scope of performance improvement tricks compared to other Apache APIs, and open bugs.

You can contribute to Apache Beam open-source big data project here: https://github.com/apache/beam

(Visited 93 times, 1 visits today)

Apache Beam

Written by

Thomas

Latest Posts

Leave a Reply Cancel reply

Apache Beam

Written by

Thomas

Latest Posts

Leave a Reply Cancel reply

SCAN THIS QR CODE WITH YOUR PHONE