Process Unbounded and Bounded Data

  1. Unbounded streams have a start but no defined end. They do not terminate and provide data as it is generated. Unbounded streams must be continuously processed, i.e., events must be promptly handled after ingesting them.
  2. Bounded streams have a defined start and end.

https://flink.apache.org/img/bounded-unbounded.png

Use Cases

Deploy Applications Anywhere

Apache Flink is a distributed system and requires compute resources in order to execute applications. Flink integrates with all common cluster resource managers such as Hadoop YARNApache Mesos, and Kubernetes but can also be setup to run as a stand-alone cluster.

Flink is designed to work well each of the previously listed resource managers. This is achieved by resource-manager-specific deployment modes that allow Flink to interact with each resource manager in its idiomatic way.

When deploying a Flink application, Flink automatically identifies the required resources based on the application’s configured parallelism and requests them from the resource manager. In case of a failure, Flink replaces the failed container by requesting new resources. All communication to submit or control an application happens via REST calls. This eases the integration of Flink in many environments.

Scalable Applications Execution

Leverage In-Memory Performance

Stateful Flink applications are optimized for local state access. Task state is always maintained in memory or, if the state size exceeds the available memory, in access-efficient on-disk data structures. Hence, tasks perform all computations by accessing local, often in-memory, states, yielding very low processing latencies. Flink guarantees exactly-once-state consistency in case of failures by periodically and asynchronously checkpointing the local state to durable storage.

https://flink.apache.org/img/local-state.png

Architecture