When adding a new broker to an existing Kafka cluster, what process ensures that the partitions are evenly distributed across all available brokers?
Load Balancing
Data Replication
Rebalancing
Broker Synchronization
In Kafka, which configuration setting controls the duration for which message acknowledgments from consumers are tracked?
group.max.session.timeout.ms
message.timeout.ms
replica.lag.time.max.ms
offsets.retention.minutes
What is the significance of the 'unclean.leader.election.enable' configuration parameter during broker failures?
It defines the time a broker is considered dead before triggering a leader election.
It controls the replication factor for partitions.
It ensures no data loss during leader election but might increase unavailability.
It allows for faster leader election but might lead to data loss.
What is the purpose of consumer groups in Kafka?
To guarantee message ordering for all consumers.
To store messages persistently on disk.
To replicate messages across multiple data centers.
To allow multiple consumers to subscribe to the same topic and each process a subset of messages.
Which tool is used to perform controlled shutdown and removal of a broker from a Kafka cluster?
kafka-reassign-partitions.sh
kafka-preferred-replica-election.sh
kafka-config.sh
kafka-topics.sh
Which Kafka Streams API is specifically designed for performing windowed aggregations?
transform()
groupBy()
reduce()
filter()
In Kafka Streams, what is the primary difference between stateful and stateless processing?
Stateful processing is used for filtering data, while stateless processing is used for transformations.
Stateless processing is faster than stateful processing because it does not require data storage.
Stateful processing is more scalable than stateless processing.
Stateful processing allows access to historical data, while stateless processing only considers the current record.
What type of metrics would you monitor to track the rate at which messages are being produced to a Kafka topic?
Broker disk usage metrics
Producer request rate metrics
Replication lag metrics
Consumer lag metrics
What is the significance of 'Exactly Once Semantics' in Kafka Streams?
It ensures that records are processed in the exact order they were produced.
It prioritizes speed over accuracy in data processing.
It prevents duplicate processing of records even in the event of failures.
It guarantees that each record is processed at least once.
How does Kafka Streams achieve fault tolerance?
By relying solely on message acknowledgments from consumers.
By using a single, centralized processing unit.
By replicating stream processing tasks across multiple nodes.
By storing all processed data in a separate, redundant database.