Skip to content
JSBlogs
Go back

Apache Kafka 4.2 — what's new and what it means for Java developers

Apache Kafka 4.2 was released on 17 February 2026. The headline is Share Groups — Kafka’s native queue-style consumption model — reaching production-ready status after several preview releases. Alongside that, Kafka Streams picks up dead letter queue support and server-side rebalance goes GA, the CLI gets a long-overdue consistency pass, and Java 25 is now supported.

This post covers every meaningful change with the context you need to decide what to adopt now and what to keep an eye on.

Table of contents

Open Table of contents

Share Groups (Kafka Queues) — production ready

Share Groups are the biggest conceptual addition to Kafka’s consumption model since consumer groups were introduced. Understanding what problem they solve is the starting point.

The traditional consumer group model

In a standard consumer group, each partition is assigned exclusively to one consumer at a time. If you want to parallelise processing of a single high-volume partition, your only option is to add more partitions. More partitions mean more coordination overhead, cannot be decreased without data reshuffling, and do not help if records within a partition have ordering dependencies.

graph LR
    subgraph cg ["Consumer Group"]
        P0["Partition 0"] --> CA["Consumer A"]
        P1["Partition 1"] --> CB["Consumer B"]
        P2["Partition 2"] --> CC["Consumer C"]
    end

The Share Group model

Share Groups break the exclusive partition ownership contract. Multiple consumers within a share group can all read from the same partition concurrently. Each record is delivered to one consumer for processing, and that consumer acknowledges the outcome explicitly — accepted, released for redelivery, or rejected permanently.

graph LR
    subgraph sg ["Share Group"]
        SP0["Partition 0"] --> SA["Worker A"]
        SP0 --> SB["Worker B"]
        SP0 --> SC["Worker C"]
        SP1["Partition 1"] --> SA
        SP1 --> SB
        SP1 --> SC
    end

This is the semantics of a message queue (RabbitMQ, SQS) built natively on top of Kafka’s log structure. You get fan-out consumption, explicit acknowledgement, and automatic redelivery — without a separate messaging system.

Using a Share Group in Java

Share Groups use ShareConsumer rather than KafkaConsumer. The subscription and poll loop look similar; acknowledgement is the key difference.

Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "order-processors");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());

try (ShareConsumer<String, String> consumer = new ShareConsumer<>(props)) {
    consumer.subscribe(List.of("orders"));

    while (true) {
        ConsumerRecords<String, String> records = consumer.poll(Duration.ofSeconds(1));

        for (ConsumerRecord<String, String> record : records) {
            try {
                processOrder(record.value());
                // Record processed successfully — release offset permanently
                consumer.acknowledge(record, AcknowledgeType.ACCEPT);

            } catch (TransientException e) {
                // Transient failure — release for redelivery to any consumer
                consumer.acknowledge(record, AcknowledgeType.RELEASE);

            } catch (PoisonPillException e) {
                // Unrecoverable — reject permanently (counts against max retries)
                consumer.acknowledge(record, AcknowledgeType.REJECT);
            }
        }

        consumer.commitSync(); // flush acknowledgements to the broker
    }
}

New in 4.2 — the RENEW acknowledgement type (KIP-1222)

Before 4.2, if your processing took longer than the acquisition lock timeout, Kafka would assume the consumer failed and redeliver the record to another consumer — even while the original consumer was still working. The only option was to tune delivery.timeout.ms globally.

AcknowledgeType.RENEW solves this without tuning: while a record is being processed, you can extend its lock before it expires, then acknowledge when done.

for (ConsumerRecord<String, String> record : records) {
    // Kick off long-running work asynchronously
    CompletableFuture<Void> work = processAsync(record.value());

    // Renew the lock before it expires while work is still running
    scheduler.scheduleAtFixedRate(
        () -> consumer.acknowledge(record, AcknowledgeType.RENEW),
        30, 30, TimeUnit.SECONDS
    );

    work.join(); // wait for completion
    consumer.acknowledge(record, AcknowledgeType.ACCEPT);
}

The full lifecycle of a record through a Share Group, including the new RENEW state:

stateDiagram-v2
    direction LR
    [*] --> Available : Record produced to topic
    Available --> Acquired : Consumer polls
    Acquired --> Acquired : RENEW — extend lock timeout
    Acquired --> Available : RELEASE — retry by another consumer
    Acquired --> Accepted : ACCEPT — processing succeeded
    Acquired --> Rejected : REJECT — unrecoverable failure
    Accepted --> [*]
    Rejected --> [*]

New in 4.2 — share partition lag metrics (KIP-1226)

Monitoring consumption lag with Share Groups previously required custom tooling. 4.2 adds dedicated lag metrics exposed through JMX and the metrics endpoint:

kafka.server:type=ShareGroupMetrics,name=SharePartitionLag,group=order-processors,topic=orders,partition=0

These flow into Prometheus/Grafana through the standard Kafka JMX exporter without any additional configuration.

When to use Share Groups vs Consumer Groups

Consumer GroupsShare Groups
Processing modelPartition-per-consumer, orderedAny consumer, concurrent
Ordering guaranteePer-partitionNot guaranteed
AcknowledgementOffset commitPer-record (Accept/Release/Reject)
RedeliveryManual DLQ wiringBuilt-in (RELEASE / REJECT)
Best forEvent sourcing, ordered streamsTask queues, async job dispatch

Tip: Share Groups are a good fit when you are currently running a Kafka consumer alongside a separate RabbitMQ or SQS queue for job dispatch. You can consolidate the two into a single Share Group topic, reducing operational surface area while keeping the queue semantics your application code already expects.

Kafka Streams improvements

Dead letter queue support in exception handlers (KIP-1034)

Before 4.2, when a Kafka Streams topology hit a deserialization error or processing failure on a record, the options were: stop the application, skip the record, or write custom error-routing code in every processor. There was no standard mechanism to route failed records to a dedicated topic.

4.2 adds dead letter queue support to both the deserialization exception handler and the processing exception handler.

flowchart LR
    T[("Input Topic")] --> S["Streams Topology"]
    S -->|success| O[("Output Topic")]
    S -->|DeserializationException| D["Deserialization\nException Handler"]
    S -->|ProcessingException| P["Processing\nException Handler"]
    D -->|CONTINUE| DLQ[("DLQ Topic")]
    P -->|CONTINUE| DLQ
    D -->|FAIL| STOP(["Stop Application"])
    P -->|FAIL| STOP

Handling deserialization failures:

public class OrderDeserializationHandler implements DeserializationExceptionHandler {

    @Override
    public DeserializationHandlerResponse handle(
            ErrorHandlerContext context,
            ConsumerRecord<byte[], byte[]> record,
            Exception exception) {

        // Write the raw bytes and headers to a DLQ topic via a side-channel producer
        dlqProducer.send(new ProducerRecord<>(
            "orders.dlq",
            record.key(),
            buildErrorEnvelope(record.value(), exception, context)
        ));

        return DeserializationHandlerResponse.CONTINUE; // skip and keep the topology running
    }
}
// Wire it in StreamsConfig
props.put(
    StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
    OrderDeserializationHandler.class
);

Handling processing failures:

public class OrderProcessingHandler implements ProcessingExceptionHandler {

    @Override
    public ProcessingHandlerResponse handle(
            ErrorHandlerContext context,
            Record<?, ?> record,
            Exception exception) {

        dlqProducer.send(new ProducerRecord<>(
            "orders.processing.dlq",
            serializeKey(record.key()),
            buildErrorEnvelope(record.value(), exception, context)
        ));

        return ProcessingHandlerResponse.CONTINUE;
    }
}

Important: The DLQ producer in these handlers is managed by your code, not Kafka Streams. Ensure it is thread-safe and that you handle producer errors separately — a DLQ write failure inside an exception handler should not crash the topology.

Server-side rebalance protocol — GA (KIP-1071)

The server-side (cooperative) rebalance protocol for Kafka Streams, which moved task assignment from the consumer side to the broker, is now generally available with a stable feature set. This protocol reduces the rebalance scope: only partitions that need to move are reassigned, rather than triggering a stop-the-world rebalance for all consumers whenever membership changes.

// Enable server-side rebalance (recommended for new deployments)
props.put(StreamsConfig.GROUP_PROTOCOL_CONFIG, "streams");

For existing deployments on the classic rebalance protocol, migration requires a rolling restart — set the config, redeploy one instance at a time.

Anchored wall-clock punctuation (KIP-1146)

PunctuationType.WALL_CLOCK_TIME punctuations in Kafka Streams previously scheduled the next punctuation relative to the last execution. Over time this drifts — a one-minute punctuation scheduled at 12:00:00 would eventually fire at 12:00:07, then 12:00:14.

4.2 adds anchored scheduling: the first punctuation fires at the next aligned wall-clock time for the given interval, and subsequent firings stay anchored to that grid.

@Override
public void init(ProcessorContext<K, V> context) {
    // Before (relative — drifts over time)
    context.schedule(Duration.ofMinutes(1), PunctuationType.WALL_CLOCK_TIME, ts -> flush());

    // After (anchored — always fires at :00, :01, :02...)
    context.schedule(
        Duration.ofMinutes(1),
        PunctuationType.WALL_CLOCK_TIME,
        Duration.ZERO,   // offset from interval boundary
        ts -> flush()
    );
}

This matters for windowing aggregations and periodic flushes where predictable timing affects downstream correctness (e.g. hourly summaries that must align to hour boundaries).

Fluent CloseOptions API (KIP-1153)

Shutting down a Kafka Streams application previously always sent a leave-group request, causing a rebalance before the application finished closing. For rolling restarts where the same instance will rejoin immediately, this rebalance is unnecessary and adds latency.

// Before — always sends leave-group, always triggers rebalance
streams.close();

// After — control leave-group behaviour and shutdown timeout explicitly
streams.close(new KafkaStreams.CloseOptions()
    .leaveGroup(false)          // skip leave-group for rolling restarts
    .timeout(Duration.ofSeconds(30))
);

// Or: send leave-group explicitly for a clean final shutdown
streams.close(new KafkaStreams.CloseOptions()
    .leaveGroup(true)
    .timeout(Duration.ofSeconds(60))
);

Tip: Use leaveGroup(false) in your Kubernetes rolling restart lifecycle hook. The new pod joins before the old one has fully stopped, and the rebalance happens once rather than twice.

Observability and metrics

Metric naming standardisation (KIP-1100)

Kafka’s JMX metric names previously mixed naming conventions across components. 4.2 corrects them to follow the kafka.COMPONENT pattern consistently. If your Prometheus or Datadog dashboards reference metric names by exact string, review them before upgrading.

The Kafka JMX exporter configuration and Confluent Platform dashboards have been updated to match. Custom dashboards built on raw JMX names will need updating.

New idle ratio metrics

Two new metrics provide visibility into whether the controller and MetadataLoader are keeping up with work:

Add these to your cluster health dashboards alongside existing RequestHandlerAvgIdlePercent metrics.

Generic feature level metrics (KIP-1180)

New JMX metrics expose the finalized, minimum supported, and maximum supported feature level for each Kafka production feature. This makes it easier to track upgrade state across a cluster:

kafka.server:type=FeatureMetrics,name=FinalizedFeatureLevel,feature=share-version
kafka.server:type=FeatureMetrics,name=MinSupportedFeatureLevel,feature=share-version
kafka.server:type=FeatureMetrics,name=MaxSupportedFeatureLevel,feature=share-version

Useful for validating that a rolling broker upgrade has fully completed before enabling new features.

Admin API improvements

Rack ID in member descriptions (KIP-1227)

AdminClient.describeConsumerGroups() and AdminClient.describeShareGroups() now include the rack ID of each member in the response. This is useful for diagnosing uneven partition assignment across availability zones.

try (AdminClient admin = AdminClient.create(props)) {
    DescribeConsumerGroupsResult result = admin.describeConsumerGroups(
        List.of("order-processors")
    );

    ConsumerGroupDescription description = result.all().get().get("order-processors");

    for (MemberDescription member : description.members()) {
        System.out.printf("Consumer %s is on rack %s%n",
            member.consumerId(),
            member.rackId().orElse("unknown")  // new in 4.2
        );
    }
}

CLI standardisation (KIP-1147)

Most Kafka CLI tools now accept --bootstrap-server as the standard broker connection argument. Previously, several tools used --broker-list, --zookeeper, or other flags — a frequent source of confusion, especially for developers switching between tools.

# Before — different tools, different flags
kafka-topics.sh --zookeeper localhost:2181 --list                # deprecated
kafka-console-producer.sh --broker-list localhost:9092 --topic orders

# After — consistent across all tools in 4.2
kafka-topics.sh --bootstrap-server localhost:9092 --list
kafka-console-producer.sh --bootstrap-server localhost:9092 --topic orders
kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
kafka-configs.sh --bootstrap-server localhost:9092 --entity-type brokers --describe

Caution: Zookeeper-based CLI flags continue to be deprecated and will be removed in Kafka 5.0. Any automation scripts still using --zookeeper should be updated now — they will stop working in the next major version.

JsonConverter external schema support (KIP-1054)

Kafka Connect’s JsonConverter traditionally embedded the full schema inside every message:

{
  "schema": {
    "type": "struct",
    "fields": [
      { "type": "string", "field": "id" },
      { "type": "int64", "field": "amount" },
      { "type": "string", "field": "currency" }
    ]
  },
  "payload": { "id": "ord-123", "amount": 4999, "currency": "USD" }
}

For topics with large or complex schemas, the schema portion can dwarf the payload, wasting storage and network bandwidth.

4.2 adds support for referencing an external schema by ID, reducing the message to payload only:

{ "id": "ord-123", "amount": 4999, "currency": "USD" }

Enable it in your connector configuration:

value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=false
value.converter.schema.registry.url=http://schema-registry:8081

This is particularly valuable for high-throughput connectors where schema overhead becomes a measurable cost at scale.

Other notable changes

Java 25 support

Kafka 4.2 is tested and supported on Java 25 (the latest LTS). The minimum Java version remains Java 11 for clients and Java 17 for brokers.

Thread-safe RecordHeader (KIP-1205)

RecordHeader was not thread-safe — accessing headers from multiple threads required external synchronisation. In 4.2, reads and writes to RecordHeader are thread-safe without additional locking. Applications that share header objects across threads no longer need to wrap accesses in synchronized blocks.

Dynamic configuration for remote log manager thread pool (KIP-1179)

For clusters using tiered storage, the thread pool size for the remote log manager can now be updated dynamically without a broker restart:

kafka-configs.sh --bootstrap-server localhost:9092 \
  --entity-type brokers --entity-default \
  --alter --add-config remote.log.manager.thread.pool.size=8

ConsumerPerformance regex topic filtering (KIP-1192)

The kafka-consumer-perf-test.sh tool now accepts a --topic-regex flag to subscribe to multiple topics matching a pattern — useful for load testing multiple topic variants at once.

kafka-consumer-perf-test.sh \
  --bootstrap-server localhost:9092 \
  --topic-regex "orders-.*" \
  --messages 1000000

Deprecations — heading to Kafka 5.0

These are deprecated in 4.2 and will be removed in 5.0:

ItemReplacementKIP
ConsumerGroupMetadata constructorsConsumerGroupMetadata.of(...) factory methodKIP-1136
MX4j JMX HTTP adapterJConsole, JMX exporter, or JolokiaKIP-1193
BrokerNotFoundExceptionKafkaExceptionKIP-1195
Zookeeper-based CLI flags (--zookeeper, --broker-list)--bootstrap-serverKIP-1147

If you use ConsumerGroupMetadata directly in your transaction code, update the construction pattern now:

// Deprecated in 4.2 — removed in 5.0
ConsumerGroupMetadata meta = new ConsumerGroupMetadata(groupId);

// Use instead
ConsumerGroupMetadata meta = new ConsumerGroupMetadata(
    groupId, generationId, memberId, groupInstanceId
);
// Or the new factory method (check 4.2 Javadoc for exact signature)

Note: MX4j was an open-source JMX HTTP bridge that many older Kafka monitoring setups relied on. If your cluster monitoring still uses MX4j, plan the migration to the Prometheus JMX exporter (jmx_prometheus_javaagent) before Kafka 5.0 removes the adapter entirely.

Upgrading to Kafka 4.2

Broker upgrade: Rolling upgrade from 4.1 to 4.2 is supported. Upgrade brokers one at a time in KRaft mode — the active controller is automatically transferred during each broker restart.

Feature flags: New features that require metadata version upgrades (Share Groups, server-side Streams rebalance) are gated behind feature flags. Enable them explicitly after all brokers are on 4.2:

# Verify all brokers are on 4.2 before enabling new features
kafka-features.sh --bootstrap-server localhost:9092 describe

# Enable share groups feature level
kafka-features.sh --bootstrap-server localhost:9092 \
  upgrade --feature share-version

Client compatibility: Kafka clients are forward and backward compatible across two minor versions. Clients on 4.0 or 4.1 will continue to work against a 4.2 broker. ShareConsumer requires a 4.2 (or later) client library.

Spring Kafka: Spring for Apache Kafka 4.0 (released November 2025 with Spring Boot 4) supports Kafka 4.x. Upgrade spring-kafka to 4.0.x alongside the broker upgrade.

<!-- Spring Boot 4 manages the version via BOM -->
<dependency>
    <groupId>org.springframework.kafka</groupId>
    <artifactId>spring-kafka</artifactId>
</dependency>

Tip: If you are still on Kafka 3.x, upgrading to 4.2 first requires a stop at 4.0 to complete the ZooKeeper-to-KRaft migration if you have not already done so. Kafka 4.0 dropped ZooKeeper support entirely — there is no ZooKeeper mode in 4.x.

Feature summary

FeatureAreaStatus
Share Groups (Kafka Queues)ConsumptionGA
RENEW acknowledgement typeShare GroupsNew
Share partition lag metricsObservabilityNew
Kafka Streams DLQ supportStreamsNew
Streams server-side rebalanceStreamsGA
Anchored wall-clock punctuationStreamsNew
Fluent CloseOptions APIStreamsNew
Metric naming standardisationObservabilityImproved
Controller / MetadataLoader idle metricsObservabilityNew
Feature level metricsObservabilityNew
Rack ID in group member descriptionsAdmin APINew
CLI --bootstrap-server standardisationCLIImproved
JsonConverter external schemaConnectNew
Java 25 supportPlatformNew
Thread-safe RecordHeaderClientFixed
Dynamic remote log manager configStorageNew

References


Share this post on:

Next Post
What to learn next — your AI engineering learning path after this course