librdkafka - the Apache Kafka C/C++ client library

Overview

librdkafka - the Apache Kafka C/C++ client library

Copyright (c) 2012-2020, Magnus Edenhill.

https://github.com/edenhill/librdkafka

librdkafka is a C library implementation of the Apache Kafka protocol, providing Producer, Consumer and Admin clients. It was designed with message delivery reliability and high performance in mind, current figures exceed 1 million msgs/second for the producer and 3 million msgs/second for the consumer.

librdkafka is licensed under the 2-clause BSD license.

KAFKA is a registered trademark of The Apache Software Foundation and has been licensed for use by librdkafka. librdkafka has no affiliation with and is not endorsed by The Apache Software Foundation.

Features

  • Full Exactly-Once-Semantics (EOS) support
  • High-level producer, including Idempotent and Transactional producers
  • High-level balanced KafkaConsumer (requires broker >= 0.9)
  • Simple (legacy) consumer
  • Admin client
  • Compression: snappy, gzip, lz4, zstd
  • SSL support
  • SASL (GSSAPI/Kerberos/SSPI, PLAIN, SCRAM, OAUTHBEARER) support
  • Full list of supported KIPs
  • Broker version support: >=0.8 (see Broker version compatibility)
  • Guaranteed API stability for C & C++ APIs (ABI safety guaranteed for C)
  • Statistics metrics
  • Debian package: librdkafka1 and librdkafka-dev in Debian and Ubuntu
  • RPM package: librdkafka and librdkafka-devel
  • Gentoo package: dev-libs/librdkafka
  • Portable: runs on Linux, MacOS X, Windows, Solaris, FreeBSD, AIX, ...

Documentation

NOTE: The master branch is actively developed, use latest release for production use.

Installation

Installing prebuilt packages

On Mac OSX, install librdkafka with homebrew:

$ brew install librdkafka

On Debian and Ubuntu, install librdkafka from the Confluent APT repositories, see instructions here and then install librdkafka:

$ apt install librdkafka-dev

On RedHat, CentOS, Fedora, install librdkafka from the Confluent YUM repositories, instructions here and then install librdkafka:

$ yum install librdkafka-devel

On Windows, reference librdkafka.redist NuGet package in your Visual Studio project.

For other platforms, follow the source building instructions below.

Installing librdkafka using vcpkg

You can download and install librdkafka using the vcpkg dependency manager:

# Install vcpkg if not already installed
$ git clone https://github.com/Microsoft/vcpkg.git
$ cd vcpkg
$ ./bootstrap-vcpkg.sh
$ ./vcpkg integrate install

# Install librdkafka
$ vcpkg install librdkafka

The librdkafka package in vcpkg is kept up to date by Microsoft team members and community contributors. If the version is out of date, please create an issue or pull request on the vcpkg repository.

Build from source

Requirements

The GNU toolchain
GNU make
pthreads
zlib-dev (optional, for gzip compression support)
libssl-dev (optional, for SSL and SASL SCRAM support)
libsasl2-dev (optional, for SASL GSSAPI support)
libzstd-dev (optional, for ZStd compression support)

NOTE: Static linking of ZStd (requires zstd >= 1.2.1) in the producer enables encoding the original size in the compression frame header, which will speed up the consumer. Use STATIC_LIB_libzstd=/path/to/libzstd.a ./configure --enable-static to enable static ZStd linking. MacOSX example: STATIC_LIB_libzstd=$(brew ls -v zstd | grep libzstd.a$) ./configure --enable-static

Building

  ./configure
  # Or, to automatically install dependencies using the system's package manager:
  # ./configure --install-deps
  # Or, build dependencies from source:
  # ./configure --install-deps --source-deps-only

  make
  sudo make install

NOTE: See README.win32 for instructions how to build on Windows with Microsoft Visual Studio.

NOTE: See CMake instructions for experimental CMake build (unsupported).

Usage in code

  1. Refer to the examples directory for code using:
  • Producers: basic producers, idempotent producers, transactional producers.
  • Consumers: basic consumers, reading batches of messages.
  • Performance and latency testing tools.
  1. Refer to the examples GitHub repo for code connecting to a cloud streaming data service based on Apache Kafka

  2. Link your program with -lrdkafka (C) or -lrdkafka++ (C++).

Commercial support

Commercial support is available from Confluent Inc

Community support

Only the last official release is supported for community members.

File bug reports, feature requests and questions using GitHub Issues

Questions and discussions are also welcome on the Confluent Community slack #clients channel.

Language bindings

See Powered by librdkafka for an incomplete list of librdkafka users.

Comments
  • configuring kerberos in librdkafka using gnu library

    configuring kerberos in librdkafka using gnu library

    unable to connect to kerberos enable broker(kafka consumer) using librdkafka producer WITH_SASL=y Libsasl2 installed.. set sasl.mechanism=GSSAPI sasl.kerberos.service.name=kafka sasl.kerberos.kinit.command=/usr/bin/kinit

    How to reproduce

    Checklist

    Please provide the following information:

    • [ ] librdkafka version (release number or git tag):
    • [ ] Apache Kafka version:
    • [x] librdkafka client configuration:
    • [ ] Operating system:
    • [ ] Using the legacy Consumer
    • [ ] Using the high-level KafkaConsumer
    • [ ] Provide logs (with debug=.. as necessary) from librdkafka
    • [ ] Provide broker log excerpts
    • [ ] Critical issue
    opened by kanchanagarwal3 80
  • Continuing on PR #1882: SSL cert verification and in-memory keys

    Continuing on PR #1882: SSL cert verification and in-memory keys

    This PR is a continuation on @noahdav's work on PR #1882.

    Summary of changes compared to #1882:

    • The retrieve_cb() has been removed in favour of using standard configuration properties. The reasoning behind this ís simplification, reduced code size, cross-language maintenance (some high-level languages, such as Go, can't use callbacks in a good way), and memory ownership issues (having the callback return cert memory to librdkafka in an effective and allocator-agnostic way is tricky).
    • Refactored the code and made it more in line with the librdkafka style guide.
    • Values of security-sensitive configuration properties are now cleared from memory when the value is no longer used, or on rd_kafka_t destruction. This is used for ssl.certificate.pem, ssl.key.pem and ssl.key.password.
    • The ssl_cert_verify_cb was changed to provide more information to the callback, including the full X509_STORE_CTX.
    • Add integration tests.

    To be done:

    • Rework the Windows SSL cert store example to its own file.
    opened by edenhill 75
  • sasl authentication failure error using librdkafka library

    sasl authentication failure error using librdkafka library

    %3|1484896879.875|FAIL|rdkafka#producer-1| sasl_plaintext://hdfs500.host.mobistar.be:9092/bootstrap: Failed to initialize SASL authentication: SASL handshake failed (start): SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Matching credential not found) %3|1484896879.875|ERROR|rdkafka#producer-1| sasl_plaintext://hdfs500.host.mobistar.be:9092/bootstrap: Failed to initialize SASL authentication: SASL handshake failed (start): SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Matching credential not found) %3|1484896879.875|ERROR|rdkafka#producer-1| 1/1 brokers are down %7|1484896880.892|SASL|rdkafka#producer-1| sasl_plaintext://hdfs500.host.mobistar.be:9092/bootstrap: Initializing SASL client: service name kafka, hostname hdfs500.host.mobistar.be, mechanisms GSSAPI %7|1484896880.892|SASLREFRESH|rdkafka#producer-1| Refreshing SASL keys with command: /usr/bin/kinit -S "kafka/hdfs500.host.mobistar.be" -k -t " " [email protected] kinit: Key table file '

    Checklist

    Please provide the following information:

    • [ ] librdkafka version (release number or git tag):
    • [ ] Apache Kafka version:
    • [ ] librdkafka client configuration:
    • [ ] Operating system:
    • [ ] Using the legacy Consumer
    • [ ] Using the high-level KafkaConsumer
    • [ ] Provide logs (with debug=.. as necessary) from librdkafka
    • [ ] Provide broker log excerpts
    • [ ] Critical issue
    opened by kanchanagarwal3 75
  • rdkafka can't be built with native compilers

    rdkafka can't be built with native compilers

    We need to build opensource products with the native compiler, e.g. SUN's Workshop tools on Solaris machines not gcc/g++. The code uses gcc specific conventions and the mklove tools assume the same. It would be useful to convert this to autoconf and to make the code compiler agnostic.

    Would you be open to take changes for this?

    portability 
    opened by shalstea 74
  • EOS support

    EOS support

    Support for Exactly-Once-Semantics as introduced in Apache Kafka 0.11.0.

    The message format is already supported which means librdkafka may be used with EOS-supporting clients (official Java clients) without a performance penalty.

    To track popularity, please add a +1 reaction to this comment if this is a feature you need

    EOS is made of up three parts:

    • Idempotent producer - guarantees messages are produced once and in order (enable.idempotence=true)
    • Producer Transaction support - commit or fail a set of produced messages and consumer offsets across partitions (BeginTransaction(), SendOffsetsToTransaction(), etc)
    • Consumer transaction support - filter out messages for aborted transactions, and wait for transactions to be committed before processing (isolation.level=read_committed)

    We'd be interested to know which functionality you are planning to use, use reactions to vote:

    • 😆 - Idempotent Producer - supported in librdkafka v1.0.0
    • 🎉 - Producer transactions - supported in v1.4.0
    • ❤️ - Consumer transactions - supported in v1.2.0
    enhancement conformance 
    opened by edenhill 70
  • Occasional assert failed on macOS with 0.9.3.x

    Occasional assert failed on macOS with 0.9.3.x

    Description

    Assertion failed: (r == 0), function rwlock_wrlock, file ../deps/librdkafka/src/tinycthread.c, line 1000.
    

    happens occasionally on macOS with librdkafka > 0.9.3.x

    How to reproduce

    running the entire e2e suite of https://github.com/Blizzard/node-rdkafka with its librdkafka submodule pointing at current head of 0.9.3.x https://github.com/edenhill/librdkafka/commit/81a673da0d1ed457d8c19dd5896651ffd178289d

    NOTE - does NOT seem to happen with current HEAD master https://github.com/edenhill/librdkafka/commit/c73d2bbaf6a4c15ea9d196ae1f9d79e448e119e6

    Checklist

    Please provide the following information:

    • [ ] librdkafka version (release number or git tag): 81a673da0d1ed457d8c19dd5896651ffd178289d
    • [ ] Apache Kafka version: 0.10.1.1
    • [ ] librdkafka client configuration: defaults
    • [ ] Operating system: macOS
    • [ ] Using the legacy Consumer
    • [ ] Using the high-level KafkaConsumer
    • [ ] Provide logs (with debug=.. as necessary) from librdkafka
    • [ ] Provide broker log excerpts
    • [ ] Critical issue
    wait-info cant reproduce 
    opened by edoardocomar 66
  • 0.11 consumer memory usage and slow performance

    0.11 consumer memory usage and slow performance

    Hi,I used high-level consumer api and used rd_kafka_consume_batch_queue() method. I found that the memory goes up quickly when I just only started the consumer ,not consume any messages . The memory can go up to 21g . It seems that there is one pthread background to fetch messages continually and put them in buffer until EOF. Topic I fetched has 4 million messages and one message has about 1K size. So how can i limit the memory usage?

    bug consumer CRITICAL 
    opened by greatt1n 62
  • Uneven paritioning 0.9.1 driver.

    Uneven paritioning 0.9.1 driver.

    I'm running into an interesting paritioning problem.

    If I launch say 4 consumers for a topic of 144 paritions, i get 3 of them consuming all the paritions and one of them get's all his paritions revoked. consumer->unassign()'ed

    I tried the range and roundrobin parition strategies from the configuration but nothing gives me an even paritions for the consumers.

    The bad part about this is when I launch a new consumer after having an stable set of consumers, the new consumer takes all the paritions and the rest just sit idle and get unassigned

    bug consumer 
    opened by emaxerrno 59
  • Solaris x86 build

    Solaris x86 build

    Struggling to build on the above platform. I've tried setting CC to the sun studio compiler. Any ideas?

    ./configure

    sed: illegal option -- E

    grep: illegal option -- E

    Usage: grep -hblcnsviw pattern file . . .

    using cache file config.cache

    checking for C compiler from CC env... ok

    checking for C++ compiler from CXX env... ok

    checking executable ld... ok (cached)

    checking executable nm... ok (cached)

    checking executable objdump... ok (cached)

    checking executable strip... ok (cached)

    checking for pkgconfig (by command)... ok (cached)

    checking for install (by command)... ok (cached)

    checking for PIC (by compile)... ok (cached)

    checking for GNU-compatible linker options... ok (cached)

    checking for __atomic_32 (by compile)... failed

    checking for __atomic_32_lib (by compile)... failed

    checking for __sync_32 (by compile)... ok (cached)

    checking for __atomic_64 (by compile)... failed

    checking for __atomic_64_lib (by compile)... failed

    checking for __sync_64 (by compile)... ok (cached)

    parsing version ''... failed (fail)

    checking for libpthread (by pkg-config)... failed

    checking for libpthread (by compile)... failed (fail)

    checking for zlib (by pkg-config)... failed

    checking for zlib (by compile)... failed (fail)

    checking for librt (by pkg-config)... failed

    checking for librt (by compile)... failed

    checking executable otool... failed

    checking for nm (by env NM)... ok (cached)

    Configure failed

    Accumulated failures:

    parseversion ()

    module: parseversion
    
    action: fail
    
    reason:
    

    Version string is empty

    libpthread ()

    module: librdkafka
    
    action: fail
    
    reason:
    

    compile check failed:

    CC: CC

    flags: -lpthread

    /usr/bin/cc -g -O2 -fPIC -Wall -Werror -Wfloat-equal -Wpointer-arith -Wall -Werror -lpthread _mkltmpXXXX.c -o _mkltmpXXXX.c.o -g:

    cc: -W option with unknown program all

    source:

    zlib ()

    module: librdkafka
    
    action: fail
    
    reason:
    

    compile check failed:

    CC: CC

    flags: -lz

    /usr/bin/cc -g -O2 -fPIC -Wall -Werror -Wfloat-equal -Wpointer-arith -Wall -Werror -lz _mkltmpXXXX.c -o _mkltmpXXXX.c.o -g:

    cc: -W option with unknown program all

    source:

    enhancement 
    opened by winbatch 53
  • Windows Kafka Performance issue

    Windows Kafka Performance issue

    Set up Description...

    The entire setup is on Windows.

    My Kafka Cluster has 6 brokers, 4 on one machine and 2 on other machine. It has two Kafka Topics with partition size 50 each, and replication factor of 3.

    My partition logic selection: like for each message (its unique ID % 50) , and then calling Kafka producer API to route a specific message to a particular topic partition . RdKafka::ErrorCode resp = m_kafkaProducer->produce(m_kafkaTopic,(unique ID % 50),RdKafka::Producer::RK_MSG_COPY /* Copy payload */,ptr,size,&partationKey,NULL); I am using various process to produce messages to Kafka to these 2 topics. On an average there are around 48 threads in each process that's calling

    m_kafkaProducer->produce(m_kafkaTopic,partition,RdKafka::Producer::RK_MSG_COPY / Copy payload /,ptr,size,&partationKey,NULL);

    My Kafka Producer configuration

    if (m_ex_event_cb == NULL)
    return -1;
    m_KafkaConf->set("event_cb", m_ex_event_cb, errstr);		
    t_KafkaConf->set("produce.offset.report", "true", errstr);
    t_KafkaConf->set("request.required.acks", "1", errstr);
    

    And I have just one KafkaConsumer, which is going to consume all the kafka messages. Currently I have restricted that to just 8 threads. I mean 8 threads in my kafka consumer are using to consumer messages and then doing the actions. Kafka Consumer setting

    t_KafkaConf->set("auto.offset.reset", "latest", errstr);
    t_KafkaConf->set("enable.auto.commit", "true", errstr);
    t_KafkaConf->set("auto.commit.interval.ms", "1000", errstr);
    m_KafkaConf->set("rebalance_cb", m_ex_rebalance_cb, errstr);	
    m_KafkaConf->set("metadata.broker.list", m_brokersAddress, errstr);
    

    I am using 0.9.2 librdkafka for Producer and Consumer.

    I had raised an issue previously where I was getting Produced failed on Windows. and you suggested the fix , now I am not getting the Produced failed, but there is strange issue I am observing.

    If I start 2 Kafka Producers and then I start another Kafka Producer, suddenly the initial 2 processes message consume rate starts declining. If I immediately kill the recently started Kafka Producer, then other 2 Kafka Processes Messages started getting consumed fine :(

    Each produced message is of 96 bytes each. What could be the reason and where I should look

    And my server.properties is

    broker.id=0
    port:9093
    num.network.threads=3
    num.io.threads=8
    socket.send.buffer.bytes=102400
    socket.receive.buffer.bytes=102400
    socket.request.max.bytes=104857600
    offsets.retention.minutes=360
    num.partitions=1
    num.recovery.threads.per.data.dir=1
    log.retention.minutes=360
    log.segment.bytes=52428800
    log.retention.check.interval.ms=300000
    log.cleaner.enable=true
    log.cleanup.policy=delete
    log.cleaner.min.cleanable.ratio=0.5
    log.cleaner.backoff.ms=15000
    log.segment.delete.delay.ms=6000
    auto.create.topics.enable=false
    

    Abhi

    bug try reproduce windows 
    opened by abhit011 49
  • Signifcant performance degradation after upgrading librdkafka from 0.11.0 to 1.1.0

    Signifcant performance degradation after upgrading librdkafka from 0.11.0 to 1.1.0

    Description

    We've upgraded our librdkafka version from 0.11.0 to 1.1.0, and doing performance tests, we've noticed a major performance degradation. librdkafka 1.1.0 is about 50% slower compared to 0.11.0 for our scenario

    How to reproduce

    We are running a produce session, producing 500,000 messages, and telling librdkafka to transfer them. When using librdkafka 1.1.0 it takes ~25 seconds, using 0.11.0 it takes ~12 seconds. This is after we configured the request.required.acks to be '1' (as we've seen the default was changed, and want to compare the same configuration)

    Checklist

    • [x] librdkafka version (release number or git tag): 1.1.0
    • [x ] Apache Kafka version: 2.0 we've also tried it with 2.3
    • [ ] librdkafka client configuration: api.version.request=true request.required.acks=1 broker.version.fallback=0.9.0 message.max.bytes=1000000 queue.buffering.max.ms=1000 api.version.request=true request.required.acks=1 broker.version.fallback=0.9.0 message.max.bytes=1000000 queue.buffering.max.ms=1000
    • [x] Operating system: Windows10
    • [ ] Provide logs (with debug=.. as necessary) from librdkafka
    • [ ] Provide broker log excerpts
    • [x ] Critical issue
    opened by Eliyahu-Machluf 48
  • Kafka messages dropped even when enable.idempotence is true (kafka broker processes freeze-thaw done)

    Kafka messages dropped even when enable.idempotence is true (kafka broker processes freeze-thaw done)

    Description

    enable.idempotence=true message.timeout.ms=30000 3 broker Kafka cluster used with topic replication factor=3 A continuous sequence(per topic partition) number is added to message body and as a header(to verify continuity) [Sample program attached for issue re-creation] KafkaIssue_MsgDrop.zip

    1. Create RdKafka::Producer and RdKafka::Consumer objects
    2. Get the last written msg (header) sequence in topic partition
    3. Verify Continuity of written messages using the header sequence number
    4. Start writing new messages from above last written sequence number + 1 (writes to multiple partitions in round robin fashion)
    5. If a produce failure or delivery failure occurs, re-start from step-1 AFTER destroying Producer and Consumer objects (waits 15 seconds before going back to step-1).

    While above steps are running, Below done in a loop(bash)

    1. 2 out of 3 Kafka broker processes are frozen for 45 seconds(so that delivery timeout occurs)
    2. Thawed for 20 seconds

    After several loops, there are gaps in written messages. i.e. above step-3(Sequence continuity verification) fails. e.g. for attached logs,

    PRODUCE_SUCCESS 0xe9d550 seq=47177 :partition=0 current sequence is 47177 loop=2 time=20230104_191656 Q_full_cnt=0 ACKNOWLEDGED 0xe9d550 20230104_191656 seq=47177 :partition=0 current sequence is 47177 loop=2 time=20230104_191656 Q_full_cnt=0 kafka brokers(2 out of 3) frozen at Wed Jan 4 19:16:56 +0530 2023 PRODUCE_SUCCESS 0xe9d550 seq=47178 :partition=0 current sequence is 47178 loop=2 time=20230104_191656 Q_full_cnt=0 PRODUCE_SUCCESS 0xe9d550 seq=47179 :partition=0 current sequence is 47179 loop=2 time=20230104_191656 Q_full_cnt=0 kafka brokers thawed at Wed Jan 4 19:17:41 +0530 2023

    BUT actually written in kafka,

    LogAppendTime:1672840016297(i.e. 19:16:56) Partition:0 Offset:47176 sequence:47177 partition=0 current sequence is 47177 loop=2 time=20230104_191656 Q_full_cnt=0 LogAppendTime:1672840046899(i.e. 19:17:26) Partition:0 Offset:47177 sequence:47236 partition=0 current sequence is 47236 loop=2 time=20230104_191656 Q_full_cnt=0

    i.e. there is a gap from 47178 -> 47235

    it seems for msgs produced during broker freeze, some of the msgs are dropped but subsequent msgs are written in kafka.

    How to reproduce

    How to Compile

    In a Linux environment :

    1. Extract the attached KafkaIssue_MsgDrop.zip
    2. Go to it's 'build' directory
    3. Set environment variables KAFKA_INC_PATH(point to librdkafka header files containing directory - both C and C++) and KAFKA_LINK_PATH(point to librdkafka shared libs containing directory)
    4. type 'cmake ..'
    5. type 'make' This will build the binary LibrdKafkaIssueTest.

    How to Run

    1. Create a Kafka topic with topic partition count=4 replication factor=3(3 broker cluster with used for my re-creation)
    2. Invoke LibrdKafkaIssueTest as : ./LibrdKafkaIssueTest KAFKA_BROKER_IP:KAFKA_BROKER_1_PORT,KAFKA_BROKER_IP:KAFKA_BROKER_2_PORT,KAFKA_BROKER_IP:KAFKA_BROKER_3_PORT [topic_name] partition-1 partition-2 ...
    3. Use bash script to freeze-thaw 2 out of 3 kafka broker processes : while true; do kill -s stop "$@" ; echo -n "frozen " ; date ;sleep 45; kill -s cont "$@"; echo -n "running " ; date ;sleep 20; done (command line arguments are PIDs of Kafka broker processes)

    LibrdKafkaIssueTest will exit with a error msg(in color RED) when issue is re-created. e.g. Msg Sequence GAP. Rec=47236 Expected=47177 + 1 Partition=0

    Checklist

    • [x] librdkafka version (release number or git tag): 1.9.2 (issue present in 1.6 too)
    • [x] Apache Kafka version: 2.7.0
    • [x] librdkafka client configuration: -- enable.idempotence=true -- message.timeout.ms=30000
    • [x] Operating system: Red Hat Linux 7.9
    • [x] Provide logs : attached -- librdkafka_out.log - Please kindly note that IP/Port names are substituted -- Log created by the (attached) sample app. this contains produced, acknowledged, purged msg details : partition_0_data.zip
    • [ ] Provide broker log excerpts
    • [x] Critical issue : NO
    opened by aKumara123 0
  • (WITHOUT object leak) RdKafka::Consumer destructor hangs forever (when kafka cluster re-starts taking place)

    (WITHOUT object leak) RdKafka::Consumer destructor hangs forever (when kafka cluster re-starts taking place)

    Description

    THERE IS NO OBJECT LEAK - program to re-create the issue attached here. RdKafka::Consumer destructor thread hangs due to broker thread rkb->rkb_refcnt being NOT zero. This happens during kafka cluster re-starts.

    Operations like reading the last msg, query_watermark_offsets done, while Kafka cluster is being killed & re-started.

    How to reproduce

    Sample program attached.

    How to Compile

    In a Linux environment :

    1. Extract the attached sample_prog_to_re_create_issue.tar.gz
    2. Go to it's 'build' directory
    3. Set environment variables KAFKA_INC_PATH(point to librdkafka header files containing directory - both C and C++) and KAFKA_LINK_PATH(point to librdkafka shared libs containing directory)
    4. type 'cmake ..'
    5. type 'make'

    This will build the binary LibrdKafkaIssueTest.

    How to Run

    Create 2 Kafka topics with topic partition count=4 for each.(3 broker cluster with replication factor=2 used for my re-creation) Keep one topic empty and write a single message to each of the partition in other topic Use bash command to run Kafka cluster kill-restart loop : 'while true; do [stop_the_kafka_cluster]; sleep 10; [re-start-kafka-cluster-WITHOUT-cleaning-data]; sleep 180; done' Invoke LibrdKafkaIssueTest as : 'LibrdKafkaIssueTest 175 [kafka boost trap servers] [empty topic] [topic with 1 msg per partition] 0'

    Run tool for about 10 minutes. When issue is re-created it will show msg (in Color RED), 'Thread Index=[index] is stuck in Consumer Destroy for [stuck time]s'.

    Checklist

    • [x] librdkafka version (release number or git tag): 1.9.2 (issue present in 1.6 too)

    • [x] Apache Kafka version: 2.7.0

    • [x] librdkafka client configuration:All default values used

    • [x] Operating system: Red Hat 8.4 x64 bit (Issue re-created on Red Hat Linux 7.9 too)

    • [x] Provide logs : attached -- gdb.txt -- librd kafka log output librdkafka_output.zip - Kindly Please note that IP and Port names are substituted. -- stack stack_3323083.zip -- stderr output : NOTE librdkafka was built with ENABLE_REFCNT_DEBUG stderr_LibrdKafkaIssueTest_1.zip

    • [] Provide broker log excerpts : Log sizes are large, hence NOT attached. Please let me know if this is essential, I will re-create the issue and attach all logs

    • [x] Critical issue : No

    opened by aKumara123 0
  • Delay between message insertion into Kafka and message consumption from KafkaConsumer client

    Delay between message insertion into Kafka and message consumption from KafkaConsumer client

    Description

    We're using a KafkaConsumer object (not Legacy consumer) which is consuming from a single topic. The behaviour I'm about to describe happens regardless of whether the KafkaConsumer is consuming all partitions, a subset of them or a single partition.

    We're registering an average delay of 950ms between the moment a new record is uploaded into the Kafka topic and the moment our consumer actually manages to consume it.

    This is an example of the events we are registering

    image

    Here we have the first highlighted trace indicating that we have received the END_OF_PARTITION errorcode from Kafka for partition 5. The HighOffset is 3193270. Later on, at the second highlighted trace, we receive the first record added to partition 5. We compute the epoch difference between the RdkafkaMessage timestamp and "now". Considering our KafkaConsumer was ready to consume any other message (even the other partitions had reached the END_OF_PARTITION), one would expect the difference to be very very small (maybe 200 ms), instead we can see here it gets to 859 ms. Sometimes it even gets as big as 1500ms.

    My team and I have been trying to understand why we're getting such a considerable delay for over 2 weeks and we don't understand what could be the cause.

    Checklist

    • librdkafka version: 1.8.2
    • Apache Kafka version: 6.2.1
    • librdkafka client configuration:
    KAFKA conf cfg [gzip,snappy,sasl,regex,lz4,sasl_plain,plugins]=[builtin.features]
    KAFKA conf cfg [rdkafka]=[client.id]
    KAFKA conf cfg [librdkafka]=[client.software.name]
    KAFKA conf cfg [10.232.64.2:9092,10.232.64.50:9092,10.232.64.51:9092,10.232.64.52:9092,10.232.64.61:9092]=[metadata.broker.list]
    KAFKA conf cfg [1000000]=[message.max.bytes]
    KAFKA conf cfg [65535]=[message.copy.max.bytes]
    KAFKA conf cfg [100000000]=[receive.message.max.bytes]
    KAFKA conf cfg [1000000]=[max.in.flight.requests.per.connection]
    KAFKA conf cfg [10]=[metadata.request.timeout.ms]
    KAFKA conf cfg [300000]=[topic.metadata.refresh.interval.ms]
    KAFKA conf cfg [900000]=[metadata.max.age.ms]
    KAFKA conf cfg [250]=[topic.metadata.refresh.fast.interval.ms]
    KAFKA conf cfg [10]=[topic.metadata.refresh.fast.cnt]
    KAFKA conf cfg [true]=[topic.metadata.refresh.sparse]
    KAFKA conf cfg [30000]=[topic.metadata.propagation.max.ms]
    KAFKA conf cfg []=[debug]
    KAFKA conf cfg [60000]=[socket.timeout.ms]
    KAFKA conf cfg [1000]=[socket.blocking.max.ms]
    KAFKA conf cfg [0]=[socket.send.buffer.bytes]
    KAFKA conf cfg [0]=[socket.receive.buffer.bytes]
    KAFKA conf cfg [false]=[socket.keepalive.enable]
    KAFKA conf cfg [false]=[socket.nagle.disable]
    KAFKA conf cfg [1]=[socket.max.fails]
    KAFKA conf cfg [1000]=[broker.address.ttl]
    KAFKA conf cfg [any]=[broker.address.family]
    KAFKA conf cfg [0]=[connections.max.idle.ms]
    KAFKA conf cfg [true]=[enable.sparse.connections]
    KAFKA conf cfg [0]=[reconnect.backoff.jitter.ms]
    KAFKA conf cfg [100]=[reconnect.backoff.ms]
    KAFKA conf cfg [10000]=[reconnect.backoff.max.ms]
    KAFKA conf cfg [0]=[statistics.interval.ms]
    KAFKA conf cfg [0]=[enabled_events]
    KAFKA conf cfg [0x413260]=[log_cb]
    KAFKA conf cfg [6]=[log_level]
    KAFKA conf cfg [false]=[log.queue]
    KAFKA conf cfg [true]=[log.thread.name]
    KAFKA conf cfg [true]=[enable.random.seed]
    KAFKA conf cfg [true]=[log.connection.close]
    KAFKA conf cfg [0x428000]=[socket_cb]
    KAFKA conf cfg [0x44a920]=[open_cb]
    KAFKA conf cfg [0x1ca0410]=[default_topic_conf]
    KAFKA conf cfg [0]=[internal.termination.signal]
    KAFKA conf cfg [true]=[api.version.request]
    KAFKA conf cfg [10000]=[api.version.request.timeout.ms]
    KAFKA conf cfg [0]=[api.version.fallback.ms]
    KAFKA conf cfg [0.10.0]=[broker.version.fallback]
    KAFKA conf cfg [plaintext]=[security.protocol]
    KAFKA conf cfg [Root]=[ssl.ca.certificate.stores]
    KAFKA conf cfg [dynamic]=[ssl.engine.id]
    KAFKA conf cfg [true]=[enable.ssl.certificate.verification]
    KAFKA conf cfg [none]=[ssl.endpoint.identification.algorithm]
    KAFKA conf cfg [GSSAPI]=[sasl.mechanisms]
    KAFKA conf cfg [kafka]=[sasl.kerberos.service.name]
    KAFKA conf cfg [kafkaclient]=[sasl.kerberos.principal]
    KAFKA conf cfg [kinit -R -t "%{sasl.kerberos.keytab}" -k %{sasl.kerberos.principal} || kinit -t "%{sasl.kerberos.keytab}" -k %{sasl.kerberos.principal}]=[sasl.kerberos.kinit.cmd]
    KAFKA conf cfg [60000]=[sasl.kerberos.min.time.before.relogin]
    KAFKA conf cfg [false]=[enable.sasl.oauthbearer.unsecure.jwt]
    KAFKA conf cfg [0]=[test.mock.num.brokers]
    KAFKA conf cfg [drv_kafka_1804289383]=[group.id]
    KAFKA conf cfg [range,roundrobin]=[partition.assignment.strategy]
    KAFKA conf cfg [45000]=[session.timeout.ms]
    KAFKA conf cfg [3000]=[heartbeat.interval.ms]
    KAFKA conf cfg [consumer]=[group.protocol.type]
    KAFKA conf cfg [600000]=[coordinator.query.interval.ms]
    KAFKA conf cfg [300000]=[max.poll.interval.ms]
    KAFKA conf cfg [false]=[enable.auto.commit]
    KAFKA conf cfg [5000]=[auto.commit.interval.ms]
    KAFKA conf cfg [true]=[enable.auto.offset.store]
    KAFKA conf cfg [100000]=[queued.min.messages]
    KAFKA conf cfg [1048576]=[queued.max.messages.kbytes]
    KAFKA conf cfg [500]=[fetch.wait.max.ms]
    KAFKA conf cfg [1048576]=[fetch.message.max.bytes]
    KAFKA conf cfg [52428800]=[fetch.max.bytes]
    KAFKA conf cfg [1]=[fetch.min.bytes]
    KAFKA conf cfg [500]=[fetch.error.backoff.ms]
    KAFKA conf cfg [broker]=[offset.store.method]
    KAFKA conf cfg [read_committed]=[isolation.level]
    KAFKA conf cfg [true]=[enable.partition.eof]
    KAFKA conf cfg [false]=[check.crcs]
    KAFKA conf cfg [false]=[allow.auto.create.topics]
    KAFKA conf cfg []=[client.rack]
    KAFKA conf cfg [60000]=[transaction.timeout.ms]
    KAFKA conf cfg [false]=[enable.idempotence]
    KAFKA conf cfg [false]=[enable.gapless.guarantee]
    KAFKA conf cfg [100000]=[queue.buffering.max.messages]
    KAFKA conf cfg [1048576]=[queue.buffering.max.kbytes]
    KAFKA conf cfg [5]=[queue.buffering.max.ms]
    KAFKA conf cfg [2147483647]=[message.send.max.retries]
    KAFKA conf cfg [100]=[retry.backoff.ms]
    KAFKA conf cfg [1]=[queue.buffering.backpressure.threshold]
    KAFKA conf cfg [none]=[compression.codec]
    KAFKA conf cfg [10000]=[batch.num.messages]
    KAFKA conf cfg [1000000]=[batch.size]
    KAFKA conf cfg [false]=[delivery.report.only.error]
    KAFKA conf cfg [10]=[sticky.partitioning.linger.ms]
    KAFKA conf topic cfg [-1]=[request.required.acks]
    KAFKA conf topic cfg [30000]=[request.timeout.ms]
    KAFKA conf topic cfg [300000]=[message.timeout.ms]
    KAFKA conf topic cfg [fifo]=[queuing.strategy]
    KAFKA conf topic cfg [false]=[produce.offset.report]
    KAFKA conf topic cfg [consistent_random]=[partitioner]
    KAFKA conf topic cfg [inherit]=[compression.codec]
    KAFKA conf topic cfg [-1]=[compression.level]
    KAFKA conf topic cfg [true]=[auto.commit.enable]
    KAFKA conf topic cfg [60000]=[auto.commit.interval.ms]
    KAFKA conf topic cfg [largest]=[auto.offset.reset]
    KAFKA conf topic cfg [.]=[offset.store.path]
    KAFKA conf topic cfg [-1]=[offset.store.sync.interval.ms]
    KAFKA conf topic cfg [broker]=[offset.store.method]
    KAFKA conf topic cfg [0]=[consume.callback.max.messages]
    
    
    • Operating system: RedHat 8
    opened by verrocchimarco 0
  • Too many open files

    Too many open files

    Description

    I am working on a Rtsp client project. In this project, my client code recieves data from mutiple Rtsp servers. I have a wrapper class. In the constructor of this class, I call rd_kafka_new() function, and then, manage the rd_kafka instance. For each Rtsp server, I create a new instance of my class. I have 85 instances of this class, and when I create another instance of my class (now I have 86 instances), I get "too many open files" error from rd_kafka logs. When I check the number of file descriptors of the program, It is about 1050 descriptors.

    My guess is that rd_kafka uses select() function, and this causes the problem. Am I right?

    Checklist

    Please provide the following information:

    • librdkafka version: v1.9.2
    • Apache Kafka version: unknown
    • librdkafka client configuration: message.timeout.ms=2000
    • Operating system: Ubuntu 20.04
    • Provide logs: Too many open files
    • Provide broker log excerpts: unknown
    opened by MiladR76 0
Releases(v1.9.2)
  • v1.9.2(Aug 1, 2022)

    librdkafka v1.9.2 is a maintenance release:

    • The SASL OAUTHBEAR OIDC POST field was sometimes truncated by one byte (#3192).
    • The bundled version of OpenSSL has been upgraded to version 1.1.1q for non-Windows builds. Windows builds remain on OpenSSL 1.1.1n for the time being.
    • The bundled version of Curl has been upgraded to version 7.84.0.

    Checksums

    Release asset checksums:

    • v1.9.2.zip SHA256 4ecb0a3103022a7cab308e9fecd88237150901fa29980c99344218a84f497b86
    • v1.9.2.tar.gz SHA256 3fba157a9f80a0889c982acdd44608be8a46142270a389008b22d921be1198ad
    Source code(tar.gz)
    Source code(zip)
  • v1.9.1(Jul 6, 2022)

    librdkafka v1.9.1

    librdkafka v1.9.1 is a maintenance release:

    • The librdkafka.redist NuGet package now contains OSX M1/arm64 builds.
    • Self-contained static libraries can now be built on OSX M1 too, thanks to disabling curl's configure runtime check.

    Checksums

    Release asset checksums:

    • v1.9.1.zip SHA256 d3fc2e0bc00c3df2c37c5389c206912842cca3f97dd91a7a97bc0f4fc69f94ce
    • v1.9.1.tar.gz SHA256 3a54cf375218977b7af4716ed9738378e37fe400a6c5ddb9d622354ca31fdc79
    Source code(tar.gz)
    Source code(zip)
  • v1.9.0(Jun 16, 2022)

    librdkafka v1.9.0

    librdkafka v1.9.0 is a feature release:

    • Added KIP-768 OUATHBEARER OIDC support (by @jliunyu, #3560)
    • Added KIP-140 Admin API ACL support (by @emasab, #2676)

    Upgrade considerations

    • Consumer: rd_kafka_offsets_store() (et.al) will now return an error for any partition that is not currently assigned (through rd_kafka_*assign()). This prevents a race condition where an application would store offsets after the assigned partitions had been revoked (which resets the stored offset), that could cause these old stored offsets to be committed later when the same partitions were assigned to this consumer again - effectively overwriting any committed offsets by any consumers that were assigned the same partitions previously. This would typically result in the offsets rewinding and messages to be reprocessed. As an extra effort to avoid this situation the stored offset is now also reset when partitions are assigned (through rd_kafka_*assign()). Applications that explicitly call ..offset*_store() will now need to handle the case where RD_KAFKA_RESP_ERR__STATE is returned in the per-partition .err field - meaning the partition is no longer assigned to this consumer and the offset could not be stored for commit.

    Enhancements

    • Improved producer queue scheduling. Fixes the performance regression introduced in v1.7.0 for some produce patterns. (#3538, #2912)
    • Windows: Added native Win32 IO/Queue scheduling. This removes the internal TCP loopback connections that were previously used for timely queue wakeups.
    • Added socket.connection.setup.timeout.ms (default 30s). The maximum time allowed for broker connection setups (TCP connection as well as SSL and SASL handshakes) is now limited to this value. This fixes the issue with stalled broker connections in the case of network or load balancer problems. The Java clients has an exponential backoff to this timeout which is limited by socket.connection.setup.timeout.max.ms - this was not implemented in librdkafka due to differences in connection handling and ERR__ALL_BROKERS_DOWN error reporting. Having a lower initial connection setup timeout and then increase the timeout for the next attempt would yield possibly false-positive ERR__ALL_BROKERS_DOWN too early.
    • SASL OAUTHBEARER refresh callbacks can now be scheduled for execution on librdkafka's background thread. This solves the problem where an application has a custom SASL OAUTHBEARER refresh callback and thus needs to call rd_kafka_poll() (et.al.) at least once to trigger the refresh callback before being able to connect to brokers. With the new rd_kafka_conf_enable_sasl_queue() configuration API and rd_kafka_sasl_background_callbacks_enable() the refresh callbacks can now be triggered automatically on the librdkafka background thread.
    • rd_kafka_queue_get_background() now creates the background thread if not already created.
    • Added rd_kafka_consumer_close_queue() and rd_kafka_consumer_closed(). This allow applications and language bindings to implement asynchronous consumer close.
    • Bundled zlib upgraded to version 1.2.12.
    • Bundled OpenSSL upgraded to 1.1.1n.
    • Added test.mock.broker.rtt to simulate RTT/latency for mock brokers.

    Fixes

    General fixes

    • Fix various 1 second delays due to internal broker threads blocking on IO even though there are events to handle. These delays could be seen randomly in any of the non produce/consume request APIs, such as commit_transaction(), list_groups(), etc.
    • Windows: some applications would crash with an error message like no OPENSSL_Applink() written to the console if ssl.keystore.location was configured. This regression was introduced in v1.8.0 due to use of vcpkgs and how keystore file was read. #3554.
    • Windows 32-bit only: 64-bit atomic reads were in fact not atomic and could in rare circumstances yield incorrect values. One manifestation of this issue was the max.poll.interval.ms consumer timer expiring even though the application was polling according to profile. Fixed by @WhiteWind (#3815).
    • rd_kafka_clusterid() would previously fail with timeout if called on cluster with no visible topics (#3620). The clusterid is now returned as soon as metadata has been retrieved.
    • Fix hang in rd_kafka_list_groups() if there are no available brokers to connect to (#3705).
    • Millisecond timeouts (timeout_ms) in various APIs, such as rd_kafka_poll(), was limited to roughly 36 hours before wrapping. (#3034)
    • If a metadata request triggered by rd_kafka_metadata() or consumer group rebalancing encountered a non-retriable error it would not be propagated to the caller and thus cause a stall or timeout, this has now been fixed. (@aiquestion, #3625)
    • AdminAPI DeleteGroups() and DeleteConsumerGroupOffsets(): if the given coordinator connection was not up by the time these calls were initiated and the first connection attempt failed then no further connection attempts were performed, ulimately leading to the calls timing out. This is now fixed by keep retrying to connect to the group coordinator until the connection is successful or the call times out. Additionally, the coordinator will be now re-queried once per second until the coordinator comes up or the call times out, to detect change in coordinators.
    • Mock cluster rd_kafka_mock_broker_set_down() would previously accept and then disconnect new connections, it now refuses new connections.

    Consumer fixes

    • rd_kafka_offsets_store() (et.al) will now return an error for any partition that is not currently assigned (through rd_kafka_*assign()). See Upgrade considerations above for more information.
    • rd_kafka_*assign() will now reset/clear the stored offset. See Upgrade considerations above for more information.
    • seek() followed by pause() would overwrite the seeked offset when later calling resume(). This is now fixed. (#3471). Note: Avoid storing offsets (offsets_store()) after calling seek() as this may later interfere with resuming a paused partition, instead store offsets prior to calling seek.
    • A ERR_MSG_SIZE_TOO_LARGE consumer error would previously be raised if the consumer received a maximum sized FetchResponse only containing (transaction) aborted messages with no control messages. The fetching did not stop, but some applications would terminate upon receiving this error. No error is now raised in this case. (#2993) Thanks to @jacobmikesell for providing an application to reproduce the issue.
    • The consumer no longer backs off the next fetch request (default 500ms) when the parsed fetch response is truncated (which is a valid case). This should speed up the message fetch rate in case of maximum sized fetch responses.
    • Fix consumer crash (assert: rkbuf->rkbuf_rkb) when parsing malformed JoinGroupResponse consumer group metadata state.
    • Fix crash (cant handle op type) when using consume_batch_queue() (et.al) and an OAUTHBEARER refresh callback was set. The callback is now triggered by the consume call. (#3263)
    • Fix partition.assignment.strategy ordering when multiple strategies are configured. If there is more than one eligible strategy, preference is determined by the configured order of strategies. The partitions are assigned to group members according to the strategy order preference now. (#3818)
    • Any form of unassign*() (absolute or incremental) is now allowed during consumer close rebalancing and they're all treated as absolute unassigns. (@kevinconaway)

    Transactional producer fixes

    • Fix message loss in idempotent/transactional producer. A corner case has been identified that may cause idempotent/transactional messages to be lost despite being reported as successfully delivered: During cluster instability a restarting broker may report existing topics as non-existent for some time before it is able to acquire up to date cluster and topic metadata. If an idempotent/transactional producer updates its topic metadata cache from such a broker the producer will consider the topic to be removed from the cluster and thus remove its local partition objects for the given topic. This also removes the internal message sequence number counter for the given partitions. If the producer later receives proper topic metadata for the cluster the previously "removed" topics will be rediscovered and new partition objects will be created in the producer. These new partition objects, with no knowledge of previous incarnations, would start counting partition messages at zero again. If new messages were produced for these partitions by the same producer instance, the same message sequence numbers would be sent to the broker. If the broker still maintains state for the producer's PID and Epoch it could deem that these messages with reused sequence numbers had already been written to the log and treat them as legit duplicates. This would seem to the producer that these new messages were successfully written to the partition log by the broker when they were in fact discarded as duplicates, leading to silent message loss. The fix included in this release is to save the per-partition idempotency state when a partition is removed, and then recover and use that saved state if the partition comes back at a later time.
    • The transactional producer would retry (re)initializing its PID if a PRODUCER_FENCED error was returned from the broker (added in Apache Kafka 2.8), which could cause the producer to seemingly hang. This error code is now correctly handled by raising a fatal error.
    • If the given group coordinator connection was not up by the time send_offsets_to_transactions() was called, and the first connection attempt failed then no further connection attempts were performed, ulimately leading to send_offsets_to_transactions() timing out, and possibly also the transaction timing out on the transaction coordinator. This is now fixed by keep retrying to connect to the group coordinator until the connection is successful or the call times out. Additionally, the coordinator will be now re-queried once per second until the coordinator comes up or the call times out, to detect change in coordinators.

    Producer fixes

    • Improved producer queue wakeup scheduling. This should significantly decrease the number of wakeups and thus syscalls for high message rate producers. (#3538, #2912)
    • The logic for enforcing that message.timeout.ms is greather than an explicitly configured linger.ms was incorrect and instead of erroring out early the lingering time was automatically adjusted to the message timeout, ignoring the configured linger.ms. This has now been fixed so that an error is returned when instantiating the producer. Thanks to @larry-cdn77 for analysis and test-cases. (#3709)

    Checksums

    Release asset checksums:

    • v1.9.0.zip SHA256 a2d124cfb2937ec5efc8f85123dbcfeba177fb778762da506bfc5a9665ed9e57
    • v1.9.0.tar.gz SHA256 59b6088b69ca6cf278c3f9de5cd6b7f3fd604212cd1c59870bc531c54147e889
    Source code(tar.gz)
    Source code(zip)
  • v1.6.2(Nov 25, 2021)

    librdkafka v1.6.2

    librdkafka v1.6.2 is a maintenance release with the following backported fixes:

    • Upon quick repeated leader changes the transactional producer could receive an OUT_OF_ORDER_SEQUENCE error from the broker, which triggered an Epoch bump on the producer resulting in an InitProducerIdRequest being sent to the transaction coordinator in the middle of a transaction. This request would start a new transaction on the coordinator, but the producer would still think (erroneously) it was in the current transaction. Any messages produced in the current transaction prior to this event would be silently lost when the application committed the transaction, leading to message loss. To avoid message loss a fatal error is now raised. This fix is specific to v1.6.x. librdkafka v1.8.x implements a recoverable error state instead. #3575.
    • The transactional producer could stall during a transaction if the transaction coordinator changed while adding offsets to the transaction (send_offsets_to_transaction()). This stall lasted until the coordinator connection went down, the transaction timed out, transaction was aborted, or messages were produced to a new partition, whichever came first. #3571.
    • librdkafka's internal timers would not start if the timeout was set to 0, which would result in some timeout operations not being enforced correctly, e.g., the transactional producer API timeouts. These timers are now started with a timeout of 1 microsecond.
    • Force address resolution if the broker epoch changes (#3238).

    Checksums

    Release asset checksums:

    • v1.6.2.zip SHA256 1d389a98bda374483a7b08ff5ff39708f5a923e5add88b80b71b078cb2d0c92e
    • v1.6.2.tar.gz SHA256 b9be26c632265a7db2fdd5ab439f2583d14be08ab44dc2e33138323af60c39db
    Source code(tar.gz)
    Source code(zip)
  • v1.8.2(Oct 18, 2021)

    librdkafka v1.8.2

    librdkafka v1.8.2 is a maintenance release.

    Enhancements

    • Added ssl.ca.pem to add CA certificate by PEM string. (#2380)
    • Prebuilt binaries for Mac OSX now contain statically linked OpenSSL v1.1.1l. Previously the OpenSSL version was either v1.1.1 or v1.0.2 depending on build type.

    Fixes

    • The librdkafka.redist 1.8.0 package had two flaws:
      • the linux-arm64 .so build was a linux-x64 build.
      • the included Windows MSVC 140 runtimes for x64 were infact x86. The release script has been updated to verify the architectures of provided artifacts to avoid this happening in the future.
    • Prebuilt binaries for Mac OSX Sierra (10.12) and older are no longer provided. This affects confluent-kafka-go.
    • Some of the prebuilt binaries for Linux were built on Ubuntu 14.04, these builds are now performed on Ubuntu 16.04 instead. This may affect users on ancient Linux distributions.
    • It was not possible to configure ssl.ca.location on OSX, the property would automatically revert back to probe (default value). This regression was introduced in v1.8.0. (#3566)
    • librdkafka's internal timers would not start if the timeout was set to 0, which would result in some timeout operations not being enforced correctly, e.g., the transactional producer API timeouts. These timers are now started with a timeout of 1 microsecond.

    Transactional producer fixes

    • Upon quick repeated leader changes the transactional producer could receive an OUT_OF_ORDER_SEQUENCE error from the broker, which triggered an Epoch bump on the producer resulting in an InitProducerIdRequest being sent to the transaction coordinator in the middle of a transaction. This request would start a new transaction on the coordinator, but the producer would still think (erroneously) it was in current transaction. Any messages produced in the current transaction prior to this event would be silently lost when the application committed the transaction, leading to message loss. This has been fixed by setting the Abortable transaction error state in the producer. #3575.
    • The transactional producer could stall during a transaction if the transaction coordinator changed while adding offsets to the transaction (send_offsets_to_transaction()). This stall lasted until the coordinator connection went down, the transaction timed out, transaction was aborted, or messages were produced to a new partition, whichever came first. #3571.

    Checksums

    Release asset checksums:

    • v1.8.2.zip SHA256 8b03d8b650f102f3a6a6cff6eedc29b9e2f68df9ba7e3c0f3fb00838cce794b8
    • v1.8.2.tar.gz SHA256 6a747d293a7a4613bd2897e28e8791476fbe1ae7361f2530a876e0fd483482a6

    Note: there was no v1.8.1 librdkafka release

    Source code(tar.gz)
    Source code(zip)
  • v1.8.0(Sep 16, 2021)

    librdkafka v1.8.0

    librdkafka v1.8.0 is a security release:

    • Upgrade bundled zlib version from 1.2.8 to 1.2.11 in the librdkafka.redist NuGet package. The updated zlib version fixes CVEs: CVE-2016-9840, CVE-2016-9841, CVE-2016-9842, CVE-2016-9843 See https://github.com/edenhill/librdkafka/issues/2934 for more information.
    • librdkafka now uses vcpkg for up-to-date Windows dependencies in the librdkafka.redist NuGet package: OpenSSL 1.1.1l, zlib 1.2.11, zstd 1.5.0.
    • The upstream dependency (OpenSSL, zstd, zlib) source archive checksums are now verified when building with ./configure --install-deps. These builds are used by the librdkafka builds bundled with confluent-kafka-go, confluent-kafka-python and confluent-kafka-dotnet.

    Enhancements

    • Producer flush() now overrides the linger.ms setting for the duration of the flush() call, effectively triggering immediate transmission of queued messages. (#3489)

    Fixes

    General fixes

    • Correctly detect presence of zlib via compilation check. (Chris Novakovic)
    • ERR__ALL_BROKERS_DOWN is no longer emitted when the coordinator connection goes down, only when all standard named brokers have been tried. This fixes the issue with ERR__ALL_BROKERS_DOWN being triggered on consumer_close(). It is also now only emitted if the connection was fully up (past handshake), and not just connected.
    • rd_kafka_query_watermark_offsets(), rd_kafka_offsets_for_times(), consumer_lag metric, and auto.offset.reset now honour isolation.level and will return the Last Stable Offset (LSO) when isolation.level is set to read_committed (default), rather than the uncommitted high-watermark when it is set to read_uncommitted. (#3423)
    • SASL GSSAPI is now usable when sasl.kerberos.min.time.before.relogin is set to 0 - which disables ticket refreshes (by @mpekalski, #3431).
    • Rename internal crc32c() symbol to rd_crc32c() to avoid conflict with other static libraries (#3421).
    • txidle and rxidle in the statistics object was emitted as 18446744073709551615 when no idle was known. -1 is now emitted instead. (#3519)

    Consumer fixes

    • Automatically retry offset commits on ERR_REQUEST_TIMED_OUT, ERR_COORDINATOR_NOT_AVAILABLE, and ERR_NOT_COORDINATOR (#3398). Offset commits will be retried twice.
    • Timed auto commits did not work when only using assign() and not subscribe(). This regression was introduced in v1.7.0.
    • If the topics matching the current subscription changed (or the application updated the subscription) while there was an outstanding JoinGroup or SyncGroup request, an additional request would sometimes be sent before handling the response of the first. This in turn lead to internal state issues that could cause a crash or malbehaviour. The consumer will now wait for any outstanding JoinGroup or SyncGroup responses before re-joining the group.
    • auto.offset.reset could previously be triggered by temporary errors, such as disconnects and timeouts (after the two retries are exhausted). This is now fixed so that the auto offset reset policy is only triggered for permanent errors.
    • The error that triggers auto.offset.reset is now logged to help the application owner identify the reason of the reset.
    • If a rebalance takes longer than a consumer's session.timeout.ms, the consumer will remain in the group as long as it receives heartbeat responses from the broker.

    Admin fixes

    • DeleteRecords() could crash if one of the underlying requests (for a given partition leader) failed at the transport level (e.g., timeout). (#3476).

    Checksums

    Release asset checksums:

    • v1.8.0.zip SHA256 4b173f759ea5fdbc849fdad00d3a836b973f76cbd3aa8333290f0398fd07a1c4
    • v1.8.0.tar.gz SHA256 93b12f554fa1c8393ce49ab52812a5f63e264d9af6a50fd6e6c318c481838b7f
    Source code(tar.gz)
    Source code(zip)
  • v1.7.0(May 10, 2021)

    librdkafka v1.7.0

    librdkafka v1.7.0 is feature release:

    • KIP-360 - Improve reliability of transactional producer. Requires Apache Kafka 2.5 or later.
    • OpenSSL Engine support (ssl.engine.location) by @adinigam and @ajbarb.

    Enhancements

    • Added connections.max.idle.ms to automatically close idle broker connections. This feature is disabled by default unless bootstrap.servers contains the string azure in which case the default is set to <4 minutes to improve connection reliability and circumvent limitations with the Azure load balancers (see #3109 for more information).
    • Bumped to OpenSSL 1.1.1k in binary librdkafka artifacts.
    • The binary librdkafka artifacts for Alpine are now using Alpine 3.12.
    • Improved static librdkafka Windows builds using MinGW (@neptoess, #3130).

    Upgrade considerations

    • The C++ oauthbearer_token_refresh_cb() was missing a Handle * argument that has now been added. This is a breaking change but the original function signature is considered a bug. This change only affects C++ OAuth developers.
    • KIP-735 The consumer session.timeout.ms default was changed from 10 to 45 seconds to make consumer groups more robust and less sensitive to temporary network and cluster issues.
    • Statistics: consumer_lag is now using the committed_offset, while the new consumer_lag_stored is using stored_offset (offset to be committed). This is more correct than the previous consumer_lag which was using either committed_offset or app_offset (last message passed to application).

    Fixes

    General fixes

    • Fix accesses to freed metadata cache mutexes on client termination (#3279)
    • There was a race condition on receiving updated metadata where a broker id update (such as bootstrap to proper broker transformation) could finish after the topic metadata cache was updated, leading to existing brokers seemingly being not available. One occurrence of this issue was query_watermark_offsets() that could return ERR__UNKNOWN_PARTITION for existing partitions shortly after the client instance was created.
    • The OpenSSL context is now initialized with TLS_client_method() (on OpenSSL >= 1.1.0) instead of the deprecated and outdated SSLv23_client_method().
    • The initial cluster connection on client instance creation could sometimes be delayed up to 1 second if a group.id or transactional.id was configured (#3305).
    • Speed up triggering of new broker connections in certain cases by exiting the broker thread io/op poll loop when a wakeup op is received.
    • SASL GSSAPI: The Kerberos kinit refresh command was triggered from rd_kafka_new() which made this call blocking if the refresh command was taking long. The refresh is now performed by the background rdkafka main thread.
    • Fix busy-loop (100% CPU on the broker threads) during the handshake phase of an SSL connection.
    • Disconnects during SSL handshake are now propagated as transport errors rather than SSL errors, since these disconnects are at the transport level (e.g., incorrect listener, flaky load balancer, etc) and not due to SSL issues.
    • Increment metadata fast refresh interval backoff exponentially (@ajbarb, #3237).
    • Unthrottled requests are no longer counted in the brokers[].throttle statistics object.
    • Log CONFWARN warning when global topic configuration properties are overwritten by explicitly setting a default_topic_conf.

    Consumer fixes

    • If a rebalance happened during a consume_batch..() call the already accumulated messages for revoked partitions were not purged, which would pass messages to the application for partitions that were no longer owned by the consumer. Fixed by @jliunyu. #3340.
    • Fix balancing and reassignment issues with the cooperative-sticky assignor. #3306.
    • Fix incorrect detection of first rebalance in sticky assignor (@hallfox).
    • Aborted transactions with no messages produced to a partition could cause further successfully committed messages in the same Fetch response to be ignored, resulting in consumer-side message loss. A log message along the lines Abort txn ctrl msg bad order at offset 7501: expected before or at 7702: messages in aborted transactions may be delivered to the application would be seen. This is a rare occurrence where a transactional producer would register with the partition but not produce any messages before aborting the transaction.
    • The consumer group deemed cached metadata up to date by checking topic.metadata.refresh.interval.ms: if this property was set too low it would cause cached metadata to be unusable and new metadata to be fetched, which could delay the time it took for a rebalance to settle. It now correctly uses metadata.max.age.ms instead.
    • The consumer group timed auto commit would attempt commits during rebalances, which could result in "Illegal generation" errors. This is now fixed, the timed auto committer is only employed in the steady state when no rebalances are taking places. Offsets are still auto committed when partitions are revoked.
    • Retriable FindCoordinatorRequest errors are no longer propagated to the application as they are retried automatically.
    • Fix rare crash (assert rktp_started) on consumer termination (introduced in v1.6.0).
    • Fix unaligned access and possibly corrupted snappy decompression when building with MSVC (@azat)
    • A consumer configured with the cooperative-sticky assignor did not actively Leave the group on unsubscribe(). This delayed the rebalance for the remaining group members by up to session.timeout.ms.
    • The current subscription list was sometimes leaked when unsubscribing.

    Producer fixes

    • The timeout value of flush() was not respected when delivery reports were scheduled as events (such as for confluent-kafka-go) rather than callbacks.
    • There was a race conditition in purge() which could cause newly created partition objects, or partitions that were changing leaders, to not have their message queues purged. This could cause abort_transaction() to time out. This issue is now fixed.
    • In certain high-thruput produce rate patterns producing could stall for 1 second, regardless of linger.ms, due to rate-limiting of internal queue wakeups. This is now fixed by not rate-limiting queue wakeups but instead limiting them to one wakeup per queue reader poll. #2912.

    Transactional Producer fixes

    • KIP-360: Fatal Idempotent producer errors are now recoverable by the transactional producer and will raise a txn_requires_abort() error.
    • If the cluster went down between produce() and commit_transaction() and before any partitions had been registered with the coordinator, the messages would time out but the commit would succeed because nothing had been sent to the coordinator. This is now fixed.
    • If the current transaction failed while commit_transaction() was checking the current transaction state an invalid state transaction could occur which in turn would trigger a assertion crash. This issue showed up as "Invalid txn state transition: .." crashes, and is now fixed by properly synchronizing both checking and transition of state.
    Source code(tar.gz)
    Source code(zip)
  • v1.6.1(Feb 24, 2021)

    librdkafka v1.6.1

    librdkafka v1.6.1 is a maintenance release.

    Upgrade considerations

    • Fatal idempotent producer errors are now also fatal to the transactional producer. This is a necessary step to maintain data integrity prior to librdkafka supporting KIP-360. Applications should check any transactional API errors for the is_fatal flag and decommission the transactional producer if the flag is set.
    • The consumer error raised by auto.offset.reset=error now has error-code set to ERR__AUTO_OFFSET_RESET to allow an application to differentiate between auto offset resets and other consumer errors.

    Fixes

    General fixes

    • Admin API and transactional send_offsets_to_transaction() coordinator requests, such as TxnOffsetCommitRequest, could in rare cases be sent multiple times which could cause a crash.
    • ssl.ca.location=probe is now enabled by default on Mac OSX since the librdkafka-bundled OpenSSL might not have the same default CA search paths as the system or brew installed OpenSSL. Probing scans all known locations.

    Transactional Producer fixes

    • Fatal idempotent producer errors are now also fatal to the transactional producer.
    • The transactional producer could crash if the transaction failed while send_offsets_to_transaction() was called.
    • Group coordinator requests for transactional send_offsets_to_transaction() calls would leak memory if the underlying request was attempted to be sent after the transaction had failed.
    • When gradually producing to multiple partitions (resulting in multiple underlying AddPartitionsToTxnRequests) sub-sequent partitions could get stuck in pending state under certain conditions. These pending partitions would not send queued messages to the broker and eventually trigger message timeouts, failing the current transaction. This is now fixed.
    • Committing an empty transaction (no messages were produced and no offsets were sent) would previously raise a fatal error due to invalid state on the transaction coordinator. We now allow empty/no-op transactions to be committed.

    Consumer fixes

    • The consumer will now retry indefinitely (or until the assignment is changed) to retrieve committed offsets. This fixes the issue where only two retries were attempted when outstanding transactions were blocking OffsetFetch requests with ERR_UNSTABLE_OFFSET_COMMIT. #3265
    Source code(tar.gz)
    Source code(zip)
  • v1.6.0(Jan 26, 2021)

    librdkafka v1.6.0

    librdkafka v1.6.0 is feature release:

    • KIP-429 Incremental rebalancing with sticky consumer group partition assignor (KIP-54) (by @mhowlett).
    • KIP-480 Sticky producer partitioning (sticky.partitioning.linger.ms) - achieves higher throughput and lower latency through sticky selection of random partition (by @abbycriswell).
    • AdminAPI: Add support for DeleteRecords(), DeleteGroups() and DeleteConsumerGroupOffsets() (by @gridaphobe)
    • KIP-447 Producer scalability for exactly once semantics - allows a single transactional producer to be used for multiple input partitions. Requires Apache Kafka 2.5 or later.
    • Transactional producer fixes and improvements, see Transactional Producer fixes below.
    • The librdkafka.redist NuGet package now supports Linux ARM64/Aarch64.

    Upgrade considerations

    • Sticky producer partitioning (sticky.partitioning.linger.ms) is enabled by default (10 milliseconds) which affects the distribution of randomly partitioned messages, where previously these messages would be evenly distributed over the available partitions they are now partitioned to a single partition for the duration of the sticky time (10 milliseconds by default) before a new random sticky partition is selected.
    • The new KIP-447 transactional producer scalability guarantees are only supported on Apache Kafka 2.5 or later, on earlier releases you will need to use one producer per input partition for EOS. This limitation is not enforced by the producer or broker.
    • Error handling for the transactional producer has been improved, see the Transactional Producer fixes below for more information.

    Known issues

    • The Transactional Producer's API timeout handling is inconsistent with the underlying protocol requests, it is therefore strongly recommended that applications call rd_kafka_commit_transaction() and rd_kafka_abort_transaction() with the timeout_ms parameter set to -1, which will use the remaining transaction timeout.

    Enhancements

    • KIP-107, KIP-204: AdminAPI: Added DeleteRecords() (by @gridaphobe).
    • KIP-229: AdminAPI: Added DeleteGroups() (by @gridaphobe).
    • KIP-496: AdminAPI: Added DeleteConsumerGroupOffsets().
    • KIP-464: AdminAPI: Added support for broker-side default partition count and replication factor for CreateTopics().
    • Windows: Added ssl.ca.certificate.stores to specify a list of Windows Certificate Stores to read CA certificates from, e.g., CA,Root. Root remains the default store.
    • Use reentrant rand_r() on supporting platforms which decreases lock contention (@azat).
    • Added assignor debug context for troubleshooting consumer partition assignments.
    • Updated to OpenSSL v1.1.1i when building dependencies.
    • Update bundled lz4 (used when ./configure --disable-lz4-ext) to v1.9.3 which has vast performance improvements.
    • Added rd_kafka_conf_get_default_topic_conf() to retrieve the default topic configuration object from a global configuration object.
    • Added conf debugging context to debug - shows set configuration properties on client and topic instantiation. Sensitive properties are redacted.
    • Added rd_kafka_queue_yield() to cancel a blocking queue call.
    • Will now log a warning when multiple ClusterIds are seen, which is an indication that the client might be erroneously configured to connect to multiple clusters which is not supported.
    • Added rd_kafka_seek_partitions() to seek multiple partitions to per-partition specific offsets.

    Fixes

    General fixes

    • Fix a use-after-free crash when certain coordinator requests were retried.
    • The C++ oauthbearer_set_token() function would call free() on a new-created pointer, possibly leading to crashes or heap corruption (#3194)

    Consumer fixes

    • The consumer assignment and consumer group implementations have been decoupled, simplified and made more strict and robust. This will sort out a number of edge cases for the consumer where the behaviour was previously undefined.
    • Partition fetch state was not set to STOPPED if OffsetCommit failed.
    • The session timeout is now enforced locally also when the coordinator connection is down, which was not previously the case.

    Transactional Producer fixes

    • Transaction commit or abort failures on the broker, such as when the producer was fenced by a newer instance, were not propagated to the application resulting in failed commits seeming successful. This was a critical race condition for applications that had a delay after producing messages (or sendings offsets) before committing or aborting the transaction. This issue has now been fixed and test coverage improved.
    • The transactional producer API would return RD_KAFKA_RESP_ERR__STATE when API calls were attempted after the transaction had failed, we now try to return the error that caused the transaction to fail in the first place, such as RD_KAFKA_RESP_ERR__FENCED when the producer has been fenced, or RD_KAFKA_RESP_ERR__TIMED_OUT when the transaction has timed out.
    • Transactional producer retry count for transactional control protocol requests has been increased from 3 to infinite, retriable errors are now automatically retried by the producer until success or the transaction timeout is exceeded. This fixes the case where rd_kafka_send_offsets_to_transaction() would fail the current transaction into an abortable state when CONCURRENT_TRANSACTIONS was returned by the broker (which is a transient error) and the 3 retries were exhausted.

    Producer fixes

    • Calling rd_kafka_topic_new() with a topic config object with message.timeout.ms set could sometimes adjust the global linger.ms property (if not explicitly configured) which was not desired, this is now fixed and the auto adjustment is only done based on the default_topic_conf at producer creation.
    • rd_kafka_flush() could previously return RD_KAFKA_RESP_ERR__TIMED_OUT just as the timeout was reached if the messages had been flushed but there were now no more messages. This has been fixed.

    Checksums

    Release asset checksums:

    • v1.6.0.zip SHA256 af6f301a1c35abb8ad2bb0bab0e8919957be26c03a9a10f833c8f97d6c405aa8
    • v1.6.0.tar.gz SHA256 3130cbd391ef683dc9acf9f83fe82ff93b8730a1a34d0518e93c250929be9f6b
    Source code(tar.gz)
    Source code(zip)
  • v1.5.3(Dec 9, 2020)

    librdkafka v1.5.3

    librdkafka v1.5.3 is a maintenance release.

    Upgrade considerations

    • CentOS 6 is now EOL and is no longer included in binary librdkafka packages, such as NuGet.

    Fixes

    General fixes

    • Fix a use-after-free crash when certain coordinator requests were retried.

    Consumer fixes

    • Consumer would not filter out messages for aborted transactions if the messages were compressed (#3020).
    • Consumer destroy without prior close() could hang in certain cgrp states (@gridaphobe, #3127).
    • Fix possible null dereference in Message::errstr() (#3140).
    • The roundrobin partition assignment strategy could get stuck in an endless loop or generate uneven assignments in case the group members had asymmetric subscriptions (e.g., c1 subscribes to t1,t2 while c2 subscribes to t2,t3). (#3159)

    Checksums

    Release asset checksums:

    • v1.5.3.zip SHA256 3f24271232a42f2d5ac8aab3ab1a5ddbf305f9a1ae223c840d17c221d12fe4c1
    • v1.5.3.tar.gz SHA256 2105ca01fef5beca10c9f010bc50342b15d5ce6b73b2489b012e6d09a008b7bf
    Source code(tar.gz)
    Source code(zip)
  • v1.5.2(Oct 20, 2020)

    librdkafka v1.5.2

    librdkafka v1.5.2 is a maintenance release.

    Upgrade considerations

    • The default value for the producer configuration property retries has been increased from 2 to infinity, effectively limiting Produce retries to only message.timeout.ms. As the reasons for the automatic internal retries vary (various broker error codes as well as transport layer issues), it doesn't make much sense to limit the number of retries for retriable errors, but instead only limit the retries based on the allowed time to produce a message.
    • The default value for the producer configuration property request.timeout.ms has been increased from 5 to 30 seconds to match the Apache Kafka Java producer default. This change yields increased robustness for broker-side congestion.

    Enhancements

    • The generated CONFIGURATION.md (through rd_kafka_conf_properties_show())) now include all properties and values, regardless if they were included in the build, and setting a disabled property or value through rd_kafka_conf_set() now returns RD_KAFKA_CONF_INVALID and provides a more useful error string saying why the property can't be set.
    • Consumer configs on producers and vice versa will now be logged with warning messages on client instantiation.

    Fixes

    Security fixes

    • There was an incorrect call to zlib's inflateGetHeader() with unitialized memory pointers that could lead to the GZIP header of a fetched message batch to be copied to arbitrary memory. This function call has now been completely removed since the result was not used. Reported by Ilja van Sprundel.

    General fixes

    • rd_kafka_topic_opaque() (used by the C++ API) would cause object refcounting issues when used on light-weight (error-only) topic objects such as consumer errors (#2693).
    • Handle name resolution failures when formatting IP addresses in error logs, and increase printed hostname limit to ~256 bytes (was ~60).
    • Broker sockets would be closed twice (thus leading to potential race condition with fd-reuse in other threads) if a custom socket_cb would return error.

    Consumer fixes

    • The roundrobin partition.assignment.strategy could crash (assert) for certain combinations of members and partitions. This is a regression in v1.5.0. (#3024)
    • The C++ KafkaConsumer destructor did not destroy the underlying C rd_kafka_t instance, causing a leak if close() was not used.
    • Expose rich error strings for C++ Consumer Message->errstr().
    • The consumer could get stuck if an outstanding commit failed during rebalancing (#2933).
    • Topic authorization errors during fetching are now reported only once (#3072).

    Producer fixes

    • Topic authorization errors are now properly propagated for produced messages, both through delivery reports and as ERR_TOPIC_AUTHORIZATION_FAILED return value from produce*() (#2215)
    • Treat cluster authentication failures as fatal in the transactional producer (#2994).
    • The transactional producer code did not properly reference-count partition objects which could in very rare circumstances lead to a use-after-free bug if a topic was deleted from the cluster when a transaction was using it.
    • ERR_KAFKA_STORAGE_ERROR is now correctly treated as a retriable produce error (#3026).
    • Messages that timed out locally would not fail the ongoing transaction. If the application did not take action on failed messages in its delivery report callback and went on to commit the transaction, the transaction would be successfully committed, simply omitting the failed messages.
    • EndTxnRequests (sent on commit/abort) are only retried in allowed states (#3041). Previously the transaction could hang on commit_transaction() if an abortable error was hit and the EndTxnRequest was to be retried.

    Note: there was no v1.5.1 librdkafka release

    Checksums

    Release asset checksums:

    • v1.5.2.zip SHA256 de70ebdb74c7ef8c913e9a555e6985bcd4b96eb0c8904572f3c578808e0992e1
    • v1.5.2.tar.gz SHA256 ca3db90d04ef81ca791e55e9eed67e004b547b7adedf11df6c24ac377d4840c6
    Source code(tar.gz)
    Source code(zip)
  • v1.5.0(Jul 20, 2020)

    librdkafka v1.5.0

    The v1.5.0 release brings usability improvements, enhancements and fixes to librdkafka.

    Enhancements

    • Improved broker connection error reporting with more useful information and hints on the cause of the problem.
    • Consumer: Propagate errors when subscribing to unavailable topics (#1540)
    • Producer: Add batch.size producer configuration property (#638)
    • Add topic.metadata.propagation.max.ms to allow newly manually created topics to be propagated throughout the cluster before reporting them as non-existent. This fixes race issues where CreateTopics() is quickly followed by produce().
    • Prefer least idle connection for periodic metadata refreshes, et.al., to allow truly idle connections to time out and to avoid load-balancer-killed idle connection errors (#2845)
    • Added rd_kafka_event_debug_contexts() to get the debug contexts for a debug log line (by @wolfchimneyrock).
    • Added Test scenarios which define the cluster configuration.
    • Added MinGW-w64 builds (@ed-alertedh, #2553)
    • ./configure --enable-XYZ now requires the XYZ check to pass, and --disable-XYZ disables the feature altogether (@benesch)
    • Added rd_kafka_produceva() which takes an array of produce arguments for situations where the existing rd_kafka_producev() va-arg approach can't be used.
    • Added rd_kafka_message_broker_id() to see the broker that a message was produced or fetched from, or an error was associated with.
    • Added RTT/delay simulation to mock brokers.

    Upgrade considerations

    • Subscribing to non-existent and unauthorized topics will now propagate errors RD_KAFKA_RESP_ERR_UNKNOWN_TOPIC_OR_PART and RD_KAFKA_RESP_ERR_TOPIC_AUTHORIZATION_FAILED to the application through the standard consumer error (the err field in the message object).
    • Consumer will no longer trigger auto creation of topics, allow.auto.create.topics=true may be used to re-enable the old deprecated functionality.
    • The default consumer pre-fetch queue threshold queued.max.messages.kbytes has been decreased from 1GB to 64MB to avoid excessive network usage for low and medium throughput consumer applications. High throughput consumer applications may need to manually set this property to a higher value.
    • The default consumer Fetch wait time has been increased from 100ms to 500ms to avoid excessive network usage for low throughput topics.
    • If OpenSSL is linked statically, or ssl.ca.location=probe is configured, librdkafka will probe known CA certificate paths and automatically use the first one found. This should alleviate the need to configure ssl.ca.location when the statically linked OpenSSL's OPENSSLDIR differs from the system's CA certificate path.
    • The heuristics for handling Apache Kafka < 0.10 brokers has been removed to improve connection error handling for modern Kafka versions. Users on Brokers 0.9.x or older should already be configuring api.version.request=false and broker.version.fallback=... so there should be no functional change.
    • The default producer batch accumulation time, linger.ms, has been changed from 0.5ms to 5ms to improve batch sizes and throughput while reducing the per-message protocol overhead. Applications that require lower produce latency than 5ms will need to manually set linger.ms to a lower value.
    • librdkafka's build tooling now requires Python 3.x (python3 interpreter).

    Fixes

    General fixes

    • The client could crash in rare circumstances on ApiVersion or SaslHandshake request timeouts (#2326)
    • ./configure --LDFLAGS='a=b, c=d' with arguments containing = are now supported (by @sky92zwq).
    • ./configure arguments now take precedence over cached configure variables from previous invocation.
    • Fix theoretical crash on coord request failure.
    • Unknown partition error could be triggered for existing partitions when additional partitions were added to a topic (@benesch, #2915)
    • Quickly refresh topic metadata for desired but non-existent partitions. This will speed up the initial discovery delay when new partitions are added to an existing topic (#2917).

    Consumer fixes

    • The roundrobin partition assignor could crash if subscriptions where asymmetrical (different sets from different members of the group). Thanks to @ankon and @wilmai for identifying the root cause (#2121).
    • The consumer assignors could ignore some topics if there were more subscribed topics than consumers in taking part in the assignment.
    • The consumer would connect to all partition leaders of a topic even for partitions that were not being consumed (#2826).
    • Initial consumer group joins should now be a couple of seconds quicker thanks expedited query intervals (@benesch).
    • Fix crash and/or inconsistent subscriptions when using multiple consumers (in the same process) with wildcard topics on Windows.
    • Don't propagate temporary offset lookup errors to application.
    • Immediately refresh topic metadata when partitions are reassigned to other brokers, avoiding a fetch stall of up to topic.metadata.refresh.interval.ms. (#2955)
    • Memory for batches containing control messages would not be freed when using the batch consume APIs (@pf-qiu, #2990).

    Producer fixes

    • Proper locking for transaction state in EndTxn handler.

    Checksums

    Release asset checksums:

    • v1.5.0.zip SHA256 76a1e83d643405dd1c0e3e62c7872b74e3a96c52be910233e8ec02d501fa33c8
    • v1.5.0.tar.gz SHA256 f7fee59fdbf1286ec23ef0b35b2dfb41031c8727c90ced6435b8cf576f23a656
    Source code(tar.gz)
    Source code(zip)
  • v1.4.4(Jun 20, 2020)

    librdkafka v1.4.4

    v1.4.4 is a maintenance release with the following fixes and enhancements:

    • Transactional producer could crash on request timeout due to dereferencing NULL pointer of non-existent response object.
    • Mark rd_kafka_send_offsets_to_transaction() CONCURRENT_TRANSACTION (et.al) errors as retriable.
    • Fix crash on transactional coordinator FindCoordinator request failure.
    • Minimize broker re-connect delay when broker's connection is needed to send requests.
    • socket.timeout.ms was ignored when transactional.id was set.
    • Added RTT/delay simulation to mock brokers.

    Note: there was no v1.4.3 librdkafka release

    Source code(tar.gz)
    Source code(zip)
  • v1.4.2(May 6, 2020)

    librdkafka v1.4.2

    v1.4.2 is a maintenance release with the following fixes and enhancements:

    • Fix produce/consume hang after partition goes away and comes back, such as when a topic is deleted and re-created (regression in v1.3.0).
    • Consumer: Reset the stored offset when partitions are un-assign()ed (fixes #2782). This fixes the case where a manual offset-less commit() or the auto-committer would commit a stored offset from a previous assignment before a new message was consumed by the application.
    • Probe known CA cert paths and set default ssl.ca.location accordingly if OpenSSL is statically linked or ssl.ca.location is set to probe.
    • Per-partition OffsetCommit errors were unhandled (fixes #2791)
    • Seed the PRNG (random number generator) by default, allow application to override with enable.random.seed=false (#2795)
    • Fix stack overwrite (of 1 byte) when SaslHandshake MechCnt is zero
    • Align bundled c11 threads (tinycthreads) constants to glibc and musl (#2681)
    • Fix return value of rd_kafka_test_fatal_error() (by @ckb42)
    • Ensure CMake sets disabled defines to zero on Windows (@benesch)
    • librdkafka's build tooling now requires Python 3.x (the python3 interpreter).

    Note: there was no v1.4.1 librdkafka release

    Checksums

    Release asset checksums:

    • v1.4.2.zip SHA256 ac50da08be69365988bad3d0c46cd87eced9381509d80d3d0b4b50b2fe9b9fa9
    • v1.4.2.tar.gz SHA256 3b99a36c082a67ef6295eabd4fb3e32ab0bff7c6b0d397d6352697335f4e57eb
    Source code(tar.gz)
    Source code(zip)
  • v1.4.0(Apr 2, 2020)

    librdkafka v1.4.0

    v1.4.0 is a feature release:

    • KIP-98: Transactional Producer API
    • KIP-345: Static consumer group membership (by @rnpridgeon)
    • KIP-511: Report client software name and version to broker

    Transactional Producer API

    librdkafka now has complete Exactly-Once-Semantics (EOS) functionality, supporting the idempotent producer (since v1.0.0), a transaction-aware consumer (since v1.2.0) and full producer transaction support (in this release). This enables developers to create Exactly-Once applications with Apache Kafka.

    See the Transactions in Apache Kafka page for an introduction and check the librdkafka transactions example for a complete transactional application example.

    Security fixes

    Two security issues have been identified in the SASL SCRAM protocol handler:

    • The client nonce, which is expected to be a random string, was a static string.
    • If sasl.username and sasl.password contained characters that needed escaping, a buffer overflow and heap corruption would occur. This was protected, but too late, by an assertion.

    Both of these issues are fixed in this release.

    Enhancements

    • Add FNV-1a partitioner (by @Manicben, #2724). The new fnv1a_random partitioner is compatible with Sarama's NewHashPartitioner partition, easing transition from Sarama to librdkafka-based clients such as confluent-kafka-go.
    • Added rd_kafka_error_t / RdKafka::Error complex error type which provides error attributes such as indicating if an error is retriable.
    • The builtin mock broker now supports balanced consumer groups.
    • Support finding headers in nonstandard directories in CMake (@benesch)
    • Improved static library bundles which can now contain most dependencies.
    • Documentation, licenses, etc, is now installed by make install
    • Bump OpenSSL to v1.0.2u (when auto-building dependencies)

    Fixes

    General:

    • Correct statistics names in docs (@TimWSpence, #2754)
    • Wake up broker thread based on next request retry. Prior to this fix the next wakeup could be delayed up to 1 second regardless of next retry.
    • Treat SSL peer resets as usual Disconnects, making log.connection.close work
    • Reset buffer corrid on connection close to honour ApiVers and Sasl request priorities (@xzxxzx401, #2666)
    • Cleanup conf object if failing to creat producer or consumer (@fboranek)
    • Fix build of rdkafka_example project for windows, when using building it using Visual Studio 2017/2019 (by @Eliyahu-Machluf)
    • Minor fix to rdkafka_example usage: add lz4 and zstd compression codec to usage (by @Eliyahu-Machluf)
    • Let broker nodename updates propagate as ERR__TRANSPORT rather than ERR__NODE_UPDATE to avoid an extra error code for the application to handle.
    • Fix erroneous refcount assert in enq_once_del_source (e.g., on admin operation timeout)
    • Producers could get stuck in INIT state after a disconnect until a to-be-retried request timed out or the connection was needed for other purposes (metadata discovery, etc), this is now fixed.

    Producer:

    • flush() now works with RD_KAFKA_EVENT_DR
    • Fix race condition when finding EOS-supporting broker

    Consumer:

    • Consumers could get stuck after rebalance if assignment was empty
    • Enforce session.timeout.ms in the consumer itself (#2631)
    • max.poll.interval.ms is now only enforced when using subscribe()
    • Fix consumer_lag calculation for transactional topics
    • Show fetch/no-fetch reason in topic debugging
    • Properly propagate commit errors per partition
    • Don't send heartbeats after max.poll.interval.ms is exceeded.
    • Honour array size in rd_kafka_event_message_array() to avoid overflow (#2773)

    Checksums

    Release asset checksums:

    • v1.4.0.zip SHA256 eaf954e3b8a2ed98360b2c76f55048ee911964de8aefd8a9e1133418ec9f48dd
    • v1.4.0.tar.gz SHA256 ae27ea3f3d0d32d29004e7f709efbba2666c5383a107cc45b3a1949486b2eb84
    Source code(tar.gz)
    Source code(zip)
  • v1.3.0(Dec 3, 2019)

    librdkafka v1.3.0 release

    This is a feature release adding support for KIP-392 Fetch from follower, allowing a consumer to fetch messages from the closest replica to increase throughput and reduce cost.

    Features

    • KIP-392 - Fetch messages from closest replica / follower (by @mhowlett)
    • Added experimental (subject to change or removal) mock broker to make application and librdkafka development testing easier.

    Fixes

    • Fix consumer_lag in stats when consuming from broker versions <0.11.0.0 (regression in librdkafka v1.2.0).

    Checksums

    Release asset checksums:

    • v1.3.0.zip SHA256 bd3373c462c250ecebea9043fb94597a11bd6e0871d3cde19019433d3f74a99e
    • v1.3.0.tar.gz SHA256 465cab533ebc5b9ca8d97c90ab69e0093460665ebaf38623209cf343653c76d2
    Source code(tar.gz)
    Source code(zip)
  • v1.2.2(Nov 12, 2019)

    librdkafka v1.2.2 release

    v1.2.2 fixes the producer performance regression introduced in v1.2.1 which may affect high-throughput producer applications.

    Fixes

    • Fix producer insert msgq regression in v1.2.1 (#2450).
    • Upgrade builtin lz4 to 1.9.2 (CVE-2019-17543, #2598).
    • Don't trigger error when broker hostname changes (#2591).
    • Less strict message.max.bytes check for individual messages (#993).
    • Don't call timespec_get() on OSX (since it was removed in recent XCode) by @maparent .
    • configure: add --runstatedir for compatibility with autoconf.
    • LZ4 is available from ProduceRequest 0, not 3 (fixes assert in #2480).
    • Address 12 code issues identified by Coverity static code analysis.

    Enhancements

    • Add warnings for inconsistent security configuration.
    • Optimizations to hdr histogram (stats) rollover.
    • Reorganized examples and added a cleaner consumer example, added minimal C++ producer example.
    • Print compression type per message-set when debug=msg

    Checksums

    Release asset checksums:

    • v1.2.2.zip SHA256 7557b37e5133ed4c9b0cbbc3fd721c51be8e934d350d298bd050fcfbc738e551
    • v1.2.2.tar.gz SHA256 c5d6eb6ce080431f2996ee7e8e1f4b8f6c61455a1011b922e325e28e88d01b53
    Source code(tar.gz)
    Source code(zip)
  • v1.2.1(Oct 9, 2019)

    librdkafka v1.2.1 release

    Warning: v1.2.1 has a producer performance regression which may affect high-throughput producer applications. We recommend such users to upgrade to v1.3.0

    v1.2.1 is a maintenance release:

    • Properly handle new Kafka-framed SASL GSSAPI frame semantics on Windows (#2542). This bug was introduced in v1.2.0 and broke GSSAPI authentication on Windows.
    • Fix msgq (re)insertion code to avoid O(N^2) insert sort operations on retry (#2508) The msgq insert code now properly handles interleaved and overlapping message range inserts, which may occur during Producer retries for high-throughput applications.
    • configure: added --disable-c11threads to avoid using libc-provided C11 threads.
    • configure: added more autoconf compatibility options to ignore

    Checksums

    Release asset checksums:

    • v1.2.1.zip SHA256 8b5e95318b190f40cbcd4a86d6a59dbe57b54a920d8fdf64d9c850bdf05002ca
    • v1.2.1.tar.gz SHA256 f6be27772babfdacbbf2e4c5432ea46c57ef5b7d82e52a81b885e7b804781fd6
    Source code(tar.gz)
    Source code(zip)
  • v1.2.0(Sep 19, 2019)

    librdkafka v1.2.0 release

    WARNING: There is an issue with SASL GSSAPI authentication on Windows with this release. Upgrade directly to v1.2.1 which fixes the issue.

    v1.2.0 is a feature release making the consumer transaction aware.

    • Transaction aware consumer (isolation.level=read_committed) implemented by @mhowlett.
    • Sub-millisecond buffering (linger.ms) on the producer.
    • Improved authentication errors (KIP-152)

    Consumer-side transaction support

    This release adds consumer-side support for transactions. In previous releases, the consumer always delivered all messages to the application, even those in aborted or not yet committed transactions. In this release, the consumer will by default skip messages in aborted transactions. This is controlled through the new isolation.level configuration property which defaults to read_committed (only read committed messages, filter out aborted and not-yet committed transactions), to consume all messages, including for aborted transactions, you may set this property to read_uncommitted to get the behaviour of previous releases. For consumers in read_committed mode, the end of a partition is now defined to be the offset of the last message of a successfully committed transaction (referred to as the 'Last Stable Offset'). For non-transactional messages there is no change from previous releases, they will always be read, but a consumer will not advance into a not yet committed transaction on the partition.

    Upgrade considerations

    • linger.ms default was changed from 0 to 0.5 ms to promote some level of batching even with default settings.

    New configuration properties

    • Consumer property isolation.level=read_committed ensures the consumer will only read messages from successfully committed producer transactions. Default is read_committed. To get the previous behaviour, set the property to read_uncommitted, which will read all messages produced to a topic, regardless if the message was part of an aborted or not yet committed transaction.

    Enhancements

    • Offset commit metadata (arbitrary application-specified data) is now returned by rd_kafka_committed() and rd_kafka_offsets_for_times() (@damour, #2393)
    • C++: Added Conf::c_ptr*() to retrieve the underlying C config object.
    • Added on_thread_start() and on_thread_exit() interceptors.
    • Increase queue.buffering.max.kbytes max to INT_MAX.
    • Optimize varint decoding, increasing consume performance by ~15%.

    Fixes

    General:

    • Rate limit IO-based queue wakeups to linger.ms, this reduces CPU load and lock contention for high throughput producer applications. (#2509)
    • Reduce memory allocations done by rd_kafka_topic_partition_list_new().
    • Fix socket recv error handling on MSVC (by Jinsu Lee).
    • Avoid 1s stalls in some scenarios when broker wakeup-fd is triggered.
    • SSL: Use only hostname (not port) when valid broker hostname (by Hunter Jacksson)
    • SSL: Ignore OpenSSL cert verification results if enable.ssl.certificate.verification=false (@salisbury-espinosa, #2433)
    • rdkafka_example_cpp: fix metadata listing mode (@njzcx)
    • SASL Kerberos/GSSAPI: don't treat kinit ECHILD errors as errors (@hannip, #2421)
    • Fix compare overflows (#2443)
    • configure: Add option to disable automagic dependency on zstd (by Thomas Deutschmann)
    • Documentation updates and fixes by Cedric Cellier and @ngrandem
    • Set thread name on MacOS X (by Nikhil Benesch)
    • C++: Fix memory leak in Headers (by Vladimir Sakharuk)
    • Fix UBSan (undefined behaviour errors) (@PlacidBox, #2417)
    • CONFIGURATION.md: escape || inside markdown table (@mhowlett)
    • Refresh broker list metadata even if no topics to refresh (#2476)

    Consumer:

    • Make rd_kafka_pause|resume_partitions() synchronous, making sure that a subsequent consumer_poll() will not return messages for the paused partitions (#2455).
    • Fix incorrect toppar destroy in OffsetRequest (@binary85, #2379)
    • Fix message version 1 offset calculation (by Martin Ivanov)
    • Defer commit in transport error to avoid consumer_close hang.

    Producer:

    • Messages were not timed out for leader-less partitions (.NET issue #1027).
    • Improve message timeout granularity to millisecond precision (the smallest ffective message timeout will still be 1000ms).
    • message.timeout.ms=0 is now accepted even if linger.ms > 0 (by Jeff Snyder)
    • Don't track max.poll.interval.ms unless in Consumer mode, this saves quite a few memory barries for high-performance Producers.
    • Optimization: avoid atomic fatal error code check when idempotence is disabled.

    Checksums

    Release asset checksums:

    • v1.2.0.zip SHA256 6e57f09c28e9a65abb886b84ff638b2562b8ad71572de15cf58578f3f9bc45ec
    • v1.2.0.tar.gz SHA256 eedde1c96104e4ac2d22a4230e34f35dd60d53976ae2563e3dd7c27190a96859
    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Jul 5, 2019)

    librdkafka v1.1.0 release

    v1.1.0 is a security-focused feature release:

    • SASL OAUTHBEARER support (by @rondagostino at StateStreet)
    • In-memory SSL certificates (PEM, DER, PKCS#12) support (by @noahdav at Microsoft)
    • Pluggable broker SSL certificate verification callback (by @noahdav at Microsoft)
    • Use Windows Root/CA SSL Certificate Store (by @noahdav at Microsoft)
    • ssl.endpoint.identification.algorithm=https (off by default) to validate the broker hostname matches the certificate. Requires OpenSSL >= 1.0.2.
    • Improved GSSAPI/Kerberos ticket refresh

    Upgrade considerations

    • Windows SSL users will no longer need to specify a CA certificate file/directory (ssl.ca.location), librdkafka will load the CA certs by default from the Windows Root Certificate Store.
    • SSL peer (broker) certificate verification is now enabled by default (disable with enable.ssl.certificate.verification=false)
    • %{broker.name} is no longer supported in sasl.kerberos.kinit.cmd since kinit refresh is no longer executed per broker, but per client instance.

    SSL

    New configuration properties:

    • ssl.key.pem - client's private key as a string in PEM format
    • ssl.certificate.pem - client's public key as a string in PEM format
    • enable.ssl.certificate.verification - enable(default)/disable OpenSSL's builtin broker certificate verification.
    • enable.ssl.endpoint.identification.algorithm - to verify the broker's hostname with its certificate (disabled by default).
    • Add new rd_kafka_conf_set_ssl_cert() to pass PKCS#12, DER or PEM certs in (binary) memory form to the configuration object.
    • The private key data is now securely cleared from memory after last use.

    Enhancements

    • configure: Improve library checking
    • Added rd_kafka_conf() to retrieve the client's configuration object
    • Bump message.timeout.ms max value from 15 minutes to 24 days (@sarkanyi, workaround for #2015)

    Fixes

    • SASL GSSAPI/Kerberos: Don't run kinit refresh for each broker, just per client instance.
    • SASL GSSAPI/Kerberos: Changed sasl.kerberos.kinit.cmd to first attempt ticket refresh, then acquire.
    • SASL: Proper locking on broker name acquisition.
    • Consumer: max.poll.interval.ms now correctly handles blocking poll calls, allowing a longer poll timeout than the max poll interval.
    • configure: Fix libzstd static lib detection
    • rdkafka_performance: Fix for Misleading "All messages delivered!" message (@solar_coder)
    • Windows build and CMake fixes (@myd7349)

    Checksums

    Release asset checksums:

    • v1.1.0.zip SHA256 70279676ed863c984f9e088db124ac84a080e644c38d4d239f9ebd3e3c405e84
    • v1.1.0.tar.gz SHA256 123b47404c16bcde194b4bd1221c21fdce832ad12912bd8074f88f64b2b86f2b
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(May 28, 2019)

    librdkafka v1.0.1 release

    v1.0.1 is a maintenance release with the following fixes:

    • Fix consumer stall when broker connection goes down (issue #2266 introduced in v1.0.0)
    • Fix AdminAPI memory leak when broker does not support request (@souradeep100, #2314)
    • Update/fix protocol error response codes (@benesch)
    • Treat ECONNRESET as standard Disconnects (#2291)
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Mar 25, 2019)

    librdkafka v1.0.0 release

    v1.0.0 is a major feature release:

    • Idempotent producer - guaranteed ordering, exactly-once producing.
    • Sparse/on-demand connections - connections are no longer maintained to all brokers in the cluster.
    • KIP-62 - max.poll.interval.ms for high-level consumers

    This release also changes configuration defaults and deprecates a set of configuration properties, make sure to read the Upgrade considerations section below.

    Upgrade considerations (IMPORTANT)

    librdkafka v1.0.0 is API (C & C++) and ABI (C) compatible with older versions of librdkafka, but there are changes to configuration properties that may require changes to existing applications.

    Configuration default changes

    The following configuration properties have changed default values, which may require application changes:

    • acks (alias request.required.acks) default is now all (wait for ack from all in-sync replica brokers), the previous default was 1 (only wait for ack from partition leader) which could cause data loss if the leader broker goes down.
    • enable.partition.eof is now false by default. Applications that rely on ERR__PARTITION_EOF to be emitted must now explicitly set this property to true. This change was made to simplify the common case consumer application consume loop.
    • broker.version.fallback was changed from 0.9 to 0.10 and broker.version.fallback.ms was changed to 0. Users on Apache Kafka <0.10 must set api.version.request=false and broker.version.fallback=.. to their broker version. For users >=0.10 there is no longer any need to specify any of these properties. See https://github.com/edenhill/librdkafka/wiki/Broker-version-compatibility for more information.

    Deprecated configuration properties

    • topic.metadata.refresh.fast.cnt is no longer used.
    • socket.blocking.max.ms is no longer used.
    • reconnect.backoff.jitter.ms is no longer used, see reconnect.backoff.ms and reconnect.backoff.max.ms.
    • offset.store.method=file is deprecated.
    • offset.store.path is deprecated.
    • offset.store.sync.interval.ms is deprecated.
    • queuing.strategy was an experimental property that is now deprecated.
    • msg_order_cmp was an experimental property that is now deprecated.
    • produce.offset.report is no longer used. Offsets are always reported.
    • auto.commit.enable (topic level) for the simple (legacy) consumer is now deprecated.

    Use of any deprecated configuration property will result in a warning when the client instance is created. The deprecated configuration properties will be removed in a future version of librdkafka. See issue #2020 for more information.

    Configuration checking

    The checks for incompatible configuration has been improved, the client instantiation (rd_kafka_new()) will now fail if incompatible configuration is detected.

    max.poll.interval.ms is enforced

    This release adds support for max.poll.interval.ms (KIP-62), which requires the application to call rd_kafka_consumer_poll()/rd_kafka_poll() at least every max.poll.interval.ms. Failure to do so will make the consumer automatically leave the group, causing a group rebalance, and not rejoin the group until the application has called ..poll() again, triggering yet another group rebalance. max.poll.interval.ms is set to 5 minutes by default.

    Idempotent Producer

    This release adds support for Idempotent Producer, providing exactly-once producing and guaranteed ordering of messages.

    Enabling idempotence is as simple as setting the enable.idempotence configuration property to true.

    There are no required application changes, but it is recommended to add support for the newly introduced fatal errors that will be triggered when the idempotent producer encounters an unrecoverable error that would break the ordering or duplication guarantees.

    See Idempotent Producer in the manual and the Exactly once semantics blog post for more information.

    Sparse connections

    In previous releases librdkafka would maintain open connections to all brokers in the cluster and the bootstrap servers.

    With this release librdkafka now connects to a single bootstrap server to retrieve the full broker list, and then connects to the brokers it needs to communicate with: partition leaders, group coordinators, etc.

    For large scale deployments this greatly reduces the number of connections between clients and brokers, and avoids the repeated idle connection closes for unused connections.

    See Sparse connections in the manual for more information.

    Original issue #825.

    Features

    • Add support for ZSTD compression (KIP-110, @mvavrusa. Caveat: will not currently work with topics configured with compression.type=zstd, instead use compression.type=producer, see #2183)
    • Added max.poll.interval.ms (KIP-62, #1039) to allow long processing times.
    • Message Header support for C++ API (@davidtrihy)

    Enhancements

    • Added rd_kafka_purge() API to purge messages from producer queues (#990)
    • Added fatal errors (see ERR__FATAL and rd_kafka_fatal_error()) to raise unrecoverable errors to the application. Currently only triggered by the Idempotent Producer.
    • Added rd_kafka_message_status() producer API that may be used from the delivery report callback to know if the message was persisted to brokers or not. This is useful for applications that want to perform manual retries of messages, to know if a retry could lead to duplication.
    • Backoff reconnects exponentially (See reconnect.backoff.ms and reconnect.backoff.max.ms).
    • Add broker[..].req["reqType"] per-request-type metrics to statistics.
    • CONFIGURATION.md: Added Importance column.
    • ./configure --install-deps (and also --source-deps-only) will automatically install dependencies through the native package manager and/or from source.

    Fixes

    General

    • rd_kafka_version() was not thread safe
    • Round up microsecond->millisecond timeouts to 1ms in internal scheduler to avoid CPU-intensive busy-loop.
    • Send connection handshake requests before lower prio requests.
    • Fix timespec conversion to avoid infinite loop (#2108, @boatfish)
    • Fix busy-loop: Don't set POLLOUT (due to requests queued) in CONNECT state (#2118)
    • Broker hostname max size increased from 127 to 255 bytes (#2171, @Vijendra07Kulhade)

    Consumer

    • C++: Fix crash when Consumer ctor fails
    • Make sure LeaveGroup is sent on unsubscribe and consumer close (#2010, #2040)
    • Remember assign()/seek():ed offset when pause()ing (#2105)
    • Fix handling of mixed MsgVersions in same FetchResponse (#2090)

    Producer

    • Added delivery.timeout.ms -> message.timeout.ms alias
    • Prevent int overflow while computing abs_timeout for producer request… (#2050, @KseniyaYakil).
    • Producer: fix re-ordering corner-case on retry.

    Windows

    • win32: cnd_timedwait*() could leave the cond signalled, resulting in high CPU usage.

    Build/installation/tooling

    • Makefile: fix install rule (#2049, @pacovn)
    • Fixing Counting error in rdkafka_performance #1542 (#2028, @gnanasekarl)
    • OpenSSL 1.1.0 compatibility (#2000, @nouzun, @wiml)
    • Set OpenSSL locking callbacks as required, dont call CRYPTO_cleanup_all_ex_data (#1984, @ameihm0912)
    • Fix 64-bit IBM i build error (#2017, @ThePrez)
    • CMake: Generate pkg-config files (@Oxymoron79, #2075)
    • mklove: suggest brew packages to install on osx
    • rdkafka_performance: Add an option to dump the configuration (@akohn)
    • Check for libcrypto explicitly (OSX Mojave, #2089)
    Source code(tar.gz)
    Source code(zip)
  • v0.11.6(Oct 18, 2018)

    v0.11.6 is a maintenance release.

    Critical fixes

    • The internal timer could wrap in under 8 days on Windows, causing stalls and hangs. Bug introduced in v0.11.5. #1980
    • Purge and retry buffers in outbuf queue on connection fail (#1913). Messages could get stuck in internal queues on retry and broker down.

    Enhancements

    • Enable low latency mode on Windows by using TCP "pipe". Users no longer need to set socket.blocking.max.ms to improve latency. (#1930, @LavaSpider)
    • Added rd_kafka_destroy_flags() to control destroy behaviour. Can be used to force an consumer to terminate without leaving the group or committing final offsets.

    Fixes

    • Producer: Serve UA queue when transitioning topic from UNKNOWN. Messages could get stuck in UA partition queue on metadata timeout (#1985).
    • Fix partial read issue on unix platforms without recvmsg()
    • Improve disconnect detection on Windows (#1937)
    • Use atomics for refcounts on all platforms with atomics support (#1873)
    • Message err was not set for on_ack interceptors on broker reply (#1892)
    • Fix consumer_lag to -1 when neither app_offset or commmitted_offset is available (#1911)
    • Fix crash: Insert retriable messages on partition queue, not xmit queue (#1965)
    • Fix crash: don't enqueue messages for retry when handle is terminating.
    • Disconnect regardless of socket.max.fails when partial request times out (#1955)
    • Now builds with libressl (#1896, #1901, @secretmike)
    • Call poll when flush() is called with a timeout of 0 (#1950)
    • Proper locking of set_fetch_state() when OffsetFetch response is outdated
    • Destroy unknown and no longer desired partitions (fixes destroy hang)
    • Handle FetchResponse for partitions that were removed during fetch (#1948)
    • rdkafka_performance: Default the message size to the length of the pattern unless given explicitly (#1899, @ankon)
    • Fix timeout reuse in queue serving/polling functions (#1863)
    • Fix rd_atomic*_set() to use __atomic or __sync when available
    • NuGet: change runtime from win7-.. to more generic win-.. (CLIENTS-1188)
    • Fix crash: failed ops enq could be handled by original dest queue callbacks
    • Update STATISTICS.md to match code (@ankon, #1936)
    • rdkafka_performance: Don't sleep while waiting for delivery reports (#1918, @ankon)
    • Remove unnecessary 100ms sleep when broker goes from UP -> DOWN (#1895)
    • rdhdrhistogram: Fix incorrect float -> int cast causing havoc on MIPS (@andoma)
    • "Message size too large": The receive buffer would grow by x2 (up to the limit) each time a EOS control message was seen. (#1472)
    • Added support for system-provided C11 threads, e.g. alpine/musl. This fixes weird behaviour on Alpine (#1998)
    • Fix high CPU usage after disconnect from broker while waiting for SASL response (#2032, @stefanseufert)
    • Fix dont-destroy-from-rdkafka-thread detection logic to avoid assert when using plugins.
    • Fix LTO warnings with gcc 8 (#2038, @Romain-Geissler-1A)
    Source code(tar.gz)
    Source code(zip)
  • v0.11.5(Jul 19, 2018)

    v0.11.5 is a feature release that adds support for the Kafka Admin API (KIP-4).

    Admin API

    This release adds support for the Admin API, enabling applications and users to perform administrative Kafka tasks programmatically:

    • Create topics - specifying partition count, replication factor and topic configuration.
    • Delete topics - delete topics in cluster.
    • Create partitions - extend a topic with additional partitions.
    • Alter configuration - set, modify or delete configuration for any Kafka resource (topic, broker, ..).
    • Describe configuration - view configuration for any Kafka resource.

    The API closely follows the Java Admin API:

    https://github.com/edenhill/librdkafka/blob/master/src/rdkafka.h#L4495

    New and updated configuration

    • Added compresion.level configuration option, which allows fine-tuning of gzip and LZ4 comression level (@erkoln)
    • Implement ssl.curves.list and ssl.sigalgs.list configuration settings (@jvgutierrez)
    • Changed queue.buffering.backpressure.threshold default (#1848)

    Enhancements

    • Callback based event notifications (@fede1024)
    • Event callbacks may now optionally be triggered from a dedicated librdkafka background thread, see rd_kafka_conf_set_background_event_cb.
    • Log the value that couldn't be found for flag configuration options (@ankon)
    • Add support for rd_kafka_conf_set_events(conf, ..EVENT_ERROR) to allow generic errors to be retrieved as events.
    • Avoid allocating BIOs and copying for base64 processing (@agl)
    • Don't log connection close for idle connections (regardless of log.connection.close)
    • Improve latency by using high-precision QPC clock on Windows
    • Added make uninstall
    • Minor documentation updates from replies to #1794 (@briot)
    • Added rd_kafka_controllerid() to return the current controller.
    • INTRODUCTION.md: add chapter on latency measurement.
    • Add relative hyperlinks to table of contents in INTRODUCTION.md (#1791, @stanislavkozlovski)
    • Improved statistics:
      • Added Hdr Histograms for all windowed stats (rtt, int_latency, throttle) #1798
      • Added top-level totals for broker receive and transmit metrics.
      • Added batchcnt, batchsize histograms to consumer.
      • Added outbuf_latency histograms.
      • STATISTICS.md moved from wiki to source tree.

    Fixes

    • Fixed murmur2 partitioner to make it compatible with java version (#1816, @lins05)
    • Fix pause/resume: next_offset was not properly initialized
    • Fix a segment fault error in rdkafka_buf with zero-length string (@sunny1988)
    • Set error string length in rkmessage.len on error (#1851)
    • Don't let metadata ERR_UNKNOWN set topic state to non-existent.
    • The app_offset metric is now reset to INVALID when the fetcher is stopped.
    • consumer_lag is now calculated as consumer_lag = hi_wmark_offset - MAX(app_offset, committed_offset), which makes it correct after a reassignment but before new messages have been consumed (#1878)
    • socket.nagle.disable=true was never applied on non-Windows platforms (#1838)
    • Update interface compile definitions for Windows using CMake (#1800, @raulbocanegra)
    • Fix queue hang when queue_destroy() is called on rdkafka-owned queue (#1792)
    • Fix hang on unclean termination when there are outstanding requests.
    • Metadata: fix crash when topic is in transitionary state
    • Metadata: sort topic partition list
    • rdkafka_example emitted bogus produce errors
    • Increase BROKERS_MAX to 10K and PARTITIONS_MAX to 100K
    • Proper log message on SSL connection close
    • Missing return on error causes use-after-free in SASL code (@sidhpurwala-huzaifa)
    • Fix configure --pkg-config-path=... (#1797, @xbolshe)
    • Fix -fsanitize=undefined warning for overflowed OP switches (#1789)
    Source code(tar.gz)
    Source code(zip)
  • v0.11.4(Mar 28, 2018)

    Maintenance release

    Default changes

    • socket.max.fails changed to 1 to provide same functionality (fail request immediately on error) now when retries are working properly again.
    • fetch.max.bytes (new config property) is automatically adjusted to be >= message.max.bytes, and receive.message.max.bytes is automatically adjusted to be > fetch.max.bytes. (#1616)

    New features

    • Message Headers support (with help from @johnistan)
    • Java-compatible Murmur2 partitioners (#1468, @barrotsteindev)
    • Add PKCS#12 Keystore support - ssl.keystore.location (#1494, @AMHIT)

    Noteworthy fixes

    • Formalise and fix Producer retries and retry-ordering (#623, #1092, #1432, #1476, #1421)
      • Ordering is now retained despite retries if max.in.flight=1.
      • Behaviour is now documented
    • Fix timeouts for retried requests and properly handle retries for all request types (#1497)
    • Add and use fetch.max.bytes to limit total Fetch response size (KIP-74, #1616). Fixes "Invalid response size" issues.

    Enhancements

    • Added sasl.mechanism and compression.type configuration property aliases for conformance with Java client.
    • Improved Producer performance
    • C++: add c_ptr() to Handle,Topic,Message classes to expose underlying librdkafka object
    • Honour per-message partition in produce_batch() if MSG_F_PARTITION set (@barrotsteindev, closes #1604)
    • Added on_request_sent() interceptor
    • Added experimental flexible producer queuing.strategy=fifo|lifo
    • Broker address DNS record round-robin: try to maintain round-robin position across resolve calls.
    • Set system thread name for internal librdkafka threads (@tbsaunde)
    • Added more concise and user-friendly 'consumer' debug context
    • Add partitioner (string) topic configuration property to set the builtin partitioners
    • Generate rdkafka-static.pc (pkg-config) for static linking

    Fixes

    • Fix producer memory leak on <0.11 brokers when compressed messageset is below copy threshold (closes #1534)
    • CRC32C - fix unaligned access on ARM (@Soundman32)
    • Fix read after free in buf_write_seek
    • Fix broker wake up (#1667, @gduranceau)
    • Fix consumer hang when rebalancing during commit (closes #1605, @ChenyuanHu)
    • CMake fixes for Windows (@raulbocanegra)
    • LeaveGroup was not sent on close when doing final offset commits
    • Fix for consumer slowdown/stall on compacted topics where actual last offset < MsgSet.LastOffset (KAFKA-5443)
    • Fix global->topic conf fallthru in C++ API
    • Fix infinite loop on LeaveGroup failure
    • Fix possible crash on OffsetFetch retry
    • Incorporate compressed message count when deciding on fetch backoff (#1623)
    • Fix debug-only crash on Solaris (%s NULL) (closes #1423)
    • Drain broker ops queue on termination to avoid hang (closes #1596)
    • cmake: Allow build static library (#1602, @proller)
    • Don't store invalid offset as next one when pausing (#1453, @mfontanini)
    • use #if instead of #ifdef / defined() for atomics (#1592, @vavrusa)
    • fixed .lib paths in nuget packaging (#1587)
    • Fixes strerror_r crash on alpine (#1580, @skarlsson)
    • Allow arbitrary lengthed (>255) SASL PLAIN user/pass (#1691, #1692)
    • Trigger ApiVersionRequest on reconnect if broker.version.fallback supports it (closes #1694)
    • Read Fetch MsgAttributes as int8 (discovered by @tvoinarovskyi, closes #1689)
    • Portability: stop using typeof in rdkafka_transport.c (#1708, @tbsaunde)
    • Portability: replace use of #pragma once with header guards (#1688, @tbsaunde)
    • mklove: add LIBS in reverse order to maintain dependency order
    • Fix build when python is not available #1358
    Source code(tar.gz)
    Source code(zip)
  • v0.11.3(Dec 4, 2017)

    Maintenance release

    Default changes

    • Change default queue.buffering.max.kbytes and queued.max.message.kbytes to 1GB (#1304)
    • win32: Use sasl.kerberos.service.name for broker principal, not sasl.kerberos.principal (#1502)

    Enhancements

    • Default producer message offsets to OFFSET_INVALID rather than 0
    • new nuget package layout + debian9 librdkafka build (#1513, @mhowlett)
    • Allow for calling rd_kafka_queue_io_event_enable() from the C++ world (#1483, @akhi3030)
    • rdkafka_performance: allow testing latency with different size messages (#1482, @tbsaunde)

    Fixes

    • Improved stability on termination (internal queues, ERR__DESTROY event)
    • offsets_for_times() return ERR__TIMED_OUT if brokers did not respond in time
    • Let list_groups() return ERR__PARTIAL with a partial group list (#1508)
    • Properly handle infinite (-1) rd_timeout:s throughout the code (#1539)
    • Fix offsets_store() return value when at least one valid partition
    • portability: rdendian: add le64toh() alias for older glibc (#1463)
    • Add MIPS build and fix CRC32 to work on big endian CPUs (@andoma, closes #1498)
    • osx: fix endian checking for software crc32c
    • Fix comparison in rd_list_remove_cmp (closes #1493)
    • stop calling cnd_timedwait() with a timeout of 0h (#1481, @tbsaunde)
    • Fix DNS cache logic broker.address.ttl (#1491, @dacjames)
    • Fix broker thread "hang" in CONNECT state (#1397)
    • Reset rkb_blocking_max_ms on broker DOWN to avoid busy-loop during CONNECT (#1397)
    • Fix memory leak when producev() fails (#1478)
    • Raise cmake minimum version to 3.2 (#1460)
    • Do not assume LZ4 worst (best?) case 255x compression (#1446 by @tudor)
    • Fix ALL_BROKERS_DOWN re-generation (fix by @ciprianpascu, #1101)
    • rdkafka-performance: busy wait to wait short periods of time
    Source code(tar.gz)
    Source code(zip)
  • v0.11.1(Oct 17, 2017)

    Maintenance release

    NOTE: If you are experiencing lousy producer performance, try setting the linger.ms configuration property to 100 (ms).

    Noteworthy critical bug fixes

    • Fix OpenSSL instability on Windows (fix thread id callback) - the bug resulted in SSL connections being torn down for no apparent reason.
    • Fetch response fix: read all MessageSets, not just the first one. (#1384) - Huge performance degradation (compared to v0.9.5) when fetching small batches.

    Enhancements

    • Add api.version.request.timeout.ms (#1277, thanks to @vinipuh5)
    • Point users to documentation when attempting to use Java security properties (#1412)

    Fixes

    • Adjust log level for partial message reads when debug is enabled (#1433)
    • Allow app metadata() requests regardless of outstanding metadata requests (#1430)
    • Proper size calculation from flags2str to include nul (#1334)
    • Thread-safe rd_strerror() (#1410)
    • Proper re-sends of ProduceRequests in-transit on connection down (#1390)
    • Producer: invalid offsets reported back when produce.offset.report=false
    • Consumer: Message Null Key/Value were messed up (regression in v0.11.0, #1386)
    • Fix metadata querying for topics with LEADER_UNAVAIL set (#1313)
    • Treat request __TIMED_OUT as a retryable error
    • Let timed out ProduceRequests result in MSG_TIMED_OUT error code for messages
    • Fix crash on leader rejoin when outstanding assignor metadata (#1371)
    • sasl_cyrus: Fix dangling stack ptr to sasl_callback_t (#1329, thanks to @gnpdt )
    Source code(tar.gz)
    Source code(zip)
  • v0.11.0(Jul 19, 2017)

    Feature release

    v0.11.0 is a new feature release of librdkafka with support for the new Kafka message format (MsgVersion 2) which makes librdkafka (and any librdkafka-based clients) transparently compatible for use with the EOS (Exactly-Once-Semantics) supporting Java client released with Apache Kafka v0.11.0.

    This release also includes enhancements and fixes as listed below.

    NOTE: While librdkafka implements the new Message version and features, it does not yet implement the EOS (Exactly-Once-Semantics) functionality itself.

    NOTE: The librdkafka C++ API is unfortunately not ABI safe (the API stability is guaranteed though): C++ users will need to recompile their applications when upgrading librdkafka.

    Upgrade notes

    api.version.request: The api.version.request property (see https://github.com/edenhill/librdkafka/wiki/Broker-version-compatibility) default value has changed from false to true, meaning that librdkafka will make use of the latest protocol features of the broker without the need to set the property to true explicitly on the client.

    WARNING: Due to a bug in Apache Kafka 0.9.0.x, the ApiVersionRequest (as sent by the client when connecting to the broker) will be silently ignored by the broker causing the request to time out after 10 seconds. This causes client-broker connections to stall for 10 seconds during connection-setup before librdkafka falls back on the broker.version.fallback protocol features. The workaround is to explicitly configure api.version.request to false on clients communicating with &lt=0.9.0.x brokers.

    Producer: The default value of queue.buffering.max.ms was changed from 1000ms to 0ms (no delay). This property denotes the internal buffering time (and latency) for messages produced.

    Features

    • Added support for MsgVersion v2 (message format of KIP-98) - message format compatible with EOS clients
    • Added support for client interceptors
    • Added support for dynamically loaded plugins (plugin.library.paths, for use with interceptors)
    • Added SASL SCRAM support (KIP-84)
    • Added builtin SASL PLAIN provider (for Win32, #982)

    Enhancements

    • Deprecate errno usage, use rd_kafka_last_error() instead.
    • Deprecate rd_kafka_wait_destroyed().
    • Implemented per-partition Fetch backoffs, previously all partitions for the given broker were backed off.
    • Added updated Kafka protocol and error enums
    • Added rd_kafka_message_latency()
    • Added rd_kafka_clusterid() and rd_kafka_type()
    • SSL: set default CA verify locations if ssl.ca.location is not specified
    • C++: add yield() method
    • Added support for stats as events (#1171)
    • Build with system liblz4 if available, else fall back on built-in lz4, for improved portability.
    • Use SNI when connecting through SSL (@vincentbernat)
    • Improve broker thread responsiveness, decreasing internal latency
    • Improve OpenSSL config error propagation (#1119)
    • Prioritize all relevant user-facing ops (callbacks) over messages on poll queue (#1088)
    • Added global->topic config fallthru: default topic config properties can now be set effortlessly on global config object.
    • Log offset commit failures when there is no offset_commit_cb (closes #1043)
    • Add CRC checking support to consumer (#1056)
    • C++: Added seek() support to KafkaConsumer
    • Added rd_kafka_conf_dup_filter() to selectively copy a config object.

    Fixes:

    • Avoid _ALIGN re-definition on BSD (#1225)
    • rdkafka_performance: exit with code 1 if not all messages were delivered
    • Fix endianism issues that were causing snappy to compress incorrectly (#1219, @rthalley)
    • Fix stability on AIX (#1211)
    • Document that rd_kafka_message_errstr() must not be used on producer
    • Add support for re-queuing half-processed ops to honour yield()
    • Handle null Protocol in JoinGroupResponse (#1193)
    • Consumer: Proper relative offset handling (#1192, @rthalley)
    • OSX: silence libsasl deprecated warnings
    • partition count should be per topic in offset request buffer (closes #1199, @matthew-d-jones)
    • fix build on SmartOS (#1186 by @misterdjules)
    • ERR_remove_thread_state OpenSSL version checking
    • Don't emit TIMED_OUT_QUEUE for timed out messages (revert)
    • producev() default partition should UA, not 0 (#1153)
    • Fix SaslHandshakeRequest timeout to 10s
    • SASL: fix memory leak: received SASL auth frames were not freed
    • Clean up OpenSSL per-thread memory on broker thread exit
    • Properly auto-set metadata.max.age.ms when metadata.refresh.interval.ms is disabled (closes #1149)
    • Fix memory alignment issues (#1150)
    • configure: auto add brew openssl pkg-config path
    • Fix consumer_lag calculation (don't use cached hi_offset)
    • rdkafka_example: fix message_errstr usage, not allowed on producer
    • Avoid use of partially destroyed topic object (#1125)
    • Improve reconnect delay handling (#1089)
    • C++: fix conf->get() allocation (closes #1118)
    • Use app_offset to calculate consumer_lag (closes #1112)
    • Fix retrybuf memory leak on termination when broker is down
    • Fix small memory leak in metadata_leader_query
    • Fix use-after-free when log.queue and debug was used
    • consumer_example: fix crash on -X dump (closes #841)
    • Added rd_kafka_offsets_store() (KafkaConsumer::offsets_store) (closes #826)
    • Optimize broker id lookups (closes #523)
    • Don't log broker failures when an error_cb is registered (closes #1055)
    • Properly log SSL connection close (closes #1081)
    • Win32 SASL GSSAPI: protection level and message size were not sent
    • C++: improved error reporting from Conf::set()
    • Flush partition fetch buffer on seek (from decide())
    • Fix recursive locking on periodic refresh of leader-less partition (closes #1311)
    Source code(tar.gz)
    Source code(zip)
  • v0.9.5(Apr 18, 2017)

    Maintenance release

    Critical fixes

    • q_concat: don't wakeup listeners if srcq is empty (fix by @orthrus in #1121): Fixes idle consumer CPU usage on Windows
    • Prioritize commit_cb over messages on poll queue (closes #1088)
    • Prioritize all relevant user-facing ops (callbacks) (#1088)
    • Fix SaslHandshakeRequest timeout

    Fixes

    • Properly log SSL connection close (closes #1081)
    • Don't log broker failures when an error_cb is registered (closes #1055)
    • Fix use-after-free when log.queue and debug was used
    • Log offset commit failures when there is no offset_commit_cb (closes #1043)
    • Fix retrybuf memory leak on termination when broker is down
    • Fix small memory leak in metadata_leader_query
    • Use app_offset to calculate consumer_lag (closes #1112)
    • Fix consumer_lag calculation (dont use cached hi_offset)
    • C++: fix conf->get() allocation (closes #1118)
    • Avoid use of partially destroyed topic object (#1125, fixed by @benli123)
    • sasl win32: protection level and message size not sent (fixed by @zyzil )
    • Properly auto-set metadata.max.age.ms when metadata.refresh.interval.ms is disabled (closes #1149)
    • Fix memory alignment issues (#1150)
    • consumer_example: fix crash on -X dump (closes #841)
    • configure: auto add brew openssl pkg-config path
    • producev() default partition should be UA, not 0 (#1153)

    Enhancements

    • Added global->topic config fallthru: Topic-level configuration properties can now be set on the global configuration object. The property will be applied on the default_topic_conf object (if no such object exists one is created automatically).
    • Use SNI when connecting through SSL (by @vincentbernat )
    • Windows performance improvements (use atomics instead of locks, avoid locking in some cases)
    • Add lz4/lib sources for in-tree building when external lz4 is not available
    • Add CRC checking support to consumer (#1056)
    • Added op priority to queues (for #1088)
    • Fail known unsupported requests locally (closes #1091)
    • Improve reconnect delay handling (#1089)
    • Improve OpenSSL config error propagation (#1119)
    • C++: improved error reporting from Conf::set()
    • Don't emit TIMED_OUT_QUEUE (revert)
    Source code(tar.gz)
    Source code(zip)
  • v0.9.4(Feb 26, 2017)

    librdkafka v0.9.4 is a small feature release with a horde of enhancements and fixes.

    NOTE: The librdkafka C++ API is unfortunately not ABI safe (the API stability is guaranteed though): C++ users will need to recompile their applications when upgrading librdkafka.

    New features

    • SASL: Windows SASL GSSAPI support using native Windows SSPI (#888)
    • Added rd_kafka_offsets_for_times (KIP-79, #842)
    • Added rd_kafka_producev() using va-args to construct message fields, including timestamps (#858, #707, #908, #345)
    • C++: added produce(..., timestamp, ..)
    • Added support for partition-specific queues (@lambdaknight)
      • allows redirecting partitions to separate queues and separates application threads.
    • True low-latency (on linux and osx) by fd-based queue wakeups

    Critical fixes

    • leader failover error detection and recovery for both producer and consumer
    • cgrp: failover robustness improvements
    • cgrp: certain failed Heartbeats could mask out future heartbeat requests
    • buffer handle callbacks could previously be triggered on wrong thread

    Enhancements

    • Events: expose EVENT_OFFSET_COMMIT
    • Experimental cmake support (thanks to @ruslo)
    • Validate topics in subscribe() (mainly regex validation)
    • Added set_log_queue() - allows application thread log handling (#355)
    • Add support for Metadata v1..2 (part of KIP-4)
    • New metadata request and caching framework:
      • much fewer metadata requests
      • use cached metadata when deemed possible
      • KIP-4 support for lean metadata requests in clusters with many topics
      • improved (faster, more reliable) leader lookups
    • C++: added TopicPartition::create(, ..offset)
    • C++: added const attribute to TopicPartition getters
    • C++: added TopicPartition::destroy(vector) helper
    • C++: added commitSync(.., OffsetCommitCb) variant
    • C++: Enrichment of Conf getter for callbacks. (#883, @vin-d)
    • Expose rd_kafka_topic_partition_list_sort()
    • Added max.in.flight -> max.in.flight.requests.per.connection alias
    • Win32: add lz4 support (using NuGet lz4 package)
    • Add cgrp stats: rebalance_age, rebalance_cnt, assignment_size

    Fixes

    • increase maximum supported topic limit to 1M
    • fix static linking on osx
    • Fix infinte wait in .._commit_queue() if callback on temp queue was used
    • Improved handling of assign() and outstanding rebalance_cb on termination
    • commit_queue: avoid global offset_commit_cb if local specified
    • Imrpoved Heartbeat error handling
    • Improve OffsetCommit failure handling
    • Query topics with missing or down leaders from topics_scan()
    • Handle rebalance from consumer_close (#1002)
    • rd_kafka_producev() would send 0 timestamp by default instead of current time
    • queue.buffering.max.ms and fetch.error.backoff.ms now takes precedence over socket.blocking.max.ms (#966)
    • flush() fix: msg_cnt not initialized
    • rdkafka_performance: fix multi -p partitions allocation error and memory leak
    • Refactor query_watermark_offsets to use new leader lookup code
    • Proper error propagation on invalid topic names (#977)
    • callbacks will no longer cut consumer_poll() timeouts short
    • improved queue handling and serving
    • rd_kafka_producev(): fix topic/rkt referencing
    • Fix ApiVersionRequest timeout (10s)
    • rd_kafka_q_concat: fix erroneous unlock() on enq failure
    • Speed up termination by emitting TERMINATE ops
    • cgrp: enforce metadata update check if matched subscribed topics is empty
    • Proper suppression of connect failure logs (#847)
    • Proper ATOMIC_OPn (#708)
    • fix: use out of scope variable (#974, @HeChuanXUPT)
    • C++: Fix undefined references to static const ints (#961, #962, @bbfgelman1)
    • mklove: lib_check now attempts compilation after pkg-config passed (for false positive of SSL on OSX)
    • mklove: fix ./configure --reconfigure
    • stats: fix brokers.toppars key duplication
    Source code(tar.gz)
    Source code(zip)
Owner
Magnus Edenhill
Magnus Edenhill
Devops kafka topics like files with kls, ktail, khead and kecho

Devops kafka topics like files with kls, ktail, khead and kecho

imotai 4 Dec 31, 2021
Simple benchmark to compare different Kafka clients performance with similar configuration.

Kafka Producer Benchmark Simple benchmark to compare different clients performance against similar configuration. The project is relatively low tech a

Jean-Louis Boudart 11 Nov 2, 2022
A dead-simple tool for working with data in Kafka

ktool - a tool for Kafka ktool is a dead-simple tool for working with data in Kafka: Copy partitions / topics to disk Replay messages Inspect message

Dom 5 Nov 4, 2022
Generated Ryst of Apache Arrow spec

Arrow generated IPC format The generated flatbuffers code for Rust. Note that these files suffered modifications because flatbuffers is unable to comp

null 14 Nov 30, 2022
Meteor Client Installer - Installer to automate the install of Fabric and Meteor Client

This is an installer that automates the install of Meteor and Fabric

Jake Priddle 3 Jun 23, 2021
Notion Offical API client library for rust

Notion API client library for rust.

Jake Swenson 65 Dec 26, 2022
Yet another ROS2 client library written in Rust

RclRust Target CI Status Document Foxy (Ubuntu 20.04) Introduction This is yet another ROS2 client library written in Rust. I have implemented it inde

rclrust 42 Dec 1, 2022
An asynchronous Rust client library for the Hashicorp Vault API

vaultrs An asynchronous Rust client library for the Hashicorp Vault API The following features are currently supported: Auth AppRole JWT/OIDC Token Us

Joshua Gilman 59 Dec 29, 2022
Ocular seeks to be the preferred cosmos client library UX for Rust projects

Ocular seeks to be the preferred cosmos client library UX for Rust projects. It is strongly based on lens, a go client library for blockchains built with the Cosmos SDK.

Peggy JV, Inc 34 Dec 26, 2022
🦀 Zulip API Library Rust Client

Zulip API Client Library Rust Crate This repo contains the code for an unofficial, third-party Zulip API client library crate written in the Rust prog

Manuel Nila 3 Oct 23, 2022
A simple, external MCPE client. For learning purposes...

sage A simple, external MCPE client. For learning purposes... Current Cheats a VERY simple speed, it just edits the speed pointer TODO Clean-up code A

Cqdet 5 Sep 7, 2021
Rust client for Pushover

Pushover RS Description It's a Rust client library you can use to interact with the Pushover messaging API. This client is unofficial and I'm in no wa

Emmanuel C. 4 Dec 9, 2022
A variation of the solana helloworld program example with a client written in Rust instead of Typescript

Simple Solana Smart Contract Example This repository demonstrates how to create and invoke a program on the Solana blockchain. In Solana the word prog

zeke 56 Dec 26, 2022
Lightweight tool for simple deployment (server+client)

deploy Lightweight tool for simple deployment (server+client) Usage You first need a key value pair: deploy generate-keys Public-Key: Used on the serv

Jan-Mirko Otter 0 Dec 27, 2021
Feign-RS (Rest client of Rust)

Feign-RS (Rest client of Rust)

null 9 Aug 12, 2022
A Rust client for the NOAA Weather Wire Service Open Interface.

nwws-oi A Rust client for the NOAA Weather Wire Service Open Interface. NWWS-OI is one of several platforms through which the National Weather Service

Will Glynn 3 Sep 15, 2022
Proxmox Backup Server and Client

Build & Release Notes rustup Toolchain We normally want to build with the rustc Debian package. To do that you can set the following rustup configurat

Read-Only Proxmox Projects Repository Clone. 26 Dec 26, 2022
A simple external client made with rust

A simple external client made with rust

null 2 Mar 19, 2022
An asynchronous API client for a light installation at the University of Kiel

An asynchronous API client for a light installation at the University of Kiel

FW 2 Nov 22, 2022