引用:https://github.com/Parsely/pykafka/issues/334
@emmett9001写道
@microamp Thanks, this is a great idea. There's currently no documentation on this, but to my knowledge the main differences are the specifics of the Python API and PyKafka's implementation of theBalancedConsumer. PyKafka strives to keep the API as pythonic as possible, which means using useful features of the language where appropriate for client code simplicity. This includes things like context managers for object cleanup and futures for asynchronous error handling. PyKafka's balanced consumer implements the Kafka project's notion of the "high level consumer", which uses ZooKeeper to balance consumption of partitions between multiple nodes in a consumer group. From what I understand, kafka-python is waiting until Kafka 0.9, when this functionality will be supported natively by the Kafka server itself, to implement self-balancing consumers.
Also, the last time we did a speed test (which was admittedly a while ago at this point), PyKafka's consumer outperformed kafka-python. I unfortunately no longer have the results from that test, so you may not want to bet too hard on PyKafka being significantly faster or slower - just figured I'd mention it.
Also, the last time we did a speed test (which was admittedly a while ago at this point), PyKafka's consumer outperformed kafka-python. I unfortunately no longer have the results from that test, so you may not want to bet too hard on PyKafka being significantly faster or slower - just figured I'd mention it.
注:
1、PyKafka 尽量保持了 python 的接口方式,包括上下文处理和异步的异常处理
2、PyKafka 支持 ZooKeeper 做消费者负载均衡。而 kafka-python 直到 Kafka 0.9 才支持消费者负载均衡,并且是依靠kafka服务实现的。
@emmett9001写道
Some more research - there are differences in the versions of python supported by each library. PyKafka supports 2.7, 3.4, 3.5, and pypy, while kafka-python adds 2.6 and removes 3.5 support. kafka-python also requires a ZooKeeper connection for offset management, which PyKafka does not. kafka-python supports versions of Kafka from 0.8.0 to 0.8.2, where PyKafka only supports 0.8.2.
注:
1、Python 版本:kafka-python 支持 2.6,不支持 3.5, PyKafak 支持 2.7,3.4,3.5
2、Kafka 版本:kafka-python 支持 0.8.0 ~ 0.8.2,PyKafka 只支持 0.8.2
@ottomata写道
A difference between kafka-python and pykafka is the producer interface. kafka-python does not require that you know the topic when instantiating the producer. This is convenient if you need to produce to topics dynamically based on input (which I do!) :)
注:
kafka-python 在初始化 producer 时,不需要知道 topic。(言外之意:PyKafka 需要?)
@cscheffler写道
@emmett9001 @ottomata Just got pointed at this thread and thought I'd make a late contribution.
We compared pykafka and kafka-python about 2 months ago while trying to decide which one to use. In the end, the deciding factor for us was that balanced consumers were much easier to manage in pykafka.
Also, we discovered later, a pykafka producer doesn't die on Kafka broker restart, while our kafka-python producers did.
Below are performance figures from a 3-node Kafka cluster running in EC2, using a single producer or consumer. The three numbers for each test are the quartiles measured for the test.
pykafka producer: 41400 – 46500 – 50200 messages per second
pykafka consumer: 12100 – 14400 – 23700 messages per second
kafka-python producer: 26500 – 27700 – 29500 messages per second
kafka-python consumer: 35000 – 37300 – 39100 messages per second
So, for clarification, the median performance of a pykafka producer was 46500 messages per second, with a quartile range of 41400 (25th percentile) to 50200 (75th percentile). Hope that makes sense.
We compared pykafka and kafka-python about 2 months ago while trying to decide which one to use. In the end, the deciding factor for us was that balanced consumers were much easier to manage in pykafka.
Also, we discovered later, a pykafka producer doesn't die on Kafka broker restart, while our kafka-python producers did.
Below are performance figures from a 3-node Kafka cluster running in EC2, using a single producer or consumer. The three numbers for each test are the quartiles measured for the test.
pykafka producer: 41400 – 46500 – 50200 messages per second
pykafka consumer: 12100 – 14400 – 23700 messages per second
kafka-python producer: 26500 – 27700 – 29500 messages per second
kafka-python consumer: 35000 – 37300 – 39100 messages per second
So, for clarification, the median performance of a pykafka producer was 46500 messages per second, with a quartile range of 41400 (25th percentile) to 50200 (75th percentile). Hope that makes sense.
注:
1、当 broker 重启后,kafka-python producer 会死掉,而 PyKafka 不会
2、作为生产者,PyKafka 性能更好;作为消费者,kafka-python 性能更好
相关推荐
Confluent的适用于Apache Kafka TM的Python客户端 confluent-kafka-python提供了与所有兼容的高级Producer,Consumer和AdminClient 经纪人> = v0.8, 和。 客户是: 可靠-它是 (通过二进制车轮自动提供)的包装,...
资源来自pypi官网。 资源全名:opentracing-python-kafka-client-0.9.tar.gz
Kafka Python客户端介绍简单介绍用于Apache Kafka的python库。 (即) 此仓库包含一个可配置且简单的生产者和使用者,以简单地显示其工作方式的示例。跑步使用Python 3 安装要求: pip install -r requirements.txt ...
python库,解压后可用。 资源全名:kafka_client_decorators-0.9.2-py3-none-any.whl
from kafka import KafkaClient from kafka.producer import SimpleProducer def send_data_2_kafka(datas): ''' 向kafka解析队列发送数据 ''' client = KafkaClient(hosts=KAFKABROKER.split(,), timeout=30)...
from pykafka import KafkaClient host = 'IP:9092, IP:9092, IP:9092' client = KafkaClient(hosts = host) # 生产者 topicdocu = client.topics['my-topic'] producer = topicdocu.get_producer() for i in range...
kafka-logsize-exporter 安装 下载,解压缩 入门 pip install -r requirement.txt vim cluster.conf # cluster name alias [kafka1003] # zookeeper zk = 127.0.0.1:2128/kafka1003 # kafka broker list brokers = ...
ansible-kafka-管理员 一个低级的ansible库,用于管理Kafka配置。 它不使用Kafka脚本,而是直接连接到Kafka和Zookeeper(如果需要)以确保创建资源。 无需SSH连接到远程主机。 如果您想增加分区,复制因子,更改...
from pykafka import KafkaClient host = 'IP:9092, IP:9092, IP:9092' client = KafkaClient(hosts = host) print client.topics # 生产者 topicdocu = client.topics['my-topic'] producer = topicdocu.get_...
资源来自pypi官网。 资源全名:kafka_client_decorators-0.9.2-py3-none-any.whl
Bruce 是 Apache Kafka 的生产者守护进程,它简化了客户端发送消息到 Kafka ,无需关注后端的 Kafka 集群。Bruce 主要处理: Routing messages to the proper brokers, and spreading the load evenly across ...
KQ:适用于Python的Kafka作业队列 KQ(Kafka队列)是一个轻量级的Python库,可让您使用异步入队和执行作业。 它在使用 。 公告内容 KQ版本3.0.0将不再支持Python 3.5。 请参阅以获取最新更新。 要求 0.9+ Python...
说真,这个问题看上去很简单,但“得益”与kafka-python神奇的文档,真的不算简单,反正我是搜了半天还看了半天源码。 直接上代码吧 from kafka import SimpleClient, KafkaConsumer from kafka.common import ...
资源分类:Python库 所属语言:Python 资源全名:kafka_client_decorators-0.8.7.tar.gz 资源来源:官方 安装方法:https://lanzao.blog.csdn.net/article/details/101784059
自述文件 这是erply.com与apache-kafka的api使用实现,消费者从生产商处接收产品,并使用erply.com api添加产品(如果尚不存在)。...在运行python脚本之前,必须先配置并运行Apache kafka。 kafka版本= kafka_2.12-1
示例将包括使用TensorFlow,Keras,H2O,Python,DeepLearning4J和其他技术构建的分析模型。 材料(博客文章,幻灯片,视频) 如果您想阅读和聆听理论而不是动手实践,那么这里有一些有关该主题的材料: 博客文章...
描述 这是一项将所有与PHP Kafka相关的事物收集到一起的工作并为每个人提供有关我们生态系统中可用资源的更好信息。...php-kafka / php-simple-kafka-client 这是一个新的扩展,由php-rdkafka的贡献者
卢梭链 具有有效O(1)附加和O(logN)证明的哈希链。... 作为python client.py kafka_host:kafka_port ,例如 python client.py 192.168.1.110:9092 或者 python client.py kafka.network.hostname:9092
Kafka的asyncio客户端 AIOKafka制作人 AIOKafkaProducer是高级异步消息生成器。 AIOKafkaProducer用法示例: from aiokafka import AIOKafkaProducer import asyncio async def send_one (): producer = ...
Onesait平台客户端库 此文件夹包括用于在Onesait平台上进行通信和开发的不同客户端库。 首次使用客户端之前,强烈建议您学习的主要... 有关更多信息,请访问Orange3-client自述文件 Python客户端 该客户端支持MQTT和R