我正在尝试使用Beam编程框架(PythonSDK)从Pub/Sub主题流式传输消息,并将它们写到控制台。
这是我的代码(使用apache-bin==2.27.0
):
import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions
TOPIC_PATH = "projects/<project-id>/topics/<topic-id>"
def run(pubsub_topic):
options = PipelineOptions(
streaming=True
)
runner = 'DirectRunner'
print("I reached before pipeline")
with beam.Pipeline(runner, options=options) as pipeline:
(
pipeline
| "Read from Pub/Sub topic" >> beam.io.ReadFromPubSub(topic=pubsub_topic)
| "Writing to console" >> beam.Map(print)
)
print("I reached after pipeline")
result = pipeline.run()
result.wait_until_finish()
run(TOPIC_PATH)
然而,当我执行这个管道时,我得到这个TypeError:
ERROR:apache_beam.runners.direct.executor:Exception at bundle <apache_beam.runners.direct.bundle_factory._Bundle object at 0x1349763c0>, due to an exception.
TypeError: create_subscription() takes from 1 to 2 positional arguments but 3 were given
最后它说:
ERROR:apache_beam.runners.direct.executor:Giving up after 4 attempts.
我不确定,我做错了什么,提前感谢你的帮助。
我不知道错误的确切位置,但是您可以考虑使用以下Beam示例之一作为模型并从那里开始吗?
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/pubsub_it_pipeline.py
https://github.com/apache/beam/blob/release-2.27.0/sdks/python/apache_beam/examples/snippets/snippets.py#L684
当我通过“pip install apache-光束”安装时,我也遇到了同样的问题。当我切换到“pip install apache-光束[gcp]”时,它对我有效,即使是DirectRunner。