Enable Building of gRPC Python with Bazel
gRPC Python currently has a constellation of scripts written to build the project, but it has a lot of limitations in terms of speed and maintainability. Bazel is the open-sourced variant of Google's internal system, Blaze, which is an ideal replacement for building such projects in a fast and declarative fashion. But Bazel in itself is still in active development, especially in terms of Python (amongst a few other languages).
The project aimed to fill this gap and build gRPC Python with Bazel.
Although previously speculated, the project didn't require any contributions directly to bazelbuild/bazel. The Bazel rules for Python are currently being separated out into their own repo at bazelbuild/rules_python.
Bazel is still very much in active development for Python though. There's still challenges when it comes to building for Python 2 vs 3. Using pip packages is still in experimental. Bazel Python support is currently distributed across these two repositories and is yet to begin migration to one place (which will be bazelbuild/rules_python).
Bazel's roadmap for Python is publicly available here as a Google doc.
Cross contribution surprisingly came up because of building protobuf sources for Python, which is still not natively supported by Bazel. An existing repository, pubref/rules_protobuf, which was maintained by an independent maintainer (i.e. not a part of Bazel) helped solve this problem, but had one major blocking issue and could not be resolved at the source. But a solution to the issue was proposed by user dududko, which was not merged because of failing golang tests but worked well for Python. Hence, a fork of this repo was made and is to be used with gRPC until the solution can be merged back at the source.
Building Cython code is still not supported by Bazel, but the team at cython/cython have added support for Bazel on their side. The way it works is by including Cython as a third-party Bazel dependency and using custom Bazel rules for building our Cython code using the binary within the dependency.
pip and PyPI still remain the de-facto standard for distributing Python packages. Although Bazel is pretty versatile and is amazing for it's reproducible and incremental build capabilities, these can only be still used by the contributors and developers for building and testing the gRPC code. But there's no way yet to build Python packages for distribution.
Integration with the internal CI was one of the areas that highlighted how simple Bazel can be to use. gRPC was already using a dockerized Bazel setup to build some of it's core code (but not as the primary build setup). Adding a new job on the internal CI ended up being as simple as creating a new shell script to install the required dependencies (which were python-dev and Bazel) and a new configuration file which pointed to the subdirectiory (src/python) under which to look for targets and run the tests accordingly.
When writing Python packages, imports in nested modules are typically made
relative to the package root. But because of the way Bazel works, these paths
wouldn't make sense from the Workspace root. So, the folks at Bazel have added
a nifty imports
parameter to all the Python rules which lets us specify for
each target, which path to consider as the root. This parameter allows for
relative paths like imports = ["../",]
.
Cython code makes use of Python.h
, which pulls in the Python API for C
extension modules to use, but it's location depending on the Python version and
operating system the code is building on. To make this easier, the folks at
Tensorflow wrote repository rules for Python
autoconfiguration.
This has been adapted with some some
modifications for use in gRPC Python
as well.
All the Bazel tests for gRPC Python can be run using a single command:
bazel test --spawn_strategy=standalone --genrule_strategy=standalone //src/python/...
If any specific test is to be run, like say LoggingPoolTest
(which is present
in
src/python/grpcio_tests/tests/unit/framework/foundation/_logging_pool_test.py
),
the command to run would be:
bazel test --spawn_strategy=standalone --genrule_strategy=standalone //src/python/grpcio_tests/tests/unit/framework/foundation:logging_pool_test
where, logging_pool_test
is the name of the Bazel target for this test.
Similarly, to run a particular method, use:
bazel test --spawn_strategy=standalone --genrule_strategy=standalone //src/python/grpcio_tests/tests/unit/_rpc_test --test_arg=RPCTest.testUnrecognizedMethod
bazel build
with a -s
flag to see the logs being printed out to
standard output while building.bazel test
with a --test_output=streamed
to see the
test logs while testing. Something to know while using this flag is that all
tests will be run locally, without sharding, one at a time._reconnect_test
Python unit test with BazeltestAbortedStreamStream
in
src/python/grpcio_tests/tests/unit/_metadata_code_details_test.py
.requirements.bazel.txt
file in the repository root.