For design of the tests, see https://grpc.io/docs/guides/benchmarking.
This document contains documentation of on how to run gRPC end-to-end benchmarks using the gRPC OSS benchmarks framework (recommended) or how to run them manually (for experts only).
The scripts in this section generate LoadTest configurations for the GKE-based gRPC OSS benchmarks framework. This framework is stored in a separate repository, grpc/test-infra.
These scripts, together with tools defined in grpc/test-infra, are used in the continuous integration setup defined in grpc_e2e_performance_gke.sh and grpc_e2e_performance_gke_experiment.sh.
The benchmarks framework uses the same test scenarios as the legacy one. The script scenario_config_exporter.py can be used to export these scenarios to files, and also to count and analyze existing scenarios.
The language(s) and category of the scenarios are of particular importance to
the tests. Continuous runs will typically run tests in the scalable
category.
The following example counts scenarios in the scalable
category:
$ ./tools/run_tests/performance/scenario_config_exporter.py --count_scenarios --category=scalable
Scenario count for all languages (category: scalable):
Count Language Client Server Categories
77 c++ scalable
19 python_asyncio scalable
16 java scalable
12 go scalable
12 node node scalable
12 node_purejs node scalable
9 csharp scalable
7 python scalable
5 ruby scalable
4 csharp c++ scalable
4 php7 c++ scalable
4 php7_protobuf_c c++ scalable
3 python_asyncio c++ scalable
2 ruby c++ scalable
2 python c++ scalable
1 csharp c++ scalable
189 total scenarios (category: scalable)
Client and server languages are only set for cross-language scenarios, where the client or server language do not match the scenario language.
The benchmarks framework uses LoadTest resources configured by YAML files. Each LoadTest resource specifies a driver, a server, and one or more clients to run the test. Each test runs one scenario. The scenario configuration is embedded in the LoadTest configuration. Example configurations for various languages can be found here:
https://github.com/grpc/test-infra/tree/master/config/samples
The script loadtest_config.py generates LoadTest configurations for tests running a set of scenarios. The configurations are written in multipart YAML format, either to a file or to stdout. Each configuration contains a single embedded scenario.
The LoadTest configurations are generated from a template. Any configuration can be used as a template, as long as it contains the languages required by the set of scenarios we intend to run (for instance, if we are generating configurations to run go scenarios, the template must contain a go client and a go server; if we are generating configurations for cross-language scenarios that need a go client and a C++ server, the template must also contain a C++ server; and the same for all other languages).
The LoadTests specified in the script output all have unique names and can be
run by applying the test to a cluster running the LoadTest controller with
kubectl apply
:
$ kubectl apply -f loadtest_config.yaml
Note: The most common way of running tests generated by this script is to use a test runner. For details, see running tests.
A basic template for generating tests in various languages can be found here: loadtest_template_basic_all_languages.yaml. The following example generates configurations for C# and Java tests using this template, including tests against C++ clients and servers, and running each test twice:
$ ./tools/run_tests/performance/loadtest_config.py -l go -l java \
-t ./tools/run_tests/performance/templates/loadtest_template_basic_all_languages.yaml \
-s client_pool=workers-8core -s driver_pool=drivers \
-s server_pool=workers-8core \
-s big_query_table=e2e_benchmarks.experimental_results \
-s timeout_seconds=3600 --category=scalable \
-d --allow_client_language=c++ --allow_server_language=c++ \
--runs_per_test=2 -o ./loadtest.yaml
The script loadtest_config.py
takes the following options:
-l
, --language
-t
, --template
-s
, --substitution
Substitution keys, in the format key=value
. These
keys are substituted while processing the template. Environment variables that
are set by the load test controller at runtime are ignored by default
(DRIVER_PORT
, KILL_AFTER
, POD_TIMEOUT
). The user can override this
behavior by specifying these variables as keys.-p
, --prefix
metadata.name
. The prefix is also added as
the prefix
label in metadata.labels
. The prefix defaults to the user name
if not set.-u
, --uniquifier_element
uniquifier
annotation in metadata.annotations
.-d
-a
, --annotation
metadata.annotations
, in the form key=value. May be repeated.-r
, --regex
.*
, i.e., select all
scenarios.--category
all
. Continuous runs typically run tests in the
scalable
category.--allow_client_language
c++
. This flag may be repeated.--allow_server_language
node
or c++
. This flag may be repeated.--instances_per_client
0
, 1
, ...--runs_per_test
n
times, where n
is the value of the flag. If n
> 1, the index of each
test run is added as a uniquifier element for that run.-o
, --output
sys.stdout
if
not set.The script adds labels and annotations to the metadata of each LoadTest configuration:
The following labels are added to metadata.labels
:
language
prefix
metadata.name
.The following annotations are added to metadata.annotations
:
scenario
uniquifier
Labels can be used in selectors in resource queries. Adding the prefix, in particular, allows the user (or an automation script) to select the resources started from a given run of the config generator.
Annotations contain additional information that is available to the user (or an automation script) but is not indexed and cannot be used to select objects. Scenario name and uniquifier are added to provide the elements of the LoadTest name uuid in human-readable form. Additional annotations may be added later for automation.
The LoadTest configuration generator can process multiple languages at a time, assuming that they are supported by the template. The convenience script loadtest_concat_yaml.py is provided to concatenate several YAML files into one, so configurations generated by multiple generator invocations can be concatenated into one and run with a single command. The script can be invoked as follows:
$ loadtest_concat_yaml.py -i infile1.yaml infile2.yaml -o outfile.yaml
The script loadtest_examples.sh is provided to generate example load test configurations in all supported languages. This script takes only one argument, which is the output directory where the configurations will be created. The script produces a set of basic configurations, as well as a set of template configurations intended to be used with prebuilt images.
The examples in the repository grpc/test-infra are generated by this script.
The script loadtest_template.py generates a load test configuration template from a set of load test configurations. The source files may be load test configurations or load test configuration templates. The generated template supports all languages supported in any of the input configurations or templates.
The example template in loadtest_template_basic_template_all_languages.yaml was generated from the example configurations in grpc/test-infra by the following command:
$ ./tools/run_tests/performance/loadtest_template.py \
-i ../test-infra/config/samples/*_example_loadtest.yaml \
--inject_client_pool --inject_server_pool \
--inject_big_query_table --inject_timeout_seconds \
-o ./tools/run_tests/performance/templates/loadtest_template_basic_all_languages.yaml \
--name basic_all_languages
The example template with prebuilt images in loadtest_template_prebuilt_all_languages.yaml was generated by the following command:
$ ./tools/run_tests/performance/loadtest_template.py \
-i ../test-infra/config/samples/templates/*_example_loadtest_with_prebuilt_workers.yaml \
--inject_client_pool --inject_driver_image --inject_driver_pool \
--inject_server_pool --inject_big_query_table --inject_timeout_seconds \
-o ./tools/run_tests/performance/templates/loadtest_template_prebuilt_all_languages.yaml \
--name prebuilt_all_languages
The script loadtest_template.py
takes the following options:
-i
, --inputs
-o
, --output
sys.stdout
if not set.--inject_client_pool
spec.clients
is set to ${client_pool}
, for later substitution.--inject_driver_image
spec.drivers
is set to ${driver_image}
, for later
substitution.--inject_driver_pool
${driver_pool}
, for later substitution.--inject_server_pool
spec.servers
is set to ${server_pool}
, for later substitution.--inject_big_query_table
${big_query_table}
.--inject_timeout_seconds
spec.timeoutSeconds
is
set to ${timeout_seconds}
.--inject_ttl_seconds
spec.ttlSeconds
is set to
${ttl_seconds}
.-n
, --name
metadata.name
.-a
, --annotation
metadata.annotations
, in the form key=value. May be repeated.The options that inject substitution keys are the most useful for template reuse. When running tests on different node pools, it becomes necessary to set the pool, and usually also to store the data on a different table. When running as part of a larger collection of tests, it may also be necessary to adjust test timeout and time-to-live, to ensure that all tests have time to complete.
The template name is replaced again by loadtest_config.py
, and so is set only
as a human-readable memo.
Annotations, on the other hand, are passed on to the test configurations, and may be set to values or to substitution keys in themselves, allowing future automation scripts to process the tests generated from these configurations in different ways.
Collections of tests generated by loadtest_config.py
are intended to be run
with a test runner. The code for the test runner is stored in a separate
repository, grpc/test-infra.
The test runner applies the tests to the cluster, and monitors the tests for completion while they are running. The test runner can also be set up to run collections of tests in parallel on separate node pools, and to limit the number of tests running in parallel on each pool.
For more information, see the tools README in grpc/test-infra.
For usage examples, see the continuous integration setup defined in grpc_e2e_performance_gke.sh and grpc_e2e_performance_gke_experiment.sh.
This approach is much more involved than using the gRPC OSS benchmarks framework (see above), but can still be useful for hands-on low-level experiments (especially when you know what you are doing).
In general the benchmark workers and driver build scripts expect linux_performance_worker_init.sh to have been ran already.
The run_performance_test.py top-level runner script can also be used with remote machines, but for e.g., profiling the server, it might be useful to run workers manually.
You'll need a "driver" and separate "worker" machines. For example, you might use one GCE "driver" machine and 3 other GCE "worker" machines that are in the same zone.
Connect to each worker machine and start up a benchmark worker with a "driver_port".
These are more simple since they all live in the main grpc repo.
$ cd <grpc_repo_root>
$ tools/run_tests/performance/build_performance.sh
$ tools/run_tests/performance/run_worker_<language>.sh
Note that there is one "run_worker" script per language, e.g., run_worker_csharp.sh for c#.
You'll need the grpc-java repo.
$ cd <grpc-java-repo>
$ ./gradlew -PskipCodegen=true -PskipAndroid=true :grpc-benchmarks:installDist
$ benchmarks/build/install/grpc-benchmarks/bin/benchmark_worker --driver_port <driver_port>
You'll need the grpc-go repo
$ cd <grpc-go-repo>/benchmark/worker && go install
$ # if profiling, it might be helpful to turn off inlining by building with "-gcflags=-l"
$ $GOPATH/bin/worker --driver_port <driver_port>
Connect to the driver machine (if using a remote driver) and from the grpc repo root:
$ tools/run_tests/performance/build_performance.sh
Get the 'scenario_json' relevant for the scenario to run. Note that "scenario
json" configs are generated from scenario_config.py.
The driver takes a list of these
configs as a json string of the form: {scenario: <json_list_of_scenarios> }
in its --scenarios_json
command argument. One quick way to get a valid json
string to pass to the driver is by running the
run_performance_tests.py locally and copying
the logged scenario json command arg.
From the grpc repo root:
QPS_WORKERS
environment variable to a comma separated list of worker
machines. Note that the driver will start the "benchmark server" on the first
entry in the list, and the rest will be told to run as clients against the
benchmark server.Example running and profiling of go benchmark server:
$ export QPS_WORKERS=<host1>:<10000>,<host2>,10000,<host3>:10000
$ bins/opt/qps_json_driver --scenario_json='<scenario_json_scenario_config_string>'
While running the benchmark, a profiler can be attached to the server.
Example to count syscalls in grpc-go server during a benchmark:
Connect to server machine and run:
$ netstat -tulpn | grep <driver_port> # to get pid of worker
$ perf stat -p <worker_pid> -e syscalls:sys_enter_write # stop after test complete
Example memory profile of grpc-go server, with go tools pprof
:
After a run is done on the server, see its alloc profile with:
$ go tool pprof --text --alloc_space http://localhost:<pprof_port>/debug/heap
Consuming process: qps_worker
Type: integer (number of seconds)
This can be used to configure the amount of time that benchmark clients wait for channels to the benchmark server to become ready. This is useful in certain benchmark environments in which the server can take a long time to become ready. Note: if setting this to a high value, then the scenario config under test should probably also have a large "warmup_seconds".
Consuming process: qps_json_driver
Type: comma separated list of host:port
Set this to a comma separated list of QPS worker processes/machines. Each
scenario in a scenario config has specifies a certain number of servers,
num_servers
, and the driver will start "benchmark servers"'s on the first
num_server
host:port
pairs in the comma separated list. The rest will be
told to run as clients against the benchmark server.