Testing tools
Wojciech.Biela@teradata.com
Łukasz.Osipiuk@teradata.com
Karol.Sobczak@teradata.com
Why we need them
● Certified distro
● Enterprise support
● Quarterly releases
● Product testing - Tempto
● Performance tes...
Tempto
Product test framework
github.com/prestodb/tempto
Łukasz Osipiuk
lukasz.osipiuk@teradata.com
What is Tempto?
● End-to-end product testing framework
● Targeted to software engineers
● For automation
● Tests easy to d...
How is test defined?
● Java
● SQL convention based
Example – Java based test
public class SimpleQueryTest extends ProductTest {
private static class SimpleTestRequirements i...
Example – Convention based test
allRows.sql:
-- database: hive; tables: blah
SELECT * FROM sample_table
allRows.result:
--...
Tempto architecture
user provided
library provided
TestNG
TestNG
listeners
utils
tests
requirements requirement
fulfillers
Tempto architecture
● Works well
● Extensible
● Well knownTestNG
TestNG
listeners
utils
tests
requirements requirement
ful...
Tempto architecture
● Tempto specific extension of TestNG
execution framework
● Requirements management
● Tests filtering
...
Tempto architecture
● Test code :)
○ Java
○ SQL-convention basedTestNG
utils
requirements requirement
fulfillers
TestNG
li...
Tempto architecture
● Declarative requirements
● Fulfilled by test framework via
pluggable fulfillers
● e.g. mutableTable(...
Tempto architecture
● extra assertions
● various tools
○ HDFS client
○ SSH client
○ JDBC query executor
TestNG
TestNG
list...
Executable runner
java -jar target/presto-product-tests-0.120-SNAPSHOT-executable.jar --help
usage: Presto product tests
-...
Configuration
hdfs:
username: hdfs
webhdfs:
host: master
port: 50070
tests:
hdfs:
path: /product-test
databases:
default:
...
Benchto
macro benchmarking framework
github.com/teradata/benchto (very soon)
Karol Sobczak
karol.sobczak@teradata.com
Goals
● Easy and manageable way to define benchmarks
● Run and analyze macro benchmarks in clustered environment
● Repeata...
Benchmarks - model
BenchmarkRun QueryExecution
Measurement
Aggregated
Measurement
Measurement
n n
1
n
1
n
Benchmarks - execution
before-benchmark-macros
prewarm
benchmark
.
.
execution-0
execution-1
execution-n
after-benchmark-m...
Benchmarks - execution
before-benchmark-macros
prewarm
benchmark
.
.
execution-0
execution-1
execution-n
after-benchmark-m...
Benchmarks - execution
before-benchmark-macros
prewarm
benchmark
.
.
execution-0
execution-1
execution-n
after-benchmark-m...
Benchmarks - execution
before-benchmark-macros
prewarm
benchmark
.
.
execution-0
execution-1
execution-n
after-benchmark-m...
Defining benchmarks - structure
● Convention based defining of benchmark through descriptors (YAML format)
and query SQL f...
Defining benchmarks - descriptor
● Descriptor is YAML configuration file with various properties and user defined
variable...
Defining benchmarks – SQL file templating
● SQL files can use keys defined in YAML configuration file – templates are
base...
Future work
● (Tempto) Support for complex concurrent tests execution
● (Benchto) Automatic regression detection
● (Bencht...
Questions?
Benchto GUI
● Visualization of benchmarks results
● Linking between tools (Grafana, Presto UI)
● Comparison of multiple be...
Grafana monitoring
● We use Grafana dashboard with Graphite
● Benchmark/executions life-cycle events are showed on dashboa...
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)
of 33

Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)

Tempto is a product test framework that allows developers to write and execute tests for SQL databases running on Hadoop. Individual test requirements such as data generation, HDFS file copy/storage of generated data and schema creation are expressed declaratively and are automatically fulfilled by the framework. Developers can write tests using Java (using a TestNG like paradigm and AssertJ style assertion) or by providing query files with expected results. We will show how we use it for presto product tests. Benchto is a benchmark framework that provides an easy and manageable way to define, run and analyze macro benchmarks in clustered environment. Understanding behavior of distributed systems is hard and requires good visibility intostate of the cluster and internals of tested system. This project was developed for repeatable benchmarking ofHadoop SQL engines, most importantly Presto.
Published on: Mar 4, 2016
Published in: Technology      
Source: www.slideshare.net


Transcripts - Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)

  • 1. Testing tools Wojciech.Biela@teradata.com Łukasz.Osipiuk@teradata.com Karol.Sobczak@teradata.com
  • 2. Why we need them ● Certified distro ● Enterprise support ● Quarterly releases ● Product testing - Tempto ● Performance testing - Benchto
  • 3. Tempto Product test framework github.com/prestodb/tempto Łukasz Osipiuk lukasz.osipiuk@teradata.com
  • 4. What is Tempto? ● End-to-end product testing framework ● Targeted to software engineers ● For automation ● Tests easy to define ● Focus on test code ● Focus on database systems ● So far used for testing ○ Presto ○ internal projects
  • 5. How is test defined? ● Java ● SQL convention based
  • 6. Example – Java based test public class SimpleQueryTest extends ProductTest { private static class SimpleTestRequirements implements RequirementsProvider{ public Requirement getRequirements(Configuration config) { return new ImmutableHiveTableRequirement(NATION); } } @Inject Configuration configuration; @Test(groups = {"smoke", "query"}) @Requires(SimpleTestRequirements.class) public void selectCountFromNation() { assertThat(query("select count(*) from nation")) .hasRowsCount(1) .hasRows(row(25)); } }
  • 7. Example – Convention based test allRows.sql: -- database: hive; tables: blah SELECT * FROM sample_table allRows.result: -- delimiter: |; ignoreOrder: false; types: BIGINT,VARCHAR 1|A| 2|B| 3|C|
  • 8. Tempto architecture user provided library provided TestNG TestNG listeners utils tests requirements requirement fulfillers
  • 9. Tempto architecture ● Works well ● Extensible ● Well knownTestNG TestNG listeners utils tests requirements requirement fulfillers
  • 10. Tempto architecture ● Tempto specific extension of TestNG execution framework ● Requirements management ● Tests filtering ● Injecting dependencies ● Extended logging TestNG utils tests requirements requirement fulfillers TestNG listeners
  • 11. Tempto architecture ● Test code :) ○ Java ○ SQL-convention basedTestNG utils requirements requirement fulfillers TestNG listeners tests
  • 12. Tempto architecture ● Declarative requirements ● Fulfilled by test framework via pluggable fulfillers ● e.g. mutableTable( Tpch.NATION, LOADED, “hive”) ● Test level and suite level ● Cleanup TestNG utils TestNG listeners tests requirements requirement fulfillers
  • 13. Tempto architecture ● extra assertions ● various tools ○ HDFS client ○ SSH client ○ JDBC query executor TestNG TestNG listeners tests requirements requirement fulfillers utils
  • 14. Executable runner java -jar target/presto-product-tests-0.120-SNAPSHOT-executable.jar --help usage: Presto product tests --config-local <arg> URI to Test local configuration YAML file. --report-dir <arg> Test reports directory --groups <arg> Test groups to be run --excluded-groups <arg> Test groups to be excluded --tests <arg> Test patterns to be included -h,--help Shows help message ● All dependencies embedded ● User provides cluster details through yaml config.
  • 15. Configuration hdfs: username: hdfs webhdfs: host: master port: 50070 tests: hdfs: path: /product-test databases: default: alias: presto hive: jdbc_driver_class: org.apache.hive.jdbc.HiveDriver jdbc_url: jdbc:hive2://master:10000 jdbc_user: hdfs jdbc_password: na jdbc_pooling: false jdbc_jar: test-framework-hive-jdbc-all.jar presto: jdbc_driver_class: com.facebook.presto.jdbc.PrestoDriver jdbc_url: jdbc:presto://localhost:8080/hive/default jdbc_user: hdfs jdbc_password: na jdbc_pooling: false
  • 16. Benchto macro benchmarking framework github.com/teradata/benchto (very soon) Karol Sobczak karol.sobczak@teradata.com
  • 17. Goals ● Easy and manageable way to define benchmarks ● Run and analyze macro benchmarks in clustered environment ● Repeatable benchmarking of Hadoop SQL engines, most importantly Presto ○ also used for Hive, Teradata components ● Transparent, trusted framework for benchmarking
  • 18. Benchmarks - model BenchmarkRun QueryExecution Measurement Aggregated Measurement Measurement n n 1 n 1 n
  • 19. Benchmarks - execution before-benchmark-macros prewarm benchmark . . execution-0 execution-1 execution-n after-benchmark-macros
  • 20. Benchmarks - execution before-benchmark-macros prewarm benchmark . . execution-0 execution-1 execution-n after-benchmark-macros
  • 21. Benchmarks - execution before-benchmark-macros prewarm benchmark . . execution-0 execution-1 execution-n after-benchmark-macros
  • 22. Benchmarks - execution before-benchmark-macros prewarm benchmark . . execution-0 execution-1 execution-n after-benchmark-macros
  • 23. Defining benchmarks - structure ● Convention based defining of benchmark through descriptors (YAML format) and query SQL files $ tree . . ├── application-presto-devenv.yaml ├── application-td-hdp.yaml ├── benchmarks │ ├── presto │ │ ├── concurrency-insert-multi-table.yaml │ │ ├── concurrency.yaml │ │ ├── linear-scan.yaml │ │ ├── tpch.yaml │ │ └── types.yaml │ └── querygrid-presto-ansi │ └── concurrency.yaml └── sql ├── presto │ ├── dev-zero │ │ ├── create-alltypes.sql │ │ └── create-lineitem.sql │ ├── linear-scan │ │ ├── selectivity-0.sql │ │ ├── selectivity-100.sql ...
  • 24. Defining benchmarks - descriptor ● Descriptor is YAML configuration file with various properties and user defined variables $ cat benchmarks/presto/concurrency.yaml datasource: presto query-names: presto/linear-scan/selectivity-${selectivity}.sql schema: tpch_100gb_orc database: hive concurrency: ${concurrency_level} runs: ${concurrency_level} prewarm-runs: 3 before-benchmark: drop-caches variables: 1: selectivity: 10, 100 concurrency_level: 10 2: selectivity: 10, 100 concurrency_level: 20 3: selectivity: 10, 100 concurrency_level: 50
  • 25. Defining benchmarks – SQL file templating ● SQL files can use keys defined in YAML configuration file – templates are based on FreeMarker $ cat sql/presto/tpch/q14.sql SELECT 100.00 * sum(CASE WHEN p.type LIKE 'PROMO%' THEN l.extendedprice * (1 - l.discount) ELSE 0 END) / sum(l.extendedprice * (1 - l.discount)) AS promo_revenue FROM "${database}"."${schema}"."lineitem" AS l, "${database}"."${schema}"."part" AS p WHERE l.partkey = p.partkey AND l.shipdate >= DATE '1995-09-01' AND l.shipdate < DATE '1995-09-01' + INTERVAL '1' MONTH
  • 26. Future work ● (Tempto) Support for complex concurrent tests execution ● (Benchto) Automatic regression detection ● (Benchto) Customized dashboards (e.g. overall performance analysis) ● (Benchto) Hardware and configuration awarness ● (Benchto) More complex benchmarking scenarios ● (Benchto) Support for complex concurrency scenarios ● (Benchto) Scheduling mechanism
  • 27. Questions?
  • 28. Benchto GUI ● Visualization of benchmarks results ● Linking between tools (Grafana, Presto UI) ● Comparison of multiple benchmarks
  • 29. Grafana monitoring ● We use Grafana dashboard with Graphite ● Benchmark/executions life-cycle events are showed on dashboards ● Provides good visibility into state of the cluster

Related Documents