Sadayuki Furuhashi
Founder & Software Architect
Treasure Data, inc.
Presto + MySQL
道玄坂LT祭り
で分散SQL
A little about me...
> Sadayuki Furuhashi
> github/twitter: @frsyuki
> Treasure Data, Inc.
> Founder & Software Architect
...
Check: www.treasuredata.com
Cloud service for the entire data pipeline,
including Presto. We’re hiring!
What’s Presto?
A distributed SQL query engine

for interactive data analisys

against GBs to PBs of data.
Client
Coordinator Connector

Plugin
Worker
Worker
Worker
Storage / Metadata
Discovery Service
Client
Coordinator Connector
Plugin
Worker
Worker
Worker
Storage / Metadata
Discovery Service
1. find servers in a cluster

Client
Coordinator Connector
Plugin
Worker
Worker
Worker
Storage / Metadata
Discovery Service
2. Client sends a query

usi...
Client
Coordinator Connector

Plugin
Worker
Worker
Worker
Storage / Metadata
Discovery Service
3. Coordinator builds

a qu...
Client
Coordinator Connector
Plugin
Worker
Worker
Worker
Storage / Metadata
Discovery Service
4. Coordinator sends

tasks ...
Client
Coordinator Connector

Plugin
Worker
Worker
Worker
Storage / Metadata
Discovery Service
5. Workers read data

throu...
Client
Coordinator Connector

Plugin
Worker
Worker
Worker
Storage / Metadata
Discovery Service
6. Workers run tasks

in me...
Coordinator Connector
Plugin
Worker
Worker
Worker
Storage / Metadata
Discovery Service
7. Client gets the result

from a w...
Client
Coordinator Connector

Plugin
Worker
Worker
Worker
Storage / Metadata
Discovery Service
Client
Coordinator Hive

Connector
Worker
Worker
Worker
HDFS,

Hive Metastore
Discovery Service
find servers in a cluster
...
Client
Coordinator JDBC

Connector
Worker
Worker
Worker
Cassandra
Discovery Service
find servers in a cluster
Cassandra co...
Client
Coordinator
other

connectors

...
Worker
Worker
Worker
PostgreSQL
Discovery Service
find servers in a cluster
Hive...
All stages are pipe-lined
✓ No wait time
✓ No fault-tolerance
MapReduce vs. Presto
MapReduce Presto
map map
reduce reduce
...
Presto meetup!
Presto
JOIN
Hive
MySQL
client
select orderkey, orderdate, custkey, email

from orders

join mysql.presto_test.users

on or...
Presto
JOIN
Hive
MySQLINSERT INTO
client
create table mysql.presto_test.recent_user_info

as
select users.id, users.email,...
Presto
JOIN
Hive
MySQL
$ psql Prestogres
Presto
JOIN
Hive
MySQL
$ psql Prestogres
PostgreSQL protocol gateway
for Presto
Presto+MySQLで分散SQL
Presto+MySQLで分散SQL
of 24

Presto+MySQLで分散SQL

2015-01-14 道玄坂LT祭り(ミドル・インフラ) in Japan 『Presto + MySQLで分散SQL』 by Sadayuki Furuhashi
Published on: Mar 4, 2016
Published in: Presentations & Public Speaking      
Source: www.slideshare.net


Transcripts - Presto+MySQLで分散SQL

  • 1. Sadayuki Furuhashi Founder & Software Architect Treasure Data, inc. Presto + MySQL 道玄坂LT祭り で分散SQL
  • 2. A little about me... > Sadayuki Furuhashi > github/twitter: @frsyuki > Treasure Data, Inc. > Founder & Software Architect > Open-source hacker > MessagePack - Efficient object serializer > Fluentd - An unified data collection tool > ServerEngine - A Ruby framework to build multiprocess servers > Prestogres - PostgreSQL protocol gateway for Presto > LS4 - A distributed object storage with cross-region replication > kumofs - A distributed strong-consistent key-value data store
  • 3. Check: www.treasuredata.com Cloud service for the entire data pipeline, including Presto. We’re hiring!
  • 4. What’s Presto? A distributed SQL query engine
 for interactive data analisys
 against GBs to PBs of data.
  • 5. Client Coordinator Connector
 Plugin Worker Worker Worker Storage / Metadata Discovery Service
  • 6. Client Coordinator Connector Plugin Worker Worker Worker Storage / Metadata Discovery Service 1. find servers in a cluster
  • 7. Client Coordinator Connector Plugin Worker Worker Worker Storage / Metadata Discovery Service 2. Client sends a query
 using HTTP
  • 8. Client Coordinator Connector
 Plugin Worker Worker Worker Storage / Metadata Discovery Service 3. Coordinator builds
 a query plan Connector plugin
 provides metadata (table schema, etc.)
  • 9. Client Coordinator Connector Plugin Worker Worker Worker Storage / Metadata Discovery Service 4. Coordinator sends
 tasks to workers
  • 10. Client Coordinator Connector
 Plugin Worker Worker Worker Storage / Metadata Discovery Service 5. Workers read data
 through connector plugin
  • 11. Client Coordinator Connector
 Plugin Worker Worker Worker Storage / Metadata Discovery Service 6. Workers run tasks
 in memory
  • 12. Coordinator Connector Plugin Worker Worker Worker Storage / Metadata Discovery Service 7. Client gets the result
 from a worker Client
  • 13. Client Coordinator Connector
 Plugin Worker Worker Worker Storage / Metadata Discovery Service
  • 14. Client Coordinator Hive
 Connector Worker Worker Worker HDFS,
 Hive Metastore Discovery Service find servers in a cluster Hive connector
  • 15. Client Coordinator JDBC
 Connector Worker Worker Worker Cassandra Discovery Service find servers in a cluster Cassandra connector
  • 16. Client Coordinator other
 connectors
 ... Worker Worker Worker PostgreSQL Discovery Service find servers in a cluster Hive
 Connector HDFS / Metastore Multiple connectors in a query JDBC
 Connector Other data sources...
  • 17. All stages are pipe-lined ✓ No wait time ✓ No fault-tolerance MapReduce vs. Presto MapReduce Presto map map reduce reduce task task task task task task memory-to-memory data transfer ✓ No disk IO ✓ Data chunk must fit in memory task disk map map reduce reduce disk disk Write data
 to disk Wait between
 stages
  • 18. Presto meetup!
  • 19. Presto JOIN Hive MySQL client select orderkey, orderdate, custkey, email
 from orders
 join mysql.presto_test.users
 on orders.custkey = users.id
 order by custkey, orderdate;
  • 20. Presto JOIN Hive MySQLINSERT INTO client create table mysql.presto_test.recent_user_info
 as select users.id, users.email, count(1) as count
 from orders
 join mysql.presto_test.users
 on orders.custkey = users.id
 group by 1, 2;
  • 21. Presto JOIN Hive MySQL $ psql Prestogres
  • 22. Presto JOIN Hive MySQL $ psql Prestogres PostgreSQL protocol gateway for Presto

Related Documents