How to transfer BigQuery table to Cloud SQL using Cloud Dataflow

How to transfer BigQuery table to Cloud SQL using Cloud DataflowJose Miguel ArrietaBlockedUnblockFollowFollowingJan 6Cloud Dataflow is an excellent solution to move data around, and several articles have being dedicated to use Dataflow to ELT data into BigQuery.

This post is dedicated in the opposite direction.

I have some table a need in BigQuery and want to move it to MySql.

This post will be build on top on the previous Dataflow post How to Create A Cloud Dataflow Pipeline Using Java and Apache Maven , and could be seen as an extension of the previous one.

Goal: Transfer some columns from BigQuery table to a MySql Table.

Disclaimer: I am a newbie on Dataflow and this series of posts help me to learn and help others.

0.

PrerequisiteA basic template project for Google Cloud Dataflow.

See: https://datascience.

com.

co/how-to-create-a-cloud-dataflow-pipeline-using-java-and-apache-maven-fc53279e9424Create Buckets: (See https://cloud.

google.

com/storage/docs/creating-buckets)This Buckets will contain jar files and temporal files if necessary.

BigQuery Table with data created: For this example I will usebigquery-public-data.

austin_bikeshare.

bikeshare_tripsAvailable as public on Google BigQuery.

Create Cloud SQL instance, database and tablea.

For Cloud SQL instance creation see https://cloud.

google.

com/sql/docs/mysql/create-instanceb.

For database and table creation I used the following codeDatabase creationCREATE DATABASE bikeshare;And for my trips on MySQL the followingCREATE TABLE trips( trip_id BIGINT, subscriber_type VARCHAR(250), start_station_name VARCHAR(250), end_station_name VARCHAR(250));Enable Cloud SQL Admin API1.

Add source codeAfter a Google Cloud Dataflow template is created like explained in https://datascience.

com.

co/how-to-create-a-cloud-dataflow-pipeline-using-java-and-apache-maven-fc53279e9424 .

In the same folder where is StarterPipeline.

java add the following Java class code BikeTrip.

java on source.

The previous java code will read a table from BigQuery, selected some columns and write into a MySQL table previously created.

Note: Change The following URL accordingly“jdbc:mysql://google/<DATABASE_NAME>?cloudSqlInstance=<PROJECT_ID>:<INSTANCE_LOCATION>:<INSTANCE_CONNECTION_NAME>&socketFactory=com.

google.

cloud.

sql.

mysql.

SocketFactory&user=<MYSQL_USER_NAME>&password=<MYSQL_USER_PASSWORD>&useSSL=false”Warning: Do not hardcode credentials on code.

This example is only with learning purposes.

Source: https://cloud.

google.

com/sql/docs/mysql/connect-external-app2.

Edit Pom.

xmlEdit the Pom.

xml file and add the following dependencies<dependency> <groupId>org.

apache.

commons</groupId> <artifactId>commons-dbcp2</artifactId> <version>2.

0</version> </dependency> <dependency> <groupId>org.

apache.

beam</groupId> <artifactId>beam-sdks-java-io-jdbc</artifactId> <version>2.

4.

0</version> </dependency> <dependency> <groupId>org.

apache.

commons</groupId> <artifactId>commons-pool2</artifactId> <version>2.

4.

1</version> </dependency> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.

1.

6</version> </dependency> <dependency> <groupId>com.

google.

cloud.

sql</groupId> <artifactId>mysql-socket-factory</artifactId> <version>1.

0.

8</version> </dependency>4.

Run Maven commandRun following command on Maven projectmvn compile exec:java -e -Dexec.

mainClass=com.

click.

example.

BikeTrip -Dexec.

args="–project=dataflow-test-227715 –stagingLocation=gs://example-dataflow-stage/staging/ –tempLocation=gs://example-dataflow-stage/temp/ –runner=DataflowRunner"5.

See job runningGo to Dataflow and see if job started.

Open job and see that workers scaled up to 6 and job is writing to MySQL 3900 elements/second.

Job Succeed in 11 minutes and wrote 1077929 rows on MySQL.

6.

Check table migrated on Cloud SQLConnect to the Mysql Instance count the number of rows and see preview of the first 10 rows.

Cheers,Jose Miguel Arrieta Ramos.

. More details

Leave a Reply