Solved

ETL using Google App Engine?

Posted on 2015-01-27
4
282 Views
Last Modified: 2015-04-14
Our company is seriously considering implementing our data warehouse and business intelligence on Google's Cloud Platform. Datasets would be queried using BigQuery and resulting data would be read through Google Sheets, QlikView, Tableau, etc. The way this works seems rather straightforward, but the more complex part is the ETL process that needs to happen before the data is loaded into BigQuery.
We've been advised that it's possible to use the Google Compute/App Engine to handle our ETL, but there is little information out there in the form of examples or case studies.
I'm wondering if there are experts here who have experience with this technology, and who can share their experiences. I'm looking for an ETL solution that can ideally does not require a lot of manual programming/scripting and should run with minimal user interaction once it's created/set up. Our system data is all kept in MSSQL 2008 R2 databases, which are currently not cloud based. Data must be loaded from multiple databases across multiple systems.
I realize this is a broad question with many possible answers and opinions so I'll divide the points over the most useful posts.
Thanks in advance.
0
Comment
Question by:Koen Van Wielink
  • 3
4 Comments
 
LVL 11

Expert Comment

by:SThaya
ID: 40574610
Hi,

  I got few interesting results while i am searching on net.please find the below.

1. There are few ETL tools available for data load
https://cloud.google.com/bigquery/third-party-tools

2.Direct data insert into Big Query
http://codereview.stackexchange.com/questions/51828/insert-an-sql-server-table-rows-into-a-bigquery-table-in-one-bloc
3.SSIS ETL component :

http://www.rssbus.com/ssis/
http://www.rssbus.com/ssis/bigquery/download.aspx

i hope this will help you to move furthur
0
 
LVL 12

Author Comment

by:Koen Van Wielink
ID: 40574671
Hi SThaya,

Thanks for the reply.

I found the first point as well, but we'd prefer not to have to buy an expensive 3rd party ETL tool considering that, if we do require a separate tool, we have SSIS available.

I'm fairly sure the suggestion made in the forum you're referencing in point 2 is not possible, as you can only insert data through the Google API, and not as a linked server.

The SSIS ETL component you're referring to in point 3 looks interesting, but it bypasses the Google Cloud Storage completely. The main drawback I see here is that BigQuery does not support updates/deletes on existing records. If changes are required, the entire table has to be dropped and re-created with the new data. As such, if we have to reload the tables I'd much rather to this from the cloud storage, rather than having to upload huge amounts of data each time from our server directly into BigQuery. At least with Cloud Storage there should be some possibility to only keep track of the changes in CSV or JSON files.

Does anyone here on EE have first hand experience with BigQuery and, more importantly, an ETL process supporting it?
0
 
LVL 12

Accepted Solution

by:
Koen Van Wielink earned 0 total points
ID: 40714964
Just want to update this before requesting to close the question.
We are currently evaluating a tool called X-Plenty which is a cloud based ETL system which does not require any on-premise installation. We're probably also going to move away from Google BigQuery and use Amazon Redshift instead, although the final decision has yet to be made.
0
 
LVL 12

Author Closing Comment

by:Koen Van Wielink
ID: 40722598
No satisfactory answer provided, and our own research has led us to the solution stated in this answer.
0

Featured Post

Superior storage. Superior surveillance.

WD Purple drives are built for 24/7, always-on, high-definition security systems. With support for up to 8 hard drives and 32 cameras, WD Purple drives are optimized for surveillance.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Can someone plz fix this..getting an error 3 19
Sql query 107 30
SQL Pivot Rows To Columns 10 27
Google Scripts - copying dynamic named range. 4 6
The Delta outage: 650 cancelled flights, more than 1200 delayed flights, thousands of frustrated customers, tens of millions of dollars in damages – plus untold reputational damage to one of the world’s most trusted airlines. All due to a catastroph…
A customer recently asked me about anti-malware and the different deployment options available for his business. Daily news about cyberattacks, zero-day vulnerabilities, and companies that suffered a security breach made him wonder if the endpoint a…
Shows how to create a shortcut to site-search Experts Exchange using Google in the Chrome browser. This eliminates the need to type out site:experts-exchange.com whenever you want to search the site. Launch the Search Engine Menu: In chrome, via you…
Need to grow your business through quality cloud solutions? With everything required to build a cloud platform and solution, you may feel like the distance between you and the cloud is quite long. Help is here. Spend some time learning about the Con…

863 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

28 Experts available now in Live!

Get 1:1 Help Now