Coder Social home page Coder Social logo

srmchem / azure-sql-db-databricks Goto Github PK

View Code? Open in Web Editor NEW

This project forked from azure-samples/azure-sql-db-databricks

0.0 0.0 0.0 313 KB

Azure SQL and Databricks samples and best practices for loading data quickly and efficiently

License: MIT License

Jupyter Notebook 100.00%

azure-sql-db-databricks's Introduction

page_type languages products description urlFragment
sample
tsql
sql
scala
azure
azure-databricks
azure-blob-storage
azure-key-vault
azure-sql-database
Fast Data Loading in Azure SQL DB using Azure Databricks
azure-sql-db-databricks

Fast Data Loading in Azure SQL DB using Azure Databricks

License

Azure Databricks and Azure SQL database can be used amazingly well together. This repo will help you to use the latest connector to load data into Azure SQL as fast as possible, using table partitions and column-store and all the known best-practices.

Samples

All the samples start from a partitioned Parquet file, created with data generated from the famous TPC-H benchmark. Free tools are available on TPC-H website to generate a dataset with the size you want:

http://www.tpc.org/tpch/

Once the Parquet file is available,

the samples will guide you through the most common scenarios

all samples will also show how to correctly load table if there are already indexes or if you want to use a column-store in Azure SQL.

Bonus Samples: Reading data as fast as possible

Though this repo focuses on writing data as fast as possible into Azure SQL, I also understand that you may also want to know how to do the opposite: how the read data as fast as possible from Azure SQL into Apache Spark / Azure Databricks? For this reason in the folder notebooks/read-from-azure-sql you will find two samples that shows how to do exactly that:

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.