microsoft / sqlworkshops Goto Github PK

SQL Server Workshops

License: MIT License

Jupyter Notebook 74.77% Batchfile 0.21% Java 1.85% Shell 1.08% TSQL 22.09%

sqlworkshops's Introduction

SQL Server and Azure SQL Labs and Workshops

(https://aka.ms/sqlworkshops)

This site is a map of learning content produced by and curated by the SQL Server and Azure SQL teams in Microsoft Engineering. These materials are meant to be instructor-led, but you can work through the materials on a test system on your own if desired. Labs are shorter and Workshops are more comprehensive. You can view all materials directly in this interface, or you can view the raw github site for this content here.

To download a Workshop or Lab to your local computer, navigate to the Workshop or Lab's github page using the links below. Once there, click the Clone or Download button you see there. More about that process is here.

See the license information at the bottom of this README.md file

Find a problem? Spot a bug? Post an issue here, include the page URL, and we'll try and fix it.

SQL Server Data Platform

Archived

Azure SQL

Programming

Presentation Materials, Code, and References

Learning how to self-learn

Many of these topics are quite deep, and take time to fully absorb. There are several phases of learning:

Awareness (You learn a technology exists and what it is used for)
Understanding (You learn the components, processes and steps of a technology)
Practice (You can perform the steps with the technology by following a process to complete a task)
Mastery (You are able to explain the technology to others)

If you need a general "Data Literacy" course, you can find that here.

These courses are designed for you to repeat many times to move through these phases. Before you embark on any of these, you may want to complete a "Learning how to Learn" course. You can find more information on that here.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

License

Microsoft and any contributors grant you a license to the Microsoft documentation and other content in this repository under the Creative Commons Attribution 4.0 International Public License, see the LICENSE file, and grant you a license to any code in the repository under the MIT License, see the LICENSE-CODE file.

Microsoft, Windows, Microsoft Azure and/or other Microsoft products and services referenced in the documentation may be either trademarks or registered trademarks of Microsoft in the United States and/or other countries. The licenses for this project do not grant you rights to use any Microsoft names, logos, or trademarks. Microsoft's general trademark guidelines can be found at http://go.microsoft.com/fwlink/?LinkID=254653.

Privacy information can be found at https://privacy.microsoft.com/en-us/

Microsoft and any contributors reserve all other rights, whether under their respective copyrights, patents, or trademarks, whether by implication, estoppel or otherwise.

Questions

Email questions to: [email protected]

sqlworkshops's People

Contributors

Stargazers

Watchers

Forkers

theconstructor fvdgeer chrisadkin-zz cbailiss nigelainscoe antoniosql andreasbotsikas rabryst aipsychnerd maxshen29 gburnettdba ndwivedi pmasl ccf19881030 runamuksoftware farcel kreidoss abdelrazekrizk pituach sqlvariant farukc kevinmiles 216giorgiy emeka-nmc cool-lei vin-yu maple0 joelhulen daovo monishabhattacharya1784 jeanpierremilan franksdevrepo serumhyb jayd2446 teralabpolska mmasuda-school ruiruitang vijayraavi klauskellerbra mikewang266 enyuxuan sqlauto alexists ephraimmagopa mikemullins nielsberglund woehrl01 kdsbest bryonh wiluey kchintap sidneyocirqueira-zz stversch hoorge plebla372 mrbez x3nc0n cloudmelon cjvaland sqlriz sassdawe slunyakin mengjin001 lielmg hbarieh mn244341 tiwarivishnu alexeysorok jeroenterheerdt kattheliving reddish-brick codehr29 getazureready elasticdataninja fsgp ashwilliams rafaeldca abhi-guru jonathan-vella ktaranov xenserverarmy ssitb jaimepolo chrisramaro vinodj dheerajbhavsar korn1699 hadesfeng princedc itumelengbk radityopw sindhuraghvan mmcsa balaji-suse aaronbertrand darlove idiall alexfeng2017 jtracey93 bweissman

sqlworkshops's Issues

Suggestion to remove checkbox image

Checkbox image just takes you to the image in github, seems unnecessary. Just a quick suggestion. Labs are great btw!

Size the tlog for sql2019lab/03_Availability/adr/adr_recovery.sql

After running the DELETE operation in STEP 3, the tlog for this db grows to 40 GB. We should add some warning about this space requirement for the VM and also change the CREATE DATABASE script to pre-create the db for this size. In the workshop instructions we mention this now:
<<
If you have time use the T-SQL notebook adr_recovery.ipynb or T-SQL script adr_recovery.sql to see how ADR affects the recovery process. This bonus activity can take some time as a large number of modifications are needed to see a bigger effect on recovery. The scripts will create a database with a data file of 10Gb and transaction log of 10Gb so there is plenty of space for the activity.

Type in brief description of Livy

You have 'Tob' submission, I suspect that Tob should be replaced with Job.

Feedback for Module 3 of the SQL Ground to Cloud Workshop

I'll review the notebooks from the activity in a separate GitHub issue if I find any, All of these comments are about this readme.md https://github.com/microsoft/sqlworkshops/blob/master/SQLGroundToCloud/sqlgroundtocloud/03-WorkingWithBigDataAndDataScienceBigDataClustersForSQLServer2019.md

This is a repeat paragraph from the previous one

In each module you'll get more references, which you should follow up on to learn more. Also watch for links within the text - click on each one to explore that topic.

I kind of felt this paragraph was out of place in the sequence

Some industry examples of big data processing are in Retail (Demand Prediction, Market-Basket Analysis), Finance (Fraud detection, customer segmentation), Healthcare (Fiscal control analytics, Disease Prevention prediction and classification, Clinical Trials optimization), Public Sector (Revenue prediction, Education effectiveness analysis), Manufacturing (Predictive Maintenance, Anomaly Detection) and Agriculture (Food Safety analysis, Crop forecasting) to name just a few.

A few comments on this intro

In this module you'll cover working with Data Science workloads with a focus on larger sets of data. Starting in SQL Server 2019, big data clusters allows for large-scale, near real-time processing of data over the HDFS file system and other data sources. It also leverages the Apache Spark framework which is integrated into one environment for management, monitoring, and security of your environment. This means that organizations can implement everything from queries to analysis to Machine Learning and Artificial Intelligence within SQL Server, over large-scale, heterogeneous data. SQL Server big data clusters can be implemented fully on-premises, in the cloud using a Kubernetes service such as Azure's AKS, and in a hybrid fashion. This allows for full, partial, and mixed security and control as desired.

I think you should mention the term data virtualization in this and what it means
What is the "hybrid fashion" example just so I'm clear. HDFS Tiering?

Right after this paragraph I would put in a few hyperlinks on where to read more about containers and k8s

To control containers, a technology called Kubernetes is used for deployment, management and storage of a grouping of containers, called a Cluster.

A SQL Server Big Data Cluster BDC can be deployed to three environments:

should be probably SQL Server Big Data Cluster (BDC)...

and I perhaps you should Introduce the BDC acronym term up earlier in the page?

Which components can be interconnected?

These architectures are not mutually exclusive - you can install some components on-premises, and others as a service. Your connections can interconnect across these environments.

When I read this, if I was a first time reader I might be a bit confused on "in addition to traditional..." since you have not described what the master instance is

In addition to traditional OLTP workloads, the SQL Server Big Data Cluster has three other uses that work with data at scale.

A few comments on this paragraph

Using the Data Virtualization capability of PolyBase in SQL Server Big Data Clusters you create External Tables for data outside the Cluster

I would take a second and describe the basics of Polybase and put a link to the doc pages
I think somewhere we should identify that BDC deploys Polybae for you automatically (which can be a bit of a burden to this manually outside of BDC)
It is not always outside the cluster right? What about HDFS in the cluster? We don't really talk about how Polybase can access HDFS data in the storage pool in an optimized fashion (vs the classic method of Java MapReduce)

This statement might confuse them if they haven't gone to your full course?

Using the Data Virtualization capability you saw in the 02 - BDC Components Module,...

So this talks about how data is ingested into the data pool. How is it accessed then?

and the PolyBase statements can specify the target of the Data Pool. The SQL Server Instances in the Data Pool store the data in a distributed fashion across multiple databases, called Shards.

You said there were 3 primary users but it was not clear what the other two were from this section

There are three primary uses for a large cluster of data processing systems for Machine Learning and AI applications.

You made this statement but never really talked about how a user accesses the Data Storage data stores directly

These users can access the Data Pool and Data Storage data stores directly to query and assist with this task.

You mentioned the App Pool in this sentence but never really defined what it is before this

or sent on to the App Pool where the Machine Learning Server is running

I would put some type of link here on Spark and Jupyter notebooks (perhaps our own tutorial in the docs)

The Data Scientist has another option to create and train ML and AI models. The Spark platform within the Storage Pool is accessible through the Knox gateway, using Livy to send Spark Jobs. This gives access to the full Spark platform, using Jupyter Notebooks (included in Azure Data Studio) or any other standard tools that can access Spark through REST calls.

broken link for pre-requisites

https://github.com/microsoft/sqlworkshops/blob/master/sqlserver2019bigdataclusters/SQL2019BDC/notebooks/README.md

The pre-requisites link is currently hitting a 404.

Typos and Wording Suggestions Part II

General comment: Not sure if it proper to bold the usage of sqlcmd throughout this section. It is currently in bold for the first usage.

2.0 Connect to SQL Server
Activity: Connect to SQL Server

Step #2 - Add a ":" after "sudo)"
Step #3 - Add a ":" after "be successful"
Step #2 - Rephrase to "Your version information may be different depending on the version of the workshop and SQL Server."
2.1 Restore a Database Backup
Step #3 - Add a ":" after "step2copy_backup_into_container.sh"

Activity: Execute SQL Server Queries

Step #2 - Add a ":" after "across your screen like this"
Step #3 - Add a ":" after "look like the following"
Step #4 - Add a ":" after "something similar to this"
Step #4 - The last paragraph is not indented as far as the other paragraphs
Step #4, last Paragraph - Rephrase to "The third batch provides information ... has consumed"
Step #5 - Add a ":" after "a prompt like this"
Step #5 - Rephrase "By typing in a query and hitting Enter, then typing in the keyword GO and hit Enter again to execute a query."

Cannot restore bak file

https://github.com/microsoft/sqlworkshops/blob/master/SQLServerMLServices/notebooks/03-Phase%201%20-%20Business%20Understanding.ipynb restores the database. It mentions a file Activity.bak but the link it provides downloads Analysis.bak. This is the code:
/* Restore Database */ USE master; GO RESTORE DATABASE Activity FROM DISK = 'C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\MSSQL\Backup\Activity.bak'; GO

Fixed code:

/* Restore Database */
USE master;
GO
RESTORE DATABASE Analysis
   FROM DISK = 'C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\MSSQL\Backup\Analysis.bak';
GO

Also, on my test machine with just one desk I cannot restore the Analysis.bak file, since it looking for D:\temp:

Msg 5133, Level 16, State 1, Line 3
Directory lookup for the file "d:\temp\tpcxbb_1gb.mdf" failed with the operating system error 3(The system cannot find the path specified.).
 Msg 3156, Level 16, State 3, Line 3
File 'tpcxbb_1gb' cannot be restored to 'd:\temp\tpcxbb_1gb.mdf'. Use WITH MOVE to identify a valid location for the file.
 Msg 5133, Level 16, State 1, Line 3
Directory lookup for the file "d:\temp\tpcxbb_1gb.ldf" failed with the operating system error 3(The system cannot find the path specified.).
 Msg 3156, Level 16, State 3, Line 3
File 'tpcxbb_1gb_log' cannot be restored to 'd:\temp\tpcxbb_1gb.ldf'. Use WITH MOVE to identify a valid location for the file.
 Msg 3119, Level 16, State 1, Line 3
Problems were identified while planning for the RESTORE statement. Previous messages provide details.
 Msg 3013, Level 16, State 1, Line 3
RESTORE DATABASE is terminating abnormally.

Workshop: SQL Server 2019 on OpenShift (CTP 2.5) - Module 1

Consider adding '-w' to 'oc get deployment mssql-deployment' command to avoid requiring user to run the command multiple times while they wait for the deployment,

SQL Server 2019 Lab suggestions

Hi, I was asked to add suggestions from my review here so here they are:

For the Intelligence Performance module it might be worth reminding people not to run all the cells within the notebook at once.

For the Availability module it might be worth having a prepopulated copy of the gocowboys database available as a download to save people processing time.

Also, worth mentioning to people if they are going to bonus they might want to keep an eye out for blocking issues when using the ALTER DATABASE statement, or get them to close everyting from the previous section first.

It might be worth telling people how much disk space they need for the gocowboys database in advance as well, especially if they are doing the bonus section.

For the Data Virtualization section it might be worth getting them to check that Polybase is definitely enabled before they look to connect to the Azure SQL database, just to be on the safe side.

I have had issues wit the Linux and Containers section. I suspect it may be resource related but I want to fully test before putting more details on here. In meantime might be testing on machines of various specs to see if that effects the running of the docker-compose.

Finally, it might be worth making sure people know that the slides are available, as they can be fairly useful for people.

Anyway, that's it for now.

Feedback for SQLG2C for section 4.2 in Module 4

Module 4 feedback for 4.2 based on this readme.md https://github.com/microsoft/sqlworkshops/blob/master/SQLGroundToCloud/sqlgroundtocloud/04-SQLServerOnTheMicrosoftAzurePlatform.md

=====

There is not enough time in this workshop to review all of the fundamentals at a deep level here. You'll receive pointers and an overview of Azure SQL platform benefits. Then, basic networking for Azure SQL will be covered. This section will go quickly in the workshop, but it's recommended to spend some time reviewing the many resources provided in this section.

I read this and wonder if someone will feel a bit off already by this. I would reword to focus on what you plan to cover but say there is reference information available if you want to go deeper

=====

Azure SQL fundamentals

After reading this section I struggled a bit to think I had just read "fundamentals". I felt I got a list and links to a series of benefits to using Azure SQL Database which is good but I personally felt like perhaps this should be called Azure SQL Benefits.

For this to truly be fundamentals for me I'm missing topics like deployment, configuration, loading data, securing, connecting, writing an application, ....

But I know this is not a deep dive. Contact me more if you want to discuss (especially becasue we then focus alot on networking)

Also Tailspin was part of 4.1 and I see it is in 4.3 but was not mentioned in 4.2?

====

without user involvement

I personally would prefer to say we provide automation for these functions without saying it doesn't involve any user interaction. For example, for performance you still need to get involved but we assist you

=====

Geo-replication by creating readable replicas of your database in the same or different data center (region).

this link only goes to auto failover groups for MI

=====

Networking fundamentals

ok, I read through this several times. I came away feeling it was deeper than just fundamentals to be honest. Anyone who is not a network expert may struggle through this section to be honest

Also, I walked away with the impression in this section that vnets are only for MI but I think we had this now for single database as well?

https://azure.microsoft.com/en-us/blog/vnet-service-endpoints-for-azure-sql-database-now-generally-available/

=====

ExpressRoute Provides secure and private connection between Azure and on-prem

vnets already give you secure and private. Isn't ExpressRoute more about speed?

=====

Chocloatey Powershell script assumes 7-zip is installed to unzip the package

Change text near chocolaty install to recommend that 7-Zip is installed

Change SQL BITS to new registration

On this page: https://github.com/Microsoft/sqlworkshops/blob/master/sqlserver2019bigdataclusters/SQL2019BDC/00%20-%20Prerequisites.md change from SQL BITS.

AttributeError: 'module' object has no attribute 'get_installed_distributions'

On my machine with Python 3.7.3 and pip 19.0.3 the code that returns the Python packages installed does not work.

This is the code that does not work:

/* Get Python Info */
EXECUTE sp_execute_external_script
@language =N'Python',
@script=N'import sys
print(sys.version)';
GO

EXECUTE sp_execute_external_script 
  @language = N'Python', 
  @script = N'import pip
import pandas as pd
installed_packages = pip.get_installed_distributions()
installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
   for i in installed_packages])
df = pd.DataFrame(installed_packages_list)
OutputDataSet = df'
WITH RESULT SETS (( InstalledPackageAndVersion nvarchar (150) ))

It fails with: AttributeError: 'module' object has no attribute 'get_installed_distributions'.

There is an easy fix, which is not to depend on pip directly (since the behavior is different from one pip version to another) - instead rely on pkg_resources:

/* Get Python Info */
EXECUTE sp_execute_external_script
@language =N'Python',
@script=N'import sys
print(sys.version)';
GO

EXECUTE sp_execute_external_script 
  @language = N'Python', 
  @script = N'import pip
import pandas as pd
import pkg_resources
installed_packages = [d for d in pkg_resources.working_set]

installed_packages_list = sorted(["%s==%s" % (i.key, i.version)
   for i in installed_packages])
df = pd.DataFrame(installed_packages_list)
OutputDataSet = df'
WITH RESULT SETS (( InstalledPackageAndVersion nvarchar (150) ))

404 on Security Endpoints

Security concepts for SQL Server big data clusters throws a 404

404 on pre-reqs link on Module 01 BDC

Fix

Feedback for SQLG2C Module 4 through section 4.1

Feedback through 4.1 from this Module 4 document

https://github.com/microsoft/sqlworkshops/blob/master/SQLGroundToCloud/sqlgroundtocloud/04-SQLServerOnTheMicrosoftAzurePlatform.md

Do you want to put a link to the accompanying slide deck in here?

======================

General comment "Is Azure SQL" A term?

Personally I think it may not make sense to officially call anything just SQL since that represents a language and not a product

I know we use it ourselves but should we stick to the official service name. I realize Azure SQL is representing both SQL Server in Azure VM and SQL Database

=============

This sentence says

assess your on-premises estate (via Tailspin Toys)

but I don't know what Tailspin Toys is? Perhaps

"through an example with the Tailspin Toys company"

==========

You organize "Activities" in the beginning of the section but not everything is really an "activity" is it? Also, do you intend for them to click these or just understand the outline of the Module?

==========

4.1 Azure SQL

I would line up the title of this section with the outline at the beginning (make sure to do this for all sections).

As a reader I would like to have some context of what I will be reading about before jumping into the Scenario. Can you add a sentence describing how you plan to teach me Introduction to Azure SQL including using a scenario review, learn deployment options, etc?

=========

They are aware that SQL Server 2008 R2 is approaching end of support

They are aware that SQL Server 2008 R2 has reached end of support

==========

Azure SQL VMs,

I would call it what is says on the website which is SQL Server on Virtual Machines or at least say that is your short name for it

==========

Not sure if too late to change this one up but for the visual I would sync up the terminology. Does Azure SQL Database consist of Managed instances and Databases?

=======

but there are gives and gets to consider

but there are tradeoffs for each option you choose

=======

Azure, quickly.

probably don't need comma

=======

Allscripts

It was not clear if they uses SQL Server on Virtual Machine and MI?

I would bold the option they chose for each example

=======

Because of the large SQL Server surface area, they were able to move about 1.5 terabytes of data smoothly to Azure SQL Managed Instance

I think your are trying to say that MI provides a better experience when you have a large surface area? That could be confusing. I think you mean that they are using a large surface area of features of SQL Server that make MI the best fit?

=======

Feature comparison between SQL database and SQL managed instance

Is this also a comparison with SQL Server?

======

Once you've landed on a deployment option, the next thing to determine is the service tier.

Does service tier options though sometimes dictate my deployment choice?

Also, I got to this point and if I was trying to make my choice I'm not sure I totally understand the benefits of the managed service such as auto HA, performance insights, etc. I think you can glean that from the customer examples but we may want to make this more explicit since this is all about making the choice.

======

Once you've landed on a deployment option, the next thing to determine is the service tier.

SQL Server on Azure Virtual Machines options

There really isn't a "service tier" for SQL VM?

======

For Azure SQL Database, which is the focus of this module, there are several options and tiers available, and the choices will depend on the scenario.

I got down to this point and didn't realize the focus of the module is on Azure SQL Database. I would mention this up front at the beginning of the module

======

For Azure SQL Database, which is the focus of this module, there are several options and tiers available, and the choices will depend on the scenario.

This means this section is more than just service tiers right?

=====

virtual core (vCore)-based (recommended)

why is this recommended? What is the default?

=====

Backup storage (RA-GRS).

What is RA-GRS?

=====

Decision 2: Choose service tier for availability

There are performance ramifications from this choice as well right?

=====

Business applications with high I/O requirements.

what is high I/O? I think the docs call it low latency requirements because you can get local SSD performance?

=====

Summary diagram

The In-memory option only being supported for Business Critical is kind of a small point that could be big for some customers. I'm assuming this is In-Memory OLTP? Are there others. We should call that out in the tiering comparison

=====

Feedback for G2C Module 00

Module:
https://github.com/microsoft/sqlworkshops/blob/master/SQLGroundToCloud/sqlgroundtocloud/00-prerequisites.md

Section:
Activity 4: Set up your environment for Modules 4 and 5
Important Note! If you are being provided an account by your instructor (see Option 4 in Activity 1), you do not need to complete this activity.

Before arriving at the workshop, if you intend to complete 04-SQLServerOnTheMicrosoftAzurePlatform and/or 05-MigratingToAzureSQL, you must complete the prerequisites outlined here. Note that all screenshots are provided, so you can "audit" this part of the Workshop if you like.

Change Proposed:
The pre-reqs link from here points to a series of steps, one of which has the name/password pair. Recommend pulling the password and

typo in adr_recoverry.ipynb for sql2019lab

It has two r's in the filename. The rest looks good!

Typos and Suggested Wording

Overall, the usage of "." after sentences is inconsistent. Section 1.0 had most sentences ending with a period while 0.0 did not.

In the README.MD file

The second link under About this Workshop has the word "Architecture" misspelled
"This README.MD file explains how the workshop is laid out, what you will learn" might be better phrased "This README.MD file explains how the workshop is structured, what you will learn"
0.0 Prerequisites
"It does have an assumption to know basics of using a Linux bash shell but all commands are provided to run the activities including scripts." could be rephrased as "It does assume basic knowledge of using a Linux bash shell, though all commands are provided to run the activities including scripts."
In the activities list, why does it have both bullets and numbers? Maybe indent the 1-6 software required steps and use a different bullet.

Activity:Prerequisites

The title "Activity: Prerequsites" had "Prerequisites" misspelled.
Step #2 - rephrase to Login to your OpenShift cluster, via a web browser, using the URL provided to you for the openshiftConsoleUrl.
Step #4 - add a ":" after "following"
Step #6 - add a ":" after "following"
Step #7 - add a ":" after "following"
Step #8 - add a ":" after "like this"
1.0 SQL Server Deployment on OpenShift
In the 3rd paragraph, second sentence, capitalize the "Y" in yaml. It almost reads as oc.yaml with the capital Y
in the 4th paragraph, place commas around "and others in the workshop" and add a ":" after "command like"

Activity: Deploy SQL Server on OpenShift

Step #3 - add a ":" after "with sudo)"
Step #4 - add a ":" after "script"
Step #4 - add a ":" after "shell prompt"
Step #5 - add a ":" after "shell prompt"
Step #6 - add a ":" after "shell prompt"
Step #7 - add a ":" after "shell prompt"
Step #7 - maybe rephrase the last paragraph to: "You have now submitted a deployment. The deployment is a logical collection of objects ..."
Step #7 - maybe rephrase the last sentence to: "Proceed to the next step to check whether the deployment was successful."
Step #8 - Rephrase to "When the value of AVAILABLE becomes 1, the deployment was successful and your container is running."
Step #8 - Rephrase to "To verify if the deployment was successful, the STATUS of the pod should be Running and the LoadBalancer service has a valid IP address for EXTERNAL-IP."
Step #9 - Rephrase to: "The SQL Server database engine produces a file called the ERRORLOG when it starts. The ERRORLOG file can be used to gather interesting information about SQL Server and be used for troubleshooting."

How to use a package in R which has two types of outputs(data and image) using Stored procedure in SQL and display the same as in screenshot from R-studio in SQL

For script sql2019lab/03_Availability/adr/adr_recovery.sql do we need step #5

When ADR is off and we attempt the delete, that gets aborted, so the table still has all the rows.
Do we need to drop and recreate the table again? We could go straight to set ADR ON and attempt the delete and shutdown the server?

curl is only available on Windows 10 build 17063 or higher

Update documentation to highlight that curl is only available in Windows 10 build 17063 or higher. For a couple of my team their version was too low. Might be worth highlighting on the prerequisites too.

Workaround: just download the file manually.

https://blogs.msdn.microsoft.com/commandline/2018/01/18/tar-and-curl-come-to-windows/

I suspect this should have said "[at least] 4GB of RAM" instead of "4MB or RAM"?

See Pull 29 for file in question & location.

BDC Workshop change

Pull MC as target

Fix SQL Server 2019 lab for data classification to use SSMS 18.2

The data classification module needs to be fixed to account for bug fixes in 18.2 that don't require WWI to be changed to dbcompat = 150

Feedback for SQLG2C for section 4.3 of Module 4

Feedback for Section 4.3 of Module 4 at https://github.com/microsoft/sqlworkshops/blob/master/SQLGroundToCloud/sqlgroundtocloud/04-SQLServerOnTheMicrosoftAzurePlatform.md

Module 4 feedback for 4.3

=========

They're feeling more comfortable with the networking and security aspects as well

But we really didn't talk about security except for links right?

========

Convert Convert the source schema to work in the target environment. This is only relevant for heterogeneous migrations.
SQL Server Migration Assistant

You say heterogeneous but make sure they know this is only for non SQL Server sources

=========

Online migrations with DMS

Are these the right links?

Online migrations with DMS
Offline migrations with DMS

=========

Sync data

Data sync just for thoroughness with the diagram

========

Remediate applications Iteratively make any necessary changes to your applications.

Any guidance here?

========

I forgot what the diagram looked like after going through the table

========

Above is a survey of some of the resources and tips available for going through the migration journey to Azure (you can find a full list of tools based on phase and target here).

Don't get overwhelmed, the number of resources shows the dedication Microsoft and partners have to helping you move successfully to the cloud. We recommend using the Data migration guide which allows you to pick your source and target, and then guides you step-by-step through your specific migration journey.

Maybe this should go up front and the table is only a reference list. I got lost going through the table. I honest was overwhelmed. There were a bunch of tools and resources listed but it was not clear which ones I should focus on vs looking through all of them.

=====

Typo: Processing should be Processor

Typo: Processing should be Processor (under 1.3 Big Data Technologies: Operating Systems)

Feedback for G2C Module 03

03 - BDC

I thought the naming for BDC went back to SQL Server 2019 Big Data Clusters? If so, it's in the name of the file and the title of the module.

Spelling/Grammar/Links:

In 3.3 Data Marts section, first sentence has normalized spelled as nomralized

In 3.3 Data Science and Apache Spark ... section, second paragraph first sentence has persisting spelled as presisting

In 3.3 Data Science and Apache Spark ... section, the first paragraph second sentence reads "The first is that the users will involved in the creation of the Features used in various ML and AI algorithms, and are often tasked to Label the data." May want to reword, I'm not exactly sure what you're trying to say here, but it reads weird. Maybe "The first is for users involved in the creation of the Features used in various ML and AI algorithms, that are often tasked to label the data." or something?

First sentence of module "in on-premises and in-cloud", remove first "in"?

Delete second paragraph in 03 intro, it's a duplicate.

3.1 sentence paragraph 4 - "Both of these technologies assumes many nodes" assumes to assume

All links work and point correctly.

In 3.1 paragraph 2:

"Hadoop" to "Hadoop (HDFS below)"
you say "computing nodes that position the workload over distributed data nodes" but the picture doesn't have any compute nodes it has a "NameNode" and also b1 b2, b2 b3, b1 b3 .. I am sure you explain all this but from just a reading perspective I think people (myself included) may be confused here.

In 3.1 paragraph 3:
"Spark is a technology that uses various libraries" to "Spark is a technology that [can be used on top of HDFS] that .." something

I like the updates to the BDC diagram overview!

CTP 3.1 - Update

Under the "Control Plane: Controller Service" section: You should remove "and the Cluster Administration Portal." as that feature was removed in CTP 3.1.

Workshop: SQL Server 2019 on OpenShift (CTP 2.5) - Module 0

Consider adding Module 0 Prereqs to the table to make it more obvious

Typo - Activity 3

Under "Activity 3: Prepare Your Workstation" it says "It's best to have at least 4MB of RAM on the management system" I think that should be 4GB

script sql2019lab/03_Availability/adr/adr.sql enhancement for ROLLBACK statements

you can wrap them in SET STATISTICS TIME for enabling lab users to capture the output and see the benefits.

Feedback for G2C Module 06

06 - What to use When

Spelling/Grammar/Links:

Intro section:

04 Proof-Of-Concept (POC): After the optimal solution technologies and processes are selected, a POC is set up wiht a small representeative example of what a solution might look like, as much as possible. If available, a currently-running solution in a prallel example can be used
05 Implementation: Implementing a phased-in rollout of the completed solution basedd on
First sentence, 'in on-premises and in-cloud'? remove first 'in'?
"In production, there are normally 6 phases to create a solution = these can be done in-person, or through recorded documents:" replace the = with ; or new sentence?

6.1 section:

4th paragraph requirments to be requirements
In activity first paragraph, descritpions to be descriptions
In project goals for AdventureWorks, whereever to wherever
In project goals for Contoso, avaialable to available
In WWI description last paragraph before Project Goals chilli chocolates to chili chocolates
For Contoso description, remove the second " The Contoso Corporation is" in the first sentence.

6.3 section:

Second paragraph, mutliple to multiple

6.4 section:

First paragraph, oftem to often
Last step, soluition to solution

References section:

Azure Pricing Calulator to calculator

Links:

Checked all links only one mis-points
In references at the bottom, the Cloud adoption framework points to the "How to write a problem statement". Should point here: https://docs.microsoft.com/en-us/azure/architecture/cloud-adoption/

Feedback: For AW project constraints, I would reword the second bullet: "In some cases, no code change is possible" to something more explicit like "In some case, they are not able or willing to make any code changes". "no code change is possible", to me, is confusing.
In 6.2 steps, it reads "Open your ADS document and ...", but I don't think "ADS" has been introduced yet and may be confused with Azure Data Studio or Azure Data Services?
I like the decision matrix activity!
There are two "workshops" with problems and solutions in https://aka.ms/azuresqlslides that contain "Common Security Scenarios and how to solve them" and "Designing available, scalable, performant apps with SQL DB" (MI and Singleton for both). Somehow I think we should reference these, maybe as an alternative or a deeper architectural session?

two typos in 01_IntelligentPerformance.md for sql2019lab

https://github.com/microsoft/sqlworkshops/blob/rgward/sql2019lab/01_IntelligentPerformance.md

Typo 1: in "1.0 The Solution"

"processing" I think should be capitalized

Typo 2: in "Activity: SQL Server Intelligent Query Processing"

The WideWorldImporters database example was created using SQL Server 2016 which has a default database compatibility level of 130. When a database is restored from a previous version oF SQL Server, the compatibility level of the database is preserved to help reduce the risk of upgrades.

Typo "oF"

@rgward

External data sources are not supported with type GENERIC.

04_DataVirtualization, encountered error when create external data source on Azure SQL. Not sure if I missed any other steps.

--SQL statement
CREATE EXTERNAL DATA SOURCE AzureSQLDatabase
WITH (
LOCATION = 'sqlserver://dmasql.database.windows.net',
PUSHDOWN = ON,
CREDENTIAL = AzureSQLDatabaseCredentials1
)
GO

-- Error message
Started executing query at Line 17
Msg 46530, Level 16, State 11, Line 1
External data sources are not supported with type GENERIC.
Total execution time: 00:00:00

Activity 8: Install Azure Data Studio and Extensions -> choco install azure-data-studio

Please have a look at the azure-data-studio chocolatey package. I believe you can use choco install azure-data-studio to replace the manual step in "Activity 8: Install Azure Data Studio and Extensions"

https://github.com/Microsoft/sqlworkshops/blob/master/sqlserver2019bigdataclusters/SQL2019BDC/00%20-%20Prerequisites.md

SQL Server 2019 Lab issue

As promised here's the issue I have been experiencing with the Linux and Containers section within the SQL Server 2019 Lab.

Sometimes I have noticed that when using docker-compose it freezes during the use of xpstar.dll. In addition.

I also once encountered abn error saying that the replication agent had encountered an exception. However, after that error I reset Docker for WIndows which allowed me to go further.

Once the Lab has loaded the replication snapshot does not always work. Which I suspect is due more to the laptop I was using at the time as the job is set to run when the CPU is idle.

It might be worth testing for minimum specs as laptop in question had 4 cores. However, might just be easier to add an extra bit of text to say that if the snapshot has not being created than run the job manually instead.

link broken final module BDC

At the very bottom

"Steps
Open this reference, and follow the instructions you see for the Service Endpoints section. This shows the addresses and ports exposed to the end-users."

The link is broken. It points here "https://github.com/Microsoft/sqlworkshops/blob/master/sqlserver2019bigdataclusters/SQL2019BDC/href"

Feedback for G2C Module 02

02 - Modernizing your data estate with SQL Server 2019

Spelling/Links:

All links work and point correctly

In the third paragraph of 2.6: "with SQL Server 2019 using using the Module 2.6 What else, migrations, and next steps section" the word "using" is twice in a row.

In intro section to module, there's no line break between "Performance" and "Note". You can add one by placing two spaces after "Performance." You could additionally at a ">" before Note, if you want it to be formatted that way. Same comment for the Note at the end of 2.5

2.4 is missing a period ending each of the first two paragraphs.

for 2.4 first paragraph, "with client drivers installed" it is with client drivers installed? not without? I should know the answer to this question but just confirming since I don't.

Slides review

Looks great! I like the Migrate to Modern SQL Server and Compat Cert slides. I hadn't seen those before.

Feedback for G2C Module 02 Structure Pass

Module:
https://github.com/microsoft/sqlworkshops/blob/master/SQLGroundToCloud/sqlgroundtocloud/02-ModernizingYourDataEstateWithSQLServer2019.md

Section:
https://github.com/microsoft/sqlworkshops/blob/master/sql2019lab/03_Availability.md#activityadr

Change Proposed:
I think the bookmark isn't rendering correctly. When you click it, you have to scroll down a bit to see the actual activity. Not a deal-breaker, but could be confusing.

Section:
as guided by your instructor.
Activity: Using Polybase with Azure SQL Server Database

Change Proposed:
Same comment as above. Bookmarks do not appear to jump to the proper location

Extended events do not capture CREATE and ALTER LOGIN statements

In SQL server 2012 and 2016, CREATE LOGIN and ALTER LOGIN statements are not getting captured by Extended events.

Is that bug or intentional?

Any alternative to capture same in SQL server in-premises environments ?

Thank you,
[email protected]

sqlserver2019bigdataclusters - Wrong URL for HDFS hyperlink

In the "BDC: Storage Pool" section, https://github.com/microsoft/sqlworkshops/blob/master/sqlserver2019bigdataclusters/SQL2019BDC/02%20-%20SQL%20Server%20BDC%20Components.md#bdc-storage-pool , the table following "These components are used in the Storage Pool of the SQL Server big data cluster:" has a value of "HDFS" but it links to https://kubernetes.io/docs/concepts/overview/components/ .

Feedback for G2C Module 02

Module:
https://github.com/microsoft/sqlworkshops/blob/master/SQLGroundToCloud/sqlgroundtocloud/02-ModernizingYourDataEstateWithSQLServer2019.md

Section:
SELF-PACED USERS ONLY: If you are using this module self-paced, read the overview of SQL Server 2019 from SQL Server 2019 of the SQL Server 2019 Lab and then proceed to Module 2.1 of this workshop SQL Server 2019 Intelligent Performance. NOTE: The SQL Server 2019 lab has Next Steps. Do not follow those but read the introduction and come back to the next section of this workshop.

Change Proposed:
The line "NOTE: The SQL Server...." should be on it's own line. If you add this as the Markdown:

> **NOTE:** The SQL Server....

It will indent and format nicely.

Section:
Instructor led workshops will review Intelligent Performance in SQL Server 2019 using the Module 2.1 Intelligent Performance Section from the deck SQL Server 2019 Modern Data Platform

Change Proposed:
Replace "Deck" with "PowerPoint Slides"

Section:
Activity: SQL Server Intelligent Performance
Proceed directly to the following Activities...

Change Proposed:

Open the following Activities from Module SQL Server 2019 Security of the SQL Server 2019 Lab in another tab in your browser, for a hands-on exercise. When are you done with these activities come back here for the next section as guided by your instructor.

Section:
2.6 What Else, Migration, and Next Steps

Change Proposed:
2.6 Additional Improvements, Migration, and Next Steps

Suggested reword in 01_IntelligentPerformance.md for sql2019lab

https://github.com/microsoft/sqlworkshops/blob/rgward/sql2019lab/01_IntelligentPerformance.md

Suggested reword in 01_IntelligentPerformance.md

In the guide for getting set up with Azure Data Studio notebooks, the instructions say
"Use the File menu to open up the iqp_tablevariabledeferred.ipynb notebook and go through all the steps."

I might instead recommend something like….

'Use the File > Open Folder menu to open up the folder sql2019lab. Now, in Azure Data Studio, you can select the piece of paper icon on the left-hand taskbar to navigate through the folder. As you go through the labs, you can find the .ipynb files in their respective folders (e.g. the notebook file for 01_IntelligentPerformance is in sql2019lab/01_IntelligentPerformance/iqp/iqp_tablevariabledeferred.ipynb)."

Why?
That way, in the following labs, they don't have to keep opening each file, they can just navigate using the built in file explorer.

@rgward

Activity 4: Install Azure CLI -> choco install azure-cli

Please have a look at the azure-cli chocolatey package. I believe you can use choco install azure-cli to replace the manual step in "Activity 4: Install Azure CLI"

https://github.com/Microsoft/sqlworkshops/blob/master/sqlserver2019bigdataclusters/SQL2019BDC/00%20-%20Prerequisites.md

sql2019lab module 2.1 missing Linux filepath

In the setup for the lab, states that you should be able to use a Windows or Linux instance of SS2019.

In https://github.com/microsoft/sqlworkshops/blob/rgward/sql2019lab/02_Security.md , but for Lab 2.1 for Step 2, only provide the filepath for Windows.

Recommended change
Add a note below that says something like..
Note: If you are using SQL Server on Linux, change the FILEPATH to /var/opt/mssql/data

@rgward , feel free to ping me if I can help on this one.

Activity 7 - add links to adding Python and pip to path

Hi - might be useful to add links on how to add Python and Pip to Path

Feedback for G2C Module 00

Module 00:

"Pre-requisites" is one word
https://www.bing.com/search?q=pre-requisites&form=EDGTCT&qs=PF&cvid=7694992296db416f84affaeac26bfc7c&refig=9e92d7c6405d43d5a5a5ea17f02ca72e&cc=US&setlang=en-US&elv=AY3%21uAY7tbNNZGZ2yiGNjfOPkBut2GbneA0WHmPu2OJYXzHppCuhAGnqwMAYx4AVhjqt7xth5nQ01TxgMeYEl7fOmdrT8RygqDm0DU0o9Xnw&plvar=0

In my opinion, this should be updated everywhere it's present in the workshop. But it's not a big deal.

Activity 1, Option 1:
"VM's in the Portal to ensure that you do no exceed the cost limits on this account." should be "VMs" and "not"

Checked all links and spelling

Feedback for G2C Module 01

Module 01-Introduction

Spelling:

In the workshop scenario section:
04 Proof-Of-Concept (POC): After the optimal solution technologies and processes are selected, a POC is set up wiht a small representeative example of what a solution might look like, as much as possible. If available, a currently-running solution in a prallel example can be used
05 Implementation: Implementing a phased-in rollout of the completed solution basedd on findings from the previous phases
06 Handoff: A post-mortem on the project with a discussion of future enhancements
Throught
In the Workshop methodology section
Title of section "Workshop Methodolgy" spelling error
In first paragraph "use these modules indpendently in many cases"

Workshop scenario doesn't include Tailspin Toys gaming yet (I know you're working on this, let me know how I can help).

Checked all links and spelling

Modules give 404 (arent working)

The Readme lists Modules at the bottom. They all error with 404.

microsoft / sqlworkshops Goto Github PK

sqlworkshops's Introduction

SQL Server and Azure SQL Labs and Workshops

(https://aka.ms/sqlworkshops)

SQL Server Data Platform

Archived

Azure SQL

Programming

Presentation Materials, Code, and References

Learning how to self-learn

Code of Conduct

License

Questions

sqlworkshops's People

Contributors

Stargazers

Watchers

Forkers

sqlworkshops's Issues

Recommend Projects

Recommend Topics

Recommend Org