aws-samples / amazon-aurora-labs-for-mysql Goto Github PK

Workshop and lab content for Amazon Aurora MySQL compatible databases. This code will contain a series of templates, instructional guides and sample code to educate users on how to use Amazon Aurora features. The AWS CloudFormation templates will create the relevant resources in a user's account, the Bash and Python scripts will support the lab, by automating tasks, generating load or execute changes on resources created using AWS CloudFormation.

amazon-aurora-labs-for-mysql's Introduction

Amazon Aurora Labs for MySQL

This repo is no longer being maintained. If you would like to use the Aurora MySQL Immersion Day labs please visit https://catalog.workshops.aws/awsauroramysql/en-US.

amazon-aurora-labs-for-mysql's People

Contributors

Stargazers

Watchers

amazon-aurora-labs-for-mysql's Issues

Overarching story

Come up with an overarching story to tie the labs together.

Create DB Cluster - UI Changed

The RDS management console Create database interface has changed, update screenshots.

Database Activity Streams lab in website has incorrect references to pgSQL + das-script.py

I was trying to run the labs for database activity streams based on instructions from the Aurora Labs for MySQL website, and noticed a few issues in the DAS section.

There are a number of references to PostgreSQL that should be references to MySQL instead. In addition, step 4 references '/home/ec2-user/das-script.py', but the provisioned script is actually /home/ubuntu/das_reader.py. Inspection of that script also shows that the way to configure the cluster ID and stream ID have changed from values we need to specify in the source code to command line parameters.

The associated MD file in this repo does not have the aforementioned issues however. This leads me to think that the version of the labs deployed in the website is not up to date.

Aurora Multi-Master lab module

Demonstrate how multi-master DB clusters work, and how conflict resolution behaves.

Knowledge checks at the end of each lab

Use a system for knowledge checks at the end of each lab to validate learnings.

Expand RDS Data API lab with API Gateway

How about adding an API Gateway integration to trigger the lambda and CURLing from the local hosts?

Aurora Global Database lab module

Demonstrate how to set up and work with Global Database

Aurora Global Database lab supersetURL issue

I followed this lab to deploy auroralab stack.
https://awsauroralabsmy.com/global/biapp/#1-gather-the-needed-information

I was not able to access the generated supersetURL.

Add simple data interaction

Add a few simple queries to “see” the data prior to running the load test in the connecting to the DB cluster module – show tables, count (1), select * limit 10… I feel it’ll make the data/experience a bit more tangible for lab participants.

Investigate scale-down issues for read replica autoscaling

application auto-scaling does not always remove the additional read replica after the load generator is stopped.

Failover lab module

Demonstrate how failover works, how it can be minimized and how to do failover testing using failure injection queries.

performance_schema.events_statements_summary_by_digest not getting accurate results

Aurora Version: 5.7.mysql_aurora.2.04.5

Params modified:

aurora_lab_mode = 0
performance_schema = 1
performance_schema_consumer_events_statements_history_long = 1
performance_schema_consumer_statements_digest = 1

I'm using a small Golang app where I loop through a number of sample queries 100 times and loop through all 3 times. I'm using the sample Sakila and Employees databases. Although I've mostly been testing with this app, I've also done some smaller runs manually using the mysql command line and got similar results.

The problem I'm seeing is that the values in events_statements_summary_by_digest aren't counting for each iteration of the queries running. There are some queries in my app that do count accurately with others that don't and I can't find any pattern as to why. I've also checked to ensure the table isn't coming close to being filled.

Query ran 300 times:
select city as c from sakila.city where country_id = 2;

I tried a new connection with the same exact query 300 more times and got the same result of still showing a count of 1.

Query ran 300 times:
select city as c from sakila.city where country_id = 3;

As you can see I've changed the value of country_id to 3. Even though the query ran 300 times, it only added 1 to the count.

Again I ran the same query with a new connection with the country_id = 3 and the count didn't increase and stayed at 2.

Query ran 300 times:
select city as c from sakila.city where country_id = 4;

Changed country_id = 4. Once again only added 1 to the count.

Ran again with country_id = 4 with a new connection and count stayed as 3.

Query ran 300 times:
select city as c from sakila.city where country_id = 2;

I changed country_id back to 2 which is the same value from the first run. As you can see the count stays at 3.

These queries below run without issues and the counts are accurate. Both of these queries are running in the exact same manner from the same Go app and against the exact same db.
select first_name from sakila.actor where last_name = 1

update sakila.staff set active = 'Yes' where staff_id = 1

Here are a few more that don't count accurately.
select concat(first_name, ' ', last_name) as n from sakila.actor where last_update > 2006-02-16 limit 1

select concat(e.first_name, ' ', e.last_name) as current_employee from employees.departments as d inner join employees.dept_manager as dm on d.dept_no = dm.dept_no inner join employees.employees as man ON dm.emp_no = man.emp_no inner join employees.dept_emp as de ON dm.dept_no = de.dept_no inner join employees.employees as e ON de.emp_no = e.emp_no where d.dept_no = 'd009' and de.to_date > '2021-06-01' and concat(e.first_name, ' ', e.last_name) = 'Aamer Cyre' group by dept_name, current_employee order by dept_name, current_employee limit 1000

Switch from user-data to cfn-init for EC2 bootstrapping

Switch the bootstrapping mode of the EC2 workstation from using just user-data script, to cfn-init.

Auto-generated passwords break python commands

If you are unfortunate enough that the auto generated password starts with an equal sign (=), the workshop's python commands do not work: (from R2, Connect, Load Data and Auto Scale)

python3 reader_loadtest.py -e[readerEndpoint] -u$DBUSER -p"$DBPASS" -dmylab

returns

(1045, "Access denied for user 'masteruser'@'172.31.0.7' (using password: YES)")

while the same mysql command works:
mysql -hauroralab-mysql-node-2.ciwmnxgn1aeo.eu-central-1.rds.amazonaws.com -u$DBUSER -p"$DBPASS" mylab
Welcome to the MySQL monitor. [...] mysql>

because the command line parser used in the python script glosses over the = at the start of parameter -p.

You have to either generate passwords without the equal sign or put a space in between: -p "$DBPASS"

select count(distinct(case ...)) sometimes returns 0

I have an RDS cluster :

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| aurora_version          | 1.23.0                       |
| innodb_version          | 1.2.10                       |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.6.10-log                   |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
+-------------------------+------------------------------+
8 rows in set (0.00 sec)

The following query against a large table (~500mil rows) returns a 0 where it should have returned a value > 0:

select 
  count(game_action_id) ga_count,
  count(distinct(game_action_id)) count_distinct,
  count(case when type='WAGER' then game_action_id end) wager_count,
  count(distinct(case when type='WAGER' then game_action_id end)) wager_count_distinct 
from ga_tst 
where created_timestamp >= '2020-11-01 00:00:00' and created_timestamp <= '2020-11-01 23:59:59';

It works correctly for dates outside November, and doesn't work when the interval is in November. The table is defined as:

CREATE TABLE `game_action` (
  `game_action_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `game_id` int(11) NOT NULL,
  `user_id` int(11) NOT NULL,
  `game_instance_id` bigint(20) unsigned DEFAULT NULL,
  `type` varchar(15) NOT NULL,
  `amount` decimal(18,2) NOT NULL,
  `currency` varchar(15) NOT NULL,
  `created_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`game_action_id`,`created_timestamp`),
  KEY `GA_IX01` (`game_id`),
  KEY `GA_IX02` (`user_id`),
  KEY `GA_IX03` (`game_instance_id`)
) ENGINE=InnoDB AUTO_INCREMENT=447579828 DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(created_timestamp))
(PARTITION pMIN VALUES LESS THAN (1388534400) ENGINE = InnoDB,
 PARTITION p2014_01 VALUES LESS THAN (1391212800) ENGINE = InnoDB,
 PARTITION p2014_02 VALUES LESS THAN (1393632000) ENGINE = InnoDB,
 PARTITION p2014_03 VALUES LESS THAN (1396310400) ENGINE = InnoDB,
 PARTITION p2014_04 VALUES LESS THAN (1398902400) ENGINE = InnoDB,
...
 PARTITION p2020_08 VALUES LESS THAN (1598918400) ENGINE = InnoDB,
 PARTITION p2020_09 VALUES LESS THAN (1601510400) ENGINE = InnoDB,
 PARTITION p2020_10 VALUES LESS THAN (1604188800) ENGINE = InnoDB,
 PARTITION p2020_11 VALUES LESS THAN (1606780800) ENGINE = InnoDB,
 PARTITION p2020_12 VALUES LESS THAN (1609459200) ENGINE = InnoDB,
 PARTITION pMAX VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;

=======

If I change the query to the following, it works correctly:

select 
  count(ga1.game_action_id) ga_count,
  count(distinct(ga1.game_action_id)) count_distinct,
  count(case when ga1.type='WAGER' then ga1.game_action_id end) wager_count,
  count(distinct(case when ga1.type='WAGER' then ga1.game_action_id end)) wager_count_distinct 
from (
  select * from game_action ga2
  where ga2.created_timestamp >= '2020-11-01 00:00:00' 
    and ga2.created_timestamp <= '2020-11-01 23:59:59'
) ga1;

The problem seem to be somewhat related to the size of the table. I made a copy of the production table game_action with the same partitioning and indexes, and the problem still occurs there. I then copied just November's data to a separate table, created with identical indexes and partitions - and the original query works correctly on this, which I why tried the work-around with the subquery; this seems to indicate that the problem is related to a conflict arising from the use of count(distinct(case when ga1.type='WAGER' then ga1.game_action_id end)) with a WHERE clause.

Avoid typing password, get from Secrets Manager

Under the section “2.Connecting to the DB Cluster”, add the aws secretsmanager get-secret-value --secret-id [secretArn] | jq -r '.SecretString'” command and pipe it to the DBPASS variable directly to avoid typing the password in the command line.

Error when running das_reader.py

Hi, I ran into below error when working on "Step 4. Read activity from the stream" of "Set up Database Activity Streams (DAS)". Looks like the script assumes an older version of AWS Encryption SDK. Thanks!

buntu@ip-172-31-0-243:~$ python3 das_reader_new.py -i cluster-M7KHZGHKIMDTO2CJSRRMUJDFVA -s aws-rds-das-cluster-M7KHZGHKIMDTO2CJSRRMUJDFVA
Press Ctrl+C to quit this test...
Filtering is enabled, only user events are displayed, service events are skipped. Disable with parameter '-f0'.
Traceback (most recent call last):
File "das_reader_new.py", line 153, in
plaintext = decrypt_decompress(payload_decoded, data_key_decrypt_result['Plaintext']).decode('utf8')
File "das_reader_new.py", line 104, in decrypt_decompress
decrypted = decrypt_payload(payload, key)
File "das_reader_new.py", line 97, in decrypt_payload
decrypted_plaintext, header = aws_encryption_sdk.decrypt(
AttributeError: module 'aws_encryption_sdk' has no attribute 'decrypt'

while true
do
aws rds describe-db-clusters \
--db-cluster-identifier labstack-cluster-clone \
| jq -r '.DBClusters[0].Status, .DBClusters[0].Endpoint'
sleep 1
done

aws-samples / amazon-aurora-labs-for-mysql Goto Github PK

amazon-aurora-labs-for-mysql's Introduction

Amazon Aurora Labs for MySQL

amazon-aurora-labs-for-mysql's People

Contributors

Stargazers

Watchers

Forkers

amazon-aurora-labs-for-mysql's Issues

Recommend Projects

Recommend Topics

Recommend Org