Coder Social home page Coder Social logo

aws-samples / amazon-aurora-labs-for-mysql Goto Github PK

View Code? Open in Web Editor NEW
79.0 11.0 36.0 159.9 MB

Workshop and lab content for Amazon Aurora MySQL compatible databases. This code will contain a series of templates, instructional guides and sample code to educate users on how to use Amazon Aurora features. The AWS CloudFormation templates will create the relevant resources in a user's account, the Bash and Python scripts will support the lab, by automating tasks, generating load or execute changes on resources created using AWS CloudFormation.

amazon-aurora-labs-for-mysql's Introduction

amazon-aurora-labs-for-mysql's People

Contributors

adisamant avatar cayblood avatar deki avatar dependabot[bot] avatar edchan-aws avatar guikcd avatar jpeddicord avatar karumajji avatar ovi-hutu-aws avatar periyatv-aws avatar saselvan avatar vladvataws avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amazon-aurora-labs-for-mysql's Issues

Database Activity Streams lab in website has incorrect references to pgSQL + das-script.py

I was trying to run the labs for database activity streams based on instructions from the Aurora Labs for MySQL website, and noticed a few issues in the DAS section.

There are a number of references to PostgreSQL that should be references to MySQL instead. In addition, step 4 references '/home/ec2-user/das-script.py', but the provisioned script is actually /home/ubuntu/das_reader.py. Inspection of that script also shows that the way to configure the cluster ID and stream ID have changed from values we need to specify in the source code to command line parameters.

The associated MD file in this repo does not have the aforementioned issues however. This leads me to think that the version of the labs deployed in the website is not up to date.

Add simple data interaction

Add a few simple queries to “see” the data prior to running the load test in the connecting to the DB cluster module – show tables, count (1), select * limit 10… I feel it’ll make the data/experience a bit more tangible for lab participants.

Failover lab module

Demonstrate how failover works, how it can be minimized and how to do failover testing using failure injection queries.

performance_schema.events_statements_summary_by_digest not getting accurate results

Aurora Version: 5.7.mysql_aurora.2.04.5

Params modified:

  • aurora_lab_mode = 0
  • performance_schema = 1
  • performance_schema_consumer_events_statements_history_long = 1
  • performance_schema_consumer_statements_digest = 1

I'm using a small Golang app where I loop through a number of sample queries 100 times and loop through all 3 times. I'm using the sample Sakila and Employees databases. Although I've mostly been testing with this app, I've also done some smaller runs manually using the mysql command line and got similar results.

The problem I'm seeing is that the values in events_statements_summary_by_digest aren't counting for each iteration of the queries running. There are some queries in my app that do count accurately with others that don't and I can't find any pattern as to why. I've also checked to ensure the table isn't coming close to being filled.

Query ran 300 times:
select city as c from sakila.city where country_id = 2;

Screen Shot 2021-11-05 at 12 12 30 PM

I tried a new connection with the same exact query 300 more times and got the same result of still showing a count of 1.

Query ran 300 times:
select city as c from sakila.city where country_id = 3;

As you can see I've changed the value of country_id to 3. Even though the query ran 300 times, it only added 1 to the count.
Screen Shot 2021-11-05 at 12 17 49 PM

Again I ran the same query with a new connection with the country_id = 3 and the count didn't increase and stayed at 2.

Query ran 300 times:
select city as c from sakila.city where country_id = 4;

Changed country_id = 4. Once again only added 1 to the count.
Screen Shot 2021-11-05 at 12 24 35 PM

Ran again with country_id = 4 with a new connection and count stayed as 3.

Query ran 300 times:
select city as c from sakila.city where country_id = 2;

I changed country_id back to 2 which is the same value from the first run. As you can see the count stays at 3.
Screen Shot 2021-11-05 at 12 30 18 PM

These queries below run without issues and the counts are accurate. Both of these queries are running in the exact same manner from the same Go app and against the exact same db.
select first_name from sakila.actor where last_name = 1

update sakila.staff set active = 'Yes' where staff_id = 1

Here are a few more that don't count accurately.
select concat(first_name, ' ', last_name) as n from sakila.actor where last_update > 2006-02-16 limit 1

select concat(e.first_name, ' ', e.last_name) as current_employee from employees.departments as d inner join employees.dept_manager as dm on d.dept_no = dm.dept_no inner join employees.employees as man ON dm.emp_no = man.emp_no inner join employees.dept_emp as de ON dm.dept_no = de.dept_no inner join employees.employees as e ON de.emp_no = e.emp_no where d.dept_no = 'd009' and de.to_date > '2021-06-01' and concat(e.first_name, ' ', e.last_name) = 'Aamer Cyre' group by dept_name, current_employee order by dept_name, current_employee limit 1000

select concat(e.first_name, ' ', e.last_name) as current_employee from employees.departments as d inner join employees.dept_manager as dm on d.dept_no = dm.dept_no inner join employees.employees as man ON dm.emp_no = man.emp_no inner join employees.dept_emp as de ON dm.dept_no = de.dept_no inner join employees.employees as e ON de.emp_no = e.emp_no where d.dept_no = 'd009' and de.to_date > '2021-06-01' and concat(e.first_name, ' ', e.last_name) = 'Aamer Cyre' group by dept_name, current_employee order by dept_name, current_employee

Auto-generated passwords break python commands

If you are unfortunate enough that the auto generated password starts with an equal sign (=), the workshop's python commands do not work: (from R2, Connect, Load Data and Auto Scale)

python3 reader_loadtest.py -e[readerEndpoint] -u$DBUSER -p"$DBPASS" -dmylab

returns

(1045, "Access denied for user 'masteruser'@'172.31.0.7' (using password: YES)")

while the same mysql command works:
mysql -hauroralab-mysql-node-2.ciwmnxgn1aeo.eu-central-1.rds.amazonaws.com -u$DBUSER -p"$DBPASS" mylab
Welcome to the MySQL monitor. [...] mysql>

because the command line parser used in the python script glosses over the = at the start of parameter -p.

You have to either generate passwords without the equal sign or put a space in between: -p "$DBPASS"

select count(distinct(case ...)) sometimes returns 0

I have an RDS cluster :

mysql> show variables like '%version%';
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| aurora_version          | 1.23.0                       |
| innodb_version          | 1.2.10                       |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.6.10-log                   |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
+-------------------------+------------------------------+
8 rows in set (0.00 sec)

The following query against a large table (~500mil rows) returns a 0 where it should have returned a value > 0:

select 
  count(game_action_id) ga_count,
  count(distinct(game_action_id)) count_distinct,
  count(case when type='WAGER' then game_action_id end) wager_count,
  count(distinct(case when type='WAGER' then game_action_id end)) wager_count_distinct 
from ga_tst 
where created_timestamp >= '2020-11-01 00:00:00' and created_timestamp <= '2020-11-01 23:59:59';

It works correctly for dates outside November, and doesn't work when the interval is in November. The table is defined as:

CREATE TABLE `game_action` (
  `game_action_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `game_id` int(11) NOT NULL,
  `user_id` int(11) NOT NULL,
  `game_instance_id` bigint(20) unsigned DEFAULT NULL,
  `type` varchar(15) NOT NULL,
  `amount` decimal(18,2) NOT NULL,
  `currency` varchar(15) NOT NULL,
  `created_timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`game_action_id`,`created_timestamp`),
  KEY `GA_IX01` (`game_id`),
  KEY `GA_IX02` (`user_id`),
  KEY `GA_IX03` (`game_instance_id`)
) ENGINE=InnoDB AUTO_INCREMENT=447579828 DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE ( UNIX_TIMESTAMP(created_timestamp))
(PARTITION pMIN VALUES LESS THAN (1388534400) ENGINE = InnoDB,
 PARTITION p2014_01 VALUES LESS THAN (1391212800) ENGINE = InnoDB,
 PARTITION p2014_02 VALUES LESS THAN (1393632000) ENGINE = InnoDB,
 PARTITION p2014_03 VALUES LESS THAN (1396310400) ENGINE = InnoDB,
 PARTITION p2014_04 VALUES LESS THAN (1398902400) ENGINE = InnoDB,
...
 PARTITION p2020_08 VALUES LESS THAN (1598918400) ENGINE = InnoDB,
 PARTITION p2020_09 VALUES LESS THAN (1601510400) ENGINE = InnoDB,
 PARTITION p2020_10 VALUES LESS THAN (1604188800) ENGINE = InnoDB,
 PARTITION p2020_11 VALUES LESS THAN (1606780800) ENGINE = InnoDB,
 PARTITION p2020_12 VALUES LESS THAN (1609459200) ENGINE = InnoDB,
 PARTITION pMAX VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */;

=======

If I change the query to the following, it works correctly:

select 
  count(ga1.game_action_id) ga_count,
  count(distinct(ga1.game_action_id)) count_distinct,
  count(case when ga1.type='WAGER' then ga1.game_action_id end) wager_count,
  count(distinct(case when ga1.type='WAGER' then ga1.game_action_id end)) wager_count_distinct 
from (
  select * from game_action ga2
  where ga2.created_timestamp >= '2020-11-01 00:00:00' 
    and ga2.created_timestamp <= '2020-11-01 23:59:59'
) ga1;

The problem seem to be somewhat related to the size of the table. I made a copy of the production table game_action with the same partitioning and indexes, and the problem still occurs there. I then copied just November's data to a separate table, created with identical indexes and partitions - and the original query works correctly on this, which I why tried the work-around with the subquery; this seems to indicate that the problem is related to a conflict arising from the use of count(distinct(case when ga1.type='WAGER' then ga1.game_action_id end)) with a WHERE clause.

Avoid typing password, get from Secrets Manager

Under the section “2.Connecting to the DB Cluster”, add the aws secretsmanager get-secret-value --secret-id [secretArn] | jq -r '.SecretString'” command and pipe it to the DBPASS variable directly to avoid typing the password in the command line.

Error when running das_reader.py

Hi, I ran into below error when working on "Step 4. Read activity from the stream" of "Set up Database Activity Streams (DAS)". Looks like the script assumes an older version of AWS Encryption SDK. Thanks!

buntu@ip-172-31-0-243:~$ python3 das_reader_new.py -i cluster-M7KHZGHKIMDTO2CJSRRMUJDFVA -s aws-rds-das-cluster-M7KHZGHKIMDTO2CJSRRMUJDFVA
Press Ctrl+C to quit this test...
Filtering is enabled, only user events are displayed, service events are skipped. Disable with parameter '-f0'.
Traceback (most recent call last):
File "das_reader_new.py", line 153, in
plaintext = decrypt_decompress(payload_decoded, data_key_decrypt_result['Plaintext']).decode('utf8')
File "das_reader_new.py", line 104, in decrypt_decompress
decrypted = decrypt_payload(payload, key)
File "das_reader_new.py", line 97, in decrypt_payload
decrypted_plaintext, header = aws_encryption_sdk.decrypt(
AttributeError: module 'aws_encryption_sdk' has no attribute 'decrypt'

Loop through describe-* command to check status

Under “Cloning Clusters” module, after executing the CLI command to add an instance to the existing cluster, to check the status, consider using a loop:

while true
do
aws rds describe-db-clusters \
--db-cluster-identifier labstack-cluster-clone \
| jq -r '.DBClusters[0].Status, .DBClusters[0].Endpoint'
sleep 1
done

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.