Planet MySQL
Planet MySQL -

  • Percona Live ONLINE: MySQL on Google Cloud: War and Peace! by Akshay Suryawanshi & Jeremy Cole
    This session at Percona Live ONLINE was presented by Akshay Suryawanshi, Senior Production Engineer at Shopify, and Jeremy Cole, Senior Staff Production Engineer – Datastores at Shopify. Shopify is an online and on-premise commerce platform, founded in 2006. Shopify is used by more than a million merchants, and hundreds of billions of dollars of sales have happened on the platform since its inception. The company is a large user of MySQL, and the Black Friday and Cyber Monday weekends are their peak dates during the year, handling hundreds of billions of queries with MySQL. This year’s presentation was an opportunity to talk about the company’s challenges and progress over the last twelve months. Key Google Cloud concepts from the presentation As part of the presentation, it’s important to understand the naming conventions that exist around Google Cloud: Regions – a geographic region where cloud operates (these could include a building or adjoining buildings) Zones – a subdivision inside particular regions. Typically there are three within each region, but it varies a bit by region. GCE – Google Compute Engine platform, the system provides virtual machines to run as servers (most of Shopify’s microscale infrastructure is on GCP and runs in VMs). Virtual machine instance – A GC virtual machine scheduled in a particular zone Persistent disk – A network-attached log-structured block storage zone GKE – Google’s Kubernetes Engine, a managed Kubernetes solution that is managed on top of Google Cloud Platform (GPC) and managed within Google Cloud. Peacetime stories Akshay spoke about Persistent disks, which are Network-Attached, distributed log-structure, block storage: “This is the place where you basically say most of your data is, especially when you’re running MySQL data or any sort of databases.” Except for their performance, (which is usually affected by some degree of latency for network-attached storage) they provide incredible features, especially fast snapshotting of volumes. “We have utilized the snapshotting behavior to revamp our Backup and Restore infrastructure and brought down our recovery time to less than one hour for even a multi-terabyte disk. This is so incredibly fast that we actually restore each and every snapshot that we preserve or retain as a backup every single day. It’s happening in both regions where we run most of our MySQL fleet,” detailed Akshay. Configurable VMs Virtual machines (VMs) expose an extensive API which is useful to do things programmatically with: “The API is very helpful. It is well documented, and we are using it in a bunch of places,” continued Akshay. Scaling VMs up and down are seamless operations (of course, most of them require a restart) and manageable. Provisioning new VMs in an appropriate region is very easy, according to Akshay: “Again because of the extensive API, which has provided something required to build resiliency against its own failures. So we spread our VMs across multiple zones. That helps us tremendously when a particular zone goes down. All of this has allowed us to build self-healing tooling to automatically replace failed VMs easily.” GCP is truly multi-regional Google Cloud’s multi-region availability means failover from one region to another is easy and Shopify can move all its traffic from one region to another in just a few minutes, multiple times in a day. They can also expand to a distant geographical region without a lot of work, yet maintain the same stability. Akshay noted: “Isolating PII data has been a big win for Shopify in the past year when we launched a certain product where PII data needed to be preserved in a particular region, and GCP provides excellent support for that.” Google Kubernetes Engine Kubernetes is an open-source project for container orchestration and Google Kubernetes Engine (GKE) is a feature-rich tool for using and running Kubernetes. According to Akshay: “Most of our future work is happening towards containers writing MySQL and running and scheduling them inside companies. The automatic storage and file system expansion are helpful in solving database problems.” Zone aware cluster node scheduling helps schedule the Kubernetes pods so that they are fault-tolerant towards zone failures. The GCP networking is simple to set up. Inter-regional latencies are pretty low, and Shopify can perform region failovers for databases quickly in the event of a disaster. “We can do a whole region, evac within a few minutes. This is because we can keep our databases in both regions up to date due to these low latencies,” explained Akshay. Virtual private clouds (VPCs) are a great way to segment the workloads. Isolating the networking connection at VPC level has helped this achievement. War: Some of the things that can go wrong Jeremy detailed some of the specific challenges that Shopify had faced over the last year, including stock outs which are when a resource requested (such as a VM or a disk) is not available at that time. Jeremy noted: “What that looks like is that you attempt to allocate it using some API, and it just takes a very long time to show up. In one particular instance, in one region, we had consistent PD and VM stockouts regularly occurring for several weeks.” It meant that the company had to adapt for when resources were not available at a moment’s notice, and to consider where time-critical components had to be resourced for availability. Trouble in persistent disk land According to Jeremy: “One of the bigger problems that we’ve had in general is a persistent disk (PD).” An example was a recent outage caused by a change in persistent disks backend, which caused a regression “anywhere from minor latency impacts to full stalls for several seconds of the underlying PD volume, which of course, pretends to be a disk. So that means the disk is fully stalled for several seconds.” It took several weeks to diagnose and pin the blame of the stalls on PD properly. Jeremy noted, “The fun part of the story is that the mitigation for this particular problem involves attaching a substantial PD volume to every one of our VMs to work around a problem that was happening in PD. In order to do that, since we had so many VMs in aggregate, we had to allocate petabytes of persistent disk, and leave them attached for a few months.” Crucial to solving the problem was working closely with their vendor partner. As Jeremy explained, “Sometimes you have to get pretty creative to make things work right now and get yourself back in action. Troop replacements Live migration (LM) was referred to in the previous year’s Shopify presentation at Percona Live, and the problem still persists according to Jeremy. “We continuously have machines being live migrated and their VMs being moved around between different physical machines.” The frequency of LM problems occurring and the number of times it causes this problem is directly related to the frequency of Linux kernel or Intel CDEs. “We’re still getting hostError instance failures where migrations fail and this kills the host,” explained Jeremy. Some live migrations are still breaking in NTP time sync. “And we are still periodically getting multiple migrations per VM for the same maintenance – up to 11 within a day or so.” A regional ally surrenders In the last year, there was a regional outage: “Google had made a change to their traffic routing in one region, causing basically an overload of their networking stack. And we went down pretty hard because of that. There was nothing really that we could do about it,” said Jeremy. This was despite being deployed across multiple zones and multiple regions. Jeremy concluded the talk with a simple statement: Running MySQL in the cloud is not magic. “There are some unique challenges to Google Cloud, unique challenges to running MySQL in cloud infrastructure and unique challenges with the cloud itself. Sometimes running databases in the cloud can feel like you are constantly at war.” Preparing in advance as much as possible around how you manage your database in the cloud can help, particularly when you run at the kind of scale that Shopify does. However there will always be unexpected events and incidents. Working with your cloud partner and support providers can help here too. You can watch a video of the recording which includes a Q&A at the end of the presentation. The post Percona Live ONLINE: MySQL on Google Cloud: War and Peace! by Akshay Suryawanshi & Jeremy Cole appeared first on Percona Community Blog.

  • InnoDB Cluster on OCI using Kubernetes and StatefulSets
    In this demo we are setting up InnDB Cluster on Kubernetes, we will use a StatefulSets and NFS as storage. This demo was created on Oracle Cloud (OCI) but vanilla Kubernetes and NFS was used so should work for any cloud or on-prem. More information on my github page here Setup NFS Server to act as your persistent volume. Setup a NFS Server for your persistent volumes, howto here If you are using a public cloud provider you can most likely use dynamic storage options for handling of PV. In bellow examples I have a NFS Server on IP: This NFS exposes folders: /var/nfs/pv0 /var/nfs/pv1 /var/nfs/pv2 Kubernetes configuration You can look at configuration for kubernetes in yamls folder. First we are creating three persistent volumes (pv0-pv2) for our InnoDB Cluster nodes. We are specifying that this volume can only be accessed by one node (ReadWriteOnce) We are also specifying that we will use our NFS server for storage. More information on PV here. After we have created the persistent volumes we will create the StatefulSet. StatefulSets is a way in Kubernetes to manage stateful applications First we create services for our cluster nodes to expose them on the network. Next we configure our StatefulSet, we want to have three replicas (three InnoDB Cluster nodes) that we are starting in parallel. We also use the simplified way by defining a volume claim template (volumeClaimTemplates) that will claim the three previously created volumes. Create PV: kubectl create -f yamls/02-mysql-pv.yamlkubectl get pv (short for kubectl get persistentvolumes) (should be in STATUS Available) Create unique namespace for the cluster: Namespaces in k8s are like compartments in OCI, lets create a unique namespace for our cluster. kubectl create namespace mysql-clusterkubectl get namespaces Set default namespace for coming commands to mysql-cluster: kubectl config set-context --current --namespace=mysql-cluster Start the cluster nodes using StatefulSets kubectl create -f yamls/02-mysql-innodb-cluster-manual.yaml Look at: kubectl get pv,pvcwatch kubectl get all -o wide(or kubectl get all -o wide -n mysql-cluster)(or kubectl get all -o wide --all-namespaces) If there are problems look at logs (mysqld is started direcly + error log is set to stderr in our docker image): kubectl logs mysql-innodb-cluster-0 or prev failed pods by running: kubectl logs -p mysql-innodb-cluster-1) Look at configuration for the pod: kubectl describe pod mysql-innodb-cluster-1 Login to MySQL: kubectl exec -it mysql-innodb-cluster-0 -- mysql -uroot -p_MySQL2020_kubectl exec -it mysql-innodb-cluster-0 -- mysqlsh -uroot -p_MySQL2020_ -S/var/run/mysqld/mysqlx.sock Create InnoDB Cluster Create admin user for InnoDB Cluster on all nodes: kubectl exec -it mysql-innodb-cluster-0 -- mysql -uroot -p_MySQL2020_ -e"SET SQL_LOG_BIN=0; CREATE USER 'idcAdmin'@'%' IDENTIFIED BY 'idcAdmin'; GRANT ALL ON *.* TO 'idcAdmin'@'%' WITH GRANT OPTION";kubectl exec -it mysql-innodb-cluster-1 -- mysql -uroot -p_MySQL2020_ -e"SET SQL_LOG_BIN=0; CREATE USER 'idcAdmin'@'%' IDENTIFIED BY 'idcAdmin'; GRANT ALL ON *.* TO 'idcAdmin'@'%' WITH GRANT OPTION";kubectl exec -it mysql-innodb-cluster-2 -- mysql -uroot -p_MySQL2020_ -e"SET SQL_LOG_BIN=0; CREATE USER 'idcAdmin'@'%' IDENTIFIED BY 'idcAdmin'; GRANT ALL ON *.* TO 'idcAdmin'@'%' WITH GRANT OPTION"; Create your cluster using shell from one of the pods (mysql-innodb-cluster-0): Login to shell: kubectl exec -it mysql-innodb-cluster-0 -- mysqlsh -uidcAdmin -pidcAdmin -S/var/run/mysqld/mysqlx.sock and then configure the instances: dba.configureInstance('idcAdmin@mysql-innodb-cluster-0:3306',{password:'idcAdmin',interactive:false,restart:true});dba.configureInstance('idcAdmin@mysql-innodb-cluster-1:3306',{password:'idcAdmin',interactive:false,restart:true});dba.configureInstance('idcAdmin@mysql-innodb-cluster-2:3306',{password:'idcAdmin',interactive:false,restart:true}); You get an error when running "dba.configureInstance" ERROR: Remote restart of MySQL server failed: MySQL Error 3707 (HY000): Restart server failed (mysqld is not managed by supervisor process). This is due some limitation running "restart" command in MySQL for our docker container, we are working on solving this. Please restart MySQL manually to enable new settings, easiest to scale down + scale up again like: kubectl scale statefulset --replicas=0 mysql-innodb-cluster Look at: watch kubectl get all -o wide during the scale up/down. kubectl scale statefulset --replicas=3 mysql-innodb-cluster Login to shell again: kubectl exec -it mysql-innodb-cluster-0 -- mysqlsh -uidcAdmin -pidcAdmin -S/var/run/mysqld/mysqlx.sock and run: cluster=dba.createCluster("mycluster",{exitStateAction:'OFFLINE_MODE',autoRejoinTries:'20',consistency:'BEFORE_ON_PRIMARY_FAILOVER'});cluster.status()cluster.addInstance('idcAdmin@mysql-innodb-cluster-1:3306',{password:'idcAdmin',recoveryMethod:'clone'});cluster.addInstance('idcAdmin@mysql-innodb-cluster-2:3306',{password:'idcAdmin',recoveryMethod:'clone'});cluster.status() Done, you should now have a running InnoDB Cluster using statefulSets on Kubernetes. Simulate a failure Look at cluster status, login to mysql shell: kubectl exec -it mysql-innodb-cluster-1 -- mysqlsh -uidcAdmin -pidcAdmin -S/var/run/mysqld/mysqlx.sock And look at cluster status: cluster=dba.getCluster()cluster.status() Also look at pods watch kubectl get all -o wide -n mysql-cluster Kill the pod that is primary (RW) (mysql-innodb-cluster-0 most likely) kubectl delete pod mysql-innodb-cluster-0 You should now see that the deleted pod is restarted and that the "old" primary (RW) will join after restart as secondary (RO). If you want to remove everything kubectl delete -f yamls/02-mysql-innodb-cluster-manual.yamlkubectl delete pvc mysql-persistent-storage-mysql-innodb-cluster-0kubectl delete pvc mysql-persistent-storage-mysql-innodb-cluster-1kubectl delete pvc mysql-persistent-storage-mysql-innodb-cluster-2kubectl delete -f yamls/02-mysql-pv.yaml Make sure all is deleted: kubectl get pv,pvkubectl get all -o wide Remember to also empty out the datadir on NFS between tests: sudo rm -fr /var/nfs/pv[0,1,2]/*ls /var/nfs/pv[0,1,2]/ Extras More information around InnoDB Cluster here Whenever deploying new stuff look at: watch kubectl get all -o wide Good training on Kubernetes: Good training on Kubernetes:

  • MySQL 8.0 InnoDB Cluster with WordPress in OCI – part III
    With this post we are reaching the end of our journey to HA for WordPress & MySQL 8.0 on OCI. If you have not read the two previous articles, this is just the right time. MySQL 8.0 InnoDB ReplicaSet with WordPress in OCI MySQL 8.0 InnoDB ReplicaSet with WordPress in OCI – part II We started this trip using the MySQL InnoDB ReplicaSet where only 2 servers are sufficient but doesn’t provide automatic fail-over. In this article we will upgrade our InnoDB ReplicaSet to InnoDB Cluster. Therefor we will also need another compute instance for the 3rd recommended node. Architecture Our architecture will be like this: The creation of mymysql03is something you should master now. If you don’t please check back the first article. Migration to MySQL InnoDB Cluter The first steps when we have mymysql03 up and running, is to dismantle out nice InnoDB ReplicaSet. It’s always recommended to put the WordPress site in maintenance. We start with the server playing the role of Secondary (currently mymysql01 on my setup): MySQL  mymysql01:33060+   JS  rs.removeInstance('mymysql01') The instance 'mymysql01:3306' was removed from the replicaset. As we will make some modification on it, we also need to allow writes to it: MySQL  mymysql01:33060+   JS  \sql SET GLOBAL SUPER_READ_ONLY=0; We can now configure this server to be part of a MySQL InnoDB Cluster: MySQL  mymysql01:33060+   JS  dba.configureInstance() Configuring MySQL instance at mymysql02:3306 for use in an InnoDB cluster… This instance reports its own address as mymysql01:3306 Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed. NOTE: Some configuration options need to be fixed: +-----------------+---------------+----------------+----------------------------+ | Variable | Current Value | Required Value | Note | +-----------------+---------------+----------------+----------------------------+ | binlog_checksum | CRC32 | NONE | Update the server variable | +-----------------+---------------+----------------+----------------------------+ Do you want to perform the required configuration changes? [y/n]: y Configuring instance… The instance 'mymysql01:3306' was configured to be used in an InnoDB cluster. For mymysql02 (the Primary node), we cannot remove it from the ReplicaSet, so we just need to remove the metadata: MySQL  mymysql01:33060+   JS  \c clusteradmin@mymysql02 Creating a session to 'clusteradmin@mymysql02' Please provide the password for 'clusteradmin@mymysql02': Save password for 'clusteradmin@mymysql02'? [Y]es/[N]o/Ne[v]er (default No): y Fetching schema names for autocompletion… Press ^C to stop. Closing old connection… Your MySQL connection id is 293574 (X protocol) Server version: 8.0.20 MySQL Community Server - GPL No default schema selected; type \use to set one. MySQL  mymysql02:33060+   JS  dba.dropMetadataSchema() Are you sure you want to remove the Metadata? [y/N]: y Metadata Schema successfully removed. Now we have to configure it and configure mysmyql03 too (for mymysql03, don’t forget to create a user with the same credentials as the other two nodes. In the previous post we used clusteradmin). When all 3 servers are configured, we also need to allow the communication for Group Replication. This means that on all 3 database servers, we will run the following commands: firewall-cmd --zone=public --permanent --add-port=33061/tcp firewall-cmd --reload semanage port -a -t mysqld_port_t -p tcp 33060-33061 For convenience I modified /etc/hosts to have all servers as I did in the previous articles. So this time I also added mymysql03. MySQL InnoDB Cluster Creation We gonna use the latest server that was acting as Primary and we will create the new MySQL InnoDB Cluster: MySQL  mymysql02:33060+   JS  cluster=dba.createCluster('myWordpressIDC') A new InnoDB cluster will be created on instance 'mymysql02:3306'. Disabling super_read_only mode on instance 'mymysql02:3306'. Validating instance configuration at mymysql02:3306… This instance reports its own address as mymysql02:3306 Instance configuration is suitable. NOTE: Group Replication will communicate with other members using 'mymysql02:33061'. Use the localAddress option to override. Creating InnoDB cluster 'myWordpressIDC' on 'mymysql02:3306'… Adding Seed Instance… Cluster successfully created. Use Cluster.addInstance() to add MySQL instances. At least 3 instances are needed for the cluster to be able to withstand up to one server failure. And now we can add the two other servers: MySQL  mymysql02:33060+   JS  cluster.addInstance('mymysql01') ... MySQL  mymysql02:33060+   JS  cluster.addInstance('mymysql02') MySQL Router When MySQL InnoDB Cluster is up and running, we need to modify the routers on our webservers (as we deleted the metadata): [root@mywordpress01 ~]# mysqlrouter --bootstrap clusteradmin@mymysql02 --user mysqlrouter \ --force --conf-use-gr-notifications [root@mywordpress02 ~]# mysqlrouter --bootstrap clusteradmin@mymysql02 --user mysqlrouter \ --force --conf-use-gr-notifications [root@mywordpress01 ~]# systemctl restart mysqlrouter [root@mywordpress02 ~]# systemctl restart mysqlrouter Conclusion It is very easy to achieve a full automatic High Availability using MySQL InnoDB Cluster and it’s possible to start with a smaller environment and grow easily when needed. And in OCI, this is also very easy. Of course it would be nice to see how does that work, isn’t it ? Check this video to see MySQL 8.0 InnoDB Cluster and WordPress in action on OCI:

  • Running MySQL on Kubernetes - deployment using persistent volumes
    In this demo we are setting up one MySQL Server using k8s, we will use a deployment and NFS as storage. This demo was created on Oracle Cloud (OCI), standard Kubernetes and NFS was use so the setup should work for any cloud deployment or on-prem. More information on my github page here Persistent volumes Setup a NFS Server for your persistent volumes, howto here If you are using a public cloud provider you can most likely use dynamic storage options for PV. In bellow examples I have a NFS Server on IP: The NFS exposes folder: /var/nfs/pv099 Kubernetes configuration You can look at configuration for kubernetes in yamls folder. First we are creating a persistent volume and a persistant volume clame. We are specifying that this volume can only be accessed by one node (ReadWriteOnce) We are also specifying that we will use our NFS server for storage. More information on PV here. After we have created the persistent volume we will create the MySQL deployment. First we create a service to expose our application on the network. Next we create the MySQL deployment using the resourses created earlier. Create a persisten volume (PV): kubectl create -f yamls/01-mysql-pv.yaml Start MySQL using a deplyments (one MySQL Server using NFS PV) kubectl create -f yamls/01-mysql-deployment.yaml Done! If you want to remove everything kubectl delete -f yamls/01-mysql-deployment.yaml kubectl delete -f yamls/01-mysql-pv.yamlMake sure everything is deleted:kubectl get pv,pvkubectl get all -o wide Remember to also empty out the datadir on NFS between tests: sudo rm -fr /var/nfs/pv099/*ls /var/nfs/pv099/

  • MySQL 8.0 InnoDB ReplicaSet with WordPress in OCI – part II
    This article is the second part of our journey to WordPress and MySQL 8.0 High Availability on OCI. The first part can be read here. We ended part I with one webserver hosting WordPress. This WordPress was connecting locally to MySQL Router using HyperDB add-on. This add-on allows to split the reads & writes on MySQL Servers using replication. And finally we had one MySQL InnoDB ReplicaSet of two members serving the database. In this article, we will upgrade our architecture by adding an extra webserver (with MySQL Router) and a load balancer in front of our webservers. The New Architecture New Webserver Installation We add a new compute instance, preferably into another AD than the first one and we call it myWordpres02. Follow the different steps to install it from the first article. Again for convenience (this is not mandatory), we will modify /etc/hosts on all 4 servers to have something like this referring all hosts: mymysql01 mymysql02 mywordpress02 mymysql03 mywordpress01 When done, we should not forget to copy the configuration files and the 2 required files needed for HyperDB: [opc@mywordpress01 ~]$ cd /var/www/html/ [opc@mywordpress01 html]$ scp wp-config.php mywordpress02: [opc@mywordpress01 html]$ scp db-config.php mywordpress02: [opc@mywordpress01 html]$ scp wp-settings.php mywordpress02: [opc@mywordpress01 html]$ scp wp-content/db.php mywordpress02: When this is copied, we need to move those files to the right location on mywordpress02: [opc@mywordpress02 ~]$ sudo su - [opc@mywordpress02 ~]# cd /var/www/html/ [root@mywordpress02 html]# mv ~opc/wp-config.php . [root@mywordpress02 html]# mv ~opc/db-config.php . [root@mywordpress02 html]# mv ~opc/wp-settings.php . [root@mywordpress02 html]# mv ~opc/db.php wp-content Now, we need to perform some security changes: [root@mywordpress02 html]# chown apache. -R * [root@mywordpress02 html]# chcon --type httpd_sys_rw_content_t wp-content/db.php [root@mywordpress02 html]# chcon --type httpd_sys_rw_content_t db-config.php MySQL Router It’s time to install and bootstrap MySQL Router: [root@mywordpress02 html]# yum install -y [root@mywordpress02 html]# yum install -u mysql-router [root@mywordpress02 html]# mysqlrouter --bootstrap clusteradmin@mymysql01 --user mysqlrouter [root@mywordpress02 html]# systemctl start mysqlrouter [root@mywordpress02 html]# systemctl enable mysqlrouter We can already verify that everything is working as expected by pointing our browser to the mywordpress02 ‘s public IP. OCI Load Balancer We will setup the load balancer on top of our webserver, so if one has some issues, the service will stay up and use only the remaining webserver (for practical reasons, I use only 2 webservers but you could setup a farm of multiple instances). The load balancer is deployed and we can use its assigned public ip to contact our WordPress site. Our request will end up on one of the Apache servers and the MySQL queries will be split on both members of our MySQL InnoDB ReplicaSet. Don’t forget that is you are performing tests like I’m doing, you need to change the URL in WordPress Settings to point to the Load Balancer’s IP: ReplicaSet Routers We have now 2 routers registered in our ReplicaSet. Using AdminAPIit’s easy to get some info about them. The listRouters() method will list all Routers: Conclusion As you can see it’s very easy to add the amount of Webservers in OCI. It’s also very easy to connect them to an already running MySQL InnoDB ReplicaSet. And finally the use of the OCI Load Balancer makes all this very easy to use together. We have now full automatic High Availability with load splitting between our WordPress servers. However in case of issue on the Primaty MySQL instance, a manual step is required to promote the current Secondary to the Primary role. Migrating the a full automated HA architecture is the next and final step to our journey on OCI with WordPress and MySQL 8.0.