Leaderboard (728 x 90)

Monday, July 25, 2022

Zimbra Mysql Crash Recovery

In the event of database corruption it may be necessary to manually perform database recovery. See Bug 15797 for an example of an issue with mysql that will require database recovery. In that example, a warning message like the following appeared in the mysql error log:

InnoDB: Serious error! InnoDB is trying to free page 716
InnoDB: though it is already marked as free in the tablespace!
InnoDB: The tablespace free space info is corrupt.
InnoDB: You may need to dump your InnoDB tables and recreate the whole
InnoDB: database!

Before beginning a full database recovery, check to see if the corruption may be limited to a single mboxgroup or a single user within an mboxgroup. This type of corruption frequently lets the server run normally for extended periods of time, with crashes occurring only when an affected user attempts to access certain mailbox items. If this is the case, it may be possible to dump, drop and recover only the affected entries without disrupting the database as a whole. Please see the instructions in the Mysql Crash Recovery (alternate method) article.

Overview of Recovery Process

  1. Configure mysql to start in recovery mode
  2. Generate SQL dumps of all relevant databases
  3. Remove all existing (and possibly corrupt) databases
  4. Re-create all databases
  5. Repopulate the databases with the data from the SQL dumps
  6. Test databases and start all ZCS services

Details of Recovery Process

1. Configure mysql to start in recovery mode

  1. Edit the file /opt/zimbra/conf/my.cnf and add a line like innodb_force_recovery = 1 under the [mysqld] section (Note that it may be necessary to increase the recovery level depending on the extent of the database corruption, as shown at the end of the database dump step)
  2. Save the file and re-start mysqld
mysql.server start

2. Generate SQL dumps of all databases

  1. Load some mysql configuration into shell variables (i.e. $mysql_socket and $mysql_root_password; note that you will use these again in step 3)
  2. Make a list of the existing databases
  3. Create a directory to hold the SQL dumps
  4. Generate the SQL dumps from the database list
source ~/bin/zmshutil ; zmsetvars
mysql --batch --skip-column-names -e "show databases" | grep -e mbox -e zimbra > /tmp/mysql.db.list

Note: If you are using ZCS v8.8.x with Chat/Talk enabled then you should take Chat database dump as well

mysql --batch --skip-column-names -e "show databases" | grep -e mbox -e zimbra -e chat > /tmp/mysql.db.list
mkdir /tmp/mysql.sql 
for db in `cat /tmp/mysql.db.list`; do
     mysqldump $db -S $mysql_socket -u root --password=$mysql_root_password > /tmp/mysql.sql/$db.sql
     echo "Dumped $db"
     sleep 10
 done

Note: If you encounter any mysql errors while dumping the databases, start over by re-editing /opt/zimbra/conf/my.cnf, incrementing the value for innodb_force_recovery by one, and restarting mysqld. It is critical to update this incrementally - 1, 2, 3, and only if needed 4. 4 and above can cause DB corruption. Please see MySQL's Forcing InnoDB Recovery guide for more information.

Note: Starting 8.7 , path of mysqldump has been changed from /opt/zimbra/mysql/bin/mysqldump to /opt/zimbra/common/bin/mysqldump . Please update the command accordingly if you are doing this for a system >= ZCS 8.7.x

Note: An error of "bash: /tmp/mysql.sql/$db.sql: ambiguous redirect" probably indicates your using an apostrophe or single quote ' rather than a tick ` -- which is one the same key as the tilde ~ .

Note: Do not reboot the machine, as some Operating Systems will remove all contents in the /tmp directory during the reboot sequence, i.e. your /tmp/mysql.sql will be removed.

HINT Did the dump work or not, try grep -L "Dump completed" /tmp/mysql.sql/*.sql [those that didn't] and grep "Dump completed" /tmp/mysql.sql/*.sql [those that did].

3. Remove all existing (and possibly corrupt) databases

Note: Take a copy of /opt/zimbra/db/data before dropping the databases. This will ensure a copy of old database.

Note that we drop the zimbra database last because the mboxgroup* databases depend on it

for db in `cat /tmp/mysql.db.list |grep mbox`
do
    mysql -u root --password=$mysql_root_password -e "drop database $db"
    echo -e "Dropped $db"
done
mysql -u root --password=$mysql_root_password -e "drop database zimbra"

Remove existing InnoDB tablespace and log files

rm -rf /opt/zimbra/db/data/ib*

Note: First, use with caution - this shouldn't need to be used often. Issue came about because of some rsync issues. Can't dump db's because of 'connection' issues at this point? One could move the /opt/zimbra/db/data directory - mv /opt/zimbra/db/data /opt/zimbra/db/data-old and then make the db - mkdir /opt/zimbra/db/data w/ ownership of zimbra:zimbra . Remove the innodb_force_recovery line from /opt/zimbra/conf/my.cnf . Then recreate a default mysql db by running /opt/zimbra/libexec/zmmyinit --sql_root_pw $mysql_root_password and then attempt this steps over again to confirm you can drop them. Also note that you may have to reset the zimbra password manually in mysql, then set it again in Zimbra with the instructions from this page: http://wiki.zimbra.com/wiki/Resetting_LDAP_%26_MySQL_Passwords

4. Re-create all databases

  1. Run mysql in non-recovery mode
    1. Remove the innodb_force_recovery line from /opt/zimbra/conf/my.cnf
    2. Save the file and restart mysqld
  2. Re-create the databases from the database list
mysql.server restart
for db in `cat /tmp/mysql.db.list`
do
    mysql -e "create database $db character set utf8"
    echo "Created $db"
done

5. Repopulate the databases with the data from the SQL dumps

Import the data from the SQL dumps. Note that we import the zimbra database first because the mboxgroup databases depend on it

mysql zimbra < /tmp/mysql.sql/zimbra.sql
for sql in /tmp/mysql.sql/mbox*
do
    mysql `basename $sql .sql` < $sql
    echo -e "Updated `basename $sql .sql` \n"
done

Note : If you are using ZCS v8.8.x with Chat/Talk enabled then you should import chat db as well.

mysql chat < /tmp/mysql.sql/chat.sql

6. Test databases and start all ZCS services

Note that this is an example query. If you know of any particular databases that were corrupt, you may want to construct other queries to verify normal access to the data.

mysql zimbra -e "select * from mailbox order by id desc limit 1"

Once you are satisfied that the databases are restored intact, start the rest of the zimbra services.

zmcontrol start

Check /opt/zimbra/log/mysql_error.log and /opt/zimbra/log/mailbox.log for database errors.


Reference: https://wiki.zimbra.com/wiki/Mysql_Crash_Recovery

Monday, September 30, 2013

How do I disable IPv6?

Upstream employee Daniel Walsh recommends not disabling the ipv6 module, as that can cause issues with SELinux and other components, but adding the following to /etc/sysctl.conf:
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
To disable in the running system:
echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6
or
sysctl -w net.ipv6.conf.all.disable_ipv6=1
sysctl -w net.ipv6.conf.default.disable_ipv6=1
If problems with X forwarding are encountered on systems with IPv6 disabled, edit /etc/ssh/sshd_config and make either of the following changes:
(1) Change the line
#AddressFamily any
to
AddressFamily inet
(inet is ipv4 only; inet6 is ipv6 only)
or
(2) Remove the hash mark (#) in front of the line
#ListenAddress 0.0.0.0

Then restart ssh.

About Network Interfaces in EL6

Each physical and virtual network device on an EL6 Linux system has an associated configuration file named ifcfg-interface in the /etc/sysconfig/network-scripts directory, where interface is the name of the interface. For example:
# cd /etc/sysconfig/network-scripts
# ls ifcfg-*
ifcfg-eth0  ifcfg-eth1  ifcfg-lo
In this example, there are two configuration files for Ethernet interfaces, ifcfg-eth0 and ifcfg-eth1, and one for the loopback interface, ifcfg-lo. The system reads the configuration files at boot time to configure the network interfaces.
The following are sample entries from an ifcfg-eth0 file for a network interface that obtains its IP address using the Dynamic Host Configuration Protocol (DHCP):
DEVICE="eth0"
NM_CONTROLLED="yes"
ONBOOT=yes
USERCTL=no
TYPE=Ethernet
BOOTPROTO=dhcp
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME="System eth0"
UUID=5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03
HWADDR=08:00:27:16:C3:33
PEERDNS=yes
PEERROUTES=yes
If the interface is configured with a static IP address, the file contains entries such as the following:
DEVICE="eth0"
NM_CONTROLLED="yes"
ONBOOT=yes
USERCTL=no
TYPE=Ethernet
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=yes
IPV6INIT=no
NAME="System eth0"
UUID=5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03
HWADDR=08:00:27:16:C3:33
IPADDR=192.168.1.101
NETMASK=255.255.255.0
BROADCAST=192.168.1.255
PEERDNS=yes
PEERROUTES=yes
The following configuration parameters are typically used in interface configuration files:
BOOTPROTO
How the interface obtains its IP address:
bootp
Bootstrap Protocol (BOOTP).
dhcp
Dynamic Host Configuration Protocol (DHCP).
none
Statically configured IP address.
BROADCAST
IPv4 broadcast address.
DEFROUTE
Whether this interface is the default route.
DEVICE
Name of the physical network interface device (or a PPP logical device).
HWADDR
Media access control (MAC) address of an Ethernet device.
IPADDR
IPv4 address of the interface.
IPV4_FAILURE_FATAL
Whether the device is disabled if IPv4 configuration fails.
IPV6_FAILURE_FATAL
Whether the device is disabled if IPv6 configuration fails.
IPV6ADDR
IPv6 address of the interface in CIDR notation. For example: IPV6ADDR="2001:db8:1e11:115b::1/32"
IPV6INIT
Whether to enable IPv6 for the interface.
MASTER
Specifies the name of the master bonded interface, of which this interface is slave.
NAME
Name of the interface as displayed in the Network Connections GUI.
NETMASK
IPv4 network mask of the interface.
NETWORK
IPV4 address of the network.
NM_CONTROLLED
Whether the network interface device is controlled by the network management daemon, NetworkManager.
ONBOOT
Whether the interface is activated at boot time.
PEERDNS
Whether the /etc/resolv.conf file used for DNS resolution contains information obtained from the DHCP server.
PEERROUTES
Whether the information for the routing table entry that defines the default gateway for the interface is obtained from the DHCP server.
SLAVE
Specifies that this interface is a component of a bonded interface.
TYPE
Interface type.
USERCTL
Whether users other than root can control the state of this interface.
UUID
Universally unique identifier for the network interface device.

Saturday, September 21, 2013

The Performance Overview tab fails to display into vSphere Client and VCSA 5.1

Problem:
I cannot view the Performance Overview tab when connecting to vCenter Server Appliance using the vSphere Client 5.1 on MS Windows XP / 2003.

Variations of the error message in the Performance Overview tab:
- This program cannot display the webpage - more often seen
- Navigation to the webpage was cancelled - rarely viewed 


Resolution:
Method 1
Usually the problem is ciphers attribute in the VCSA Tomcat config file - server.xml

1. Log in to console of VCSA as the root user
2. Locate file under /usr/lib/vmware-vpx/tomcat/conf/
3. Make file backup (for your peace of mind)
4. Edit file - Find and replace "ciphers" attribute in the server.xml with following value:

ciphers="SSL_RSA_WITH_RC4_128_MD5, SSL_RSA_WITH_RC4_128_SHA, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_DSS_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_3DES_EDE_CBC_SHA, SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA, SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA"

4. Restart the vmware-vpxd service:

via console
vcsa-lab:~# service vmware-vpxd restart

Posted by Milen Lyutskanov

Method 2

For Windows 2003, try to download and install this hotfix 

http://hotfixv4.microsoft.com/Windows%20Server%202003/sp3/Fix192447/3790/free/351403_ENU_x64_zip.exe 

on computer that run vSphere client.

Cause:
Windows XP/2003 doesn't supports high cipher strength. 

Refer to http://support.microsoft.com/kb/948963/en-us

Modified by Kenji


Saturday, August 10, 2013

How to Configure Red Hat Cluster Services Fencing with iLO 3

Information
Environment :

Red Hat Cluster Suite 4+
Red Hat Enterprise Linux 5 Advanced Platform (Clustering)
Red Hat Enterprise Linux Server 6 (with the High Availability Add on)

Description :
Support for the iLO3 fence device has been added with the release of cman 2.0.115-34.el5_5.4 through erratum RHEA-2010-0876 which provides support for iLO3 via fence_ipmilan.
The iLO3 firmware should be a minimum of 1.15 as provided by HP.

Details
Resolution :
On both cluster nodes, install the following OpenIPMI packages used for fencing:
$ yum install OpenIPMI OpenIPMI-tools

Stop and disable the 'acpid' daemon:
$ service acpid stop; chkconfig acpid off

Start ipmi service on all cluster nodes:
$ service ipmi start; chkconfig ipmi on

Test ipmitool interaction with iLO3:
$ ipmitool -H -I lanplus -U -P chassis power status

The desired output is:
Chassis Power is on

Edit the /etc/cluster/cluster.conf to add the fence device:

<?xml version="1.0"?> <cluster alias="rh5nodesThree" config_version="32" name="rh5nodesThree"> <fence_daemon clean_start="0" post_fail_delay="1" post_join_delay="3"/> <clusternodes> <clusternode name="rh5node1.examplerh.com" nodeid="1" votes="1"> <fence> <method name="1"> <device domain="rh5node1" name="ilo3_node1"/> </method> </fence> </clusternode> <clusternode name="rh5node2.examplerh.com" nodeid="2" votes="1"> <fence> <method name="1"> <device domain="rh5node2" name="ilo3_node2"/> </method> </fence> </clusternode> <clusternode name="rh5node3.examplerh.com" nodeid="3" votes="1"> <fence> <method name="1"> <device domain="rh5node3" name="ilo3_node3"/> </method> </fence> </clusternode> </clusternodes> <cman expected_votes="3"> <multicast addr="229.5.1.1"/> </cman> <fencedevices> <fencedevice agent="fence_ipmilan" power_wait="10" ipaddr="XX.XX.XX.XX" lanplus="1" login="username" name="ilo3_node1" passwd="password"/> <fencedevice agent="fence_ipmilan" power_wait="10" ipaddr="XX.XX.XX.XX" lanplus="1" login="username" name="ilo3_node2" passwd="password"/> <fencedevice agent="fence_ipmilan" power_wait="10" ipaddr="XX.XX.XX.XX" lanplus="1" login="username" name="ilo3_node3" passwd="password"/> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster>

Test that fencing is successful. From node1 attempt to fence node2 as follows:
$ fence_node node2