Postgresql Warm Standby Testing
[
2009/12/03 11:24 | by askwan ]
2009/12/03 11:24 | by askwan ]
OS: Debian5.0 lenny x86_64
Postgresql:8.3.8
Master :192.168.0.121
Warm Standby:192.168.0.122

1. install postgresql on both master and warm standby server.
2. Config passwordless ssh authentication use user postgres
Please refer to http://www.petefreitag.com/item/532.cfm
3. Install pg_standby on both both master and warm standby server.
#cd /usr/local/src/postgresql-8.3.8/contrib/pg_standby
#make
#make install
4. Config postgresql
In master server :
in warm standby server :
start master server :
5. Dump all production data to master server .
You can check the warm standby server ‘s /wal_archives directory , whether there is archive logs from master server located on .
6. Login to master server as postgres user.
Create a script to do a base backup:
7. Login to warm standby server as postgresql user.
Create a script to do some clean and configuration
Note:PostgreSQL 8.4 provides the recovery_end_command option.so,in recovery.conf add recovery_end_command=’rm –f /tmp/pgsql.trigger ’
pg_standby supports creation of a "warm standby" database server. It is designed to be a production-ready program, as well as a customizable template should you require specific modifications.
More about pg_standby,refer to http://www.postgresql.org/docs/current/static/pgstandby.html
http://www.enterprisedb.com/docs/en/8.4/pg/pgstandby.html
8. Start warm standby server
9. check warm standby postgresql logfile:
Check standby logfile:
10. failover
In warm standby server
touch /tmp/ pgsql.trigger
Postgresql:8.3.8
Master :192.168.0.121
Warm Standby:192.168.0.122
1. install postgresql on both master and warm standby server.
#cd /usr/local/src/postgresql-8.3.8/
#./configure –prefix=/usr/local/pgsql
#make
#make install
#mkdir /usr/local/pgsql/data
#chown postgres /usr/local/pgsql/data
#su - postgres
$/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
#./configure –prefix=/usr/local/pgsql
#make
#make install
#mkdir /usr/local/pgsql/data
#chown postgres /usr/local/pgsql/data
#su - postgres
$/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
2. Config passwordless ssh authentication use user postgres
Please refer to http://www.petefreitag.com/item/532.cfm
3. Install pg_standby on both both master and warm standby server.
#cd /usr/local/src/postgresql-8.3.8/contrib/pg_standby
#make
#make install
4. Config postgresql
In master server :
Quotation
archive_mode = on
archive_command = 'rsync -arv %p postgres@192.168.0.122:/wal_archives/%f'
archive_timeout = 1200
archive_command = 'rsync -arv %p postgres@192.168.0.122:/wal_archives/%f'
archive_timeout = 1200
in warm standby server :
#mkdir /wal_archives
#chown postgres /wal_archives/
#chown postgres /wal_archives/
start master server :
pg_ctl -D /usr/local/pgsql/data/ start -l /usr/local/pgsql/data/logfile
5. Dump all production data to master server .
You can check the warm standby server ‘s /wal_archives directory , whether there is archive logs from master server located on .
6. Login to master server as postgres user.
Create a script to do a base backup:
Quotation
#!/bin/bash
PGDATA=/usr/local/pgsql/data
psql -c "CHECKPOINT;"
psql -c "SELECT pg_start_backup('BASEBACKUP');"
tar -C /usr/local/pgsql/ -zcf - data |ssh 192.168.0.122 "tar -C /usr/local/pgsql/ -zxf -"
psql -c "select pg_stop_backup();"
echo "BASEBACKUP compete!"
PGDATA=/usr/local/pgsql/data
psql -c "CHECKPOINT;"
psql -c "SELECT pg_start_backup('BASEBACKUP');"
tar -C /usr/local/pgsql/ -zcf - data |ssh 192.168.0.122 "tar -C /usr/local/pgsql/ -zxf -"
psql -c "select pg_stop_backup();"
echo "BASEBACKUP compete!"
7. Login to warm standby server as postgresql user.
Create a script to do some clean and configuration
Quotation
#!/bin/bash
PG_HOME=/usr/local/pgsql
PGDATA=/usr/local/pgsql/data
trigger_file=/tmp/pgsql.trigger
pg_standby=$PG_HOME/bin/pg_standby
PG_ARCHIVES=/wal_archives
rm -f $PGDATA/recovery.*
rm -f $PGDATA/logfile
rm -f $PGDATA/postmaster.pid
rm -f $PGDATA/pg_xlog/0*
rm -f $PGDATA/pg_xlog/archive_status/0*
sed -i '/^archive_/s/^/#/g' $PGDATA/postgresql.conf
echo "restore_command = '$pg_standby -l -d -s 2 -t $trigger_file $PG_ARCHIVES %f %p %r 2>>/tmp/standby.log'" > $PGDATA/recovery.co
nf
chown postgres.postgres $PGDATA/recovery.conf
echo "Init completed! now start warm standby server."
PG_HOME=/usr/local/pgsql
PGDATA=/usr/local/pgsql/data
trigger_file=/tmp/pgsql.trigger
pg_standby=$PG_HOME/bin/pg_standby
PG_ARCHIVES=/wal_archives
rm -f $PGDATA/recovery.*
rm -f $PGDATA/logfile
rm -f $PGDATA/postmaster.pid
rm -f $PGDATA/pg_xlog/0*
rm -f $PGDATA/pg_xlog/archive_status/0*
sed -i '/^archive_/s/^/#/g' $PGDATA/postgresql.conf
echo "restore_command = '$pg_standby -l -d -s 2 -t $trigger_file $PG_ARCHIVES %f %p %r 2>>/tmp/standby.log'" > $PGDATA/recovery.co
nf
chown postgres.postgres $PGDATA/recovery.conf
echo "Init completed! now start warm standby server."
Note:PostgreSQL 8.4 provides the recovery_end_command option.so,in recovery.conf add recovery_end_command=’rm –f /tmp/pgsql.trigger ’
pg_standby supports creation of a "warm standby" database server. It is designed to be a production-ready program, as well as a customizable template should you require specific modifications.
More about pg_standby,refer to http://www.postgresql.org/docs/current/static/pgstandby.html
http://www.enterprisedb.com/docs/en/8.4/pg/pgstandby.html
8. Start warm standby server
pg_ctl -D /usr/local/pgsql/data start -l /usr/local/pgsql/data/logfile
9. check warm standby postgresql logfile:
Quotation
LOG: database system was interrupted; last known up at 2009-12-03 11:44:37 CST
LOG: starting archive recovery
LOG: restore_command = '/usr/local/pgsql/bin/pg_standby -l -d -s 2 -t /tmp/pgsql.trigger /wal_archives %f %p %r 2>>/tmp/standby.log
'
LOG: restored log file "000000010000000000000014.00000020.backup" from archive
LOG: restored log file "000000010000000000000014" from archive
LOG: automatic recovery in progress
LOG: redo starts at 0/14000068
FATAL: the database system is starting up
LOG: restored log file "000000010000000000000015" from archive
LOG: restored log file "000000010000000000000016" from archive
LOG: restored log file "000000010000000000000017" from archive
LOG: restored log file "000000010000000000000018" from archive
LOG: restored log file "000000010000000000000019" from archive
LOG: restored log file "00000001000000000000001A" from archive
LOG: restored log file "00000001000000000000001B" from archive
LOG: restored log file "00000001000000000000001C" from archive
LOG: restored log file "00000001000000000000001D" from archive
LOG: restored log file "00000001000000000000001E" from archive
LOG: restored log file "00000001000000000000001F" from archive
LOG: restored log file "000000010000000000000020" from archive
LOG: restored log file "000000010000000000000021" from archive
……
LOG: starting archive recovery
LOG: restore_command = '/usr/local/pgsql/bin/pg_standby -l -d -s 2 -t /tmp/pgsql.trigger /wal_archives %f %p %r 2>>/tmp/standby.log
'
LOG: restored log file "000000010000000000000014.00000020.backup" from archive
LOG: restored log file "000000010000000000000014" from archive
LOG: automatic recovery in progress
LOG: redo starts at 0/14000068
FATAL: the database system is starting up
LOG: restored log file "000000010000000000000015" from archive
LOG: restored log file "000000010000000000000016" from archive
LOG: restored log file "000000010000000000000017" from archive
LOG: restored log file "000000010000000000000018" from archive
LOG: restored log file "000000010000000000000019" from archive
LOG: restored log file "00000001000000000000001A" from archive
LOG: restored log file "00000001000000000000001B" from archive
LOG: restored log file "00000001000000000000001C" from archive
LOG: restored log file "00000001000000000000001D" from archive
LOG: restored log file "00000001000000000000001E" from archive
LOG: restored log file "00000001000000000000001F" from archive
LOG: restored log file "000000010000000000000020" from archive
LOG: restored log file "000000010000000000000021" from archive
……
Check standby logfile:
Quotation
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 00000001.history
WAL file path : /wal_archives/00000001.history
Restoring to... : pg_xlog/RECOVERYHISTORY
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/00000001.history" "pg_xlog/RECOVERYHISTORY"
Keep archive history : 000000000000000000000000 and later
running restore : OK
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 000000010000000000000014.00000020.backup
WAL file path : /wal_archives/000000010000000000000014.00000020.backup
Restoring to... : pg_xlog/RECOVERYHISTORY
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/000000010000000000000014.00000020.backup" "pg_xlog/RECOVERYHISTORY"
Keep archive history : 000000000000000000000000 and later
running restore : OK
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 000000010000000000000014
WAL file path : /wal_archives/000000010000000000000014
Restoring to... : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/000000010000000000000014" "pg_xlog/RECOVERYXLOG"
Keep archive history : 000000000000000000000000 and later
running restore : OK
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 000000010000000000000015
WAL file path : /wal_archives/000000010000000000000015
Restoring to... : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/000000010000000000000015" "pg_xlog/RECOVERYXLOG"
Keep archive history : 000000010000000000000014 and later
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 000000010000000000000016
WAL file path : /wal_archives/000000010000000000000016
Restoring to... : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Restoring to... : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/000000010000000000000016" "pg_xlog/RECOVERYXLOG"
Keep archive history : 000000010000000000000014 and later
running restore : OK
......
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
......
Waiting for WAL file : 00000001.history
WAL file path : /wal_archives/00000001.history
Restoring to... : pg_xlog/RECOVERYHISTORY
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/00000001.history" "pg_xlog/RECOVERYHISTORY"
Keep archive history : 000000000000000000000000 and later
running restore : OK
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 000000010000000000000014.00000020.backup
WAL file path : /wal_archives/000000010000000000000014.00000020.backup
Restoring to... : pg_xlog/RECOVERYHISTORY
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/000000010000000000000014.00000020.backup" "pg_xlog/RECOVERYHISTORY"
Keep archive history : 000000000000000000000000 and later
running restore : OK
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 000000010000000000000014
WAL file path : /wal_archives/000000010000000000000014
Restoring to... : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/000000010000000000000014" "pg_xlog/RECOVERYXLOG"
Keep archive history : 000000000000000000000000 and later
running restore : OK
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 000000010000000000000015
WAL file path : /wal_archives/000000010000000000000015
Restoring to... : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/000000010000000000000015" "pg_xlog/RECOVERYXLOG"
Keep archive history : 000000010000000000000014 and later
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
Trigger file : /tmp/pgsql.trigger
Waiting for WAL file : 000000010000000000000016
WAL file path : /wal_archives/000000010000000000000016
Restoring to... : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Restoring to... : pg_xlog/RECOVERYXLOG
Sleep interval : 2 seconds
Max wait interval : 0 forever
Command for restore : ln -s -f "/wal_archives/000000010000000000000016" "pg_xlog/RECOVERYXLOG"
Keep archive history : 000000010000000000000014 and later
running restore : OK
......
WAL file not present yet. Checking for trigger file...
WAL file not present yet. Checking for trigger file...
......
10. failover
In warm standby server
touch /tmp/ pgsql.trigger
use pgFouine analyse postgresql slow queries
[
2009/11/25 09:55 | by askwan ]
2009/11/25 09:55 | by askwan ]
"pgFouine is a PostgreSQL log analyzer used to generate detailed reports from a PostgreSQL log file. pgFouine can help you to determine which queries you should optimize to speed up your PostgreSQL based application."
you can use syslog or stderr
for example :use syslog
1. edit your /etc/syslog.conf
*.info;mail.none;authpriv.none;cron.none;local0.none /var/log/messages
2. edit postgresl.conf
3.download pgFouine
http://pgfouine.projects.postgresql.org/index.html
than restart syslogd and postgresql
4. analyse log
./pgfouine.php -file /var/log/postgresql/pgsql > askwan.com.html

use stderr please refer to http://gkoenig.wordpress.com/2009/02/19/pgfouine-dive-into-postgres-log/
you can use syslog or stderr
for example :use syslog
1. edit your /etc/syslog.conf
Quotation
local0.* -/var/log/pgsql
Quotation
*.info;mail.none;authpriv.none;cron.none;local0.none /var/log/messages
2. edit postgresl.conf
Quotation
log_destination = 'syslog'
silent_mode = on
log_min_duration_statement = 0
#set to 0 log all
#set to -1 ,disable query logging
#set to X(positive integral) ,means to log queries slower than X milliseconds:
log_duration = off
log_statement = 'all'
silent_mode = on
log_min_duration_statement = 0
#set to 0 log all
#set to -1 ,disable query logging
#set to X(positive integral) ,means to log queries slower than X milliseconds:
log_duration = off
log_statement = 'all'
3.download pgFouine
http://pgfouine.projects.postgresql.org/index.html
than restart syslogd and postgresql
4. analyse log
./pgfouine.php -file /var/log/postgresql/pgsql > askwan.com.html
use stderr please refer to http://gkoenig.wordpress.com/2009/02/19/pgfouine-dive-into-postgres-log/
Mysql replication Error "Slave_SQL_Running:No"
[
2009/09/15 13:20 | by askwan ]
2009/09/15 13:20 | by askwan ]
mysql> show slave status\G
......
Slave_IO_Running: Yes
Slave_SQL_Running: No
......
首先定位同步不成功的原因
查数据库日志
若是从主机重启,事物回滚,则
其他网络故障,则
go to master :
go to slave ,manual replication
then done !
mysql> show slave status\G
......
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
......
another example:
first stop slave replication:
change master host info:
last step ,start replication:
check all right or not :
.......
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
.......
......
Slave_IO_Running: Yes
Slave_SQL_Running: No
......
首先定位同步不成功的原因
查数据库日志
若是从主机重启,事物回滚,则
mysql> slave stop;
mysql> set GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
mysql> slave start;
mysql> set GLOBAL SQL_SLAVE_SKIP_COUNTER=1;
mysql> slave start;
其他网络故障,则
go to master :
mysql> show master status;
+------------------+-----------+--------------+------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+------------------+-----------+--------------+------------------+
| mysql-bin.000004 | 244274056 | cada_w | |
+------------------+-----------+--------------+------------------+
1 row in set (0.00 sec)
go to slave ,manual replication
mysql>salve stop
mysql> change master to
> master_host='192.168.0.50',
> master_user='askwan',
> master_password='askwan.com',
> master_port=3306,
> master_log_file=localhost-bin.000004',
> master_log_pos=244274056 ;
1 row in set (0.00 sec)
mysql> slave start;
1 row in set (0.00 sec)
mysql> change master to
> master_host='192.168.0.50',
> master_user='askwan',
> master_password='askwan.com',
> master_port=3306,
> master_log_file=localhost-bin.000004',
> master_log_pos=244274056 ;
1 row in set (0.00 sec)
mysql> slave start;
1 row in set (0.00 sec)
then done !
mysql> show slave status\G
......
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
......
another example:
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.100.88
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000097
Read_Master_Log_Pos: 84519125
Relay_Log_File: HK1-relay-bin.001040
Relay_Log_Pos: 8880267
Relay_Master_Log_File: mysql-bin.000097
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB: askwan.com
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 67663733
Relay_Log_Space: 30399576
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
1 row in set (0.00 sec)
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.100.88
Master_User: slave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000097
Read_Master_Log_Pos: 84519125
Relay_Log_File: HK1-relay-bin.001040
Relay_Log_Pos: 8880267
Relay_Master_Log_File: mysql-bin.000097
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB: askwan.com
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Skip_Counter: 0
Exec_Master_Log_Pos: 67663733
Relay_Log_Space: 30399576
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
1 row in set (0.00 sec)
first stop slave replication:
mysql> stop slave;
Query OK, 0 rows affected (0.04 sec)
Query OK, 0 rows affected (0.04 sec)
change master host info:
mysql> change master to master_host='192.168.100.88',master_user='slave',master_password='askwan.com',master_port=3306,master_log_file='mysql-bin.000097',master_log_pos=67663733;
last step ,start replication:
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
check all right or not :
mysql> show slave status\G
.......
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
.......




