Annual Thanks Giving Service Karunya

Cinthiya(my sister) called me and said that the Annual Thanks Giving Service at Karunya is being hosted live on the web. Watching it gave me a nostalgic feeling of the good old days in Karunya where i was part of the Mega Play Team. I managed to record part of the stream though the video quality is not great. The below video is an extract containing the final choreography in which my sister was part of.

mysql and disk space anomoloy

Every hour, we saw the diskspace on our root partition go up from 45 %  to 90 % and come back to 45 %. Initially thought mysql was processing some temp tables and in the next hour we kept watching the temp location. In terms of files / diskspace usage using “du / ls” nothing changed. But lsof provided an interesting output.

Output 1 :

mysqld    31719   mysql    6u   REG                8,1          0   149240 /var/tmp/ibOLLuRa (deleted)
mysqld    31719   mysql    7u   REG                8,1        102   149243 /var/tmp/ibNuSvI9 (deleted)
mysqld    31719   mysql    8u   REG                8,1          0   149245 /var/tmp/ibqq2wz8 (deleted)
mysqld    31719   mysql    9u   REG                8,1          0   149255 /var/tmp/ibVN66Rh (deleted)
mysqld    31719   mysql   13u   REG                8,1          0   149256 /var/tmp/ibslSJJA (deleted)
mysqld    31719   mysql   15u  unix 0×000001085e445680            19044143 /tmp/mysql.sock
mysqld    31719   mysql   18u   REG                8,1 1128267776   146656 /var/tmp/MLHbDVOq (deleted)

Output 2 :

mysqld    31719   mysql    6u   REG                8,1          0   149240 /var/tmp/ibOLLuRa (deleted)
mysqld    31719   mysql    7u   REG                8,1        102   149243 /var/tmp/ibNuSvI9 (deleted)
mysqld    31719   mysql    8u   REG                8,1          0   149245 /var/tmp/ibqq2wz8 (deleted)
mysqld    31719   mysql    9u   REG                8,1          0   149255 /var/tmp/ibVN66Rh (deleted)
mysqld    31719   mysql   13u   REG                8,1          0   149256 /var/tmp/ibslSJJA (deleted)
mysqld    31719   mysql   15u  unix 0×000001085e445680            19044143 /tmp/mysql.sock
mysqld    31719   mysql   18u   REG                8,1 1396703232   146656 /var/tmp/MLHbDVOq (deleted)

The size occupied by a file marked deleted kept increasing :( Strange but it seems to be happening on every dataload. The size change in this file corroborates perfectly with the space usage reported by “df” but not with “du”.

We are setting up the tmp location to a filesystem with more space for now before we investigate further.

A Beautiful Song !

moving from mysql 5.0 to 5.1

Problem Statement : 386 GB of MySQL 5.0 Compressed MyISAM tables (Merge engine used)  need to be migrated to MySQL 5.1 Compressed/Partitioned tables. Also we had to add a few new columns and change the datatype of few existing columns.

Initial plan : Setup the MySQL 5.1 database with the schema with partitioning to which the data needs to be migrated and then run scripts to copy the .MYD ,.MYI and .frm files to the 5.1 database. Disable the keys using “Alter table <> disable keys“. Then migrate the data using the insert into newtable select * from old table. Then to compress the tables run myisampack and myisamchk.

As per plan we started the migration process. The plan worked perfectly though a bit on the slower side for most of the tables. For few of the tables, the import process kept dragging on and on. It was well past a week since the import was started and the system was writing at a speed of 1 – 1.5 Gig per hour on to disk. There was heavy contention on the key buffer and the behavior was seen on tables which had keys greater than the available buffer and those with where clause in the “insert into <> select from” statments. Then we started getting constrained on time, was looking for avenues to optimize this whole process.

After quite a number of tests and trials , we found the following 2 methods to be faster.

Method 1:

1. Dump the data using select into outfile in the primary key order in small chunks if the table is huge.

2. Alter table <> disable keys.

3. Load the data using load data infile statment.

4. Run myisampack and myisamchk to compress the data and rebuild the indexes respectively.

Method 1, was faster than the original plan.

Method 2 :

Method 2 would be a slight variation of method 1. Instead of “Alter table <> disable keys“, do the following :

For each partition on the  table run myisamchk  –keys-used=0 -rq <tablename>. This would disable all keys on the table. Now proceed from step 3.

With the second approach we, got a throughput of 1GB data written to disk in 1 minute approx. Will follow up this post with more detailed commands and steps. Whew ! What a week it has been so far !

Monkeys in Bannerghetta zoo

Victor’s B’Day

Me @ ma desk

Data inconsistency issues – binlog to rescue

One of our peer group developers contacted me today stating that they were inserting a value of 10 in one of the columns in their tables. But after the insert, when they query the table they were getting the value as 0.

Logged into the box to check what was happening. Luckily binlog was turned on that mysql instance. Ran the mysqlbinlog utility on the current binlog and greped for the particular primary key value.

It listed an insert statement and an update statement. There was the culprit, the value of 10 was getting inserted but soon the application was following it up with an update to value 0. Yeah it is always not the database ;)

Relay binlog corrupt

The slave failed with the error that the relay binlog is corrupt. It had copied close to 12 binlogs from the master and they were yet to be applied. Unfortunately those binlogs have been purged on the master. Now to sync up cleanly we might have to refresh data from the master which can be costly since it was a 290 GB database. We had the option of shutting down the server. We thought we can try our luck with a crazy hack. We shutdown the server. Tried reading the binlog using mysqlbinlog utility from the corrupt position. It failed as expected. Then we tried reading from the next immediate position and it went through fine. Now we had a proof that our hack might work. We opened the relay-log.info and incremented the second row by a value of one. Then we started the server. Boom, the slave started running and we were saved from a great pain of resyncing the slave.

PS : We might have missed one transaction in this hack, but that was ok for our use case.

Script to obtain the count of archive logs applied during recovery

Pass the start date + time and end date + time to the following script. It looks at the alert log file and obtains the no of archive logs that have been applied during media recovery in that time interval.

#!/usr/bin/perl -w

BEGIN { push @INC,"/home/user/TimeDate-1.16/lib/" }
use Date::Parse;
$start=str2time(@ARGV[0]);
chomp(@ARGV[1]);
$end=str2time(@ARGV[1]);

$log='path to the log file';
open(ALERTLOGFILE,$log) or die "Not able to open ".$log;
$progress=-1;
$count=0;
while(<ALERTLOGFILE>){

if(/^[A-Z][a-z]{2}\s[A-Z][a-z]{2}/){
  $time=str2time($_);

  if(($time >= $start) && ($time <= $end)){
   $progress=1;
   }else{
     $progress=-1;
   }
}
if(/^Media Recovery Log/){
  if($progress == 1){
    $count+=1;
  }
}
}
print $count ."\n";