Issue
The client has a disk migration task on their cloud that has been running for a long time.
Environment
OnApp 5.x / 6.x
Resolution
Start by finding the dd task on the backup server for the vdisk by running this command
ps aux | grep <vdisk identifier>
This will output something like this:
[root@storage ~]# ps aux | grep nvh46v5q5hs6xg
root 5615 3.9 0.0 105176 672 pts/0 Ds+ 20:24 3:28 dd if=/dev/onapp-x7xebtgwqro2jk/nvh46v5q5hs6xg of=/dev/onapp-tcnq7wp8zbtl2c/nvh46v5q5hs6xg bs=4096 conv=notrunc
root 22345 0.0 0.0 103256 868 pts/2 S+ 21:51 0:00 grep nvh46v5q5hs6xg
[root@storage ~]#
Next we can use the pid from the output to run an other command:
ll /proc/5615/fd
for example:
[root@storage ~]# ll /proc/5615/fd
total 0
lr-x------ 1 root root 64 Jun 25 21:54 0 -> /dev/dm-11
l-wx------ 1 root root 64 Jun 25 21:54 1 -> /dev/dm-14
lrwx------ 1 root root 64 Jun 25 21:51 2 -> /dev/pts/0
[root@storage ~]#
Notice how each line has a number, 0 -> /dev/dm-11 and 1 -> /dev/dm-14 in this case one of these is likely the main disk and the other is the swap.
Now we take the number from this output and the pid from before and run the next command:
cat /proc/5615/fdinfo/0
for example:
[root@storage ~]# cat /proc/5615/fdinfo/0
pos: 117058506752
flags: 0100000
[root@storage ~]# cat /proc/5615/fdinfo/0
pos: 117299363840
flags: 0100000
[root@storage ~]#
Notice how I ran the command twice (waiting a little bit between each time) and also how the pos number changed between the first and second time. This indicates that data is in fact being migrated and the migration task has not zombied.