Overview
After several days of using the my dm-cache setup (see: Adding an SSD cache to an existing encrypted filesystem with dm-cache) to cache my encrypted /home filesystem I have noticed that very little was being cached. Based on my interpretations of the block statistics for the 3 block devices involved the cache was saving me a huge number of writes to disk. Still, I hadn’t noticed a significant perceptable performance improvement. Caching just /home may not be enough to change my perception of my workstation as being held back by a slow drive. However, the real problem is that I have no baseline to compare with. It’s time to remove the cache and set up some stats collection.
Block device review
/dev/mapper/benwayvg-ehome luks block device on an LVM partition
/dev/mapper/cehome cache device from dmsetup
/dev/mapper/ehome opened luks device (an ext4 filesystem)
Save the Stats
The cumulative block stats for the three devices are still
illustrative. This is after 4 days of uptime: cat /sys/block/dm-{17,18,19}
device rd i/o rmerges rsectors rticks wr i/o wmerges wsectors wticks inflight io_ticks time_in_queue
/dev/mapper/benwayvg-ehome 120223 0 4702124 651290 127889 0 20907992 3195852 0 644128 3847142
/dev/mapper/cehome 133223 0 4518728 595193 578275 0 32206792 5623591 0 896061 6218948
/dev/mapper/ehome 133137 0 4517804 604400 550712 0 32206792 12038764 0 911568 12643178
I think some general observations can be made here. I didn’t notice much of a performance change after adding the cache device but it’s clear it saved a big percentage of writes to disk. The writes to the luks device were 550712, but luks has some overhead it looks like and the writes to the cache device were 578275. However the cache device only wrote to disk 127889 times!
There wasn’t much savings of disk read i/o however, but perhaps some tuning would have changed that. I plan to revisit it all after establish some baseline stats over the next week.
The output of dmsetup status cehome also revealed only 1471 blocks
in the cache. Blocks in this case are 256k each, which is only 350
MiB or so. I guess my expectation was more would be cached. The
documentation for dm-cache (device-mapper/cache.txt in your kernel
documentation directory) explains all fields in the output from
dmsetup status and for illustration I’ll show the full output from
dmsetup status:
0 209715200 cache 376/524288 19279 877645 583982 265753 0 662 1471 1438 0 2 migration_threshold 204800 4 random_threshold 4 sequential_threshold 2048
Perhaps dm-cache is simply doing it’s job and not caching blocks that would be replaced too quickly. Maybe it’s simply that my /home partition is a good candidate for caching, but simply doesn’t require very much cache to satisfy it’s needs.
Prepare for removal
First I have to umount /home and to do that I had to kill all user processes and any other processes that might have files open on /home. Be sure to save your work and log in as root if you’re attempting something similar.
pkill -u coxa ; umount /home
Now we should close the encrypted Luks device: cryptsetup luksClose ehome
Removing the Cache
Basically I just need to remove the cache, and to do that we need to use the cleaner policy which forces everything back to disk.
[root@benway ~]# dmsetup table cehome
0 209715200 cache 8:17 8:18 253:17 512 1 writeback default 0
[root@benway ~]# dmsetup status cehome
0 209715200 cache 376/524288 19279 877645 583982 265753 0 662 1471 1438 0 2 migration_threshold 204800 4 random_threshold 4 sequential_threshold 2048
[root@benway ~]# dmsetup suspend cehome
[root@benway ~]# dmsetup reload cehome --table '0 209715200 cache 8:17 8:18 253:17 512 1 writeback cleaner 0'
[root@benway ~]# dmsetup resume cehome
[root@benway ~]# dmsetup status cehome
0 209715200 cache 376/524288 19279 877657 583982 265753 0 0 1471 1009 0 2 migration_threshold 204800 0
[root@benway ~]# dmsetup status cehome
0 209715200 cache 376/524288 19279 877657 583982 265753 0 0 1471 530 0 2 migration_threshold 204800 0
[root@benway ~]# dmsetup status cehome
0 209715200 cache 376/524288 19279 877657 583982 265753 0 0 1471 361 0 2 migration_threshold 204800 0
[root@benway ~]# dmsetup status cehome
0 209715200 cache 376/524288 19279 877657 583982 265753 0 0 1471 0 0 2 migration_threshold 204800 0
[root@benway ~]# dmsetup status cehome
0 209715200 cache 376/524288 19279 877657 583982 265753 0 0 1471 0 0 2 migration_threshold 204800 0
[root@benway ~]# dmsetup wait cehome
[root@benway ~]# dmsetup remove cehome
You can see the column showing dirty blocks dropping down to zero after resuming the device with the cleaner policy in place.
Tidying Up
I needed to modify /etc/crypttab to make sure it references the logical volume that we are no longer caching, instead of the cached block device we removed above.
#ehome /dev/mapper/cehome
ehome /dev/benwayvg/ehome
I was using a systemd service to create the dm-cache device, we need
to make sure that is not started: rm -f /usr/lib/systemd/system/local-fs.target.wants/dmsetup-dm-cache.service /etc/systemd/system/dmsetup-dm-cache.service
Next steps
The next step is to start gathering some detailed statistics about the disk io on my system. For that I am going to try graphite/carbon or collectd. My next post will include some setup notes from that process.
No comments:
Post a Comment