A (possibly overengineered) system for data redundancy


So I have a lot of machines which I need to keep backed up. I have more than a few laptops, and I'm slowly growing my collection of servers. Data redundancy is, in general, is a good thing.

Hardware

The backup hosting box is a Cloud Engines Pogoplug Pro. This is a really funky little box -- it was originally intended as a sort of personal private cloud thing, similar to Western Digital's MyCloud series of network attached storage devices. You create an account, your box gently talks to Cloud Engines' servers, and you can access your files from anywhere in the world where you can hit Cloud Engines' public network. As it turns out, you can also solder on a serial port, flash a custom U-Boot ROM onto the NAND, and run Debian on it (more on that later).

The hardware is... interesting. This is a strange old ARMv6 SoC made by a long-defunct manufacturer called Oxford National Semiconductor (aka. OXNAS). The (dual-core) OX820 chipset has some silicon-level design faults which mostly manifest themselves as slightly horrifying concurrency issues, especially when DMA is involved. The production run of this silicon was accordingly not very long, and later devices in the Pogoplug series (the Pogoplugs 3 and 4) used Marvell Kirkwood chipsets instead.

The Pogoplug Pro does have some nice things though -- it has a SATA port on the mainboad and a PCIe slot which by default comes with a Ralink WiFi card. The SATA port is by default inaccessible, but can be exposed to the outside world via a small hardware mod. Power can be provided to a disk on the SATA port by dropping the power from the USB bus, as there are some extra USB ports on the PCB which don't have anything actually mounted on them. The WiFi card was a deal-maker for me, as it's something the later Pogoplugs lack -- it wasn't the easiest decision though, as most of the Pogoplug 3's I've seen on ebay come in the most adorable pink and white colour scheme.

The Pogoplugs as a collective have four USB 2 ports and a gigabit ethernet port, though on my specimen the CPU isn't fast enough to run the ethernet at line rate. It's still faster than a Raspberry Pi 3 though, as the ethernet is on a different bus from the USB ports, though I'm not sure about how it compares to the 3+, as I haven't ever had my hands on one of them.

For data storage, I picked up two 2 TB USB disk drives. One's a Seagate, and has all sorts of fancy SATA features for disk introspection and power management, and the other is a Western Digital, which has nearly no extra SATA features but much higher optimal I/O block size. Bearing in mind both of these disks will be on USB 2, they aren't going to be fast -- but it doesn't need to be fast, it needs to work.

Software

As I implied, I run Debian on the Pogoplug. Go and read this forum -- it is my authoritative source of information for all things Pogoplug. For robustness and power-saving purposes, I've torn out all of the default userland and replaced it with a custom built solution using Laurent Bercot aka skarnet's software stack. skarnet writes good software -- I've been using a similar configuration to what I have on the Pogoplug on two laptops for well over a year now, and it's been an incredibly robust and low-maintenance solution.

The two hard disks are meshed into a RAID 1 array using the standard Linux md driver, and I formatted the array with xfs. According to some folks on IRC, xfs handles large volumes better than ext4, and the xfs driver certainly seems to be pretty smart about running on top of RAID arrays, at least according to the Arch Linux wiki

As the Pogoplug travels with me a lot of the time, I have it set up to automagically SSH to one of my servers as an access point from the public internet, and set up a reverse tunnel to a special sshd on the Pogoplug's localhost. This sshd is configured to only allow logins for a special sftponly user, who is chrooted into a directory on the RAID array and only permitted SFTP access. It goes without saying, but all the authentication is done with SSH keys here.

The rest is handled by duplicity and some scripting for automation. Setting up a system to be backed up is then pretty simple: install duplicity and the scripting, create an SSH key and transfer the key to the Pogoplug. Run the script as root, and you get encrypted, incremental backups appearing on the Pogoplug.



home