Wessel Dankers
2012-08-06 14:46:12 UTC
Hi,
(This thread was originally in -users.)
on this list, but BackupPC development is opaque and centralised in the
main developer. A lot of us write patches, and the Debian packagers seem to
be helping keep BackupPC up to date. But getting those patches upstream
seems impossible, and I'm beginning to wonder if v4 is coming at all.
I'm currently experimenting with some ideas of my own.
My employer uses BackupPC to backup ~150 unix servers nightly. To work
around some performance problems I had to alter the source code a little.
While doing this I realized there might be some nice gains to be made by
doing certain things differently.
The ideas overlap to a limited extent with the ideas[0] that Craig posted
to this list. For instance, no more hardlinks, and garbage collection is
done using flat-file databases. Some things are quite different. I'll try
to explain my ideas here.
Observations
============
1) I/O: writes are fast, reads are slow. Random reads are very slow.
Writing can be done asynchronously. Even random writes can be elevatored to
the platters fairly efficiently. Reading, on the other hand, is blocking to
the process doing it. Running many processes in parallel can alleviate the
random-read bottleneck a little but it's still best to do sequential reads
whenever possible.
2) CPU and RAM are cheap.
BackupPC parallelizes well and for larger installations 12-core or even
24-core systems can be had for reasonable prices. Memory is dirt cheap.
File data
=========
All files are to be split into chunks of (currently) 2 megabytes. These
chunks are addressed by their sha512 hashes (that is, the filename for each
chunk is simply the base64 encoded hash). Any compression is applied after
splitting. This provides dedup even for large files that change or
otherwise differ only slightly.
The use of sha512 will eliminate collisions completely, and provide safe
dedup without expensive whole-file comparisons. This obviates the need for
a lot of disk reading.
Pool metadata
=============
Each backup consists of a number of databases, one for each share. Each
database contains a complete list of files and for each file the file
attributes and a list of sha512 hashes that together describe the contents.
These databases are of the âwrite once, read many timesâ variety, similar
to djb's cdb-databases or iso9660 filesystems. This part is already
implemented:
https://git.fruit.je/hardhat
https://git.fruit.je/hardhat-perl
Some informal estimations based on real-world data indicate that files take
up about 100 bytes of metadata on average in this format.
Garbage collection
==================
The list of used hashes is collected from each backup by scanning the
databases for each share. These are collected into a list for the host they
belong to and the lists for each host are collected again into a global
list.
The actual garbage collection is simply enumerating all files in the chunk
pool and checking with the global list if each chunk is still needed.
This part is already implemented:
https://git.fruit.je/hashlookup-perl
These hash lists need to be sorted and merged so that lookups will be
efficient. That part is already implemented as well, but doesn't yet have a
git repository of its own.
Incremental backups
===================
Incremental backups use a previous backup as a reference. Any files that
have the same timestamp+size are not transferred from the client but
instead the file metadata is copied from the previous backup. That means
that incremental backups are complete representations of the state of the
client. The reference backup is not needed to browse or restore files.
Integration
===========
The above ideas and code need to be integrated into BackupPC. I've created
a git repository for that:
https://git.fruit.je/backuppc
but the code in that repository is pretty much guaranteed to not work at
all, for now.
My next target is to create BackupPC::Backup::Reader and
BackupPC::Backup::Writer classes similar to the poolreader and poolwriter
classes. The reader class might even get a variant that can read v3
backups; useful for migration scenarios.
===
So, where to go from here? I'd love to hear from Craig what he thinks of
all this. As far as I'm aware he has not started work on 4.0 yet, so I'll
just take the liberty to continue tinkering with the above ideas in the
meanwhile. :)
And of course, comments/feedback are more than welcome.
Kind regards,
(This thread was originally in -users.)
My novice understanding is that the next release is a "whopper" which
fundamentally changes many things about how BackupPC works...so it's
taking a long time.
I'm sorry to have to cast doubt on this. I have heard the above many timesfundamentally changes many things about how BackupPC works...so it's
taking a long time.
on this list, but BackupPC development is opaque and centralised in the
main developer. A lot of us write patches, and the Debian packagers seem to
be helping keep BackupPC up to date. But getting those patches upstream
seems impossible, and I'm beginning to wonder if v4 is coming at all.
My employer uses BackupPC to backup ~150 unix servers nightly. To work
around some performance problems I had to alter the source code a little.
While doing this I realized there might be some nice gains to be made by
doing certain things differently.
The ideas overlap to a limited extent with the ideas[0] that Craig posted
to this list. For instance, no more hardlinks, and garbage collection is
done using flat-file databases. Some things are quite different. I'll try
to explain my ideas here.
Observations
============
1) I/O: writes are fast, reads are slow. Random reads are very slow.
Writing can be done asynchronously. Even random writes can be elevatored to
the platters fairly efficiently. Reading, on the other hand, is blocking to
the process doing it. Running many processes in parallel can alleviate the
random-read bottleneck a little but it's still best to do sequential reads
whenever possible.
2) CPU and RAM are cheap.
BackupPC parallelizes well and for larger installations 12-core or even
24-core systems can be had for reasonable prices. Memory is dirt cheap.
File data
=========
All files are to be split into chunks of (currently) 2 megabytes. These
chunks are addressed by their sha512 hashes (that is, the filename for each
chunk is simply the base64 encoded hash). Any compression is applied after
splitting. This provides dedup even for large files that change or
otherwise differ only slightly.
The use of sha512 will eliminate collisions completely, and provide safe
dedup without expensive whole-file comparisons. This obviates the need for
a lot of disk reading.
Pool metadata
=============
Each backup consists of a number of databases, one for each share. Each
database contains a complete list of files and for each file the file
attributes and a list of sha512 hashes that together describe the contents.
These databases are of the âwrite once, read many timesâ variety, similar
to djb's cdb-databases or iso9660 filesystems. This part is already
implemented:
https://git.fruit.je/hardhat
https://git.fruit.je/hardhat-perl
Some informal estimations based on real-world data indicate that files take
up about 100 bytes of metadata on average in this format.
Garbage collection
==================
The list of used hashes is collected from each backup by scanning the
databases for each share. These are collected into a list for the host they
belong to and the lists for each host are collected again into a global
list.
The actual garbage collection is simply enumerating all files in the chunk
pool and checking with the global list if each chunk is still needed.
This part is already implemented:
https://git.fruit.je/hashlookup-perl
These hash lists need to be sorted and merged so that lookups will be
efficient. That part is already implemented as well, but doesn't yet have a
git repository of its own.
Incremental backups
===================
Incremental backups use a previous backup as a reference. Any files that
have the same timestamp+size are not transferred from the client but
instead the file metadata is copied from the previous backup. That means
that incremental backups are complete representations of the state of the
client. The reference backup is not needed to browse or restore files.
Integration
===========
The above ideas and code need to be integrated into BackupPC. I've created
a git repository for that:
https://git.fruit.je/backuppc
but the code in that repository is pretty much guaranteed to not work at
all, for now.
My next target is to create BackupPC::Backup::Reader and
BackupPC::Backup::Writer classes similar to the poolreader and poolwriter
classes. The reader class might even get a variant that can read v3
backups; useful for migration scenarios.
===
So, where to go from here? I'd love to hear from Craig what he thinks of
all this. As far as I'm aware he has not started work on 4.0 yet, so I'll
just take the liberty to continue tinkering with the above ideas in the
meanwhile. :)
And of course, comments/feedback are more than welcome.
Kind regards,
--
Wessel Dankers <***@fruit.je>
[0]
http://sourceforge.net/mailarchive/message.php?msg_id=27140174
http://sourceforge.net/mailarchive/message.php?msg_id=27140175
http://sourceforge.net/mailarchive/message.php?msg_id=27140176
Wessel Dankers <***@fruit.je>
[0]
http://sourceforge.net/mailarchive/message.php?msg_id=27140174
http://sourceforge.net/mailarchive/message.php?msg_id=27140175
http://sourceforge.net/mailarchive/message.php?msg_id=27140176