Hasleo Software Forums

Full Version: Merge strategy needs some love
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Currently the merge strategy seems to be: create a new file with all the data of the merged files combined, then delete the obsolete files.

This doesn't scale.

Example: a full backup using 51% of the available backup space. Merging any incremental backup with this is not possible, as the aforementioned method would occupy 51% twice during the merge operation.

A solution to this dilemma is non-trivial (and I don't know anything about the backup file format).

One strategy I could imagine: "melt" the older backup into the newer file. Say a full backup and an increment shall be merged:
  • Mark the full backup file as "incomplete, use increment file for restore if selected"
  • Read a chunk from the end of the full backup file
  • Append all the data from that chunk which isn't obsolete as a new chunk to the increment file
  • Truncate the full backup file to a new size, freeing exactly the copied chunk

In case of a crash during this process, the increment is always a valid restore option. The process could be resumed easily.

This would only add the size of a "chunk" to the backup space during the operation. Maybe 100 MiB would be good for those. Compression should probably operate on chunks, not whole backup files.

Thank you for your efforts to create a new backup suite and your generous free launch!

Edit: a similar strategy, but much faster, for file systems supporting hole-punching:

Phase 1:
  • Mark the increment file as "incomplete, use full backup file for restore if selected"
  • Read a chunk from the end of the increment file
  • Append all the data from that chunk to the full backup file
  • Truncate the increment file, freeing exactly the copied chunk

Phase 2:
  • Read a chunk from the full backup file
  • If it contains obsolete data, copy and append to the end of the full backup file, discarding the obsolete data, and punch a hole at the old location
Opposed to the previous strategy, this one mostly only reads the full backup file, leaving most of the data untouched. Instead of hole-punching, backups could be split into chunk-files in general, to support more file systems.
Your suggestion is good, but it may be very difficult to implement. Our development team will explore its feasibility, and we will improve it in future releases if it is technically feasible, please note that it will not appear in recent releases.