Backups: Difference between revisions

From Rixort Wiki
Jump to navigation Jump to search
No edit summary
 
(13 intermediate revisions by the same user not shown)
Line 13: Line 13:
* rdiff-backup
* rdiff-backup
* deja-dup
* deja-dup
* rclone (for backing up from/to cloud services such as Google Docs)


== Google Docs ==
== Writing new software ==


* [https://developers.google.com/docs/api/quickstart/python API access with Python]
Powerful Command-Line Applications in Go - may be useful
 
MVP:
 
* Include file (one per line)
* Exclude file (one per line)
* Copy files from X to Y
* Database (SQLite? DuckDB?) to hold metadata (map hashes to file paths)
* File size and SHA-3 for deduplication
 
Thoughts:
 
* Can the hash checking of files be done in parallel?
* Should we compare file size first, and only calculate hashes if the sizes are the same?
* What metadata should we store about each file?
* How to restore files?
* How to prune files?
* Deduplication at the file level - could also do this with chunks of files at a later date?
* Can we compress the files? Can this be done in parallel?
* What is the fastest collision-resistant algorithm for file comparison?
* How do we stop two backup processes running simultaneously?
 
Security:
 
* Assumption is that you trust the backup target
* If you want encryption at rest, use LUKS etc. to encrypt the underlying device
* If you want encryption in transit, use SSH or TLS


== GitHub ==
== GitHub ==
Line 24: Line 51:
* [https://jpmens.net/2019/04/15/i-mirror-my-github-repositories-to-gitea/ I mirror my Github repositories to Gitea]
* [https://jpmens.net/2019/04/15/i-mirror-my-github-repositories-to-gitea/ I mirror my Github repositories to Gitea]
* [https://github.com/PyGithub/PyGithub PyGithub]
* [https://github.com/PyGithub/PyGithub PyGithub]
[[Category:Software]]
[[Category:Open Source Software]]

Latest revision as of 10:07, 24 September 2024

Topics for consideration

  • How can I start backups automatically when I login? Don't want to do this until a network connection is available, and possibly my keyring has been unlocked.
  • How can I start a backup run periodically? Is this necessary given that I usually login at least once every 24 hours?
  • How can I put an icon in the system menu that shows backup status/process? Similar to Nextcloud would be useful with a tick for complete, circular arrows for in process, and a cross for failed.

Existing software

  • tar
  • Obnam
  • restic
  • Borg
  • rdiff-backup
  • deja-dup
  • rclone (for backing up from/to cloud services such as Google Docs)

Writing new software

Powerful Command-Line Applications in Go - may be useful

MVP:

  • Include file (one per line)
  • Exclude file (one per line)
  • Copy files from X to Y
  • Database (SQLite? DuckDB?) to hold metadata (map hashes to file paths)
  • File size and SHA-3 for deduplication

Thoughts:

  • Can the hash checking of files be done in parallel?
  • Should we compare file size first, and only calculate hashes if the sizes are the same?
  • What metadata should we store about each file?
  • How to restore files?
  • How to prune files?
  • Deduplication at the file level - could also do this with chunks of files at a later date?
  • Can we compress the files? Can this be done in parallel?
  • What is the fastest collision-resistant algorithm for file comparison?
  • How do we stop two backup processes running simultaneously?

Security:

  • Assumption is that you trust the backup target
  • If you want encryption at rest, use LUKS etc. to encrypt the underlying device
  • If you want encryption in transit, use SSH or TLS

GitHub