I'm working on a utility to archive and organize old data that I want to keep fo...

diamondap · on Jan 15, 2021

I wrote a similar app for archivists to push materials into preservation repositories. I used Electron, since it has good cross-platform support. The source is at https://github.com/aptrust/dart with documentation at https://aptrust.github.io/dart-docs/users/getting_started/

The underlying JavaScript code in that app started getting messy because I was working on several other projects simultaneously, but you might find it useful to play with as you consider your desktop app.

The archival community uses a simple text-based packaging format from the Library of Congress called BagIt, which allows you to include metadata and checksums with your archived materials so you can ensure their integrity and make sense of them when you get them back.

Anyway, you're working on an interesting problem. I'd be interested to see how it goes.

jetrink · on Jan 15, 2021

Thanks! Your project looks like great inspiration and BagIt might be very useful too.

stonewareslord · on Jan 14, 2021

For anyone interested in these features today, check out Git Annex. Fits all these requirements except for tagging. Add files just like with git, then git annex copy file --to some-remote. Intended for large files, you can Zip directories too if you like. I personally like directory organization but that's optional.

conductr · on Jan 15, 2021

FWIW I've often thought of building a cold storage cloud for this type of stuff. Basically the same functionality of [api, web gui, etc] that everyone has except, files need to be requested and may take some time to become hot/available to the user. It's really just because I think it's silly that the only reason I pay $100+/year to my provider is because I have some archived videos/photos that put me over their free limit. I never touch those files but don't want get rid of them either (I realize I could store myself but them I'm the one responsible if they get lost :))

Malcx · on Jan 14, 2021

Have you got a site/email list/github/twitter I can follow for a release announcement?

jetrink · on Jan 14, 2021

Not yet, but you can email me at the address in my profile and I'll let you know when something is available.

brittpart_ · on Jan 14, 2021

What's a flat collection hierarchy?

jetrink · on Jan 14, 2021

I just mean that a collection has no subfolders or other structure. It is simply a list of items like an S3 bucket.

brittpart_ · on Jan 15, 2021

How do you find an item then? I've read numerous research studies that prove people still prefer navigation over search. Ofer Bergman has done a lot of work.

jetrink · on Jan 15, 2021

The thought is that collections should be homogeneous so that for most use cases,

* The number of items would be so small that search would not be necessary, e.g. a collection personal projects

* The items would fall naturally into a timeline so you can search trivially by scrolling, e.g. RAW photos grouped together by month

* The items would be easily identified by name, e.g. MP3 files grouped by album (why am I still holding onto these?)

The intention is not to upload 1000s of individual files in a jumble, but instead, a much smaller number of archives. E.g. If you are archiving the previous semester's homework assignments, instead of uploading a bunch of random documents, each item would be an archive of the assignments from a particular class. You could tag each item with 'Fall 2020' if you want to improve the organization. I'm intending to make that an easy process, where you point the program at a directory and it packages, tags and uploads each subfolder.