Tags

arch bash cakephp dauth devops drums drupal fosdem foss git golang information age linux lua maemo mail monitoring music n900 netlog openstack php productivity python real life thesis travel uzbl vimeo web2.0

Posts

Dvcs-autosync: An open source dropbox clone... well.. almost

I found the Dvcs-autosync project on the vcs-home mailing list, which btw is a great list for folks who are doing stuff like maintaining their home directory in a vcs.
In short: Use cases:
  • you have one or more trees you want to maintain under a VCS (because you want vcs advantages like history and whatnot) but it's not worth to spend time comitting manually
  • you want to backup some files transparently
  • you want to share / work with others easily
  • you like the idea of dropbox, but not the closed-source-ness or the vendor dependence
  • you work mainly with relatively small files (or.. read on)
Thoughts:
  • very simple to get started
  • simpler (and imho saner) code and implementation in comparison to sparkleshare
  • bound to the limitations of the dvcs. In case of git: no support for ownership, xattrs, and less suited for bigger files (although git-annex might help?)
  • You cannot store a git repo inside a git repo, I think (i.e. I don't think you can keep a -potentially dirty- git clone in dvcs-autosync)
  • For a real, filesystem-level dropbox-alike coda might be a better option, though I'm not sure how useful that project is right now
If it sounds like something you need, try it out. And contributions welcome.
Next up to the todolist: more clever heuristic for event coalescing, setting up bugtracker.

Comments

Nice project! I am currently writing with a FUSE driver based on git-annex. It is not perfectly working yet, but I think with little effort, it should become useable very soon.

https://github.com/chmduquesne/sharebox
Cristophe-Marie: that looks like an interesting project as well, but it's hard for an outsider to understand the ideas behind it and how it's implemented.  You should document that in the README.  Clear information is key to have more users/contributors ;)
Sounds great, thanks! I'll try it out as soon as I have time.
Absolutely, sharebox is not finished.

What is implemented right now:
- the default state for files is to be commited to git-annex: their content is not versionned, only links to their content.
- They are presented by the file system as regular files, resolved by reading the links. If they are not present on the system, they are seen as empty files.
- When a file not present on the system is accessed (with the open() system call) 'git annex get' is used to get the content of the file (If it is not possible to get the file, there is an access error).
- Copy on write is used: When you open a file, you open the read-only file linked by git-annex (even if you tried to open it with write access). If you do not modify the file, nothing happens, but if you do, the file is 'git annex unlock'-ed (It then occupies twice the necessary space) and the result is commited to git when the file is closed.

What is implemented, but does not work very well:
- There is a mount option for specifying a synchronisation interval. The idea of using xmpp seems rather gooed, maybe I should dig into that.

What is not implemented:
- Documentation: as you said, I lack a good one for getting contributors.
- conflicts are not handled yet. What I plan is to pop up a merge tool to choose between the remote and the local version.
- the number of copies kept for the same file is not setable yet, but it should become a mount option.
- A good test suite: I still experiment by hand, but this approach is beginning to show its limits. I should begin to write a test suite to automatically detect what is wrong...
Cristophe-Marie: so you've basically started working on a direction I'm also interested in: leveraging git-annex to effeciently work with bigger files in git, and since git-annex is not completely transparent, using fuse for it.

Another interesting point is what happens when all clients/clones/whatever remove their copy of a file, is there some way to make sure there is always a copy (for archival/backup/history purposes) ?  Maybe by having an extra remote which always keeps the file?

Anyway this is not the right place to discuss your software, I suggest you also announce it on vcs-home or something
I did not know about vcs-home. I'll suscribe and submit the project next week, so that ideas like yours can be discussed.
The arch linux "community contributions" subforum is also a great place; many fine hackers over there.
I found this as I was looking for a way to share a (versionable) private filetree with collaborators, without wanting to upload to "a server" that could get hacked, ie. have everything on our laptops.

So being an XMPP fan I searched on Git + XMPP, thinking maybe XMPP gives a way of naming/addressing repos in a logical fashion, without needing a WWW server.

However looking at the docs, I don't get the impression you're using XMPP in quite that way: " $ cd ~ && git clone <server>:autosync.git autosync && cd autosync
"

... is <server> there an XMPP address ?
Hi, yeah no <server> is a an ssh-style hostname there, anything git can clone from. I think. sorry for replying 2 years late. :)


Name:


E-mail:


URL:


Comment:


What is the first name of the guy blogging here?


This comment form is pretty crude. Make sure mandatory fields are entered correctly.
Basic html tags (a,i,b, etc) are allowed, others are sanitized