alternatively: branches and commits are cheap… make a new branch, and commit often. then go back and clean things up before submitting a patch or merging to some ‘main’ branch…
I kinda disagree. It’s tedious in the same way that organizing your life in general is tedious… you’re moving the cost up front instead of having to deal with the cost of disorganization later
The tool in the article also needs some commands issued beforehand for it to work, is that willpower/discipline too? I guess it’s useless for most people then.
Okay, but I’m talking about the GP’s concrete advice. There are a lot of small things that you can do that aren’t “willpower” related to improve your organization, including getting and using a planner, putting things in your calendar. If you have a lot of paper, getting a file drawer or switching to a Remarkable and being sure to put documents in the right place. Don’t put all your files on your desktop, use an organizational structure. Close a tab when you have over 5 open or use tab groups. All of these things have very much improved my organization and improved my rather severe ADHD. It’s not about willpower, it’s finding concrete little things that are easy to do and improve the structure of your life.
How is using a calendar or a planner not something that requires will power? Do you have any idea how many times and how many different strategies for managing notes/appointments/tasks I tried?
Should have finished reading before commenting. Literally everything you mention requires will power to work. Yes, once habits are formed, it gets easy, but it’s never cost free.
I definitely agree that small steps make things that require discipline manageable! I have ADHD too, and my planner, calendar, and todo tracker are invaluable.
But the practices of GTD and Building A Second Brain both support the idea of one big inbox that you sort through later. In this case, dura is the inbox.
Another tool of ADHD management is a notebook that serves as your short-term memory. Dura is the notebook.
I’m not sure the comparison is apt. Organizing ones life requires completely different set of skills than switching to terminal and typing some commands to select all updated files and committing them with(out) meaningful message. Also, I’m failing to see how having a tool commit for me, vs I manually committing changes often moves any cost. In the end, I need to clean up a sequence of intermediate commits into a good, submittable change request (here, I’m referring to original poster’s suggestion to commit often).
Commit everything on a branch dev/andy. Whenever I get something working, i.e. make a test pass, I commit. Then git diff becomes a useful aid – what’s changed since something last worked? – and I can experiment without worrying.
Then git rebase -i master when I have a logical piece of work that should land on the main branch.
Then ./local.sh git-merge-to-master which does this:
git-merge-to-master() {
local branch=$(git rev-parse --abbrev-ref HEAD) # find current branch
git checkout master
git merge $branch # merge my work into master
git push
git checkout $branch # so I continue working on the branch
}
Is there an advantage to using this dura tool? git can be tedious but it’s also very easily automated with shell.
dura is invisible (as long as you don’t look at the massive number of branches it creates). It makes the commits without touching any existing files (the only exception is it updates the dura-* reference. I’m unsure if your solution would cause friction with existing tools, but I like my backup tools to be reliable. Hard to say if dura is better than your solution, but it seems to me like it is.
The reason I made it is because I forget to commit. Or rather, I don’t want to mess up my pretty lineage of commits so I avoid committing (yes, I know it’s irrational, but it’s what I do). I wanted something 100% automated so that I don’t have to contend with my own psychology.
I see this potentially being installed by IT departments on fresh laptop images. It needs some work to get it to that point, but from a company’s perspective it’s a no-brainer. Save lost time, save money.
I definitely understand the use case for this. I’ve certainly had moments when I wanted to go back to the state code was in a couple of hours ago. I achieve this across my whole home directory with zfs snapshots.
Please tell me where I can find this ZFS on windows ;)
Windows now has an NFS client out of the box, so it’s actually pretty easy to run a FreeBSD VM under Hyper-V and mount folders via NFSv4 from Windows. Performance isn’t great though (at least for me, the Windows NFS client seems to have very high latency the first time you access a file).
There was a port of OpenZFS as a Windows installable filesystem (IFS). It doesn’t seem to have had much development for a few months though. Note that, unlike Linux, there is no legal impediment to using a CDDL filesystem on Windows.
I’d be inclined to do the same sort of thing that you do with filesystem snapshots: decrease granularity as they age. I might want to keep changes for every 5 seconds for a few hours, then every minute for a few days, then every hour for a few weeks, and so on. Git lets you splice parents, so you can delete intermediate git commits and fix up the tree so that the surviving commits still work.
Last time I used git-imerge on a very large tree, I discovered that git really doesn’t scale well to large numbers of commits without regular git gc invocations (imerge creates NxM commits, where N and M are the number of commits in each of the two branches that you’re merging and operations ended up taking 3-4 seconds on my laptop, then closer to 200ms after a git gc) so it’s probably important to do something like this.
Personally, I’d love to have my editor use git for undo history. I love the fact that vim has persistent undo (I set insane limits, so I can use undo to go back to a version many hours of editing ago if I realise I’d deleted something ages ago by mistake) but it grates slightly that I have two independent mechanisms for doing this.
Does each dura branch record which branch you were working off of at that time?
If so, you could give a list of them next to what HEAD was at the time, with ones that have been merged into the current branch highlighted (a la git branch -d)
you dont need a full object prune per-say, just packing up all the objects into packfile and have them delta compressed already go a long way. Even better if you give them proper commit-graph with Bloom filter backing.
Another technique is that: while working on branch feature-A, you can store all the snapshot commits into a special refs at refs/snapshot/feature-A (as opposed to refs/heads/feature-A). After that, you can implement logic of checking ages of refs in refs/snapshot/* and implement clean up accordingly.
I’d try something like adding a cleanup of “older than X days”, if you want to be defensive set that to 90 per default. Other people might want to reduce that, depending on how much binary stuff they’re abusing git for.
Then if people are still crying you could backup the whole git folder before that to a different place which is also only removed every X days, so if the cleanup goes wrong they can still revert. (Credits to bfg repo cleaner )
This is pretty cool. I wonder how much bigger it makes your .git folder. I didn’t see in the readme a way to clean up old branches; hopefully the solution isn’t just “git -D dura-X dura-Y” x 100 lol.
I created issue #9 to discuss this. It’s definitely a problem, but I’m not sure how big. I’m not even sure how to reason about how big of a problem it is.
The README suggests this just polls the filesystem for changes. I’m sure that this isn’t too big a deal, but I’d imagine inotify/fseventsd/whatever Windows has would be much more efficient.
Then again, it sounds like this should compile and run on nearly any platform. so far as I’m aware, there’s no portable abstraction for inotify and fseventsd.
Yeah, this polls every 5 seconds. I’d love this to be more efficient, but I’d like to keep it as portable as possible. I want to add some metrics so that I can get a feel for performance degradation as I watch more and more repositories. FYI there’s some good discussion happening on this Github issue
v4 is pretty stable, v5-preview doesn’t have the debouncer back (until I find the time to do that properly), though there are a lot of people using it and it has nearly complete BSD support.
Okay, I like the sound of the rough idea, but I’m skeptical about how clean and problem-free this will be in practice if it is using the very same git files per repo (./.git/**/*, etc.). It claims to do its work in a separate dura branch, while supposedly keeping all your git “state” the same, and untouched (working dir, dirty files, index, staged, etc.). But can it really safely do that, in every possible dirty state of every possible dura user, with every possible repo of those users? If you use git long enough, and you’re an experienced enough dev, you can get your local repo state into some pretty hairy conditions, thanks to things like git conflicts, in-progress rebasing, git bisect, failed git stash applys, and so on. This dura just “knows” how to deal with all the myriad ways a local checkout can get into a bad state, and just magically can do all its work without damaging the repo state in some way, or shifting the rug under the user (in a way the user can’t detect)?
This is a really cool idea. Some thoughts for you to ignore or otherwise as you see fit :-)
You can avoid having to ‘cd’ into the dir by just taking the dirname as an optional argument to dura watch (default to current dir)
It might be useful (to avoid the need to copy the hash) to embed a timestamp in the dura branch name, rather than a hash (there is some complexity here, since if the timezone you use has DST, timestamps aren’t unique. You might also want to worry about clocks not being monotonic. That said, I think both problems are solvable with heuristics)
Regarding “What heuristic would you use to identify dura branches that can be cleaned up?”, it seems to me that this is a “backup strategy” problem. I’d suggest something like “all changes for last N days (N=7?)”, “at most one commit per hour for last M days (M=30?)”, “at most one commit per day for older” (you can go further than this, e.g. you might only want one commit per month for stuff older than 3 years….). You could even make these tunable in your config file.
Having some standardish ‘-h’ ‘–help’ or ‘help’ subcommands would help usability
I had thought about this, but using the base commit hash takes care of some edge cases like switching branches. I want to create a sub-command that views changes in chronological order, but it seems a bit gnarly to get right (I like tig style, but most people use either git log or a GUI).
For me personally this is not needed, but from the other comments it seems there is demand. I often work on multiple features until they are ready to be merged. For this reason I naturally create branches and commit often, anyway. I commit into my branch whenever I finished one unit of work - sometimes even when the code does not compile. Before the merge I rebase it all.
I like my approach much more, because it is very clear to which logical change a commit belongs. Is it to speed up X? Then the branch is named speedup_x. Is it to add customer feature Y? Then the branch might be called customer_feature_y.
I would also worried whether this daemon can indeed run fully in the background without ever breaking anything. Have seen to many unexpected edge-cases crashing in my (yet short) developer career.
Does it have any git hooks in place to prevent someone accidentally pushing dura branches to the remote? There’s many, MANY git worflows out there and it seems likely that someone has a “git push –all” that’s going to lead to an absolute mess without a little safety check.
alternatively: branches and commits are cheap… make a new branch, and commit often. then go back and clean things up before submitting a patch or merging to some ‘main’ branch…
Branching and committing often is tedious, which this tool automates.
I kinda disagree. It’s tedious in the same way that organizing your life in general is tedious… you’re moving the cost up front instead of having to deal with the cost of disorganization later
Any advice that reduces to “just use willpower/discipline” is essentially non-advice for many people.
The tool in the article also needs some commands issued beforehand for it to work, is that willpower/discipline too? I guess it’s useless for most people then.
Once it’s set up, it’s set it and forget it. Very different.
Okay, but I’m talking about the GP’s concrete advice. There are a lot of small things that you can do that aren’t “willpower” related to improve your organization, including getting and using a planner, putting things in your calendar. If you have a lot of paper, getting a file drawer or switching to a Remarkable and being sure to put documents in the right place. Don’t put all your files on your desktop, use an organizational structure. Close a tab when you have over 5 open or use tab groups. All of these things have very much improved my organization and improved my rather severe ADHD. It’s not about willpower, it’s finding concrete little things that are easy to do and improve the structure of your life.
How is using a calendar or a planner not something that requires will power? Do you have any idea how many times and how many different strategies for managing notes/appointments/tasks I tried?
Should have finished reading before commenting. Literally everything you mention requires will power to work. Yes, once habits are formed, it gets easy, but it’s never cost free.
This is the definition of something that requires willpower.
I definitely agree that small steps make things that require discipline manageable! I have ADHD too, and my planner, calendar, and todo tracker are invaluable.
But the practices of GTD and Building A Second Brain both support the idea of one big inbox that you sort through later. In this case, dura is the inbox.
Another tool of ADHD management is a notebook that serves as your short-term memory. Dura is the notebook.
yes! i have ADHD too, which is a big reason why dura seemed so reasonable to sink a few days into building
I’m not sure the comparison is apt. Organizing ones life requires completely different set of skills than switching to terminal and typing some commands to select all updated files and committing them with(out) meaningful message. Also, I’m failing to see how having a tool commit for me, vs I manually committing changes often moves any cost. In the end, I need to clean up a sequence of intermediate commits into a good, submittable change request (here, I’m referring to original poster’s suggestion to commit often).
Here is how I do it:
Commit everything on a branch
dev/andy
. Whenever I get something working, i.e. make a test pass, I commit. Thengit diff
becomes a useful aid – what’s changed since something last worked? – and I can experiment without worrying.Then
git rebase -i master
when I have a logical piece of work that should land on the main branch.Then .
/local.sh git-merge-to-master
which does this:Is there an advantage to using this dura tool? git can be tedious but it’s also very easily automated with shell.
dura
is invisible (as long as you don’t look at the massive number of branches it creates). It makes the commits without touching any existing files (the only exception is it updates thedura-*
reference. I’m unsure if your solution would cause friction with existing tools, but I like my backup tools to be reliable. Hard to say ifdura
is better than your solution, but it seems to me like it is.The reason I made it is because I forget to commit. Or rather, I don’t want to mess up my pretty lineage of commits so I avoid committing (yes, I know it’s irrational, but it’s what I do). I wanted something 100% automated so that I don’t have to contend with my own psychology.
I see this potentially being installed by IT departments on fresh laptop images. It needs some work to get it to that point, but from a company’s perspective it’s a no-brainer. Save lost time, save money.
git branch foo-changes
… then just:
git add . && git commit -m "boop $(date +'%Y%m%d-%H%M')"
https://xkcd.com/1205/
I definitely understand the use case for this. I’ve certainly had moments when I wanted to go back to the state code was in a couple of hours ago. I achieve this across my whole home directory with zfs snapshots.
Which is probably the perfect solution but also very intrusive towards your setup.
Please tell me where I can find this ZFS on windows ;)
Windows now has an NFS client out of the box, so it’s actually pretty easy to run a FreeBSD VM under Hyper-V and mount folders via NFSv4 from Windows. Performance isn’t great though (at least for me, the Windows NFS client seems to have very high latency the first time you access a file).
There was a port of OpenZFS as a Windows installable filesystem (IFS). It doesn’t seem to have had much development for a few months though. Note that, unlike Linux, there is no legal impediment to using a CDDL filesystem on Windows.
Oh that’s nice. Maybe we can fuse mount SFTP one day in windows.
As if that stopped anyone ;)
You don’t.
Without a proper clean up mechanism built-in to the daemon, this can get out of hand really quickly.
Agreed. What heuristic would you use to identify dura branches that can be cleaned up?
I’d be inclined to do the same sort of thing that you do with filesystem snapshots: decrease granularity as they age. I might want to keep changes for every 5 seconds for a few hours, then every minute for a few days, then every hour for a few weeks, and so on. Git lets you splice parents, so you can delete intermediate git commits and fix up the tree so that the surviving commits still work.
Last time I used
git-imerge
on a very large tree, I discovered that git really doesn’t scale well to large numbers of commits without regulargit gc invocations
(imerge creates NxM commits, where N and M are the number of commits in each of the two branches that you’re merging and operations ended up taking 3-4 seconds on my laptop, then closer to 200ms after agit gc
) so it’s probably important to do something like this.Personally, I’d love to have my editor use git for undo history. I love the fact that vim has persistent undo (I set insane limits, so I can use undo to go back to a version many hours of editing ago if I realise I’d deleted something ages ago by mistake) but it grates slightly that I have two independent mechanisms for doing this.
Does each dura branch record which branch you were working off of at that time?
If so, you could give a list of them next to what HEAD was at the time, with ones that have been merged into the current branch highlighted (a la
git branch -d
)At the very least, I assume it has to record what HEAD was because that’d be the parent of the commit on the dura branch.
There might not even be a branch if you did a bisect or had an orphaned checkout for some other reason.
I think I can convince libgit2 to tell me all branches on a certain commit. Last time I checked it wasn’t entirely obvious though.
I definitely prefer looking at branches instead of objects. The branches are the main pain point, and
git gc
will cleanup orphaned objects anywayyou dont need a full object prune per-say, just packing up all the objects into packfile and have them delta compressed already go a long way. Even better if you give them proper commit-graph with Bloom filter backing.
See MSFT’s Scalar or my poor man rewrite of it here: https://github.com/sluongng/git-care/blob/master/git-care.sh . (Most of these have also been ported to git-maintenance).
Another technique is that: while working on branch
feature-A
, you can store all the snapshot commits into a special refs atrefs/snapshot/feature-A
(as opposed torefs/heads/feature-A
). After that, you can implement logic of checking ages of refs inrefs/snapshot/*
and implement clean up accordingly.No clue why my comment got triple posted. 🤔
[Comment removed by author]
I’d try something like adding a cleanup of “older than X days”, if you want to be defensive set that to 90 per default. Other people might want to reduce that, depending on how much binary stuff they’re
abusing git for.Then if people are still crying you could backup the whole git folder before that to a different place which is also only removed every X days, so if the cleanup goes wrong they can still revert. (Credits to bfg repo cleaner )
[Comment removed by author]
This is pretty cool. I wonder how much bigger it makes your .git folder. I didn’t see in the readme a way to clean up old branches; hopefully the solution isn’t just “git -D dura-X dura-Y” x 100 lol.
I created issue #9 to discuss this. It’s definitely a problem, but I’m not sure how big. I’m not even sure how to reason about how big of a problem it is.
The README suggests this just polls the filesystem for changes. I’m sure that this isn’t too big a deal, but I’d imagine inotify/fseventsd/whatever Windows has would be much more efficient.
Then again, it sounds like this should compile and run on nearly any platform. so far as I’m aware, there’s no portable abstraction for inotify and fseventsd.
Yeah, this polls every 5 seconds. I’d love this to be more efficient, but I’d like to keep it as portable as possible. I want to add some metrics so that I can get a feel for performance degradation as I watch more and more repositories. FYI there’s some good discussion happening on this Github issue
notify supports inotify on Linux, FSEvents on macOS, ReadDirectoryChangesW on Windows, kqueue on the BSDs, and polling on everything else.
I’d be happy to PR a change implementing this, if you’d like.
A PR for this would be VERY welcome. I haven’t had the time to familiarize myself with
notify
.v4 is pretty stable, v5-preview doesn’t have the debouncer back (until I find the time to do that properly), though there are a lot of people using it and it has nearly complete BSD support.
(I am the main? maintainer of notify)
Nice project!
On cursory glance, I think for the Emacs folks,
magit-wip-mode
can do a lot of this for youOkay, I like the sound of the rough idea, but I’m skeptical about how clean and problem-free this will be in practice if it is using the very same git files per repo (
./.git/**/*
, etc.). It claims to do its work in a separatedura
branch, while supposedly keeping all your git “state” the same, and untouched (working dir, dirty files, index, staged, etc.). But can it really safely do that, in every possible dirty state of every possible dura user, with every possible repo of those users? If you use git long enough, and you’re an experienced enough dev, you can get your local repo state into some pretty hairy conditions, thanks to things like git conflicts, in-progress rebasing,git bisect
, failedgit stash apply
s, and so on. This dura just “knows” how to deal with all the myriad ways a local checkout can get into a bad state, and just magically can do all its work without damaging the repo state in some way, or shifting the rug under the user (in a way the user can’t detect)?This is a really cool idea. Some thoughts for you to ignore or otherwise as you see fit :-)
You can avoid having to ‘cd’ into the dir by just taking the dirname as an optional argument to
dura watch
(default to current dir)It might be useful (to avoid the need to copy the hash) to embed a timestamp in the dura branch name, rather than a hash (there is some complexity here, since if the timezone you use has DST, timestamps aren’t unique. You might also want to worry about clocks not being monotonic. That said, I think both problems are solvable with heuristics)
Regarding “What heuristic would you use to identify dura branches that can be cleaned up?”, it seems to me that this is a “backup strategy” problem. I’d suggest something like “all changes for last N days (N=7?)”, “at most one commit per hour for last M days (M=30?)”, “at most one commit per day for older” (you can go further than this, e.g. you might only want one commit per month for stuff older than 3 years….). You could even make these tunable in your config file.
Having some standardish ‘-h’ ‘–help’ or ‘help’ subcommands would help usability
tig
style, but most people use eithergit log
or a GUI).For me personally this is not needed, but from the other comments it seems there is demand. I often work on multiple features until they are ready to be merged. For this reason I naturally create branches and commit often, anyway. I commit into my branch whenever I finished one unit of work - sometimes even when the code does not compile. Before the merge I rebase it all.
I like my approach much more, because it is very clear to which logical change a commit belongs. Is it to speed up X? Then the branch is named speedup_x. Is it to add customer feature Y? Then the branch might be called customer_feature_y.
I would also worried whether this daemon can indeed run fully in the background without ever breaking anything. Have seen to many unexpected edge-cases crashing in my (yet short) developer career.
Does it have any git hooks in place to prevent someone accidentally pushing dura branches to the remote? There’s many, MANY git worflows out there and it seems likely that someone has a “git push –all” that’s going to lead to an absolute mess without a little safety check.
No. You should create an issue for it so we can generate ideas on how to do this!