Git / DVCS workflows

This forums is for OS project announcements including project openings, new releases, update notices, test requests, and job openings (both paying and volunteer).
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Git / DVCS workflows

Post by Korona »

Solar wrote:I am targeting the 90% of cases which, to my experience and that of a sizeable number of colleagues I have spoken with so far, are not considering the kind of detailed history grooming you are describing, open source or not.

To me (and many others), keeping commits small and singular in purpose is something done before you commit, because that is the way to work effectively. For me (and many others), the ability to rewrite published history (by purpose, or worse, by accident) is something that would be a show-stopper if there weren't outside forces at work that make using a different VCS a non-option.
I am a bit confused about this statement. My experience is quite different, both at work and in open source projects. It is not about "history grooming" but rather about making changes after code review. The usual workflow is:
1) prepare a series of commits
2) push them to a private feature branch
3) post a merge request and get a code review (by a coworker or open source maintainer)
4) fix issues raised in the review, proceed to step 2)
5) merge the private feature branch into trunk

I've used this practice both with Git and with SVN (and CVS :D), so I am quite confused about your perceived difference between git and SVN here (well, git makes it arguably more convenient to use such a patch-based workflow but the workflow certainly predates Git).

If we are talking about rewriting the history of trunk (or public feature branches), we can agree that this is not a good idea. But I don't think that that's a controversial statement; people do not rewrite the Git history of public branches. If you want to make sure that no modifications are done to public branches at all, doesn't it make more sense to distinguish between public and private branches and prevent rewrites on private branches?
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Git / DVCS workflows

Post by Solar »

Maybe we have a disconnect in vocabulary here.

What, in your workflow, makes something -- like a branch -- "private"?

I'd like to draw the line at remote / local. If it's in the remote (main) repository, it's publically visible, and any modification there is "rewriting history".

Have I overlooked some kind of privacy feature?

I consider the ability to move back on the timeline, and being sure I see the repository exactly as it was at that time, to be an essential feature of any VCS.
Every good solution is obvious once you've found it.
vvaltchev
Member
Member
Posts: 274
Joined: Fri May 11, 2018 6:51 am

Re: Git / DVCS workflows

Post by vvaltchev »

Solar wrote:Maybe we have a disconnect in vocabulary here.

What, in your workflow, makes something -- like a branch -- "private"?

I'd like to draw the line at remote / local. If it's in the remote (main) repository, it's publically visible, and any modification there is "rewriting history".

Have I overlooked some kind of privacy feature?
Let me try to explain. Let's say you own a project called PROJ somewhere, using GIT. I cannot or I'm not supposed to directly push any commits there directly. So, I could just clone it on my machine and work there on a local branch. But, what if I my machine dies or what if I would like to access my work from multiple machines? Well, I'll make a fork of your whole PROJ in my (remote) account. Then, I'd push my work on the FEATURE_XYZ branch on the remote machine. My fork can be private or public, that doesn't matter: what matter is that nobody knows about my fork and nobody cares. Everybody continues to clone your PUBLIC/OFFICIAL repository when preparing for changes.

So, I continue to work on my series of commits, synching everything up with the remote machine until, after passing a review, my patch series is ready. At that point, I can:

- Create a pull request, asking you to pull the commit in my private branch into the official repo of PROJ. In case my changes don't merge directly, you might ask me to rebase my feature branch to the latest commit in master and then update the pull request.

OR

- Given the permission to push directly into your PUBLIC/OFFICIAL repo, I checkout the latest version locally and merge my feature branch on the master branch of my local copy of PROJ. During that, I might have to resolve conflicts. After testing that everything works, I push my new master into the official repo.

So, even if I had the permission to push commits to the PUBLIC/OFFICIAL repo, I don't do it until my code is ready and has been reviewed. I certainly won't create a branch on my own in the official repo. The whole point of GIT is that everybody works on his/her own fork of the official repo and then pushes (or makes a pull request) to the official repo once the series is ready.


In conclusion it's not considered "rewriting history" when it's about a private fork, because nobody cares about that history other than the developer himself. Typically people cannot even see that private/temporary repo, because it's in the home directory of the developer and he/she syncs with that via SCP (through git). Everybody cares about the history of the official repo and in *no case* the history there should be re-written.
Tilck, a Tiny Linux-Compatible Kernel: https://github.com/vvaltchev/tilck
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Git / DVCS workflows

Post by Korona »

Solar wrote:Maybe we have a disconnect in vocabulary here.

What, in your workflow, makes something -- like a branch -- "private"?

I'd like to draw the line at remote / local. If it's in the remote (main) repository, it's publically visible, and any modification there is "rewriting history".
The fact that nobody else pulls from the branch and the branch is not deployed to production.

I agree about the fact that branches in the main repository should not be tampered with. The branches that I am talking about should not be part of the main repository. But they need to be public (at least to other developers) such that code review can be done. In an open source project, these branches are typically on a fork of the repository under your user account, while in a closed source project, they can be part of some "staging" repository.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Git / DVCS workflows

Post by Solar »

I don't quite get the need for the extra branch for a code review in the first place (and have done plenty of code reviews without those, right there on the feature branch), but ok. Thanks for the explanation.
Every good solution is obvious once you've found it.
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Git / DVCS workflows

Post by Korona »

The branch is not technically needed but it reduces the amount of churn on the feature branch (and later on trunk). That's useful when bisecting and/or logging at the history. It also ensures that no commits that do not build (or pass tests) get accidentally pushed to a feature branch.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Git / DVCS workflows

Post by Ethin »

I've been working on improving my open-source workflows and skillset. Currently my kernel repo has a pretty broken history (not broken in the sense of its all over the place, but broken in the sense of "I didn't use branches and pushed large code changes"). So I'm trying to improve that -- I'm just not used to the "branch when making changes, push small changes" model of doing things, unfortunately (most of my work was on private projects, so we didn't use branches because those who I worked with and myself were always in communication).
I am definitely trying to get better at it though, but I really don't know if I can actually fix my commit history now. Maybe I should just leave it as-is and say "well, that was when I wasn't very good with git". Lol
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Git / DVCS workflows

Post by Korona »

There is nothing wrong with pushing large commit series without using a branch (as long as everything is still tested properly etc.). Or are you taking about large commits? These are indeed an anti-pattern.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Git / DVCS workflows

Post by Ethin »

Korona wrote:There is nothing wrong with pushing large commit series without using a branch (as long as everything is still tested properly etc.). Or are you taking about large commits? These are indeed an anti-pattern.
Talking about the latter. E.g.: commit e7994e6e4b877a4eecfbd2b7f220fb75d24c5fc3 contained 4 changes, and commit 0d9340a7feb63432fbf36be05653d89408d1cb1d contained at least 10 changes. Didn't help that I switched VCSs to give Fossil a go either, so commit 6caf7bea92471a4f147eb14107e2438bb05700dc was all about switching back because there wasn't anything like GitHub for Fossil, and though I liked the idea I just felt like the infrastructure wasn't there, and I don't have the money to pay for my own CI integrations or anything like that when I start using those.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Git / DVCS workflows

Post by Solar »

Korona wrote:The branch is not technically needed but it reduces the amount of churn on the feature branch (and later on trunk). That's useful when bisecting and/or logging at the history. It also ensures that no commits that do not build (or pass tests) get accidentally pushed to a feature branch.
This is the kind of "because we can" that always irritated me about git / DVCS, and the usage patterns that evolved around it. No version that is in the public repository should "not build" or "not pass tests". That's elementary checking that should be done by the developer locally before pushing, not delegated to some later VCS "magic". Would you really want to do a code review on a version that doesn't even build?

You are accepting here that either a) that there are commits in the repo that fail basic build / test, or b) that you have to rewrite public repo history. I consider either a "no go". Of course there are the accidents and exceptions, but for general development?
Every good solution is obvious once you've found it.
Korona
Member
Member
Posts: 1000
Joined: Thu May 17, 2007 1:27 pm
Contact:

Re: Git / DVCS workflows

Post by Korona »

I think here we have the same understanding as before: I am not talking about production repos but about staging repos. The rewriting is done on the staging repo, before stuff is pushed into the production/public repo.
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].
Ethin
Member
Member
Posts: 625
Joined: Sun Jun 23, 2019 5:36 pm
Location: North Dakota, United States

Re: Git / DVCS workflows

Post by Ethin »

I'm working on preparing my kernel for open-source contributions and I explicitly wrote in CONTRIBUTING.md that all code that's committed must pass syntactical and linter checks and it must build. It doesn't have to work how you wanted but it *must* build. I certainly am not going to make a habit of fixing other peoples code when they could've ran `cargo make -t build_all`, which runs `cargo check` and clippy for rust lints and then builds the kernel from start to finish. I don't have any testsuites... I need to work on those but I don't have a good testing architecture. Like every test would need to be its own kernel and I'd like something that autogenerates all that stuff for me. Maybe I might have to make a tool of my own to do that until I get kernel self-tests...
Post Reply