Page 1 of 2

"Ideal" design and workflow for VCS

Posted: Sun Apr 07, 2019 4:01 am
by glauxosdever
Hi,


Since git won't (probably) run on my OS and I don't want to be dependent on another OS for hosting and building my OS eventually, I decided to design and implement a Version Control System that will run on Linux and other UNIX-like OSes for now, but will be easy to reimplement for my own OS. I also know there is no "ideal" design or workflow for anything, but maybe it can be better in some respects than in existing VCSs. So I'd like to get some input from you on this matter.

The list below is not exhaustive; I haven't thought about everything yet and I may be missing some stuff.
  • Centralised vs Distributed: I think most of us agree a distributed VCS is better, because it allows us to commit locally and makes it easier to make forks. But I still included this question for the sake of completeness.
  • Branching: I'm thinking of having a separate source directory for each branch, whose contents don't get altered when switching to another branch. Pretty sure most of us have lost uncommitted changes this way at least once. Also, I don't think there is the need for the VCS to remember which branch you are working on, just specify the branch in each command.
  • Committing: For commits, I'm thinking of storing the file contents separately from the filename; in the case of renaming files, it would de-duplicate some data. Same for commit contents; i.e. separately from the commit metadata (author name/email/time, commit name/email/time, etc). Workflow-wise, I don't think there is much to be done differently though.
  • Merging: I don't have any actual ideas here yet, but what are your opinions? Algorithms to use? Squashing or no squashing? Additional merge commit or just "literal" merging? Change commit name/email/time when merging? (Changing the commit time would imply that merged commits get appended after the existing commits on that branch instead of getting somewhere in between, depending on the original commit time.)
  • Additional features
    • Per branch integrated issue tracking: Having separate issues per branch is one idea I've been pondering around. The only VCS that I found that does this is fossil, and it takes it too far by also having wikis and full sites per branch.
    • Other ideas you would like in a VCS?
Thank you in advance! :-)


Regards,
glauxosdever

Re: "Ideal" design and workflow for VCS

Posted: Sun Apr 07, 2019 5:22 am
by Solar
Nonono.... please no...

Why do you think that Git won't run on your OS, or Subversion? (*) If all else fails, scale back to CVS, which has very few dependencies. But even RCS would be more efficient than starting a VCS side-project. Definitely more efficient than starting one with its own workflow, interface, and set of quirks.

Focus on adding features to your OS so that the more powerful software can be easily ported. Don't fragment your efforts by trying to implement ANOTHER powerful (and complex) piece of software BEFORE your OS gets somewhere.

(*): I actually do not consider DVCS to be "better" unless you are working "on the road" on a regular basis or have a large team with frequently overlapping change sets. DVCS add lots of complexity, and Git in particular has several issues regarding its interface that make me prefer Subversion for small-to-mid sized projects.

Re: "Ideal" design and workflow for VCS

Posted: Sun Apr 07, 2019 11:09 am
by nullplan
I second everything Solar said, especially that first line. You are currently trying to develop an OS, which is a big task in itself, you probably don't want to be sidetracked with another really complicated task.

Though, reinventing the wheel is kind of what this hobby is all about. So don't let us two naysayers dissuade you should you really want to set out on this quest.

Re: "Ideal" design and workflow for VCS

Posted: Sun Apr 07, 2019 12:02 pm
by glauxosdever
Hi,

Solar wrote:Nonono.... please no...

Why do you think that Git won't run on your OS, or Subversion? (*) If all else fails, scale back to CVS, which has very few dependencies. But even RCS would be more efficient than starting a VCS side-project. Definitely more efficient than starting one with its own workflow, interface, and set of quirks.
My OS will not be UNIX-like (at least not with a compatibility layer, but let's not depend on it anyway). There are already a lot of UNIX-like systems, so I think different designs should be tried out. As for being efficient myself, honestly, I'm not and I've decided to be not.
Focus on adding features to your OS so that the more powerful software can be easily ported. Don't fragment your efforts by trying to implement ANOTHER powerful (and complex) piece of software BEFORE your OS gets somewhere.
I don't care about porting software, at least not until I've tried to write some on my own that fits the OS design.
nullplan wrote:I second everything Solar said, especially that first line. You are currently trying to develop an OS, which is a big task in itself, you probably don't want to be sidetracked with another really complicated task.
It's possible that I'm underestimating the complexity of a VCS. But I can allow it initially some 3 months, with about 5 to 10 hours or so per week.


Regards,
glauxosdever

Re: "Ideal" design and workflow for VCS

Posted: Sun Apr 07, 2019 3:12 pm
by Solar
Either port an existing system, or at least reimplement it. Don't reinvent one.

You see, there exists documentation and online help for Git and Subversion. There wouldn't for whatever you would cook up. Even if it were good, it would still mean that anybody who wants so much as check out your OS sources would have to go through your VCS... which, if your OS is going to be so radically different that porting isn't straightforward, means a hen-and-egg problem, as YourVCS-clients will probably not be available for other operating systems than YourOS.

Both Subversion and Git are available for e.g. Windows as well. Without having checked, I can say with some confidence that you will find some platform-dependent code in there that's #ifdef'ed. Add YourOS as another option, and do the necessary adjustments. (With a quick first glance, I'd say you're probably better off going with Subversion, as that's plain C, while git seems to be implemented in a hodgepodge of different languages, making a port more difficult.)

Or simply eschew self-hosting capabilities for now. That will allow you to do thinks very specific to YourOS, while still enjoying a fully grown toolchain on the host OS.

Re: "Ideal" design and workflow for VCS

Posted: Mon Apr 08, 2019 12:37 am
by Korona
I agree with Solar and nullplan on everything they said. Regarding your technical points, I would heavily object to your branches-are-subdirs approach. That just leads to heavy duplication of files without no real benefit. What do you mean by "everybody lost uncommitted changes"? If you really want this, you can already have such a layout with "git worktree".

Honestly, the only thing I would change about Git is the command naming and default parameters. git pull --rebase should be the default. merge should never do a fast-forward, there should be an explicit command to do that. I feel that git reset and git reset --hard could have better names (maybe point-at instead of reset and --checkout instead of --hard). Same for git revert. The interfaces to some commands such as git stash could be much better.

Re: "Ideal" design and workflow for VCS

Posted: Mon Apr 08, 2019 1:40 am
by glauxosdever
Hi,

Solar wrote:You see, there exists documentation and online help for Git and Subversion. There wouldn't for whatever you would cook up. Even if it were good, it would still mean that anybody who wants so much as check out your OS sources would have to go through your VCS... which, if your OS is going to be so radically different that porting isn't straightforward, means a hen-and-egg problem, as YourVCS-clients will probably not be available for other operating systems than YourOS.
I can write the documentation, sure. Online help would be a problem though, and this is probably the first point in this thread that we agree on. As for people checking out my OS sources, I'll have a client and a server written in C targeting UNIX-like OSes; in fact the initial ones are going to be such. Besides, I'd need them myself while developing the initial custom programming language compiler (also written in C targeting UNIX-like OSes) and also while developing the OS, until I develop a "native" compiler and the OS actually becomes self-hosting.
Or simply eschew self-hosting capabilities for now. That will allow you to do thinks very specific to YourOS, while still enjoying a fully grown toolchain on the host OS.
But being self-hosting is part of the fun! Sure it's impractical because I'd need to develop a toolchain that will run on both Linux and my OS (two different versions), but it makes the OS more impressive when accomplished.
nullplan wrote:Regarding your technical points, I would heavily object to your branches-are-subdirs approach. That just leads to heavy duplication of files without no real benefit. What do you mean by "everybody lost uncommitted changes"? If you really want this, you can already have such a layout with "git worktree".
I think this is a valid objection and, now that I think of it, I tend to agree. Indeed, when having 100 branches, it's too much. Concerning losing uncommitted changes, I think I had it happen to me in another case, but not when checking out another branch (sorry, I misremembered it #-o).
Honestly, the only thing I would change about Git is the command naming and default parameters. git pull --rebase should be the default. merge should never do a fast-forward, there should be an explicit command to do that. I feel that git reset and git reset --hard could have better names (maybe point-at instead of reset and --checkout instead of --hard). Same for git revert. The interfaces to some commands such as git stash could be much better.
I agree; some of git's defaults are bad. Command naming is a minor usability issue too. As for git stash, I don't remember I've ever used it (probably that's why I lost those changes anyway); I think task switching in git should be easier anyway.


Regards,
glauxosdever

Re: "Ideal" design and workflow for VCS

Posted: Mon Apr 08, 2019 3:15 am
by iansjack
I think we can learn a lot from Babbage, who constantly thought of better ways do do things and more ambitious projects with the result that none got completed.

Writing an OS, particularly one that doesn't follow the usual conventions, is a hard enough task. Also writing your own toolchain, and then your own custom language, and then your own version control system, and all the other user-space tools seems a little overambitious. I would tend to stick to one task at a time.

Re: "Ideal" design and workflow for VCS

Posted: Mon Apr 08, 2019 4:18 am
by glauxosdever
Hi,

iansjack wrote:I think we can learn a lot from Babbage, who constantly thought of better ways do do things and more ambitious projects with the result that none got completed.

Writing an OS, particularly one that doesn't follow the usual conventions, is a hard enough task. Also writing your own toolchain, and then your own custom language, and then your own version control system, and all the other user-space tools seems a little overambitious. I would tend to stick to one task at a time.
At this point, I was thinking to stick with the VCS for some months initially, without concerning myself with the other tools.

But, perhaps, should I limit the scope of my project? I realise it's getting quite similar in ambitiousness as Brendan's and, from what I know, nothing by Brendan has been released during the last 10-15 years (and we all know how Brendan would go out of his way to justify his decisions). Furthermore, I don't have too much time either. Maybe I should do a U-turn after all and do everything I've talked against since the last 2.5 years? Maybe a UNIX-like OS, with some software written from scratch instead of being ported over? I don't know. Let's continue this discussion on another thread.


Regards,
glauxosdever

Re: "Ideal" design and workflow for VCS

Posted: Mon Apr 08, 2019 7:10 am
by bzt
I agree too with the others. There's no point in reinventing the wheel.
If I were you I would take Solar's advice and port an existing VCS. Most of them are using mostly file-related functions, so it shouldn't be hard to replace POSIX file operations with your OS' syscalls (unless you have a very exotic system without file abstration).
Also note that you don't have to port everything at once. For example you could port git's local repo first, and add remote repo support later.
A final thought, you can switch VCS any time in your project. So you can start with git on your host OS for example, and later migrate your codebase to another VCS when that's ready in your OS.

Cheers,
bzt

Re: "Ideal" design and workflow for VCS

Posted: Mon Apr 08, 2019 9:08 am
by glauxosdever
Hi,


Switching VCSs at some point in the future is something I had also considered some time ago. I decided against it back then because I wouldn't like to have the earlier history inaccessible. But I guess I could convert it from git to mine when the need arises (although issues and maybe other stuff would be missing, as git doesn't concern itself with them).

So, I'm skipping the VCS for now. I might even skip the toolchain too, if I decide to use POSIX after all.


Regards,
glauxosdever

Re: "Ideal" design and workflow for VCS

Posted: Tue Jun 11, 2019 12:33 pm
by eekee
This crossed my mind a while ago. I didn't think out the implications very much because I had (and have) a lot of other things to think about. I don't really want to put the effort in to making it. In any case, I personally would be better off with a versioning file system, because I forget to commit at the most critical points. The interface can be far simpler, too. I know commit messages are a good thing, but I'm hoping the copious notes I make will make up for that. (This reminds me, I need to set up scheduled Git commits on my own files. It'll do until I have a versioning filesystem again.)
Solar wrote:git seems to be implemented in a hodgepodge of different languages, making a port more difficult.
Yep. Plan 9 users have found it easier to port Python (multiple times!) to support Mercurial, than Git. I'm told there's work going on to make Git easier to port. I gather it's only the "porcelain" that's hard to port, but the core is supposedly insanely hard to use. Maybe you could port just the core and write a different interface for it, one which suits your OS? I have no idea how much work that would be. Plan 9 (as 9front) has hgfs which fits into Plan 9's way of working. It's an actual independent implementation of hg, but it's very incomplete and has been for a long time.

Mercurial had one tree per branch for a long time. Didn't use it like that myself because I find DVCS in general far too complicated and end up avoiding it altogether. I imagine one tree per branch makes it a bit simpler to think about, but my imagination is very good. ;) It would enable using common Unix tools rather than the ones built into the DVCS.
iansjack wrote:I think we can learn a lot from Babbage, who constantly thought of better ways do do things and more ambitious projects with the result that none got completed.
Absolutely! I'm the same, it's a hard personality trait to control.

Re: "Ideal" design and workflow for VCS

Posted: Tue Jun 11, 2019 12:55 pm
by Korona
If you forget to commit things, the best thing to fix is your mindset about VCS and not your tooling (and maybe the process of your development -- for anything except for small personal projects this should either be caught by CI or by code review). After large feature additions, I tend to spend a non-negligible amount of time on breaking changes into proper patches so that it becomes possible for others to understand the change sets. Do not do `git add -u && git commit` after each feature. Instead, inspect what you did, aggregate hunks into commits and rebase/fixup them as necessary. Only do `git push` when you're satisfied with the history that you're generating.

Re: "Ideal" design and workflow for VCS

Posted: Tue Jun 11, 2019 12:58 pm
by eekee
Oh yes, that's ideal, but my ability to focus comes and goes. Sometimes I simply have to walk away before anything can be committed, and then it can be two weeks before I rediscover my changes. It's a bit rare for it to be that bad, but it does happen.

Re: "Ideal" design and workflow for VCS

Posted: Tue Jun 11, 2019 3:02 pm
by Solar
Happens for me as well. The trick is to discipline yourself and...

a) Work on one issue at a time only. Don't let the scope creep, and start by thinking about "what will this look like as a change set, and what would be the comment?".

b) As soon as you resolved that issue, commit. Do not continue to add other tidbits to the same change set. You've reached a point of stability; snapshot it, like you would in a game.

As long as you follow those two rules, it doesn't matter if "work in progress" sits in your working directory for a while. (At least, not until someone else touches the same pieces of code and you have to start merging changes.)

Really, once you start employing VCS, it's not an afterthought of the development process, the commit / change set is the basis for it all. I don't follow this religiously myself, but I do feel the pain whenever I deviate from it, because things get messy (and wasteful) really quick every time.

And I've worked with versioning filesystems... in my opinion, they make it worse, not better, exactly because they take away the thinking in a) and somewhat muddy the waters around b).