Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: git/git
base: 173fc54b005c92dc0da0fe5e71034128eddbacc8
Choose a base ref
...
head repository: git/git
compare: bcec6780b2ec77ea5f846d5448771f97110041e1
Choose a head ref
  • 7 commits
  • 15 files changed
  • 1 contributor

Commits on Nov 17, 2022

  1. refs: fix memory leak when parsing hideRefs config

    When parsing the hideRefs configuration, we first duplicate the config
    value so that we can modify it. We then subsequently append it to the
    `hide_refs` string list, which is initialized with `strdup_strings`
    enabled. As a consequence we again reallocate the string, but never
    free the first duplicate and thus have a memory leak.
    
    While we never clean up the static `hide_refs` variable anyway, this is
    no excuse to make the leak worse by leaking every value twice. We are
    also about to change the way this variable will be handled so that we do
    indeed start to clean it up. So let's fix the memory leak by using the
    `string_list_append_nodup()` so that we pass ownership of the allocated
    string to `hide_refs`.
    
    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    pks-t authored and ttaylorr committed Nov 17, 2022
    Copy the full SHA
    5eeb9aa View commit details
    Browse the repository at this point in the history
  2. refs: get rid of global list of hidden refs

    We're about to add a new argument to git-rev-list(1) that allows it to
    add all references that are visible when taking `transfer.hideRefs` et
    al into account. This will require us to potentially parse multiple sets
    of hidden refs, which is not easily possible right now as there is only
    a single, global instance of the list of parsed hidden refs.
    
    Refactor `parse_hide_refs_config()` and `ref_is_hidden()` so that both
    take the list of hidden references as input and adjust callers to keep a
    local list, instead. This allows us to easily use multiple hidden-ref
    lists. Furthermore, it allows us to properly free this list before we
    exit.
    
    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    pks-t authored and ttaylorr committed Nov 17, 2022
    Copy the full SHA
    9b67eb6 View commit details
    Browse the repository at this point in the history
  3. revision: move together exclusion-related functions

    Move together the definitions of functions that handle exclusions of
    refs so that related functionality sits in a single place, only.
    
    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    pks-t authored and ttaylorr committed Nov 17, 2022
    Copy the full SHA
    05b9425 View commit details
    Browse the repository at this point in the history
  4. revision: introduce struct to handle exclusions

    The functions that handle exclusion of refs work on a single string
    list. We're about to add a second mechanism for excluding refs though,
    and it makes sense to reuse much of the same architecture for both kinds
    of exclusion.
    
    Introduce a new `struct ref_exclusions` that encapsulates all the logic
    related to excluding refs and move the `struct string_list` that holds
    all wildmatch patterns of excluded refs into it. Rename functions that
    operate on this struct to match its name.
    
    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    pks-t authored and ttaylorr committed Nov 17, 2022
    Copy the full SHA
    1e9f273 View commit details
    Browse the repository at this point in the history
  5. revision: add new parameter to exclude hidden refs

    Users can optionally hide refs from remote users in git-upload-pack(1),
    git-receive-pack(1) and others via the `transfer.hideRefs`, but there is
    not an easy way to obtain the list of all visible or hidden refs right
    now. We'll require just that though for a performance improvement in our
    connectivity check.
    
    Add a new option `--exclude-hidden=` that excludes any hidden refs from
    the next pseudo-ref like `--all` or `--branches`.
    
    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    pks-t authored and ttaylorr committed Nov 17, 2022
    Copy the full SHA
    8c1bc2a View commit details
    Browse the repository at this point in the history
  6. rev-parse: add --exclude-hidden= option

    Add a new `--exclude-hidden=` option that is similar to the one we just
    added to git-rev-list(1). Given a section name `uploadpack` or `receive`
    as argument, it causes us to exclude all references that would be hidden
    by the respective `$section.hideRefs` configuration.
    
    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    pks-t authored and ttaylorr committed Nov 17, 2022
    Copy the full SHA
    5ff36c9 View commit details
    Browse the repository at this point in the history
  7. receive-pack: only use visible refs for connectivity check

    When serving a push, git-receive-pack(1) needs to verify that the
    packfile sent by the client contains all objects that are required by
    the updated references. This connectivity check works by marking all
    preexisting references as uninteresting and using the new reference tips
    as starting point for a graph walk.
    
    Marking all preexisting references as uninteresting can be a problem
    when it comes to performance. Git forges tend to do internal bookkeeping
    to keep alive sets of objects for internal use or make them easy to find
    via certain references. These references are typically hidden away from
    the user so that they are neither advertised nor writeable. At GitLab,
    we have one particular repository that contains a total of 7 million
    references, of which 6.8 million are indeed internal references. With
    the current connectivity check we are forced to load all these
    references in order to mark them as uninteresting, and this alone takes
    around 15 seconds to compute.
    
    We can optimize this by only taking into account the set of visible refs
    when marking objects as uninteresting. This means that we may now walk
    more objects until we hit any object that is marked as uninteresting.
    But it is rather unlikely that clients send objects that make large
    parts of objects reachable that have previously only ever been hidden,
    whereas the common case is to push incremental changes that build on top
    of the visible object graph.
    
    This provides a huge boost to performance in the mentioned repository,
    where the vast majority of its refs hidden. Pushing a new commit into
    this repo with `transfer.hideRefs` set up to hide 6.8 million of 7 refs
    as it is configured in Gitaly leads to a 4.5-fold speedup:
    
        Benchmark 1: main
          Time (mean ± σ):     30.977 s ±  0.157 s    [User: 30.226 s, System: 1.083 s]
          Range (min … max):   30.796 s … 31.071 s    3 runs
    
        Benchmark 2: pks-connectivity-check-hide-refs
          Time (mean ± σ):      6.799 s ±  0.063 s    [User: 6.803 s, System: 0.354 s]
          Range (min … max):    6.729 s …  6.850 s    3 runs
    
        Summary
          'pks-connectivity-check-hide-refs' ran
            4.56 ± 0.05 times faster than 'main'
    
    As we mostly go through the same codepaths even in the case where there
    are no hidden refs at all compared to the code before there is no change
    in performance when no refs are hidden:
    
        Benchmark 1: main
          Time (mean ± σ):     48.188 s ±  0.432 s    [User: 49.326 s, System: 5.009 s]
          Range (min … max):   47.706 s … 48.539 s    3 runs
    
        Benchmark 2: pks-connectivity-check-hide-refs
          Time (mean ± σ):     48.027 s ±  0.500 s    [User: 48.934 s, System: 5.025 s]
          Range (min … max):   47.504 s … 48.500 s    3 runs
    
        Summary
          'pks-connectivity-check-hide-refs' ran
            1.00 ± 0.01 times faster than 'main'
    
    Signed-off-by: Patrick Steinhardt <ps@pks.im>
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    pks-t authored and ttaylorr committed Nov 17, 2022
    Copy the full SHA
    bcec678 View commit details
    Browse the repository at this point in the history