Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: git/git
base: f85cd430b12b0d3e4f1a30ef3239a1b73d5f6331
Choose a base ref
...
head repository: git/git
compare: 029a632c35861b3395c71e767d80bbf463dc1ae1
Choose a head ref
  • 10 commits
  • 13 files changed
  • 1 contributor

Commits on Apr 18, 2023

  1. pack-write.c: plug a leak in stage_tmp_packfiles()

    The function `stage_tmp_packfiles()` generates a filename to use for
    staging the contents of what will become the pack's ".mtimes" file.
    
    The name is generated in `write_mtimes_file()` and the result is
    returned back to `stage_tmp_packfiles()` which uses it to rename the
    temporary file into place via `rename_tmp_packfiles()`.
    
    `write_mtimes_file()` returns a `const char *`, indicating that callers
    are not expected to free its result (similar to, e.g., `oid_to_hex()`).
    But callers are expected to free its result, so this return type is
    incorrect.
    
    Change the function's signature to return a non-const `char *`, and free
    it at the end of `stage_tmp_packfiles()`.
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    c412583 View commit details
    Browse the repository at this point in the history
  2. builtin/repack.c: fix incorrect reference to '-C'

    When cruft packs were originally being developed, `-C` was designated as
    the short-form for `--cruft` (as in `git repack -C`).
    
    This was dropped due to confusion with Git's top-level `-C` option
    before submitting to the list. But the reference to it in
    `--cruft-expiration`'s help text was never updated. Fix that dangling
    reference in this patch.
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    c512f31 View commit details
    Browse the repository at this point in the history
  3. builtin/gc.c: ignore cruft packs with --keep-largest-pack

    When cruft packs were implemented, we never adjusted the code for `git
    gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft
    packs. This option and configuration option share a common
    implementation, but including cruft packs is wrong in both cases:
    
      - Running `git gc --keep-largest-pack` in a repository where the
        largest pack is the cruft pack itself will make it impossible for
        `git gc` to prune objects, since the cruft pack itself is kept.
    
      - The same is true for `gc.bigPackThreshold`, if the size of the cruft
        pack exceeds the limit set by the caller.
    
    In the future, it is possible that `gc.bigPackThreshold` could be used
    to write a separate cruft pack containing any new unreachable objects
    that entered the repository since the last time a cruft pack was
    written.
    
    There are some complexities to doing so, mainly around handling
    pruning objects that are in an existing cruft pack that is above the
    threshold (which would either need to be rewritten, or else delay
    pruning). Rewriting a substantially similar cruft pack isn't ideal, but
    it is significantly better than the status-quo.
    
    If users have large cruft packs that they don't want to rewrite, they
    can mark them as `*.keep` packs. But in general, if a repository has a
    cruft pack that is so large it is slowing down GC's, it should probably
    be pruned anyway.
    
    In the meantime, ignore cruft packs in the common implementation for
    both of these options, and add a pair of tests to prevent any future
    regressions here.
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    05b9013 View commit details
    Browse the repository at this point in the history
  4. t/t5304-prune.sh: prepare for gc --cruft by default

    Many of the tests in t5304 run `git gc`, and rely on its behavior that
    unreachable-but-recent objects are written out loose. This is sensible,
    since t5304 deals specifically with this kind of pruning.
    
    If left unattended, however, this test would break when the default
    behavior of a bare "git gc" is adjusted to generate a cruft pack by
    default.
    
    Ensure that these tests continue to work as-is (and continue to provide
    coverage of loose object pruning) by passing `--no-cruft` explicitly.
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    b934207 View commit details
    Browse the repository at this point in the history
  5. t/t6501-freshen-objects.sh: prepare for gc --cruft by default

    In a similar spirit as previous commits, prepare for `gc --cruft`
    becoming the default by ensuring that the tests in t6501 explicitly
    cover the case of freshening loose objects not using cruft packs.
    
    We could run this test twice, once with `--cruft` and once with
    `--no-cruft`, but doing so is unnecessary, since we already test object
    rescuing, freshening, and dealing with corrupt parts of the unreachable
    object graph extensively via t5329.
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    b31d45b View commit details
    Browse the repository at this point in the history
  6. t/t6500-gc.sh: refactor cruft pack tests

    In 12253ab (gc: add tests for --cruft and friends, 2022-10-26), we
    added a handful of tests to t6500 to ensure that `git gc` respected the
    value of `--cruft` and `gc.cruftPacks`.
    
    Then, in c695592 (config: let feature.experimental imply
    gc.cruftPacks=true, 2022-10-26), another set of similar tests was added
    to ensure that `feature.experimental` correctly implied enabling cruft
    pack generation (or not).
    
    These tests are similar and could be consolidated. Do so in this patch
    to prepare for expanding the set of command-line invocations that enable
    or disable writing cruft packs. This makes it possible to easily test
    more combinations of arguments without being overly repetitive.
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    50685e0 View commit details
    Browse the repository at this point in the history
  7. t/t6500-gc.sh: add additional test cases

    In the last commit, we refactored some of the tests in t6500 to make
    clearer when cruft packs will and won't be generated by `git gc`.
    
    Add the remaining cases not covered by the previous patch into this one,
    which enumerates all possible combinations of arguments that will
    produce (or not produce) a cruft pack.
    
    This prepares us for a future commit which will change the default value
    of `gc.cruftPacks` by ensuring that we understand which invocations do
    and do not change as a result.
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    b9061bc View commit details
    Browse the repository at this point in the history
  8. t/t9300-fast-import.sh: prepare for gc --cruft by default

    In a similar fashion as previous commits, adjust the fast-import tests
    to prepare for "git gc" generating a cruft pack by default.
    
    This adjustment is slightly different, however. Instead of relying on us
    writing out the objects loose, and then calling `git prune` to remove
    them, t9300 needs to be prepared to drop objects that would be moved
    into cruft packs.
    
    To do this, we can combine the `git gc` invocation with `git prune` into
    one `git gc --prune`, which handles pruning both loose objects, and
    objects that would otherwise be written to a cruft pack.
    
    Likely this pattern of "git gc && git prune" started all the way back in
    03db452 (Support gitlinks in fast-import., 2008-07-19), which
    happened after deprecating `git gc --prune` in 9e7d501 (builtin-gc.c:
    deprecate --prune, it now really has no effect, 2008-05-09).
    
    After `--prune` was un-deprecated in 58e9d9d (gc: make --prune useful
    again by accepting an optional parameter, 2009-02-14), this script got a
    handful of new "git gc && git prune" instances via via 4cedb78
    (fast-import: add input format tests, 2011-08-11). These could have been
    `git gc --prune`, but weren't (likely taking after 03db452).
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    c58100a View commit details
    Browse the repository at this point in the history
  9. builtin/gc.c: make gc.cruftPacks enabled by default

    Back in 5b92477 (builtin/gc.c: conditionally avoid pruning objects
    via loose, 2022-05-20), `git gc` learned the `--cruft` option and
    `gc.cruftPacks` configuration to opt-in to writing cruft packs when
    collecting or pruning unreachable objects.
    
    Cruft packs were introduced with the merge in a50036d (Merge branch
    'tb/cruft-packs', 2022-06-03). They address the problem of "loose object
    explosions", where Git will write out many individual loose objects when
    there is a large number of unreachable objects that have not yet aged
    past `--prune=<date>`.
    
    Instead of keeping track of those unreachable yet recent objects via
    their loose object file's mtime, cruft packs collect all unreachable
    objects into a single pack with a corresponding `*.mtimes` file that
    acts as a table to store the mtimes of all unreachable objects. This
    prevents the need to store unreachable objects as loose as they age out
    of the repository, and avoids the problem of loose object explosions.
    
    Beyond avoiding loose object explosions, cruft packs also act as a more
    efficient mechanism to store unreachable objects as they age out of a
    repository. This is because pairs of similar unreachable objects serve
    as delta bases for one another.
    
    In 5b92477, the feature was introduced as experimental. Since then,
    GitHub has been running these patches in every repository generating
    hundreds of millions of cruft packs along the way. The feature is
    battle-tested, and avoids many pathological cases such as above. Users
    who either run `git gc` manually, or via `git maintenance` can benefit
    from having cruft packs.
    
    As such, enable cruft pack generation to take place by default (by
    making `gc.cruftPacks` have the default of "true" rather than "false).
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    e3e24de View commit details
    Browse the repository at this point in the history
  10. repository.h: drop unused gc_cruft_packs

    As of the previous commit, all callers that need to read the value of
    `gc.cruftPacks` do so outside without using the `repo_settings` struct,
    making its `gc_cruft_packs` unused. Drop it accordingly.
    
    Signed-off-by: Taylor Blau <me@ttaylorr.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ttaylorr authored and gitster committed Apr 18, 2023
    Copy the full SHA
    029a632 View commit details
    Browse the repository at this point in the history