-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perform shallow clones on the TypeScript submodule. #9
base: main
Are you sure you want to change the base?
Conversation
Tested this, and it still gave me a full clone, it seems. |
Although this PR itself is appropriate, it does not work as expected due to the Git issues described below. @DanielRosenwasser Current IssuesWe have two ways to clone submodules, but both suffer from performance issues:
Even with the --shallow-submodules option, Git fetches histories of unused branches. For large repositories like TypeScript, which contain hundreds of branches, this leads to downloading a significant amount of unnecessary data, greatly increasing clone times. related issue: git/git#1740
Even if Command logs
As shown in the logs, more than 300 branch objects are requested, which results in a large download size. > GIT_TRACE=1 GIT_TRACE_PACKET=1 GIT_TRACE_PERFORMANCE=1 GIT_CURL_VERBOSE=1 git clone --branch shallowClone --recurse-submodules --shallow-submodules [email protected]:microsoft/typescript-go.git
// Submodule processing begins
Submodule '_submodules/TypeScript' (https://github.com/microsoft/TypeScript.git) registered for path '_submodules/TypeScript'
// Executes git clone --no-single-branch, requesting objects for all branches from the server
18:38:15.581439 run-command.c:759 trace: start_command: /opt/homebrew/opt/git/libexec/git-core/git clone --no-checkout --progress --depth=1 --separate-git-dir /Users/noyan/tmp/go-recur-clone/.git/modules/_submodules/TypeScript --no-single-branch -- https://github.com/microsoft/TypeScript.git /Users/noyan/tmp/go-recur-clone/_submodules/TypeScript
// Requests objects for 300 branches from the server
17:51:55.092224 pkt-line.c:85 packet: clone> command=fetch
17:51:55.092238 pkt-line.c:85 packet: clone> agent=git/2.48.1
17:51:55.092243 pkt-line.c:85 packet: clone> object-format=sha1
17:51:55.092246 pkt-line.c:85 packet: clone> 0001
17:51:55.092249 pkt-line.c:85 packet: clone> thin-pack
17:51:55.092252 pkt-line.c:85 packet: clone> ofs-delta
17:51:55.092256 pkt-line.c:85 packet: clone> deepen 1
17:51:55.092261 pkt-line.c:85 packet: clone> want 0aac72020ee8414218273f654eb7ce1dc2dd0d6b
17:51:55.092265 pkt-line.c:85 packet: clone> want 1ae7dbbcf091c9e52296bccfbb376f6fc397bd85
17:51:55.092268 pkt-line.c:85 packet: clone> want 5aa2eb744a3cffe570e54a4d382d67013284742b
17:51:55.092271 pkt-line.c:85 packet: clone> want 78ee4cacafc20491fca5557da7908580df18db0e
...
remote: Enumerating objects: 84605, done.
remote: Counting objects: 100% (34907/34907), done.
remote: Compressing objects: 100% (17581/17581), done.
remote: Total 25934 (delta 15901), reused 16546 (delta 7746), pack-reused 0 (from 0)
18:41:57.210076 git.c:476 trace: built-in: git fetch --depth=1
Receiving objects: 100% (25934/25934), 34.26 MiB | 5.91 MiB/s, done.
Resolving deltas: 100% (15901/15901), completed with 4849 local objects.
remote: Enumerating objects: 569, done.
remote: Counting objects: 100% (303/303), done.
remote: Compressing objects: 100% (51/51), done.
remote: Total 59 (delta 53), reused 10 (delta 6), pack-reused 0 (from 0)
Unpacking objects: 100% (59/59), 49.10 KiB | 88.00 KiB/s, done.
remote: Enumerating objects: 569, done.
remote: Counting objects: 100% (303/303), done.
remote: Compressing objects: 100% (51/51), done.
Even with > git clone --branch shallowClone [email protected]:microsoft/typescript-go.git; cd typescript-go
> git submodule update --init
// Retrieves HEAD and its history
// i.e., all commits before HEAD in the main branch
18:38:00.970221 git.c:476 trace: built-in: git fetch origin 52c59dbcbee274e523ef39e6c8be1bd5e110c2f1
19:28:36.112895 pkt-line.c:85 packet: fetch> command=fetch
19:28:36.112905 pkt-line.c:85 packet: fetch> agent=git/2.48.1
19:28:36.112908 pkt-line.c:85 packet: fetch> object-format=sha1
19:28:36.112910 pkt-line.c:85 packet: fetch> 0001
19:28:36.112912 pkt-line.c:85 packet: fetch> thin-pack
19:28:36.112914 pkt-line.c:85 packet: fetch> ofs-delta
19:28:36.112917 pkt-line.c:85 packet: fetch> shallow 0aac72020ee8414218273f654eb7ce1dc2dd0d6b
19:28:36.112921 pkt-line.c:85 packet: fetch> want 52c59dbcbee274e523ef39e6c8be1bd5e110c2f1
19:28:36.112926 pkt-line.c:85 packet: fetch> have 0aac72020ee8414218273f654eb7ce1dc2dd0d6b
remote: Enumerating objects: 622228, done.
remote: Counting objects: 100% (622214/622214), done.
remote: Compressing objects: 100% (165732/165732), done.
Receiving objects: 100% (598243/598243), 1.91 GiB | 10.97 MiB/s, done.
Resolving deltas: 100% (443668/443668), completed with 19471 local objects.
WorkaroundPerform a normal git clone [email protected]:microsoft/typescript-go.git
git submodule update --init --depth 1 Use the same approach when updating submodules: git submodule update --depth 1 |
This should make clones of the TypeScript repo way quicker, since we really don't need full history available.