Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for .pptx diff #1581

Closed
1 task done
mhujer opened this issue Mar 23, 2018 · 2 comments
Closed
1 task done

Add support for .pptx diff #1581

mhujer opened this issue Mar 23, 2018 · 2 comments

Comments

@mhujer
Copy link

mhujer commented Mar 23, 2018

  • I was not able to find an open or closed issue matching what I'm seeing

Setup

  • Which version of Git for Windows are you using? Is it 32-bit or 64-bit?
$ git --version --build-options

git version 2.16.2.windows.1
cpu: x86_64
built from commit: e1848984d1004040ec5199e749b5f282ddf4bb09
sizeof-long: 4
  • Which version of Windows are you running? Vista, 7, 8, 10? Is it 32-bit or 64-bit?
    W10 Pro 1709, 64bit
$ cmd.exe /c ver

Microsoft Windows [Version 10.0.16299.309]
  • What options did you set as part of the installation? Or did you choose the
    defaults?
Editor Option: VIM
Path Option: Cmd
SSH Option: OpenSSH
CURL Option: OpenSSL
CRLF Option: LFOnly
Bash Terminal Option: MinTTY
Performance Tweaks FSCache: Enabled
Use Credential Manager: Enabled
Enable Symlinks: Disabled

Details

Currently, when you try to diff pptx files (Powerpoint presentation), it only shows that the binary files differ:

Binary files a/slides.pptx and b/slides.pptx differ

Diffing already works for .doc, .docx, .pdf and several other binary files. It would be nice to have it working also for presentations.

I found these related issues:

I also discovered some tools that may be able to the the pptx conversion:

I checked that the docx2txt.pl included in the distribution (in Git\usr\bin\docx2txt.pl) is more advanced (configurable unzipping tool etc.), so it would make more sense to adapt it to work with pptx.

What do you think?

If you think that pursuing PPTX diffing is a good idea, I'd love to work on it (or at least try it).

@dscho
Copy link
Member

dscho commented Mar 23, 2018

Welcome!

If you think that pursuing PPTX diffing is a good idea, I'd love to work on it (or at least try it).

I am all for supporting more formats, in particular when it comes to little cost.

I checked that the docx2txt.pl included in the distribution (in Git\usr\bin\docx2txt.pl) is more advanced (configurable unzipping tool etc.), so it would make more sense to adapt it to work with pptx.

This comes from http://docx2txt.sourceforge.net/, and maybe there was already work/thought in that direction by that project?

https://github.com/welcheb/pptx2txt.sh (MIT licensed)

That one looks simple enough, I think it would make for a great way to support .pptx format with minimal effort (and minimal weight).

https://sourceforge.net/p/pptx2txt/code/HEAD/tree/ (GPL3 licensed)

That one is actually an adaptation of afore-mentioned docx2txt, according to the code comment... And it does provide a little bit more detail when rendering as .txt.

Maybe you can compare the output on one of your .pptx files, and then continue from there?

As to the way to include this: you will first want to create a Pacman package. For this, install the Git for Windows SDK via git clone --depth=1 https://github.com/git-for-windows/git-sdk-64 and then starting the git-bash.exe at its top-level.

Then, initialize the MSYS2-packages worktree via sdk init MSYS2-packages. This is the repository where MSYS2/Git for Windows keep the package definitions of all MSYS packages (i.e. software that depends directly or indirectly on the MSYS2 runtime that provides that POSIX emulation layer, as is the case for us because we want to use the Bash or Perl).

Then, copy the docx2txt/ directory to pptx2txt/ and adjust the PKGBUILD file as necessary.

To test that this worked, run makepkg -s in the pptx2txt/ directory. This will create the package (a file called pptx2txt-<version>-x86_64.pkg.tar.xz. You can now install this via pacman -U pptx2txt-*.pkg.tar.xz. Once that is done, you should be able to run pptx2txt and test that to your heart's extent.

When everything works, commit and push (fork https://github.com/git-for-windows/MSYS2-packages unless you have done that already), then open a Pull Request.

The next step is to build an installer: sdk build installer (run this once to verify that everything works as expected).

Then, adjust /usr/src/build-extra/make-file-list.sh by looking for docx2txt and adding the corresponding pptx2txt entry. This file is what generates the list of files to include in Git for Windows' installer/portable Git/MinGit. (And build-extra is where Git for Windows keeps all kind of things necessary to build/maintain Git for Windows.)

Now, you will also want to adjust the Git attributes and the astextplain helper by editing the files gitattributes and astextplain in /usr/src/build-extra/git-extra/. To make use of these, you will also have to run updpkgsums in that directory because those files' checksums will be verified when building the git-extra package (which contains extra files and a post-install script performing certain adjustments necessary for Git for Windows).

You will also want to build and install that git-extra package, but this one is a MINGW package (because it is supposedly not depending on the MSYS2 runtime, even if that is no longer true because it contains shell scripts), so it has to be built using makepkg-mingw -s (which will build also the 32-bit package, but you do not need that).

Now you can test and verify that the astextplain helper works as expected, and that Git shows nice diffs in a repository containing .pptx files.

Once that file is edited appropriately, build another installer: sdk installer. You can now test this installer by running it, and verifying that Git Bash (or Git CMD) now also handle those diffs correctly.

When that test is successful, please also commit, push and open a Pull Request (this time at https://github.com/git-for-windows/build-extra).

Thank you for having this splendid idea, and for offering to work on it!

@mhujer
Copy link
Author

mhujer commented Nov 24, 2019

@dscho sorry for not pursuing this. I created my own Markdown based slide generator instead.

@dscho dscho closed this as completed Nov 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants