-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diff for *.doc files did not work as expected #355
Comments
This behaviour (and simmilar ones for pdf,rtf docx) seems to be caused by the commits that edited /etc/gitattributes. This file and its supporting scripts apparently did not make it into the 2.X versions. Could you test this after the following steps:
download the old script 'astextplain' to your git bin directory. add these lines to your gitconfig
I can't test this currently since I have no git for windows 2.5 installed. |
Thanks for clarifying! |
It was not meant as a final sollution, but as a check if it still works. Documents are basically binary files so conversion to text is not exactly an out of the box feature. gits main purpose is sourcecode versioning, that means it's optimized for plain text. Since both scripts (astexplain and docx2txt) plus antiword add up to roughly 250 kB here, I don't think @dscho would have much off a problem if we added them back in. I'll take a look at making an installer and comparing the sizes when I find time for that. I'll notify you about any pull requests. |
You will find it in the poppler package. You can install it via Just for fun, I contributed a package definition for Xpdf which also provides a |
Next Update: Git 2.x searches /$(prefix)/etc/gitattributes instead of /etc/gitattributes. I guess Git 1.x was just built without the prefix. I apparently also forgot to include the antiword mapping files. I'm getting closer. |
I'm still having slight differences in the doc conversion, but I got it running. I've also gotten docx2txt running as intended. I should probably use the unzip package instead of adding unzip to the git-extra package. I guess I should also introduce a separate package for antiword and its 30-ish mapping files. This is my current Git 2.5.1 result for doc to text conversion:
This is the intended result that Git 1.9.4 produces:
I would assume it's a conversion issue between CRLF and LF, but I don't know why it would be converted in Git 1.9.4 and not in Git 2.5.1. Any ideas @dscho? maybe we could fix this by running the doc file through TODO:
|
I would add it here: https://github.com/git-for-windows/build-extra/blob/b1ada75d730cd6b5e74ac202841b46e28a5399f0/make-file-list.sh#L89 (and here: https://github.com/git-for-windows/build-extra/blob/b1ada75d730cd6b5e74ac202841b46e28a5399f0/make-file-list.sh#L73). In any case, would you have some code to show? If you do, please open a Pull Request (with the prefix "DO NOT MERGE YET:"). |
Converting Word files to text before diffing them allows an easier comparison between changed files. This reintroduces some functionality of Git for Windows 1.7.x+. It was requested by a user to reintroduce this feature in git-for-windows/git#355 . Signed-off-by: Matthias Aßhauer <[email protected]>
For those who follow this thread, but have not had a look at git-for-windows/build-extra#75: I've opened a pull request for the first version of these changes, but they aren't ready to be merged yet. @dscho and I have created packages for |
Converting Word files to text before diffing them allows an easier comparison between changed files. This reintroduces some functionality of Git for Windows 1.7.x+. It was requested by a user to reintroduce this feature in git-for-windows/git#355 . Signed-off-by: Matthias Aßhauer <[email protected]>
Converting Word files to text before diffing them allows an easier comparison between changed files. This reintroduces some functionality of Git for Windows 1.7.x+. It was requested by a user to reintroduce this feature in git-for-windows/git#355 . Signed-off-by: Matthias Aßhauer <[email protected]>
Converting Word files to text before diffing them allows an easier comparison between changed files. This reintroduces some functionality of Git for Windows 1.7.x+. It was requested by a user to reintroduce this feature in git-for-windows/git#355 . Signed-off-by: Matthias Aßhauer <[email protected]>
Converting Word files to text before diffing them allows an easier comparison between changed files. This reintroduces some functionality of Git for Windows 1.x. This fixes git-for-windows/git#355 . Signed-off-by: Matthias Aßhauer <[email protected]>
Converting Word files to text before diffing them allows an easier comparison between changed files. This reintroduces some functionality of Git for Windows 1.x. This fixes git-for-windows/git#355 . Signed-off-by: Matthias Aßhauer <[email protected]>
Converting PDF and Word files to text before diffing them allows an easier comparison between changed files. This reintroduces some functionality of Git for Windows 1.x. This fixes git-for-windows/git#355 . Signed-off-by: Matthias Aßhauer <[email protected]>
Converting PDF and Word files to text before diffing them allows an easier comparison between changed files. This reintroduces some functionality of Git for Windows 1.x. Including only unzip.exe instead of the entire unzip package makes the installer increase only by 61 kiB instead of 84 kiB, hence the we opted for the former. pdftotext exists in the xpdf package (adds 2860 kiB) and the poppler package (adds 13250 kiB), we opted to include the xpdf pdftotext.exe and its dependency libstdc++-6.dll that add 550 kiB to the installer instead of the poppler pdftotext.exe and its 23 additional dlls, that would increase the size of the installer by 2032 kiB. In total this commit increases the size of the installer by 2220 kiB. This fixes git-for-windows/git#355 . Signed-off-by: Matthias Aßhauer <[email protected]>
git for windows 1.9.5 has nice feature — comparing text content of *.doc files when diff calculated
git 2.5.x treats *.doc files like ordinary binary files
The text was updated successfully, but these errors were encountered: