Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rar 5 format #310

Merged
merged 1 commit into from
Dec 4, 2017
Merged

Rar 5 format #310

merged 1 commit into from
Dec 4, 2017

Conversation

sorset
Copy link
Contributor

@sorset sorset commented Oct 13, 2017

Fix rar 5 format comment

Fix rar 5 format comment
@adamhathcock
Copy link
Owner

Do you plan on doing more?

@sorset
Copy link
Contributor Author

sorset commented Oct 13, 2017

I have made IsRar5 function but still not implement.

@sorset
Copy link
Contributor Author

sorset commented Oct 13, 2017

But really nit have any free time to implement rar5 compatibility. Can we get from 7zip or 7zip.sharp projects?

@adamhathcock adamhathcock merged commit a4ebd5f into adamhathcock:master Dec 4, 2017
@coderb
Copy link
Contributor

coderb commented Dec 15, 2017

Hi Adam,

I may have some bandwidth to help work on Rar5 support. Do you have a sense for how much work is involved or how to proceed? I see that rarlabs has open sourced the rar5 unrar source code in C. Would the plan be to port this code?

I also noticed there is a rar5 branch in SharpCompress but I'm not sure what the status of that branch is.

-Brien

@adamhathcock
Copy link
Owner

The RAR5 branch is nothing basically.

Really the work should just be implementing reading the headers: https://www.rarlab.com/technote.htm So the work isn't so bad.

I think the decompression is all the same but that needs to be tested.

@coderb
Copy link
Contributor

coderb commented Dec 15, 2017

So just parse the headers and use the existing RAR4 decompression code? Sounds like work on the order of days to weeks... or am I being optimistic?

EDIT -----
additionally, looks like there is BLAKE2 based checksumming which may be new but shouldn't be too hard to support.

@adamhathcock
Copy link
Owner

I believe that's the work. It shouldn't be too bad. I think you're not far off on the estimated time :)

@coderb
Copy link
Contributor

coderb commented Dec 15, 2017

Ok... so just so I have this straight, RAR5 is primarily a change to the archive format and the decompression algorithm has not changed??? So any compression gains are via algo params and/or archive format efficiency?

@adamhathcock
Copy link
Owner

That's my understanding. I think he just didn't like the header format and it was a little bit of a mess.

Honestly though, I'm scared I'm wrong about the algorithm not changing but it needs testing.

@coderb
Copy link
Contributor

coderb commented Dec 15, 2017

I'm going to try to get in touch with the winrar developers to get some confirmation (emailed [email protected]).

For reference here's a list of updates from wikipedia:

  • Maximum compression dictionary size increased to 1 GiB (default for WinRAR 5.x is 32 MiB and 4 MiB for WinRAR 4.x).
  • Maximum path length for files in RAR and ZIP archives is increased up to 2048 characters.
  • Support for Unicode file names stored in UTF-8 format.
  • Faster compression and decompression.
  • Multicore decompression support.
  • Greatly improves recovery.
  • Optional AES encryption increased from 128-bit to 256-bit.
  • Optional 256-bit BLAKE2 file hash instead of a default 32-bit CRC32 file checksum.
  • Optional duplicate file detection.
  • Optional NTFS hard and symbolic links.
  • Optional Quick Open Record. Rar4 archives had to be parsed before opening as file names were spread throughout the archive, slowing operation particularly with slower devices such as optical drives, and reducing the integrity of damaged archives. Rar5 can optionally create a "quick open record", a special archive block at the end of the file that contains the names of files included, allowing archives to be opened faster.
  • Removes specialized compression algorithms for Itanium executables, text, raw audio (WAV), and raw image (BMP) files; consequently some files of these types compress better in the older RAR (4) format with these options enabled than in RAR5.

@coderb
Copy link
Contributor

coderb commented Dec 16, 2017

hey adam-
sorry to bug you but i'm trying to get setup using your solution file and i haven't done .netstandard stuff much so i have a stupid question.

i'm running devstudio 2017 (latest patch) with resharper 2017.2.2. i had to edit the sharpcompress csproj to remove the net35 target since it's not installed on my machine.

the solution compiles fine but for some reason resharper sees a bunch of unresolved calls from the test project to sharpcompress, eg entry.WriteToDirectory() shows up as red even though i can Ctrl-B and follow the reference fine. i deleted my resharper cache and restarted to no avail.

when i try to "Run" any [Fact] from the test project it fails with the dump below.

any tips to getting this setup? thanks!

2017.12.16 12:30:43.800 ERROR Run: d05a8ab4-8391-41f9-8756-b3d810ff28f7 - Faulted
2017.12.16 12:30:43.801 ERROR System.AggregateException: One or more errors occurred. ---> System.AggregateException: One or more errors occurred. ---> JetBrains.ReSharper.UnitTestFramework.DotNetCore.Exceptions.ProcessExitedUnexpectedlyException: dotnet exited unexpectedly with the code (0)
Output stream: Microsoft (R) Test Execution Command Line Tool Version 15.3.0-preview-20170628-02
Copyright (c) Microsoft Corporation. All rights reserved.
Error stream:
--- End of inner exception stack trace ---
--- End of inner exception stack trace ---
at JetBrains.ReSharper.UnitTestFramework.Launch.Stages.RunTestsStage.<>c__DisplayClassc.b__8()
---> (Inner Exception #0) System.AggregateException: One or more errors occurred. ---> JetBrains.ReSharper.UnitTestFramework.DotNetCore.Exceptions.ProcessExitedUnexpectedlyException: dotnet exited unexpectedly with the code (0)
Output stream: Microsoft (R) Test Execution Command Line Tool Version 15.3.0-preview-20170628-02
Copyright (c) Microsoft Corporation. All rights reserved.
Error stream:
--- End of inner exception stack trace ---
---> (Inner Exception #0) JetBrains.ReSharper.UnitTestFramework.DotNetCore.Exceptions.ProcessExitedUnexpectedlyException: dotnet exited unexpectedly with the code (0)
Output stream: Microsoft (R) Test Execution Command Line Tool Version 15.3.0-preview-20170628-02
Copyright (c) Microsoft Corporation. All rights reserved.
Error stream: <---
<---

@coderb
Copy link
Contributor

coderb commented Dec 16, 2017

disregard.... figured it out (cache clearing and extension disabling fixed it)

@coderb
Copy link
Contributor

coderb commented Dec 17, 2017

marathon coding session and most of the header logic is done.
and.......
yep, there's new decompression code needed!

i assume you are ok with the winrar source license terms (particularly clause #2) as it seems as though the unpack*.cs source code was already ported from it. i'll have to port unpack50.cpp over for RAR5 support.

****** ***** ****** UnRAR - free utility for RAR archives
** ** ** ** ** ** ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
****** ******* ****** License for use and distribution of
** ** ** ** ** ** ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
** ** ** ** ** ** FREE portable version
~~~~~~~~~~~~~~~~~~~~~

  The source code of UnRAR utility is freeware. This means:
  1. All copyrights to RAR and the utility UnRAR are exclusively
    owned by the author - Alexander Roshal.

  2. UnRAR source code may be used in any software to handle
    RAR archives without limitations free of charge, but cannot be
    used to develop RAR (WinRAR) compatible archiver and to
    re-create RAR compression algorithm, which is proprietary.
    Distribution of modified UnRAR source code in separate form
    or as a part of other software is permitted, provided that
    full text of this paragraph, starting from "UnRAR source code"
    words, is included in license, or in documentation if license
    is not available, and in source code comments of resulting package.

  3. The UnRAR utility may be freely distributed. It is allowed
    to distribute UnRAR inside of other software packages.

  4. THE RAR ARCHIVER AND THE UnRAR UTILITY ARE DISTRIBUTED "AS IS".
    NO WARRANTY OF ANY KIND IS EXPRESSED OR IMPLIED. YOU USE AT
    YOUR OWN RISK. THE AUTHOR WILL NOT BE LIABLE FOR DATA LOSS,
    DAMAGES, LOSS OF PROFITS OR ANY OTHER KIND OF LOSS WHILE USING
    OR MISUSING THIS SOFTWARE.

  5. Installing and using the UnRAR utility signifies acceptance of
    these terms and conditions of the license.

  6. If you don't agree with terms of the license you must remove
    UnRAR files from your storage devices and cease to use the
    utility.

    Thank you for your interest in RAR and UnRAR.

                                      Alexander L. Roshal
    

@adamhathcock
Copy link
Owner

Impressive!

I converted the unrar code from the java version. But yes, no problem with keeping everything free and open source :)

@coderb
Copy link
Contributor

coderb commented Dec 17, 2017

hey adam

so.... it's going to be a monster of a patch when it's done.

i've done an initial port and no debugging of the rarlabs unpack5 code but unsuprisingly it doesn't quite work,

my approach was just to pull in the minimum amount of code and try to integrate it with the existing stuff that came via your port of junrar.

i think it may not be worth the debugging effort to fix the code though because the unpack15/unpack2/upack5 code is pretty tightly integrated and the end result would be pretty untrustworthy in my eyes. some examples of issues are liberties taken when junrar ported to java (lack of unsigned types/casting); the current unrar codebase uses size_t in places and supports buffers up to 4GB. it would also be unproductive to try to identify and reapply all code changes over the past ~10 years to the unrar codebase.

so, that's just a really long explanation leading up to my thought that i'd be better off just re-porting the entire unpack codebase afresh directly from the unrar cpp codebase. i'd use the current code as a guide so i'm not that intimidated by the idea.

so.... will you have any issues accepting a megapatch that is basically a rewrite of the unrar stuff?

or would you like me to approach it differently somehow?

@coderb
Copy link
Contributor

coderb commented Dec 17, 2017

I just pushed the code to my repository if you want to take a peek.

@adamhathcock
Copy link
Owner

adamhathcock commented Dec 18, 2017

so, that's just a really long explanation leading up to my thought that i'd be better off just re-porting the entire unpack codebase afresh directly from the unrar cpp codebase. i'd use the current code as a guide so i'm not that intimidated by the idea.

so.... will you have any issues accepting a megapatch that is basically a rewrite of the unrar stuff?

Doing a fresh port would be great. I never attempted to do that myself because of time and lack of experience reading C++

Once things are looking closer to being complete, I can review or help. My personal life is a mess at the moment otherwise I'd volunteer to do more.

@adamhathcock
Copy link
Owner

@coderb you still working on this? I'm interested in pushing this along if possible.

@coderb
Copy link
Contributor

coderb commented Jan 10, 2018

hey adam,
i got pulled away onto other things but i'll take a look today.
the current state of my branch is that i've done a fresh port of the c++ unrar src code. basic rar5 decompression (format and algo) is working.

i need to fill in the following:

  • recode (or reuse) some helper classes for the Unpack30() algo (PPM,RARVM)
  • code/test some gaps and differences due to RAR5 file format records (encryption,blake checksum,...)

the branch implements the new code side by side with the old code, so if desired it could be pulled into the main branch at any time. it is enabled / disabled via the call to new Unpack() in RarArchive and RarReader.

-brien

@adamhathcock
Copy link
Owner

Yeah I ran into the Unpack30 build issues. I'll have to stare harder at this at another time.

Thanks for the work. It will be nice to have a fresh port. Hopefully, it performs well as a fresh port.

@coderb
Copy link
Contributor

coderb commented Jan 10, 2018

you should be able to it to build if you define RarV2017_RAR5ONLY which excludes the older algos from the new implementation.

@adamhathcock
Copy link
Owner

@coderb I'm being lazy and integrating the legacy with your new port to cover the V3 case.

There are some missing test files you had. Other than that and fixing up encryption, it's working well.

@coderb
Copy link
Contributor

coderb commented Apr 26, 2018

that's cool. i didn't really mean to leave it like that but just got too busy with the real world. if you tell me which files are missing i can provide them. -b

@adamhathcock
Copy link
Owner

@coderb thanks! the files are:

  • Rar5.Encrypted.rar
  • Rar5.EncryptedParts.part01.rar to Rar5.EncryptedParts.part06.rar
  • Rar5.encrypted_filesOnly.rar

@coderb
Copy link
Contributor

coderb commented Apr 26, 2018

hmm, it's been a while...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants