News Categories |
Main /
Compress
On this page… (hide)
1. About data compressionData compression allows to make files smaller while keeping the content usable. We can distinguish cca 4 types of data compression:
See appropriate online resources for further general information about data compression. 2. Potential pitfallsBefore trying to compress anything, you should be aware of following things:
3. Algorithms3.1 Simple algorithmsRLERLE (“ Run Length Encoding ”) is the simplest compression algorithm, based on “holes” (large blocks of same bytes like 0,0,0,0,0,… or 255,255,255,… but also 119,119,119,…) in the input file. Is is defined as an optional feature of the BMP image format, but almost never used there (most BMP’s are uncompressed). Most archivers do not use it, for practical reasons probably (difficult to include in one pass ?), however, performing RLE before the more sophisticated (LZ- & Huffman-based) algorithms can result in an improvement. On ordinary files, the benefit is not spectacular, but in special cases, like huge (>10 MiB) “empty” (huge “hole” see above) or almost “empty” files, Deflate and similar methods perform very badly, producing a file that is well decompressible to original, but also well compressible again, breaking the “law” that data can never (almost :-D ) be compressed more applying the same algorithm again. Preprocessing such files using RLE could improve the compression (smaller output file & also faster). LZ77LZ77 (“ Lempel-Ziv 77 “) was developed in 1977. Searches for repeated occurrence of same strings and replaces such strings by a code pointing back in the data stream / file to a location where the same string has previously occurred, and including the length of the matching string. Mostly used together with Huffman in the Deflate algorithm. LZ78, LZW84Developed in 1978 and 1984, based on LZ77, trying to improve it. Both used to be patented, but now (since 2006) all possible patents are expired. LZ78 (“ Lempel-Ziv 78 ”) was never popular, unlike LZW84 (“ Lempel-Ziv-Welch 84 ”) , used in the “.Z” compressor, GIF images, PKARC, old versions of PKZIP and PDF documents. Adobe implemented LZW84 (among other patented algorithms) in its PDF document format, and did this by intention: they did have a license to use this algorithm, and wanted to keep their “exclusive rights” on PDF this way. Big problems came up after one found out (in 1995 and 1999, after many years of using LZW84 and assuming it as “free and safe” ) that it was patented. Now LZW84 is highly obsolete: “.Z” compressor/format was replaced by GZIP (Deflate algorithm), later by BZIP2 or LZMA / 7-ZIP, LZW84 in ZIP’s (0.xx and 1.xx versions) also by Deflate, later 7-ZIP archiver also, and GIF images by PNG, also Deflate algorithm, finally in PDF documents also Deflate was introduced (in PDF 1.2) and is preferred, still LZW84 remains an option and part of the PDF file format. LZXDerivative of LZ77 algorithm, invented in 1995 by Jonathan Forbes and Tomi Poutanen for an archiver of same name for the Amiga computer. Originally shareware, 1997 abandoned and turned into freeware, but source code was never released. In 1996 Forbes went to work for Microsoft and “brought” the algorithm (with tiny modifications) to multiple file formats of them, including CAB (“Cabinet”) installation packages, CHM (“Compiled HTML”) documentation files, and WIM installation packages (used in Vista). The format of those files as well as the algorithm are sufficiently well known and publicly documented, and several independent tools (including 7-ZIP) can extract them, but nothing supports the original Amiga LZX archives.
HuffmanThe Huffman algorithm is based on the fact, that different values occur in significantly different amounts inside a file. In a text file for example, the lowercase letters “a”, “e”, “s” occur much more frequently than for example “Q”, “X”, and even more non-alphanumeric values. The algorithm allows to assign to every input byte value (8 bit in size) a variable length (cca 2…15 bits) output value, while keeping the act reversible, and, why we do this, having the output file smaller (many of the short (2…6 bits) codes, very few long (>8 bits) ones) than the input one. Arithmetic, RangeTwo almost identical algorithms, but with one critical difference: “Arithmetic” is patented (by IBM ?), while the “Range” algorithm is considered as unpatented. Both are improvements of Huffman, they compress better but slower that it, and both do give almost same results in speed and output file size. 3.2 Combined algorithmsThese algorithms consist of multiple of previously named simple algorithms, partially including some additional (not providing compression when isolated) processing. DeflateInvention & GeneralVery popular algorithm doing LZ77 and Huffman in one pass. Used in “standard” ZIP (PKZIP 2.xx) archives, GZIP compressed files, PNG images, some (older) Windows installers and PDF documents (since PDF 1.2). Good algorithm descriptions with sample sources are available (RFC 1951 from 1999-May). Quite “cheap”, usable on CPU’s down to Intel 8086 with a few MHz and 512 KiB RAM. Late lifeDeflate is the most popular compression algorithm of all times. Despite its compression is substantially inferior to newer algorithms like LZMA, it is still popular after more than 20 years of life. In in some fields of usage it has been mostly replaced (Windows installers), is some fields partially (Linux source packages), while in some fields it is still dominant (lossless image compression - PNG, most trouble-free and compatible archives - ZIP). Several attitudes towards its inferiority can be observed:
See below about implementations. BWT/BZIP/BZIP2BWT (“ Burrows-Wheeler-Transform ”) itself does not compress, it only “mixes” data in a “magic” way to make them more compressible using LZ- and Huffman-style algorithms. It was developed by Mike Burrows and David Wheeler. It performs best on huge source code packages, the drawback is that it is more vulnerable to “special” (highly repetitive) input data than other algorithms. BZIP and BZIP2 are compressors/algorithms written and maintained by Julian R. Seward. In 1995 BZIP was released, a compressor based on BWT, LZ77 & Arithmetic compression algorithms. J.R. Seward soon found out that the Arithmetic algorithm is patented and had to drop it and go back to classical Huffman :-( , worsing the compression by cca 1% and named the result BZIP2. It is released under a liberal open source license and unpatented. BZIP2 is positioned somewhere between Deflate and LZMA in compression factors achieved, speed and memory requirements. A few MiO RAM is required for compression (depends on settings), less for decompression. Could work down to 80286 with XMS or 8086 with EMS. There are also some experimental stronger compressors based on BWT, but BZIP2 is the only implementation with practical use. LZMAInvention & UsageLZMA (“ Lempel-Ziv-Markov-Chain-Algorithm “) is an improvement of Deflate, developed by Igor Pavlov , used in eir 7-ZIP product. Instead of “old” Huffman, the “Range” algorithm is used, and instead of LZ77 with 32 KiO sliding window, sophisticated match finders supporting dictionaries of many MiO in size are used, also a “Markov-Chain” algorithm is involved. Later the algorithm “leaked” into other products, most notably UPX executable packer, and NSIS and INNOSETUP installers (both Win32 only, but at least extractable in DOS also). Description, LZMA2Unfortunately, no good algorithm description is available so far. For years the only source of information was the source code of 7-ZIP program, the LZMA “toy” compressor (was intended to change into a serious GZIP/BZIP2 replacement, but maybe this will never happen because of competing attempts named XZ and LZIP), and LZMA SDK source, all written in “C++” , with all the speed optimizations and using multithreading (optional only ?), besides this there was a “simplified but compatible” ANSI-C source of decompression only. Things changed a bit with 7-ZIP version 4.58 beta: Igor rewrote the LZMA compression and decompression code from C++ into “plain” C (reason: performance, rest of 7-ZIP application remains in C++), so now plain C code is the “reference” implementation, but still no text in a “human” language. LZMA algorithm is excellent for highly compressible data, OTOH it doesn’t perform well on incompressible or badly compressible one. In practice, LZMA will expand such data more frequently and by more than for example Deflate. LZMA2 algorithm (available in 7-ZIP versions 9.xx, earliest stable one is 9.20 from 2010-Nov) supports uncompressed blocks, addressing this problem of existing LZMA, and is incompatible with it of course, and not extractable with older versions of 7-ZIP. For years the license of LZMA SDK was GNU LGPL with a few very minor exceptions, with version 4.61 beta of 7-ZIP and LZMA SDK, it was changed into Public Domain . Cost (technical)The algorithm is slightly a memory hog - cca 64 MiO RAM is the minimum for reasonable compression, and also the CPU should be at least a 80486 with 50 MHz - not suitable for very old PC’s. The decompression is much “cheaper” and can be performed with a few MiB RAM (depends on dictionary size set while compressing) on a 80386 or even below with XMS or EMS, if someone ported the code for such systems. Even “worse”, UPX reportedly can decompress LZMA even on 8086, very slowly, when using a very small dictionary. PPMDAlgorithm by Dmitry Shkarin . Project page: compression.ru/ds . Implemented in 7-ZIP, optionally can be used instead of LZMA. MiscMisc … 3.3 Multimedia algorithmsMultimedia algorithms … See appropriate online resources for further information about multimedia compression algorithms. DCT/MDCT/IDCT Wavelet WVT/IWVT 4. EncryptionMany archivers do offer besides compression additionally encryption of data. There is symmetrical and asymmetrical encryption available, “classical” archiving with a password uses symmetrical one. Many older archivers (ARC, PKZIP 2.xx) use poor algorithms and have critical weaknesses, newer products (7-ZIP) theoretically are very secure, moving the risk to other factors, like the “human” factor and usage of risky OS’es (like Windows). If poor encryption is sufficient (hiding viruses from antivirus programs for example :-D or texts from text search), PKZIP 2.xx algorithm is preferable. For secure encryption, 7-ZIP is the right product, using the 7-ZIP archive format, a good password and doing so on DOS. See appropriate online resources for further information about encryption. 5. Processing stepsThe “steps” are rather theoretical - in most archivers they are all performed automatically in one pass, on Linux piping is partially used to “chain” them. 5.1 Non-Solid archiving
Examples: ARC, ZIP.
5.2 Solid archiving
Solid archiving can be:
The big file resulting from composing the files can be saved and accessed by the user. Example: TAR followed by GZIP or BZIP2. The TAR file is accessible. or
The big “file” is not accessible, all steps do occur in memory in one pass. Examples: RAR, ACE, 7-ZIP with solid option on. 6. Archivers6.1 ARCThe original: ARC = ARChive. Released 1986, maintained by two (!) companies: PKWARE as “PKARC” and SEA as “SEA-ARC” . Provided as shareware exclusively for the only acceptable OS that everybody had at that time : MS-DOS. Copyright conflicts between PKWARE and SEA resulted in death of ARC format and SEA company in 1990, while PKWARE introduced ZIP file format and PKZIP+PKUNZIP programs, and became quickly very popular with those. Some more info on Wikipedia: en.wikipedia.org/wiki/Phil_Katz en.wikipedia.org/wiki/ARC_(file_format) ARC related post by Rugxulo : mail-archive.com/freedos-user…10457.html 6.2 ZIPFormat creation by PKWARE, PKZIP for DOSIntroduced by PKWARE as replacement of ARC. After some experiments with the ZIP file format and hacking on the compression algorithm in versions 0.xx and 1.xx, PKZIP 2.04 for MS-DOS was released in 1993, as shareware again. It supports file sizes up to 4 GiB, the Deflate compression algorithm, CRC32 checknumbers for integrity verification, and a sort of “encryption”, which however is rather poor, see also the “Encryption” section. Also it tries to achieve maximum speed through using, if available, EMS, XMS, DPMI32, and 32-bit 80386 or 80486 instructions. On the other side it should work on 8086 too. Info-ZIPA free and open source implementation of the Deflate algorithm and ZIP file format, available under a BSD license. The (useless) “encryption” was originally not included (available only as a separate patch for the source code) because of US “cryptography export” restrictions, later the restrictions were reduced allowing to include it. WinZIPPKWARE had the intention to maintain exclusively the ZIP standard, however, they made the file format and the algorithms open (the speed-optimized code was always closed source). This made ZIP to a quasi standard of archiving, and allowed development of ZIP-compatible archivers by other people and companies, but also allowed some people to create the “ WinZIP ” product, that, having the magic word “Win” in the name, “hijacked” the standard and made “WinZIP” the most popular archiver and turned PKZIP into a rather marginal and historical thing. WinZIP started with 16-bit code on on Win 3.xx, changed to 32-bit with Win95, but it required PKZIP & PKUNZIP for many years, it did not contain any compression code at all :-D , finally very late (version cca 7 ???) Deflate code (picked from Info-ZIP project) was added removing the PKZIP & PKUNZIP requirement. For some time other DOS packers and unpackers (ARC, LHA) were “supported” (allowing WinZIP to “support” those formats), finally also they got dropped, in the meantime various compression algorithms were added (picked open source libraries with sufficiently liberal licenses). Interesting: Old WinZIP self-extractors are dual-mode executables, working on DOS and Windows (file structure: MZ … NE … PK !!!). PKWARE then also changed to “Windows” (native Win32 console and GUI binaries), but late - “too late” as many people say. Competitors beating ZIP format, “extended” ZIP, zipxBesides WinZIP, other “Win”-based archivers were created, especially WinRAR, offering new archive formats with better compression, solid archiving, stronger encryption and redundancy/recovery. With a big delay, PKWARE & WinZIP maintainers tried to react to the competition and introduced, partially independently, “extensions” to the ZIP standard: Deflate64 for (marginally) stronger compression, later BZIP2 and finally PPMD and LZMA algorithms, ZIP64 for files > 4 GiB, additional encryption algorithms (RC2, RC4, DES, 3DES, finally Rijndael (AES), in 2 different incompatible implementations :-D ), and special handling of some multimedia files (WAVPACK algorithm for WAV files, and an algorithm for lossless recompression of lossy JPG pictures). As result, they generously messed up the ZIP standard. A ”ZIP archiver” supporting all this formats and extensions is bloated, complicated and very difficult to make and keep bug-free. Even later (means: “too late”) the zipx file format was defined - it is just a “ZIP” with any of aforementioned and already previously implemented extensions (except Deflate64 or ZIP64 ???). Those extensions have been mostly (not fully) implemented in the 7-ZIP archiver, also they are slowly leaking into Info-ZIP and other archivers. From “ http://www.winzip.com/comp_info.htm ” : “ The PPMd compression format was introduced in WinZip 10.0 Beta, released in August 2005. The WavPack compression format was introduced in WinZip 11.0 Beta, released in October 2006. The compressed Jpeg format was introduced in WinZip 12.0, released in September 2008. In WinZip 12.1, released in May of 2009, the Zipx file was introduced. The Zipx file is a Zip file that uses any of the aforementioned compression methods or the LZMA or bzip2 compression methods as documented in the Zip file appnote.txt specification. ” PKZIP and DOSThe latest DOS version of PKZIP is 2.5 , released in 1999. It is optimized for running in faked “DOS” boxes (Win98, with LFN), supports newer CPU’s (Pentium, should run faster on them), and as undocumented feature, it can extract files compressed with Deflate64. Unfortunately it seems to have problems with XMS/DPMI/CPUID handling - it can misdetect a Pentium as 80486, and even worse, crash is some situations with XMS and DPMI present - this problem should be fixed in HIMEMX 3.32, so avoid older versions, most notably the “official” FreeDOS 1.0 HIMEM 2.26, at least with PKWARE. It can not compress Deflate64 and also does not improve the compression compared to version 2.04, the new “-exx” switch has no effect on most files. Other known problem: PKUNZIP for DOS (all versions) and 2.50 for Win32 may falsely refuse to extract ZIP’s created on Linux.
Other ZIP archivers, KZIPThe archivers supporting ZIP format (the standard one) vary in compression performance and achieved size reduction. PKZIP is fast and has good compression, but still leaves space for improvements (see above “Late life” of Deflate). Using 7-ZIP one can increase the compression effort (still referring to standard ZIP) and achieve better compression while keeping compatibility. There is also a product named KZIP (right: there is no “ P “) written by Ken Silverman , closed source freeware. It offers the probably best PKZIP 2.xx compatible compression, at cost of speed. It runs in DOS using HX-DOS Extender. One more interesting product is TUNZ, an UNZIPper written in ASM (no source release yet), only 2.5 KiO in size (DOS .COM executable), 8086 compatible. It however has some limitations about number of files in the archive and subdirectories, and can extract only all files of the archive together, no “selective” extract. Other usage of the ZIP formatThe ZIP format is being used for “other” file types too, most notably JAR Java packages, Open Office ODT documents, and DOCX documents of MS Office / Word 2007 and newer, see DocumentFormatsViewers. So any archiver supporting ZIP can extract those files, still this doesn’t mean that the result will be human-readable text, but usually you can at least extract images this way. 6.3 RARDeveloped by Eugene Roshall ( RAR = Roshall’s ARchiver ) and maintained by em up now. Started cca 1995 as “RAR” for DOS and changed soon to “Windows”, named “WinRAR” then. Has been always an innovative product, introduced better compression than ZIP (and still improving) at acceptable speed, one-stage solid archiving, redundancy data for recovery and strong (closed source :-( ) encryption. The algorithm is and always was proprietary and closed source, and RAR and WinRAR products shareware, but there is a freeware UNRAR program for different platforms, including DOS available. Also the UNRAR code is open source, with the restriction that you may not use it to reconstruct the RAR algorithm from it. A minimal commandline RAR for DOS is available, also as shareware at same cost as WinRAR with is expensive GUI. Having free 7-ZIP available and working in DOS also, RAR became quite obsolete.
6.4 ACEDeveloped by Marcel Lemke ( ACE = Advanced Compression Engine ??? ) in cca 1996. Used to be an innovative product providing very good compression at acceptable speed and some other benefits. Versions 1.x did provide a free & open source UNACE, versions 2.x do no longer (unreproductable license change). This license issue together with coming up of 7-ZIP made ACE popularity sinking and the product and file format obsolete.
6.5 7-ZIPCreationDeveloped by Igor Pavlov in late 1990′s and based on eir LZMA compression algorithm. After year 2000, the product became stable and usable and its popularity has been slowly increasing all the time. Supports its own 7-ZIP archive format as well as some other popular formats: the “standalone” console version supports: 7-ZIP, ZIP (with some of the new and obsolete extensions, like Deflate64), GZIP, BZIP2, TAR, Z (very obsolete, LZW84 algorithm, extract only). The DLL-based console version and Win32-GUI one also support some additional archive formats, like RAR (extract only), CAB and WIM (extract only, since 4.57), FAT and NTFS (since 9.20, hard disk filesystems) and ISO (CD filesystem, partially, extract only). The Win32 GUI version provides a simple 2-panel file manager (WinZIP freaks do not like it :-D ). Later historyThe latest 3.xx version was 3.13 from 2003–12–11 , then the 4.xx line began, the latest 4.xx version is 4.65 from 2009-Feb-03 . Due to year 2009, Igor decided to bump the major version number to 9, and the only stable version is 9.20 released 2010-Nov-18 . In year 2015 the major version number was bumped to 15, the only stable versions are 15.12 from 2015–11–19 and 15.14. Meanwhile version 16 is out. 7-ZIP archive file formatThe 7-ZIP archive format provides excellent compression using the LZMA algorithm (also LZMA2 since versions 9.xx), alternatively also PPMD, BZIP2 or Deflate, one-stage solid archiving (risky, not everybody likes it), strong encryption (Rijndael algorithm, 256-bit key, large amount of SHA-256 hashes - 512 Ki of them as in version 4.58, amount supposed to grow in future, while keeping compatibility with older versions of 7-ZIP ), and support for unreasonably huge file sizes (many TiB’s). 7-ZIP and DOSUnfortunately, Igor Pavlov never provided a DOS version, only a Win32 GUI and a Win32 console one. Also the so called “standalone” console version uses multithreading and can not be compiled to DOS in a trivial way. But an external developer, Japheth , created a “HX-DOS extender” product, allowing to use many Win32 console apps in DOS, even those with multithreading. 7-ZIP was one of eir privileged apps and ey made it working excellently in DOS. An other developer, Blair, also got 7-ZIP working in DOS in another way - ey took the “p7zip” product, the posix (Linux and similar systems) version of 7-ZIP, performed some minor fixes in the source and recompiled with DGJPP and its “pthreads” emulation library. The result works, there are only minor problems, like lazy progress indicator and bloated executable size. In the past ey ported versions 4.32, 4.33beta and 4.42 (4.42 is included in the FreeDOS 1.0 distribution, all those versions are no longer (separately) available ?), later 4.55 and 4.57 became available, also other people having compiled and released some [p]7-ZIP ports for DOS are Mik & Rugxulo , see links below. Actually, since 7-ZIP / p7-ZIP v. 4.37, DGJPP is one of the “official” platforms supported by “p7zip” project. Still those various ports expose various problems like incompatibility with HDPMI32 (disable DPMI 1.0 ???), does not work on FreeDOS (???), creation of ZIP’s tagged as “created on Linux” (PKUNZIP refuses to extract them, other UNZIP tools are fine, other archive formats don’t expose this problem). Also the DLL-based “full” commandline version, supporting those additional formats, works in DOS using HX-DOS (seems to work, not tested too much). So far there is no benefit from 7-ZIP’s huge file size support in DOS, maybe one day file sizes up to 256 GiB (is it poor ? :-D ) will be possible in DOS, and only on FAT32+ partitions, after FAT32+ support will be added into the DOS kernel (Udo Kuhnt’s EDR-DOS has it implemented since 2006 August WIP, FreeDOS not yet) and HX-DOS extender (not yet done). By now the limit is 2 GiB, usage of 2GiB…4GiB files in DOS is sort of possible on FAT32 but problematic. Because of support of other (than 7-ZIP) formats, 7-ZIP archiver almost obsoletes ZIP (PKZIP & PKUNZIP, Info-ZIP), GZIP, BZIP2 & TAR archivers, if one accepts the need of HX-DOS and the CPU requirements (down to cca 80486 - 4.58 is verified to work on 80486 SX without FPU, no tests on 80386).
6.6 TAR, Z, GZIP, BZIP-2These archivers originate from Linux and are still very popular there. Some people speak of “ Linuxed archive ” when seeing such a file, however, they are in (almost) no way specific to Linux and well usable on other systems and DOS also. TAR does not compress files, it only composes them together, the resulting file is supposed to be compressed using Z (very obsolete, using LZW84 algorithm), GZIP (GZ, Deflate algorithm) or BZIP2 (even newer) or LZMA (newest, but format for LZMA files (unlike LZMA compression algorithm and 7-ZIP file format) is not yet finalized). This is a 2-stage solid archiving. There are also archivers performing TAR and the compression in one pass (“piping”) without storing the huge TAR file on a disk. It is also possible to compress the TAR with other archivers as well, like ZIP (benefit: weak “encryption” possible, while GZIP has none) or 7-ZIP (benefit: better compression, possible strong encryption). TAR, GZIP and BZIP2 all do have 32-bit DOS ports compiled with DGJPP, TAR and GZIP also some 16-bit ports or clones. The current version of BZIP2 is 1.0.6 from 2010-Sep-20, however there haven’t been previously any spectacular changes (compression improvement) for years since cca 1.0.2 . It requires a 80386 CPU and some MiO of RAM, at least theoretically it could run also on a 80286 or 8086 with XMS or EMS if someone ported the code to such systems. TAR and GZIP are “cheap” enough to run even on a 8086 with 512 KiO RAM. The best way to handle these archives on new PC’s (80486 and above) is the 7-ZIP archiver, supporting them all, and ZIP and 7-ZIP additionally.
6.7 Misc archiversThe is a huge amount of other archivers available, never or no longer having a big popularity, few examples:
6.8 Deflate optimization toolsStay compatible and become smaller at same timeSee above “Late life” of Deflate. KZIP and PNGOUT(see above about KZIP, GraphMediaTech about PNGOUT) DeflOPTFreeware, closed source. It takes already compressed data as input (ZIP, GZIP, PNG). It’s a bit a mystery what it does and how, but does work and is lossless. Efficient Compression ToolZopfliCompression library, some binaries creating or optimizing ZIP, GZIP, PNG are available or can be compiled. 7. I got a file of “.XXX” type - what now ?If someone sends you a file or you find a file compressed with an obsolete or unpopular/unknown archiver:
8. Executable compression8.1 AboutExecutable compression is a controversial “technology”. It can make executables (possibly also DLL’s, and even “drivers”) looking smaller (can be massively smaller), but there are disadvantages as well. Pro’s:
Con’s:
8.2 UPXA famous product, providing the possibility of decompression to original (“equivalent”, but still not necessarily byte-identical) file as an important official feature. Supports 16-bit DOS .COM and .EXE , .SYS drivers, .SYS/.EXE “combos” , 32-bit WATCOM/LE, DJGPP/COFF, Win32/PE and many other non-DOS formats. The UPX license prohibits (or tries to) any manipulations/cracking of the decompression stub, like hiding the usage of UPX or preventing easy decompression. Unfortunately, in PE files, UPX stores some info in the PE header outside of “official” fields, and applying PESTUB (see HX-DOS ) on it has (accidentally) exactly this effect - prevents easy decompression. Product license is “semi-GPL” - it uses a proprietary compression algorithm NRV - alternatively, one can compile UPX (on Linux only (?), very difficult) oneself, however only the weaker algorithm UCL is available then. Since version 3 (tested in 2.9xx versions), also LZMA is available - for big files LZMA is the best, for smaller ones (below 100 KiB cca) , NRV is better. LZMA is available also for real mode and 8086 (don’t forget to specify "--8086" switch), but decompression is very slow, so it’s doubtful whether this can be considered as an achievement at all.
8.3 APACKAPACK is a 16-bit real-mode DOS executable ( .EXE and .COM ) compressor by Ibsen Software / Jorgen Ibsen . Latest version is aPACK v0.99b from 2000–09–24. It is closed source, free for personal use (only). Some FreeDOS utils are compressed with it, there have been however hot discussions whether it is legal / GPL-compliant or not to “link” APACK’s closed source cca 160 bytes (!!) “stub” with GPL’ed FreeDOS code, without final clarification :-D There is no official unpacker. 8.4 ASPACKThis product brings nothing good (commercial, Win32 only, no official unpacker), it is sort of “relevant” because of troubles it causes when running apps compressed with it using HX-DOS Extender. 8.5 PEtiteAnother Win32 PE packer (silly note: “petite” is the French word for “small”). Executables packed with it (example: “PHATCODE.EXE”) don’t work with HX-DOS, reason is unknown. 8.6 Unpackers, IUPAPACK unpackerFor .COM only, FASM source included. board.flatassembler.net/topic.php?t=7278 ASPACK-DieUnpacker for ASPACK. Win32 GUI, doesn’t work on DOS by now. IUPIntelligent UnPacker is a generic unpacker for DOS .COM and .EXE , using the Debug/SingleStep CPU mode to track the unpack process allowing to save the unpacked file then. Supports PKLITE, APACK and many other DOS real mode executable packers. Unfortunately doesn’t work in FreeDOS (reason: wrong usage of INT $21 / AH=$5A “create temp file” function, together with lack of correctness of all (!) DOS kernels, “fixed” in later FreeDOS kernels by adjusting them following other ones where IUP “happens to work” despite the bug), no problem in EDR-DOS. No project page, download from here: 9. Multimedia compressionInfo on Multimedia compression is available at GraphMediaTech. 10. Faked compressionThe topic data compression fascinates many people. Among many more or less serious releases participating in the “compression race”, there have been a few that have to be called “faked compression”, either for fun or (commercial) for fraud. Those “products” pretend to achieve better compression than they acually provide or that is even doable. The 2 ways to accomplish this are:
It’s easy to check whether a compression product is “honest” or not. Lossy compression can be detected using some hash (MD5 for example, must be same for original and decompressed file), hiding data can be detected by transferring the compressed file to another computer and decompressing it there. If it decompresses hapilly on the same computer, but fails on the other one (reports error, or output is lossy or garbage), then this is a strong evidence of hiding data. One example of faked compressor of category “fun” is BARF, and one of category “fraud” is “Infima Archiver”. 11. See also
|
Recent Comments