Archive abysses: ZIP malware tricks unpacked & explained
There is more to ZIP as an attack vector than some people think – and format-specific peculiarities always provide nasty surprises.
(Image: erstellt mit Bing Image Creator durch ovw)
As a component of malware and phishing campaigns, ZIP archives are classics in the most negative sense. Hackers are constantly pulling new tricks out of their hats to either bypass protection functions or even launch attacks directly. In the last few months, researchers have repeatedly discovered malware in ZIP files that specifically evaded anti-virus software.
We therefore explain the typical tricks used by criminals using current phishing campaigns with appended archives and newly emerged variants of tried and tested anti-analysis tricks. And we show that a five-year-old archive bomb still provides explosive material for attacks today.
Videos by heise
When is a ZIP a ZIP?
If you want to understand the attackers' tactics, you can't avoid a little theory about the ZIP format. Developed and published in 1989 by programmer Phil Katz, it provides flexible containers for files of any type and can also map the associated directory structure. Optional, lossless compression helps to save storage space and transfer data faster. In addition to these obvious advantages, the public domain status of the archive format is also likely to have contributed to its rapid spread and establishment.
Today, ZIP is a matter of course – and often an integral part of various software projects and applications without us even being aware of it –. E-books with an .epub extension, modern office files (e.g. .docx, xlsx) as well as Java archives and Android packages (.jar, .apk) are technically ZIP archives whose content can be viewed after renaming to .zip with any packing program or even with the operating system's on-board tools. Only additional specific mandatory components and structural requirements from outside the ZIP specification turn the respective file into a Word document or an Android app.
For a ZIP to be a ZIP according to specification, however, only two elements are required at the end of the archive: a so-called Central Directory as a "table of contents" and an End of Central Directory entry (EOCD) immediately following it.
(Image:Â Wikimedia Commons / Niklaus Aeschbacher / CC BY-SA 3.0)
Traditionally, parsers analyze the ZIP format from back to front. They first identify the EOCD using a fixed byte signature beginning with "PK" (for Phil Katz) and extract information from it about the size and memory address (offset) of the central directory, among other things. This in turn dedicates an entry with metadata and the offsets to the local file headers, also beginning with "PK", to each individual element in the archive. Each element is preceded by such a header with information that is necessary for correct unpacking.
If you want to delve deeper into the ZIP structure, we recommend the detailed "ZIP Archive Walkthrough" by security researcher Corkami.