summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJohan Lundberg <lundberg@sunet.se>2019-01-21 14:51:34 +0100
committerIvan Kanakarakis <ivan.kanak@gmail.com>2019-01-25 15:47:00 +0200
commit56f75da775b01aac7eec18cad3ddd47976ab8312 (patch)
tree729401c3080af73de0f2d21bbd273ed57de9f545
parentfbff99e4d3cbd1b53150019f41d88654058bb751 (diff)
downloadpysaml2-56f75da775b01aac7eec18cad3ddd47976ab8312.tar.gz
Convert sign_statement result to native string
Using lxml.etree.tostring without encoding in python3 results in a unparsable xml document. To fix this, we always set the encoding to UTF-8 and omit the xml declaration. We then convert the result to the native string type before returning it. --- Our preferred encoding (in general) is `utf-8`. `lxml` defaults to `ASCII`, or expects us to provide an encoding. Provided an encoding, `lxml` serializes the tree-representation of the xml document by encoding it with that encoding. If it is directed to include an xml declaration, it embeds that encoding in the xml declaration as the `encoding` property. (ie, `<?xml version='1.0' encoding='iso-8859-7'?>`) `lxml` allows for some _special_ values as an encoding. - In python2 those are: `"unicode"` and `unicode`. - In python3 those are: `"unicode"` and `str`. By specifying those values, the result will be _decoded_ from bytes to unicode ("unicode" is not an actual encoding; the actual encoding will be utf-8). The encoding is already the _type_ of the result. This is why you are not allowed to have an xml declaration for those cases. The result is not bytes that have to be read by some encoding rules, but decoded data that their type dictates how they are managed. With the latest changes, what we do is: 1. we always encode the result as UTF-8 2. we do not include an xml declaration (because of _(3)_) 3. we convert to the native string type (that is `bytes`/`str` for Python2, and `str` for Python3 (the equivalent of `unicode` in Python2) The consumer of the result should expect to treat the result as utf8-encoded bytes in Python2, and utf8-decoded string in Python3. Signed-off-by: Ivan Kanakarakis <ivan.kanak@gmail.com>
-rw-r--r--src/saml2/sigver.py5
1 files changed, 4 insertions, 1 deletions
diff --git a/src/saml2/sigver.py b/src/saml2/sigver.py
index 6e9ebf9b..0541535a 100644
--- a/src/saml2/sigver.py
+++ b/src/saml2/sigver.py
@@ -957,7 +957,10 @@ class CryptoBackendXMLSecurity(CryptoBackend):
xml = xmlsec.parse_xml(statement)
signed = xmlsec.sign(xml, key_file)
- return lxml.etree.tostring(signed, xml_declaration=True)
+ signed_str = lxml.etree.tostring(signed, xml_declaration=False, encoding="UTF-8")
+ if not isinstance(signed_str, six.string_types):
+ signed_str = signed_str.decode("utf-8")
+ return signed_str
def validate_signature(self, signedtext, cert_file, cert_type, node_name, node_id, id_attr):
"""