It can decode URL encoded characters in the URL, especially in the query part where it will be troublesome.
See the article from Anders Abel that discusses the differences between framework 4.0 and framework 4.5 https://coding.abel.nu/2014/10/beware-of-uri-tostring/
As shown in this article, more changes have been introduced int the newer frameworks.
I have done some testing by setting different targetFramework in the httpRuntime element of web.config.
Tests have been performed by creating a URI with a parameter and iterating all ASCII characters (256) and then calling ToString().
for(var i=0;i<16*16;i++) { var enc = $"%{i:x2}".ToUpper(); var uri = new Uri($"https://some.site/?key={enc}"); uri.ToString(); }
Tests show that different encodings are being decoded when other encodings are kept and the following table shows the differences between each framework where x marks the characters being decoded and a blank space shows that the encoding is kept even after ToString().
targetFramework | decoded | Control 00-1F | 20 | ! 21 | " 22 | # 23 |
$ 24 | % 25 | & 26 | '()*+, 27-2C | - 2D | . 2E | / 2F |
0-9
30-39
| : 3A | ; 3B | < 3C | = 3D | > 3E | ? 3F |
@ 40 | A-Z 41-5A | [ 5B | \ 5C | ] 5D | ^_`a-z{\|}~ 5E-7E | DEL 7F | extended 80-FF |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4.0 | 126 | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||
4.5 - 4.7.1 | 83 | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | ||||||||||||
4.7.2 - 4.8 | 75 | x | x | x | x | x | x | x | x | x | x |
Summary
ToString always decodes%20,%22,%2D,%2E,%2F,%30-%39,%3C,%3E,%41-%5A,%5E-%7E
"-./0-9<>A-Z^_`a-z{\|}~
I'm not sure that I would like the %20 (whitespace) to be decoded but that's perhaps a personal preference. Of course, it's readability is improving but for programming use, I'm not a fan.
4.5-4.7.1 also decodes
%21,%27-%2C,%3A,%5B,%5D
!'()*+,:[]
4.0 also decodes
-%1F,%24,%26,%3B,%3D,%3F,%40,%5C
Control characters $&;=?@\
4.0 is not good since an embedded query string will be decoded since an encoded & and = (%26,%3D) in a value will be decoded.
Example
https://some.site/?key=key%3Dvalue
will be decoded to
https://some.site/?key=key=value
which will mess up the whole meaning
Workaround
If you would like to avoid decoding any URL:s you can use the following extension method.This will use the AbsoluteUri when we are dealing with an absolute URI and the OriginalString for relative
public static string ToUrl(this Uri uri) { if (uri == null) return null; if (uri.IsAbsoluteUri) return uri.AbsoluteUri; return uri.OriginalString; }
Conclusion
- Don't use target framework 4.0, at least switch to 4.5.
- If you want to avoid encoding, don't use Uri.ToString(). It will always decode some encoded characters (even if newer frameworks are better att avoiding characters that might get you in trouble).
No comments:
Post a Comment