30605

Is URLEncoder.encode(string, “UTF-8”) a poor validation?

Question:

In a portion of my J2EE/java code, I do a URLEncoding on the output of getRequestURI() to sanitize it to prevent XSS attacks, but Fortify SCA considers that poor validation.

Why?

Answer1:

The key point is that you need to convert HTML special characters to HTML entities. This is also called "HTML escaping" or "XML escaping". Basically, the characters <, >, ", & and ' needs to be replaced by &lt;, &gt;, &quot;, &amp; and &#39;.

URL encoding does not do that. URL encoding converts URL special characters to percent-encoded values. This is not HTML escaping.

In case of web applications, HTML escaping is normally to be done in the view side, exactly there where you're redisplaying user-controlled input. In case of a Java EE web applications, that depends on the view technology you're using.

<ol><li>

If the webapp is using modern Facelets view technology, then you don't need to escape it yourself. Facelets will already implicitly do that.

</li> <li>

If the webapp is using legacy JSP view technology, then you need to ensure that you're using JSTL <c:out> tag or fn:escapeXml() function to redisplay user-controlled input.

<pre class="lang-xml prettyprint-override"><c:out value="${bean.foo}" /> <input type="text" name="foo" value="${fn:escapeXml(param.foo)}" /> </li> <li>

If the webapp is very legacy or bad designed and using servlets or <em>scriptlets</em> to print HTML, then you've a bigger problem. There are no builtin tags or functions, let alone Java methods which can escape HTML entities. You should either write some escape() method yourself or use the Apache Commons Lang <a href="http://commons.apache.org/lang/api-2.5/org/apache/commons/lang/StringEscapeUtils.html#escapeHtml%28java.lang.String%29" rel="nofollow">StringEscapeUtils#escapeHtml()</a> for this. Then you need to ensure that you're using it everywhere you're printing user-controlled input.

<pre class="lang-java prettyprint-override">out.print("

" + StringEscapeUtils.escapeHtml(request.getParameter("foo")) + "

");

Much better would be to redesign that legacy webapp to use JSP with JSTL.

</li> </ol>

Answer2:

URL encoding does not affect certain significant characters including single quote (') and parentheses, so URL encoding will pass through unchanged certain payloads.

For example,

onload'alert(String.fromCharCode(120))'

will be treated by some browsers as a valid attribute that can result in code execution when injected inside a tag.

The best way to avoid XSS is to treat all untrusted inputs as plain text, and then when composing your output, properly encode all plain text to the appropriate type on output.

If you want to filter inputs as an additional layer of security, make sure your filter treats all quotes (including back-tick) and parentheses as possible code, and disallow them unless the make sense for that input.

Recommend

  • How to use EXTRACT through dbplyr when connecting to an Oracle DB
  • dplyr idiom for “select A, B, max(C) from D group by C”
  • Should the plus in tel URIs be encoded?
  • Progress Bar in Python Console on Windows
  • two ways of displaying a decimal
  • PHP mysql_real_escape_string() and % character
  • Pie Chart Labels Cut off
  • React textarea with value is readonly but need to be updated
  • Kubernetes 1.7 on Google Cloud: FailedSync Error syncing pod, SandboxChanged Pod sandbox changed, it
  • Why does my regular expression fail with certain substitutions?
  • UIImageJPEGRepresentation giving 2x images on retina display
  • Excel Date field value differs from c# dateTime by 1 day while reading excel file with EEPlus
  • How to initialize context? [closed]
  • back button function for phonegap windows phone 7
  • Can long-polling be achieved in Restlet by just making the thread sleep?
  • Streaming screenshots over WebRTC as a video stream from iOS
  • Retaining data after updating application
  • Row to Column conversion in Talend
  • Ember.js model to be organised as a tree structure
  • How to suppress a dialog
  • Jackson Parser: ignore deserializing for type mismatch
  • OpenGL ES texture problem, 4 duplicate columns and horizontal lines (Android)
  • Java Scanner input dilemma. Automatically inputs without allowing user to type
  • Cross-Platform Protobuf Serialization
  • what is the difference between the asp.net mvc application and asp.net web application
  • How can I estimate amount of memory left with calling System.gc()?
  • How to format a variable of double type
  • Cannot Parse HTML Data Using Android / JSOUP
  • Matrix multiplication with MKL
  • How get height of the a view with gone visibility and height defined as wrap_content in xml?
  • coudnt use logback because of log4j
  • Getting Messege Twice Using IMvxMessenger
  • Authorize attributes not working in MVC 4
  • JaxB to read class hierarchy
  • Programmatically clearing map cache
  • Busy indicator not showing up in wpf window [duplicate]
  • Binding checkboxes to object values in AngularJs
  • Python/Django TangoWithDjango Models and Databases
  • Net Present Value in Excel for Grouped Recurring CF
  • How to load view controller without button in storyboard?