84725

Why must I specify charset attributes for by tags?

Question:

I have a bit of an odd situation:

<ol><li>Main HTML page is served in UTF-16 character set (due to some requirements out-of-scope for this question)</li> <li>HTML page uses <script> tags to load external scripts (i.e. they have src attributes)</li> <li>Those external scripts are in US-ASCII/UTF-8</li> <li>The web server is serving the scripts with the content-type "application/javascript" with no character set hints</li> <li>The scripts have no byte-order-mark (BOM)</li> </ol>

When loading the page described above, both Firefox and Chrome (current versions) throw errors saying that the first character of the script files are invalid.

Looking at the "Network" tabs of the respective dev-tools views shows the files are just fine (they render in the previewer just fine).

My conclusion was that the browsers are becoming confused as to what the encoding should be for "the whole page" or some similar foolishness.

So I tried adding a charsrt="UTF-8" attribute to the <script> tags and that seems to solve the problem.

But I <em>really</em> shouldn't have to do that, should I?

First of all, the server is telling the client what the document's type is. It's application/javascript and doesn't specify a character set. (Indeed, the <a href="https://tools.ietf.org/html/rfc2045" rel="nofollow">RFC</a> says that charset is only applicable to text/* MIME-types). Okay, I can understand why there might be some ambiguity, there.

But the document-type is javascript, and there are some obvious rules for how to handle a javascript file whose actual charset you don't know. For example, if it's got a BOM, then use it. If there isn't any BOM, it should be really easy to tell UTF-16 from UTF-8. (Note that there doesn't seem to be any problem on these same pages with loading CSS files, which are also in the same situation as the scripts.)

Lastly, the enclosing page shouldn't have to know what the encoding of its dependencies are. In fact, it might be <em>impossible</em> for it to know, and explicitly-specifying the charset then tightly-couples the page to its dependencies and vice-versa.

Is there a way to get the browser to correctly-detect the character set of these dependencies without specifying the charset in the page itself?

Answer1:

Without a BOM in the file, or an explicit charset in the <script> or Content-Type for the file, the encoding of the file is ambiguous. The browser <em>might</em> assume UTF-8 (and should, per <a href="https://tools.ietf.org/html/rfc4329" rel="nofollow">RFC 4329</a>), but if the script contains any non-ASCII characters that are not actually encoded in UTF-8, the file won't process properly.

However, HTML 5 Section 4.11 dictates that a <script>'s fallback encoding is the document's encoding if the <script> does not have a charset attribute. The fallback takes effect if there is no BOM or charset to specify the file's actual encoding.

So, either make sure your HTML and JS files are always using the same encoding, or else you have to be explicit about the JS file's charset, one way or the other.

Recommend

  • how to display   in Mozilla using XSL.
  • Is there a way to set up a fallback for the formAction attribute in HTML5?
  • multidatatrigger with multibinding in ControlTemplate.Triggers
  • Retaining data after updating application
  • Problems with toDataURL HTML5 other ways to get canvas data?
  • Cloud Code function running twice
  • How to handle elastic beanstalk deployment so it uploads only changed files
  • Best way to dynamically load an xml configuration file into a Flex4/Flash movie at runtime?
  • Hide HTML elements without javascript, only CSS
  • How to synchronize jQuery dialog box to act like alert() of Javascript
  • Filter strings with regex before casting to numeric
  • Angularjs pass function from Controller to Directive (or call controller function from directive) -
  • IE7 and TinyMCE with Plone
  • How to make jdk.nashorn.api.scripting.JSObject visible in plugin [duplicate]
  • How does document.ready work with angular element directives?
  • How to revert to previous XCode version?
  • Zurb Foundation _global.scss meta styles for js?
  • Is playing sound in Javascript performance heavy?
  • Thread safety of a fluent like class using clone() and non final fields
  • jQuery ready not fired after rails link_to is clicked
  • How to define custom class, title, and target in Link Browser for content elements and the new rte_c
  • Converting a WriteableBitmap image ToArray in UWP
  • Make VS2015 use angular-cli ng at build time in a .NET project
  • JQuery Internet Explorer and ajaxstop
  • Chrome doesn't support silverlight anymore? How to solve this?
  • How to delay loading a property with linq to sql external mapping?
  • Cannot connect to cassandra from Spark
  • Display issues when we change from one jquery mobile page to another in firefox
  • Ajax jQuery multiple calls at the same time - long wait for answer and not able to cancel
  • HTML download movie download link
  • Updating server-side rendering client-side
  • Where to put my custom functions in Wordpress?
  • Javascript convert timezone issue
  • Numpy divide by zero. Why?
  • PHP: When would you need the self:: keyword?
  • Acquiring multiple attributes from .xml file in c#
  • How to set the response of a form post action to a iframe source?
  • Setting background image for body element in xhtml (for different monitors and resolutions)
  • reshape alternating columns in less time and using less memory
  • How can I use threading to 'tick' a timer to be accessed by other threads?