Location hack: Difference between revisions

Revision as of 10:39, 6 December 2007

The location hack is an ugly but useful way to interact with the content scope of the page being user scripted. It does this by indirectly evaling strings within that scope.

Background

For security reasons, Greasemonkey uses XPCNativeWrappers and sandbox to isolate it from the web page. Under this system, the user script can access and manipulate the page using event listeners, the DOM API, and GM_* functions.

Sometimes the sandbox is too limiting, in which case the user script can access other parts of the page using unsafeWindow. As the name unsafeWindow implies, this can often be unsafe, and expose security holes.

In December 2005, Jesse Ruderman came up with the location hack, to be an alternative to unsafeWindow in many cases.

Basic usage: page functions

Suppose the page contains a function called pageFunc, or window.pageFunc. The user script knows this function as unsafeWindow.pageFunc.

The user script could simply call unsafeWindow.pageFunc(), but this can leak the sandbox. Instead, the user script can take advantage of javascript: URLs, which always run in the content scope. Just entering this URL into the browser's location bar does not leak a GreaseMonkey sandbox:

javascript:void(pageFunc())

Similarly, a user script can set location.href to this URL to safely call the function:

location.href = "javascript:void(pageFunc())";

Invoke onclick behavior

Sometimes a userscript wants to simulate the behavior of clicking a link that has an onclick handler. For example, on a YouTube video page (like this) in the video description there is the link more with an onclick handler that as of this writing can be found with the XPath

"//div[@id='videoDetailsDiv']//a[text() = 'more']/@onclick"

and it contains the string:

addClass(_gel('videoDetailsDiv'), 'expanded'); return false;

If the variable onclick is bound to the XPath result, then this handler can be invoked through location thusly:

location.href = "javascript:void((function(){" + onclick.nodeValue + "})());";

Note that the onclick has to be wrapped in a function so that its return has a scope and the whole needs to be wrapped in void.

This example is rather robust because it will still work if YouTube redefines the contents of its onclicks.

Really simulating a click

If a link has one or more anonymous eventListeners and/or it is within the DOM scope of some element with an explicit 'onclick' handler or an anonymous eventListener then just evaluating an 'onclick' handler will not simulate a click. One really has to just just send the link a fake event. Assume the variable link is bound to some link. Then the following code will send it a "click" event.

var evt = document.createEvent("HTMLEvents");
evt.initEvent("click", true, true);
link.dispatchEvent(evt);

Note: The click event is really a concatenation of the mousedown and mouseup events. However I don't know if sending a link a fake "mousedown" followed by a fake "mouseup" will cause it to recognize it as a "click".

Trigger javascript: links

This hack can also be used to trigger javascript: links – simply do

location.href = someLink.href;

Modifying the page

The location hack can do anything a page script or bookmarklet can do, so it can modify content variables and such as easily as it can access them. For example:

location.href = "javascript:void(window.someVariable = 'someValue')";

Executing large blocks of code

Executing more than one statement can become unreadable very easily. Luckily, Javascript can convert functions to strings, so you can use:

location.href = "javascript:(" + function() {
  // do something
} + ")()";

Even though the function is defined in the sandbox, it is not a closure of the sandbox scope. It is converted to a string and then back to a function in page scope. It cannot access anything in the sandbox scope, which is a limitation, but is also essential to making this technique secure.

Percent encoding issue

Sometimes percent-encoding the percent symbol is required. For example,

location.href = ("javascript:(" + function() {
  var n = 44;
  if(!(n%22)) alert('n is a multiple of 22');
} + ")()");

The above code will cause error because %22 is interpreted as double quotation mark. The workaround is:

location.href = "javascript:(" + encodeURI(
 function() {
  var n = 44;
  if(!(n%22)) alert('n is a multiple of 22');
 }) + ")()";

Returning values

Functions called through the location hack cannot return data directly to the user script scope. To communicate between location hack code and regular user script code, data must be placed where the user script can see it, for example, by writing it into the DOM, or by triggering an event. A simple example:

var oldBodyTitle = document.body.title;
location.href = "javascript:void(document.body.title = pageFunc())";
var fauxReturnValue = document.body.title;
document.body.title = oldBodyTitle;

Function to get values of global variables

The following function can be used to access the values of global variables using var value = GM_getGlobalElement('globalVariable');. Please note that the returned value is converted to a string.

// function to get values of global variables using the "location hack"
window.GM_getGlobalElement = null;
window.getGlobalValue = function(name) {
  if (GM_getGlobalElement == null) {
    GM_getGlobalElement = document.createElement("textarea");
    GM_getGlobalElement.id = "GM_getGlobalElement";
    GM_getGlobalElement.style.visibility = "hidden";
    GM_getGlobalElement.style.display = "none";
    document.body.appendChild(GM_getGlobalElement);
  }
  location.href = "javascript:void(typeof("+ name + ")!=\"undefined\"?document.getElementById(\"GM_getGlobalElement\").value=" + name + ":null)";
  return(GM_getGlobalElement.value);
}

GM_eval vs. eval(s, unsafeWindow) vs. LocationHack

JavaScript in Mozilla/Gecko/Firefox is currently implemented by the SpiderMonkey engine written in C. It implements an eval function that takes an optional second argument to give the scope or context for the eval.

eval(string[, object])

It seems the two-argument eval can do all or more than the location-hack.

The main question is how secure it is in terms of giving a web page control over a browser. (Note that there are problems with this issue but particulars are purposefully rarely mentioned). I can't see how a JS statement like the following could leak any GM access to a web page.

eval(s, unsafeWindow);

But then, the following might be unsafe, particularly if r is later stored somewhere or more particularly if r gets some methods invoked off of it:

var r = eval(s, unsafeWindow);

The obvious solution to make r safe from potential terrors is to wrap it in a XPCNativeWrapper just like GreaseMonkey does.

r = new XPCNativeWrapper(eval(s, unsafeWindow));

Objections to: eval(s, unsafeWindow)

It is not as obfuscated as sending the string 'javascript:...' to location.href and is thus not as cool.
2-argument eval is not standard JavaScript
- While the true GreaseMonkey is based on Mozilla code, there are several other implementations for different browsers with different script engines, and those might not implement 2-argument eval.
- Mozilla JavaScript is considering and may migrate to to standards such as ECMAScript for XML (E4X) and/or go to the open source Tamerin engine which Adobe wants. These new versions of JavaScript might not implement 2-argument eval (though I doubt it).
Script writers who directly use eval may break security unless they take the extra step to wrap the result.

Advantages of: eval(s, unsafeWindow)

It is less verbose than the location hack. You don't have to wrap strings in things like

javascript:void(...)

It permits returning a value in a way that the target page cannot detect.
You don't have to worry whether you need to encode special characters like with the location-hack.

GM_eval

To resolve this distinction I propose that the GM API include the function GM_eval. In Mozilla GreaseMonkey it is just defined as something like

function GM_eval (string) {return new XPCNativeWrapper(eval(string, unsafeWindow));

For other browsers/script-engines that don't have 2-argument eval one might want to make it a 2-argument function:

GM_eval(string, [boolean]);

that uses the location-hack and wraps and encodes the string appropriatly. If the second argument is true then that means the function should use some detectable means to record the result inside the target document and return it or just possibly complain with a security error.

Advantages:

Using GM_eval semantically indicates that you are safely evaluating something on the target page as opposed to going through some hack back door.
You don't need to extra wrap or encode things like for the location-hack.
GM_eval would really simplify the documentation of GreaseMonkey. For example this page would disappear. There would be one page on GM_eval that explains what it means and how it can be imlemented through eval or the location-hack, and most of the other stuff on this page could go into "code snippits". [Though I have noticed some other references to location-hack that should be cleaned up]

Disadvantages

None seen so far other than increasing the API count by one.

@@ Line 1: / Line 1: @@
-The '''location hack''' is an ugly but useful way to interact with the content scope of the page being [[user script]]ed.
+The '''location hack''' is an ugly but useful way to interact with the content scope of the page being [[user script]]ed. It does this by indirectly [http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Functions:eval evaling] strings within that scope.
 == Background ==
@@ Line 13: / Line 13: @@
 Suppose the page contains a function called <code>pageFunc</code>, or <code>window.pageFunc</code>. The user script knows this function as <code>unsafeWindow.pageFunc</code>.
-The user script could simply call <code>unsafeWindow.pageFunc()</code>, but this can leak the sandbox. Instead, the user script can take advantage of javascript: URLs, which always run in the content scope. Just entering this URL into the browser's location bar does not leak a Greasemonkey sandbox:
+The user script could simply call <code>unsafeWindow.pageFunc()</code>, but this can leak the sandbox. Instead, the user script can take advantage of javascript: URLs, which always run in the content scope. Just entering this URL into the browser's location bar does not leak a GreaseMonkey sandbox:
   javascript:void(pageFunc())
@@ Line 20: / Line 20: @@
   location.href = "javascript:void(pageFunc())";
 == Invoke onclick behavior ==
@@ Line 115: / Line 114: @@
     return(GM_getGlobalElement.value);
   }
+== GM_eval vs. eval(s, unsafeWindow) vs. LocationHack ==
+JavaScript in Mozilla/Gecko/Firefox is currently implemented by the [http://developer.mozilla.org/en/docs/SpiderMonkey SpiderMonkey engine written in C]. It implements an [http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Functions:eval eval] function that takes an optional second argument to give the scope or context for the <code>eval</code>.
+ eval(string[, object])
+It seems the two-argument <code>eval</code> can do all or more than the ''location-hack''.
+The main question is how secure it is in terms of giving a web page control over a browser. (Note that there are problems with this issue but particulars are purposefully rarely mentioned).
+I can't see how a JS statement like the following could leak any GM access to a web page.
+ eval(s, unsafeWindow);
+But then, the following might be unsafe, particularly if <code>r</code> is later stored somewhere or more particularly if <code>r</code> gets some methods invoked off of it:
+ var r = eval(s, unsafeWindow);
+The obvious solution to make <code>r</code> safe from potential terrors is to wrap it in a [http://developer.mozilla.org/en/docs/XPCNativeWrapper XPCNativeWrapper] just like GreaseMonkey does.
+ r = new XPCNativeWrapper(eval(s, unsafeWindow));
+=== Objections to: eval(s, unsafeWindow) ===
+* It is not as obfuscated as sending the string <code>'javascript:...'</code> to <code>location.href</code> and is thus not as '''cool'''.
+* 2-argument <code>eval</code> is not standard JavaScript
+** While the '''true GreaseMonkey''' is based on Mozilla code, there are several other implementations for different browsers with different script engines, and those might not implement 2-argument <code>eval</code>.
+** Mozilla JavaScript is considering and may migrate to to standards such as [http://developer.mozilla.org/en/docs/E4X ECMAScript for XML (E4X)] and/or go to the open source [http://www.mozilla.org/projects/tamarin/ Tamerin engine] which Adobe wants. These new versions of JavaScript might not implement 2-argument eval (though I doubt it).
+* Script writers who directly use <code>eval</code> may break security unless they take the extra step to wrap the result.
+=== Advantages of: eval(s, unsafeWindow) ===
+* It is less verbose than the location hack. You don't have to wrap strings in things like
+ javascript:void(...)
+* It permits returning a value in a way that the target page cannot detect.
+* You don't have to worry whether you need to  [http://wiki.greasespot.net/index.php?title=Location_hack&action=submit#Percent_encoding_issue encode special characters] like with the ''location-hack''.
+=== GM_eval ===
+To resolve this distinction I propose that the GM API include the function GM_eval. In Mozilla GreaseMonkey it is just defined as something like
+ function GM_eval (string) {return new XPCNativeWrapper(eval(string, unsafeWindow));
+For other browsers/script-engines that don't have 2-argument eval one might want to make it a 2-argument function:
+ GM_eval(string, [boolean]);
+that uses the ''location-hack'' and wraps and encodes the string appropriatly. If the second argument is ''true'' then that means the function should use some detectable means to record the result inside the target document and return it or just possibly complain with a security error.
+==== Advantages: ====
+* Using <code>GM_eval</code> semantically indicates that you are safely evaluating something on the target page  as opposed to going through some hack back door.
+* You don't need to extra wrap or encode things like for the ''location-hack''.
+* <code>GM_eval</code> would really simplify the documentation of GreaseMonkey. For example this page would disappear. There would be one page on <code>GM_eval</code> that explains what it means and how it can be imlemented through <code>eval</code> or the ''location-hack'', and most of the other stuff  on this page could go into "code snippits". [Though I have noticed some other references to ''location-hack'' that should be cleaned up]
+==== Disadvantages ====
+None seen so far other than increasing the API count by one.