Talk:GM.xmlHttpRequest: Difference between revisions

From GreaseSpot Wiki
Jump to navigationJump to search
m (Arantius moved page Talk:GM.xmlhttpRequest to Talk:GM.xmlHttpRequest: Greasemonkey 4.0)
 
(22 intermediate revisions by 13 users not shown)
Line 1: Line 1:
== Altering User Agent Script: Read Only if you care about private browsing ==
I think most people want straightforward instructions how to alter the useragent and "referrer" and every other unneeded bullcrap their browser sends to websites, but it's not explained here and really should be.
 
[[User:Usarajent|Usarajent]] 18:10, 13 February 2011 (UTC)
== Core Example Proposal from Aavindraa ==
== Core Example Proposal from Aavindraa ==
Reference: [[GM_xmlhttpRequest#Core |Core Example]]<br />
Reference: [[GM_xmlhttpRequest#Core |Core Example]]<br />
Line 11: Line 17:


I am very unsure where the "recommendation" currently there came from.  I think the "If not specified..." statement is significantly more clear, and actually represents reality. [[User:Arantius|Arantius]] 21:54, 15 September 2009 (EDT)
I am very unsure where the "recommendation" currently there came from.  I think the "If not specified..." statement is significantly more clear, and actually represents reality. [[User:Arantius|Arantius]] 21:54, 15 September 2009 (EDT)
: Sounds good, just needed a 2nd opinion before changing it's meaning again... It originally said <code>"User-Agent":"monkeyagent"</code> from LouCypher.  As people usually follow by example, most people were spoofing the user agent with GM specific agent and sites were beginning to block requests when the agent was set to that... would rather not have a repeat of that. [[User:Marti|Marti]] 23:02, 15 September 2009 (EDT)
== RE: Get Request ==
I'm curious about how Greasemonkey has a XMLHttpRequest object that doesn't have a same origin policy on it. I was looking on MDN but haven't found any documentation on it in particular. I did see some things about sockets and transports... Anyway, what I did was I went ahead and made an extension tonight that fires up PHP. My big idea was that worming PHP into an extension would mean I could pipe data through it because I've got a bunch of scripts for processing web pages and such. Having it in the browser means I don't have to tinker around with curl and accessing sites. Well, in theory anyway... I've only got one way communication from chrome into PHP. I'm thinking I can whip out some socket scripts for PHP because I can both listen for and respond to traffic that way. I think it will be good, arbitrary events on the page can trigger my extension as well as any other "traps" I set which would fire off a cross domain XMLHttpRequest using Greasemonkey. I suppose I could set it up so that PHP grabbed my cookies straight out of the database and converted them to files CuRL could use. . . Then I suppose it would be like having 3 different kinds of request, the locked down one in the page, the non restricted one in Greasemonkey, and PHP's grab anything and run it through any sort of transformer it supports. It would be fun to get it to interpret scripts on the page. . . hm. . . I'd have to disable a lot of .  . .. Well. Not now. That would take a long time.
Anyway, what I meant to say was that in the example of the get request there is a note about only creating a DOM object from XML. I know, it's a real pain to do anything with HTML response text and I always hear that the two options we have are to either tear out our hair trying to process it with regex, or to try cramming it into an iFrame and dealing with the complexity of reflow, script bugs, and a million other things... :D
Look! I haven't seen this anywhere because I made it up the other day. I'm new here, so if I shouldn't be posting this sort of stuff in the wiki for everyone then just show me where to go. :D
The script takes the string of response text and loads it into a separate document. You can use document.evaluate and manipulate this hidden DOM however you want You just have to ... say for instance you instantiated the HTMLParser into a variable called otherDoc, you'd run otherDoc.evaluate(xpathExpression, contextNode, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null); because it's a different document. You have to supply otherDoc as the context node as well if you're not using an element reference from within the hidden DOM. XD Since the hidden dom is created by document . . . well I don't really know why it hasn't given me trouble about just selecting nodes from the hidden DOM and appending them to nodes on the displayed dom. I thought I'd get snagged up and have to import and adopt but so far I haven't had to do it. I know you can do the same trick with a document fragment instead of my massive parser script, but the document fragment method will ruin the head section and has a few other side effects. This trick here is the best solution I've been able to find other than fetching the page with PHP and manipulating it there. . .
The require script is the content scope runner found right here in the wiki. Check the link if you want. :D
<pre>
// ==UserScript==
// @name          Atropa Toolbox - HTML Parser
// @namespace      http://matthewkastor@atropaIncIntl
// @description    Carry out DOM operations without loading content to the active document.
// @include        *
// @require        http://userscripts.org/scripts/source/68059.user.js
// ==/UserScript==
(function(){
"use strict";
var HTMLParser;
function HTMLParser() {
    this.newDocument = function () {
        var dt;
        dt = document.implementation.createDocumentType("html", "-//W3C//DTD HTML 4.01 Transitional//EN", "http://www.w3.org/TR/html4/loose.dtd");
        this.doc = document.implementation.createDocument('', '', dt);
    };
    this.loadString = function (htmlstring) {
        if (!htmlstring) {
                return false;
            }
        this.newDocument();
        this.doc.appendChild(this.doc.createElement('html'));
        this.doc.documentElement.innerHTML = htmlstring;
        return this.doc;
    };
    return this.newDocument();
}
//window.atropa.HTMLParser = HTMLParser;  //uncomment this line and find the parser
                                          // in the same place all the time.
                                          // new atropa.HTMLParser();
}());
</pre>
you use it something like....
<pre>
var htmlParse = new HTMLParser();
otherDoc = htmlparse.loadString(xmlhttpResponseText);
var ps = otherDoc.evaluate(".//div/div//p[@class='someclass']", otherDoc, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
</pre>
Anyway guys, I got to go! It's late and I'm sleepy. Send me a note if you want, or if there's a better place for me to write this sort of stuff. :D
Matthew Kastor
matthewkastor@gmail.com
== Making requests from 'native' events ==
I was working on my Greasemonkey script for YouTube and have been struggling with an issue all day. I am not sure if it's a bug or it has to do with security (I guess the latter).
The problem is that YouTube adds a lot of elements dynamically through JavaScript. This causes GreaseMonkey to start earlier than I want (I need some elements that are added dynamically). So I added an event listener ('DOMNodeInserted') to the 'window'-object of the browser to wait till the required elements are loaded. All things went well until I wanted to execute 'GM_xmlhttpRequest'. By debugging I found out the request was called but none of the events ('onload', 'onerror', 'ontimeout') were ever executed.
Here is a little demo I wrote to demonstrate the bug [http://ddaanv.nl/TestMeuk/xmlhttpRequestTest.user.js]
and a video that you could use to test the script. [https://www.youtube.com/watch?v=dQw4w9WgXcQ]
Will this be fixed or is this intended? (In case it stays the way it is I would place a warning in the documentation at least, because I have been struggling with this for hours.)

Latest revision as of 15:15, 3 November 2017

Altering User Agent Script: Read Only if you care about private browsing

I think most people want straightforward instructions how to alter the useragent and "referrer" and every other unneeded bullcrap their browser sends to websites, but it's not explained here and really should be.

Usarajent 18:10, 13 February 2011 (UTC)

Core Example Proposal from Aavindraa

Reference: Core Example
Aavindraa has proposed a change in a comment field to modify the existing reference from:
"User-Agent": "Mozilla/5.0", // Recommend using navigator.userAgent when possible
to
"User-Agent": "Mozilla/5.0", // If not specified, navigator.userAgent is automatically used.

Discuss pros and cons here:

Obliterating the meaning is a concern... I do agree that some sort of merge should be incorporated with this as the normal default behaviour, if this atom is omitted, of the browser typically chooses it for the ScriptWright. I do want to make sure that the meaning behind it currently isn't removed. The original contributor didn't clearly note why one would want to do this in the first place (e.g. Override the default user agent for either browser anonymity or spoofing another browser type). I would prefer to make this ultra clear along with Aavindraa's change. Comments and suggestions on merging and meaning are appreciated. Marti 00:20, 15 September 2009 (EDT)

I am very unsure where the "recommendation" currently there came from. I think the "If not specified..." statement is significantly more clear, and actually represents reality. Arantius 21:54, 15 September 2009 (EDT)

Sounds good, just needed a 2nd opinion before changing it's meaning again... It originally said "User-Agent":"monkeyagent" from LouCypher. As people usually follow by example, most people were spoofing the user agent with GM specific agent and sites were beginning to block requests when the agent was set to that... would rather not have a repeat of that. Marti 23:02, 15 September 2009 (EDT)

RE: Get Request

I'm curious about how Greasemonkey has a XMLHttpRequest object that doesn't have a same origin policy on it. I was looking on MDN but haven't found any documentation on it in particular. I did see some things about sockets and transports... Anyway, what I did was I went ahead and made an extension tonight that fires up PHP. My big idea was that worming PHP into an extension would mean I could pipe data through it because I've got a bunch of scripts for processing web pages and such. Having it in the browser means I don't have to tinker around with curl and accessing sites. Well, in theory anyway... I've only got one way communication from chrome into PHP. I'm thinking I can whip out some socket scripts for PHP because I can both listen for and respond to traffic that way. I think it will be good, arbitrary events on the page can trigger my extension as well as any other "traps" I set which would fire off a cross domain XMLHttpRequest using Greasemonkey. I suppose I could set it up so that PHP grabbed my cookies straight out of the database and converted them to files CuRL could use. . . Then I suppose it would be like having 3 different kinds of request, the locked down one in the page, the non restricted one in Greasemonkey, and PHP's grab anything and run it through any sort of transformer it supports. It would be fun to get it to interpret scripts on the page. . . hm. . . I'd have to disable a lot of . . .. Well. Not now. That would take a long time.


Anyway, what I meant to say was that in the example of the get request there is a note about only creating a DOM object from XML. I know, it's a real pain to do anything with HTML response text and I always hear that the two options we have are to either tear out our hair trying to process it with regex, or to try cramming it into an iFrame and dealing with the complexity of reflow, script bugs, and a million other things... :D


Look! I haven't seen this anywhere because I made it up the other day. I'm new here, so if I shouldn't be posting this sort of stuff in the wiki for everyone then just show me where to go. :D


The script takes the string of response text and loads it into a separate document. You can use document.evaluate and manipulate this hidden DOM however you want You just have to ... say for instance you instantiated the HTMLParser into a variable called otherDoc, you'd run otherDoc.evaluate(xpathExpression, contextNode, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null); because it's a different document. You have to supply otherDoc as the context node as well if you're not using an element reference from within the hidden DOM. XD Since the hidden dom is created by document . . . well I don't really know why it hasn't given me trouble about just selecting nodes from the hidden DOM and appending them to nodes on the displayed dom. I thought I'd get snagged up and have to import and adopt but so far I haven't had to do it. I know you can do the same trick with a document fragment instead of my massive parser script, but the document fragment method will ruin the head section and has a few other side effects. This trick here is the best solution I've been able to find other than fetching the page with PHP and manipulating it there. . .


The require script is the content scope runner found right here in the wiki. Check the link if you want. :D




// ==UserScript==
// @name           Atropa Toolbox - HTML Parser
// @namespace      http://matthewkastor@atropaIncIntl
// @description    Carry out DOM operations without loading content to the active document.
// @include        *
// @require        http://userscripts.org/scripts/source/68059.user.js
// ==/UserScript==
(function(){
"use strict";
var HTMLParser;
function HTMLParser() {
    this.newDocument = function () {
        var dt;
        dt = document.implementation.createDocumentType("html", "-//W3C//DTD HTML 4.01 Transitional//EN", "http://www.w3.org/TR/html4/loose.dtd");
        this.doc = document.implementation.createDocument('', '', dt);
    };
    this.loadString = function (htmlstring) {
        if (!htmlstring) {
                return false;
            }
        this.newDocument();
        this.doc.appendChild(this.doc.createElement('html'));
        this.doc.documentElement.innerHTML = htmlstring;
        return this.doc;
    };
    return this.newDocument();
}
//window.atropa.HTMLParser = HTMLParser;  //uncomment this line and find the parser
                                          // in the same place all the time.
                                          // new atropa.HTMLParser(); 
}());



you use it something like....

var htmlParse = new HTMLParser();
otherDoc = htmlparse.loadString(xmlhttpResponseText);
var ps = otherDoc.evaluate(".//div/div//p[@class='someclass']", otherDoc, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);

Anyway guys, I got to go! It's late and I'm sleepy. Send me a note if you want, or if there's a better place for me to write this sort of stuff. :D


Matthew Kastor matthewkastor@gmail.com

Making requests from 'native' events

I was working on my Greasemonkey script for YouTube and have been struggling with an issue all day. I am not sure if it's a bug or it has to do with security (I guess the latter).

The problem is that YouTube adds a lot of elements dynamically through JavaScript. This causes GreaseMonkey to start earlier than I want (I need some elements that are added dynamically). So I added an event listener ('DOMNodeInserted') to the 'window'-object of the browser to wait till the required elements are loaded. All things went well until I wanted to execute 'GM_xmlhttpRequest'. By debugging I found out the request was called but none of the events ('onload', 'onerror', 'ontimeout') were ever executed.

Here is a little demo I wrote to demonstrate the bug [1] and a video that you could use to test the script. [2]

Will this be fixed or is this intended? (In case it stays the way it is I would place a warning in the documentation at least, because I have been struggling with this for hours.)