htmldocument

How to supply cookie to Java HTMLDocument?

I'm trying to read a web site as an HTMLDocument; and the site requires either a cookie from a previous logon, or a response to a popup dialog. I'm thinking that supplying the necessary cookie is the easiest to accomplish, but I haven't found a way to do that. The code to open and read the document is: URL url = new URL(suppliedURL); ...

Caching and re-using tree of HtmlElement objects

I am using the WebBrowser control in my project to display complex HTML documents that are generated/manipulated at runtime. I have noticed that constructing the DOM programmatically from C# by creating HtmlElement objects is about 3x slower than generating an HTML string and passing it to the WebBrowser, which in turn parses it to ge...

Removing HtmlElement objects programmatically using C#

In a WebBrowser control, how do I remove HtmlElement objects? There are no methods in the HtmlElement class to accomplish this. As a workaround, I can create a "dummy" HtmlElement (without inserting it into the HtmlDocument), into which I then insert (via AppendChild) the HtmlElement objects to be removed. This feels like a hack. Is ...

Building HtmlElement object trees

I'm using the MSIE WebBrowser control in a C# desktop application and am looking for a way to build and maintain trees of HtmlElement objects outside of this control. I am trying to quickly switch between multiple complex pages without incurring the overhead of re-parsing the HTML each time (and I don't want to maintain multiple control...

How can I change the color of a particular element of a HTMLDocument in a JEditorPane?

I basically want to implement changing the color of the links when I hover over them. The HyperlinkEvent that is triggered when I mouse over the link hands me the HTML element, but it won't let me set any style attributes on it, and I can't figure out how to get the elements that do have settable attributes. ...

C#: HtmlDocument object has no constructor?

What's up with that? It seems the only way to get an working HtmlDocument object is copying the Document property of an mshtml/webbrowser control. But spawning that is sloooooooooooow. I'd like to avoid writing my own HTML parser and HtmlAgilityPack is copyleft. Are there other sources of getting an instantiated HtmlDocument that I can ...

HTMLDocument: does Swing "optimize out" span elements?

I'm messing about with HTMLDocument in a JTextPane in Swing. If I have this situation: <html>... <p id='paragraph1'><span>something</span></p> <span id='span1'><span>something else</span></span> ...</html> (the extra <span> tags are to prevent Swing from complaining that I can't change the innerHTML of a leaf) or this situ...

mshtml.HTMLDocumentClass in C#

In C#, I managed to get the entire HTMLDocumentClass from an InternetExplorer object (navigating to a certain URL). However, in Visual Studio 2008's debug mode, the content of this HTMLDocumentClass for this particular URL is MASSIVE, including attributes like activeElement, alinkColor, all, applets, charset, childNodes, etc, etc ,etc. ...

Q about HTMLDocument, HTMLEditorKit, and blank spaces

When I run the following code: import java.io.IOException; import java.io.Reader; import java.io.StringReader; import javax.swing.text.BadLocationException; import javax.swing.text.EditorKit; import javax.swing.text.Element; import javax.swing.text.html.HTMLDocument; import javax.swing.text.html.HTMLEditorKit; . . . St...

How do I copy a DOM sub tree between two webbrowser.HmtlDocuments?

A WinForm application. I want to scrape a part of an HTML web page and save it into a local html file. I have one local file, "empty.htm" (containing just "I'm empty" in the body), one remote web page, and two WebBrowser controls. WebBrowser1 navigates to the remote page, WebBrowser2 to the local file. Both display their content appro...

Regex challenge - find "foobar" in HTML document

I have a fairly long and complex HTML document, and I need to find all occurences of a given string, e.g. "foobar", unless it's between <a> and </a> anchor tags. The trouble is: it could be inside some text between the anchor tags, e.g. <a>this is a foobar test</a> and even in this case, I should not find the match. How can I do th...

Extract a web document using c#

Hi I am trying to get data from a web page using c# So far this is my code: WebBrowser wb = new WebBrowser(); wb.Url = new Uri("http://www.microsoft.com"); HtmlDocument doc = wb.Document; MessageBox.Show(doc.ToString()); Unfortunately wb remains null and the Url property never gets set. Can anyone help me please? Thanks ...

In VB how can I use a website url to create an HtmlDocument object that contains all the html from that webpage?

I was trying to use HtmlDocument and a given url to pull in the html contents of a website to use. However there is no constructor for HtmlDocument and it's Url property is readonly. Is there any way to create an object that contains the entire DOM for a given url? Thanks, Matt ...

ASP.Net Page abstraction

We have a win application that shows a web form in a web browser. In order to get data from this web form we are using a hidden text box and get its text using HtmlDocument object of web browser control. I want to make an abstraction of this web form that has this text box element so that other forms can use this abstraction. I made a we...

Changing content of HTMLDocument displayed in a JTextPane

I'm displaying some tables as HTML code (rendered by a Freemarker template) with a JTextPane. I also have some HTML links in this HTML output which can be used to interact with the values of the table (For example "Delete a row" and stuff like this). Currently I always recreate the whole HTML output on each change and replace the whole d...

How to add <link> or <meta> tags to <head> with HtmlAgilityPack?

The link to download documentation from http://htmlagilitypack.codeplex.com is returning an error and I can't figure this out by trying the code. I'm trying to insert various tags into the <head> section of a HtmlDocument that I've loaded from a HTML string. The original issue I'm having is described here. Can somebody give me an idea ...

Why does System.Windows.Forms.HtmlDocument require full trust?

The HtmlDocument class has the following attribute: [PermissionSet(SecurityAction.LinkDemand, Name="FullTrust")] public sealed class HtmlDocument Why? Can I override this somehow? Or would i need to reflect the source and recompile? ...

How to create XPCNativeWrapper for HTMLDocument

Hi, I am having troubles with making my Firefox extension use xPath on a page loaded via AJAX inside my extension window. Here https://developer.mozilla.org/en/DOM/document.evaluate it says that evaluate can be used with HTML or XML documents to evaluate xPath expression. So, i created a HTMLDocument object, assigned the AJAX response ...

How can I invoke this onkeydown event with a WebBrowserControl in C#?

<input onkeydown="if(isNumber(event)) { this.value = isNumber(event); ajax_submit(this.form); bump_recruiter(); el('altsubmit').setAttribute('disabled', 'disabled'); return false; }" class="captcha" type="text" id="number" name="number" value=""> That is the html. I tried this within the webBrowser.Navigate method: javascript: ajax_su...

How to convert HtmlDocument.DomDocument to string ?

How to convert HtmlDocument.DomDocument to string ? ...