hello,
I'm new to python. I want to extract some text from the CNN website.
I want to use python win32com module.
EDIT: on [why win32com]
Because of javascript in website... I thought of using win32com; I have looked for other solution but without success in regard to my requirement. In fact, I wanted to use mechanize or a similiar solut...
I want a python script to print list of all functions defined in a C/C++ file.
e.g. abc.c defines two functions as:
void func1() { }
int func2(int i) { printf("%d", i); return 1; }
I just want to search the file (abc.c) and print all the functions defined in it (function names only). In the example above, I would like to print func1,...
How to eliminate left recursion for the following grammar?
E := EE+|EE-|id
Using the common procedure:
A := Aa|b
translates to:
A := b|A'
A' := ϵ| Aa
Applying this to the original grammar we get:
A = E, a = (E+|E-) and b = id
Therefore:
E := id|E'
E' := ϵ|E(E+|E-)
But this grammar seems incorrect because
ϵE+ -> ϵ id +
w...
Hi,
For a Jericho Element, I am trying to find out how to loop over all child nodes, whether an element or plain text.
Now there is Element.getNodeIterator(), but this references ALL descendants within the Element, not just the first descendants.
I need the equivalent of Element.getChildSegments(). Any ideas?
Thanks
...
hello.
i want to extract some text in certain website.
here is web address what i want to extract some text to make scraper.
http://news.search.naver.com/search.naver?sm=tab%5Fhty&where=news&query=times&x=0&y=0
in this page, i want to extract some text with subject and content field separately.
for example,if you open tha...
Alrighty, by LL(k) languages, I mean programming languages whose parsers can be described by grammars which are LL(k).
these are my guesses:
pascal
lisp
xml and friends
...
I am attempting to parse Yahoo's weather XML feed via this script. The parsing itself works: I am just struggling with getting the days to correspond with today, tomorrow and the day after.
The final HTML output looks like this:
Which can be seen here: http://www.wdmadvertising.com.au/preview/cfs/index.shtml
todayMon______________19
...
I am querying a web service that was built by another developer. It returns a result set in a JSON-like format. I get three column values (I already know what the ordinal position of each column means):
[["Boston","142","JJK"],["Miami","111","QLA"],["Sacramento","042","PPT"]]
In reality, this result set can be thousands of records ...
The ISO8601 format for date/time representations supports many variations of format to express the same information.
I know that not all languages have libraries that support the range of the standard - for example, I've had problems parsing the different possible formats of the timezone using Java's SimpleDateFormat API.
Given the cho...
Hi,
I have a create action that handles XML requests. Rather than using the built in params hash, I use Nokogiri to validate the XML against an XML schema. If this validation passes, the raw XML is stored for later processing.
As far as I understand, the XML is parsed twice: First the Rails creates the params hash, then the Nokogiri pa...
Is there a simple way, using C#, to open an arbitrary URL, read in the text, and reduce it down to that which would be displayed in a web page? I suppose I could get the < body > content, and iterate char by char over that content, ripping out anything that is in betwee < and >(inclusive). I looked briefly at HTML Agiligy Pack, and tha...
Hello I am using simple_html_dom to find every link in the html document that is of the class "new". Ordinarily I would use:
$html->find('a[class=new]');
This would obtain links such as e.g.
<a class="new" ... blah blah ... />
Hoever the problem this time is that the html document contains links with classes such as
<a class="to...
Suppose I want a function that takes a number and returns it as a string, exactly as it was given. The following doesn't work:
SetAttributes[foo, HoldAllComplete];
foo[x_] := ToString[Unevaluated@x]
The output for foo[.2] and foo[.20] is identical.
The reason I want to do this is that I want a function that can understand dates with ...
Hi,
How to extract method name and namespace from this xml using LINQ to XML?
<?xml version="1.0" encoding="utf-8"?><soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:wsa="http://schemas.xmlsoap.org/ws/2004/08/add...
Hi SO,
I need to parse aspx, ascx, master files in an MVC project to an object model so that I can allow people to change particular parts and save the file back ~ A content management type of thing.
Is there anything in the framework to help me?
What I have tried.
XDocument.Load: Cannot load the
directives and inline code blocks
Ge...
Hey!
i'm using the NSXMLParser to fetch a String from xml. I'v created a class to store the data with synchronized variables.
To get the text between the elementName i use the foundCharacter function. And to store the Strings i use a MutableString *.
When i find the String and print everything is correct but when i'm done the two differ...
I have an unfinished binary file that has some info that I can recover using regex. The contents are:
G $12.Angry.Men.1957.720p.HDTV.x264-HDLH Lhttp://site.com/forum/f89/12-angry-men-1957-720p-hdtv-x264-hdl-538403/ L I Š M ,ABBA.The.Movie.1977.720p.BluRay.DTS.x264-iONN Phttp://site.com/forum/f89/abba-movie-1977-...
So, what I'm trying to do, is take a .txt or html file, being able to search through it, and grab a piece of text from file, place it into a string and finally adding it into a textView.
Each couple of piece of text will be divided like this:
001:001 Text1
001:002 Text2
001:003 Text3
002:001 Text1a
002:002 Text1b...
Hey guys. I'm wondering if there are any existing libraries in or accessible from Objective-C that would allow me to scrape pages formatted like this one. Specifically, all of the dates and all of the text next to each date. If not, what would be the best way to go about doing this? Regular expressions? I heard that NSString might alread...
I want to go through the children of an element and filter only the ones that are text or span, something like:
element.children.select {|child|
child.class == String || child.element_type == 'span'
}
but I can't find a way to test which type a certain element is. How do I test that? I'd like to know that regardless if there's a bet...