data-dump

To read SO's data dump effectively

I use currently Vim to read SO's data dump. However, my Macbook slows down when I roll down just a few rows. This suggests me that there must be more efficient ways to read the data. I know little MySQL. The files are in .xml -format. It is rather hard to read the data at the moment in .xml. It may be more efficient to convert the xml -...

How to convert XML file to a Database?

I recently downloaded the SO Data Dump and was wondering how I could convert it from XML to a DB that I could use in my .NET applications. ...

How to open problematic data dump and convert it to mysql (or some other practical format)?

I'm trying to work with pretty interesting data set. Unfortunately, I have problems with opening it and converting it into any useful format. It's collection of archived txt files. When i decompress them and try to open txt file i get 'it's binary file, saving it may result in corrupt file' and it's unreadable - there are just 'weird cha...

What's the easiest way to convert an SO data dump from HTML back to Markdown?

I've just got my hands on a Stackoverflow data dump, and I'm disappointed to see that the Body field of the posts is in HTML rather than Markdown. I suspect there's Markdown in the original database because that's what I see if I try to edit an answer. I want to recover Markdown from a large set of answers. I will be processing hundre...

SQL Server Table Dump - Not All Columns

I'm trying to do an SQL Server dump of a couple of tables in a database (Microsoft SQL Server). We don't have write access to the DB, so we can't do what I was originally thinking (create temp db, copy tables (minus the columns we don't want into the temp db, and then dump that db). I really can't figure out a way to do this. A csv expo...

With Haskell, how do I process large volumes of XML?

I've been exploring the Stack Overflow data dumps and thus far taking advantage of the friendly XML and “parsing” with regular expressions. My attempts with various Haskell XML libraries to find the first post in document-order by a particular user all ran into nasty thrashing. TagSoup import Control.Monad import Text.HTML.TagSoup use...

Propel-load-data is causing an error

I am trying to load fixtures but myproject is erroring at the CLI and starting the indexer process. I have tried: Rebuilding the schema and model Emptying the database and starting again Clearing the cache Validating the YML file and trying much simpler data-dumps My platform is Symfony 1.0 on Windows Some also seems to have had t...

How to export primary keys on data-dump?

Hello, When I export my database with doctrine:data-dump, I encounter 2 problems: * the primary keys are not exported * instead of foreign keys columns correct name, it uses the name of the foreign table. For example, here are my tables: # schema.yml Planet: connection: doctrine tableName: planet columns: planet_id: ty...

Is there anywhere to get a free stockmark data feed/dump?

The Stock Market world seems to almost be as fragmented as our nation's Real Estate Fiefdoms. At any rate, are there any providers of free data dumps, API's, or feeds related to stocks or commodities? Delay time isn't important. ...