Hi,
I have a relatively strange question.
I have a file that is 6 gigabytes long. What I need to do, is scan the entire file, line by line, and determine all rows that match an id number of any other row in the file. Essentially, its like analyzing a web log file where there are many session ids that are organized by the time of each...
I'm using the following basic function which I copied from the net to read a text file
public void read ()
{
File file = new File("/Users/MAK/Desktop/data.txt");
System.out.println("Start");
try
{
//
// Create a new Scanner object which will read the data from the
// file passed in. To check i...
I want to read a large xml file (100+M). Due to its size, I do not want to load it in memory using XElement. I am using linq-xml queries to parse and read it.
What's the best way to do it? Any example on combination of XPath or XmlReader with linq-xml/XElement?
Please help. Thanks.
...
Hi, im wondering if anyone knows of any host providers ( uk preferably ) that deals mostly with accepting large file uploads. Most hosts only let you push something like 1.5mb ( thats taking into account the connection and the max execution time ). What i am looking for is a host specificaly for storing files on.
I was going to create a...
I need to programmatically download a large file before processing it. What's the best way to do that? As the file is large, I want to specific time to wait so that I can forcefully exit.
I know of WebClient.DownloadFile(). But there does not seem a way to specific an amount of time to wait so as to forcefully exit.
try
{
WebClie...
I'm reading a 6 million entry .csv file with Python, and I want to be able to search through this file for a particular entry.
Are there any tricks to search the entire file? Should you read the whole thing into a dictionary or should you perform a search every time? I tried loading it into a dictionary but that took ages so I'm current...
I'm working with PHP and need to parse a number of fairly large XML files (50-75MB uncompressed). The issue, however, is that these XML files are stored remotely and will need to be downloaded before I can parse them.
Having thought about the issue, I think using a system() call in PHP in order to initiate a cURL transfer is probably th...
Hello
What XML-parser do you recommend for the following purpose:
The XML-file (formatted, containing whitespaces) is around 800 MB. It mostly contains three types of tag (let's call them n, w and r).
They have an attribute called id which i'd have to search for, as fast as possible.
Removing attributes I don't need could save around ...
I am trying to read a very large file in AS3 and am having problems with the runtime just crashing on me. I'm currently using a FileStream to open the file asynchronously. This does not work(crashes without an Exception) for files bigger than about 300MB.
_fileStream = new FileStream();
_fileStream.addEventListener(IOErrorEvent.IO_ERR...
Say I have a binary file of 12GB and I want to slice 8GB out of the middle of it. I know the position indices I want to cut between.
How do I do this? Obviously 12GB won't fit into memory, that's fine, but 8GB won't either... Which I thought was fine, but it appears binary doesn't seem to like it if you do it in chunks! I was appending ...
It's no secret that application logs can go well beyond the limits of naive log viewers, and the desired viewer functionality (say, filtering the log based on a condition, or highlighting particular message types, or splitting it into sublogs based on a field value, or merging several logs based on a time axis, or bookmarking etc.) is be...
I have some json files with 500MB.
If I use the "trivial" json.load to load its content all at once, it will consume a lot of memory.
Is there a way to read partially the file? If it was a text, line delimited file, I would be able to iterate over the lines. I am looking for analogy to it.
Any suggestions?
Thanks
...
Hi.
I've got a "slightly" large sql script saved as a textfile. It totals in at 8.92gb, so it's a bit of a beast.
I've got to do some search and replaces in this file(specifically, change all NOT NULL to NULL, so all fields are nullable) and then execute the darned thing. Does anyone have any suggestions for a text editor that would be ...
I have a web application that accepts file uploads of up to 4 MB. The server side script is PHP and web server is NGINX. Many users have requested to increase this limit drastically to allow upload of video etc.
However there seems to be no easy solution for this problem with PHP. First, on the client side I am looking for something tha...
Hello,
I have application which retrieves many large log files from a system LAN.
Currently I put all log files on Postgresql, the table has a column type TEXT and I don't plan any search on this text column because I use another external process which nightly retrieves all files and scans for sensitive pattern.
So the column value c...
I'm trying to create a very large image (86400 x 43200) using several tiles that make up a portion of this final image with ImageMagick (using the .NET bindings).
The problem seems to be when I attempt to create my output image with the given size; ImageMagick just hangs on the Resize() call. When I say 'hangs' I mean the program become...
I'd like to be able to do random access into a gzipped file.
I can afford to do some preprocessing on it (say, build some kind of index), provided that the result of the preprocessing is much smaller than the file itself.
Any advice?
My thoughts were:
Hack on an existing gzip implementation and serialize its decompressor state every,...
I have a set of key/values (all text) that is too large to load in memory at once. I would like to interact with this data via a Python dictionary-like interface.
Does such a module already exist?
Reading key values should be efficient and values compressed on disk to save space.
Edit:
Ideally cross platform, but only using Linux ...
I am using XMLReader to read a large XML file with about 1 million elements on the level I am reading from. However, I've calculated it will take over 10 seconds when I jump to -for instance- element 500.000 using XMLReader::next ([ string $localname ] ) or XMLReader::read ( void )
This is not very usable. Is there a faster way to...
I've tried it these ways so far:
1) Make a hash with the source IP/port and destination IP/port as keys. Each position in the hash is a list of packets. The hash is then saved in a file, with each flow separated by some special characters/line. Problem: Not enough memory for large traces.
2) Make a hash with the same key as above, but ...