views:

163

answers:

3

Is there an easy way, preferably with a scripting language or a small tool that could be called through a batch file, to operate on a text file, mark an offset, and put everything after the offset into a new file?

I have a text file added to nightly, and I would like to make it so that the end of the file is marked, then after new data is added, only data between the offset and the end is processed. I can not do this with just strings or delimiters as it is blob data.

edit: The text file is created by running a ms access macro from a scheduled tasks, which exports the data as a csv file. In considering Patricks suggestion, I would like to know if it would be possible to add a wildcard such as the date to the filename, to always have a different file. This file will then be scp'd to a linux server, where it will be loaded into a mysql database.

+2  A: 

It is simple with python:

import sys

def divide_file(fname, mark):
    mark_found = 0
    f = file(fname, 'r')
    for line in f.readlines():
     if mark in line:
      mark_found = 1
     if mark_found:
      print line.rstrip()
    f.close()

divide_file(sys.argv[1], sys.argv[2])

Usage & output example:

c:\tmp>divide_file.py divide_file.py close
        f.close()

divide_file(sys.argv[1], sys.argv[2])
Michał Niklas
Thankyou for your answer, but python is not a possibility for this instance.
Joshxtothe4
+1  A: 

I could think of tail, bash and other utilities from Unix-like systems. You could get those on Windows by minimally installing MSYS. Documentation and examples referring to these utilities are quite easy to find. And bash stuff is way stronger than Windows batch files. The script would look something like this:

#!/bin/bash

PREV_SIZE=`du -b text_file`
write_something_to_file text_file
CURR_SIZE=`du -b text_file`
let NUM=$PREV_SIZE-$CURR_SIZE
tail -c $NUM > new_text_file
Eduard - Gabriel Munteanu
A: 

Assuming you're currently exporting the data from the Access database with a script already:

@echo OFF

:: Force a new line and add a marker; assuming your file is data.txt.
@echo. >> data.txt
@echo **MARKER** >> data.txt

:: Run your export here: these lines just simulate the export.
@echo Test Line 1 >> data.txt
@echo Test Line 2 >> data.txt

:: Find line number of last marker:
for /f "usebackq delims=:" %%I in (`findstr /N "**MARKER**" data.txt`) do (
    set LAST_MARKER=%%I
)

:: Get all the lines after the last marker
for /f "skip=%LAST_MARKER% tokens=*" %%L in (data.txt) do (
    @echo %%L >> new_data.txt
)

The output in new_data.txt will be:

Test Line 1
Test Line 2

Patrick Cuff
I am using a macro within access that is run from the command line, would this be easy to do with a macro?
Joshxtothe4
Your question asked "Is there an easy way, preferably with a scripting language or a small tool that could be called through a batch file"...
Patrick Cuff
I have to ask; why not just put the data that would come after the marker into its own file?
Patrick Cuff
That does seem obvious. I suppose because everything must run automatically each night.., I can set that up in linux, but can access have a wildcard or variable such as the date in the filename?
Joshxtothe4
Not from a macro, but it can from VBA. If you already have a .BAT file that calls the Access macro from the command, then my solution should work for you.
Patrick Cuff
You can also rename the file created by your .BAT file to have a timestamp: `for %I in (test.txt) do ren %I "%~nI_%DATE:~10,4%-%DATE:~4,2%-%DATE:~7,2%_%TIME:~0,2%-%TIME:~3,2%%~xI"`
Patrick Cuff
Is that just a batch file for loop? I cannot get it to alter the filename at all.
Joshxtothe4
If you're using this in a batch file use %%I instest if %I.
Patrick Cuff
what about do ren? Should it be %%Ifor i in test if%I do ren?
Joshxtothe4
Yes, use %%I everywhere %I is used. You use a single % on the command line, but double %%s within a batch script.
Patrick Cuff
It should be: `for %%I in (test.txt) do ren %%I "%%~nI_%DATE:~10,4%-%DATE:~4,2%-%DATE:~7,2%_%TIME:~0,2%-%TIME:~3,2%%%~xI"`
Patrick Cuff
This produces test_-1.-00_14-53.txt for a computer set as the 19th November 2008. What does 1-00 correspond to?
Joshxtothe4
That should be the date, as yyyy-mm-dd; If your local date format is different you'll have to change the substrings in the %DATE% parsing. From a command line, type `echo %DATE%`; for me it returns "Mon 02/02/2009", so the year starts in position 10 for a length of 4 (0-bases string).
Patrick Cuff
The month starts in position 4 for a length of 2, the day starts in position 7 for a length of 2. Adjust these start positions to match your regional settings.
Patrick Cuff
Thankyou! works great
Joshxtothe4
Is there a way to store the result in a variable as well, to specify that is should be the file copied?
Joshxtothe4
Sure: `for %%I in (test.txt) do set MyNewFilename="%%~nI_%DATE:~10,4%-%DATE:~4,2%-%DATE:~7,2%_%TIME:~0,2%-%TIME:~3,2%%%~xI"`
Patrick Cuff
Then you can rename like: `ren OldFile.txt %MyNewFilename%`
Patrick Cuff
Actually, each new file created will still contain all the data from the old file, so I am planning to use a combination of markers and dates. How would you modify the above to place everything after the marker into a new file with the name of the date? It would have to be automatic, perhaps rename
Joshxtothe4
:: Create new file namefor %%I in (data.txt) do set NewFile="%%~nI_%DATE:~10,4%-%DATE:~4,2%-%DATE:~7,2%_%TIME:~0,2%-%TIME:~3,2%%%~xI":: Get all the lines after the last markerfor /f "skip=%LAST_MARKER% tokens=*" %%L in (data.txt) do ( @echo %%L >> %NewFile%)
Patrick Cuff
Ahh, the problem is that access does not append data to the file, it recreates the file. Keeping this in mind, is there still a solution?
Joshxtothe4