Windows, Progress 11.
We have a variety of files we send to other groups in the company. These are tab delimited and some of them can be very large, containing 80 million lines or more.
Every so often, they'll report a problem with the data in, let's say, line number 20,000,000. These files are too large for any text editor I have. So I'm trying to write something that will extract a set of lines from the file. Using a .Net StreamReader, I can go through the file and send out only the set of lines of interest. This is not awful, reads about 4 million lines per minute, but that's still a bit slow.
In C#, I can write a bit of code like
public IEnumerable<string> ExtractLines(string f, int StartLine, int EndLine) { return File.ReadLines(f).Skip(StartLine - 1).Take(EndLine - StartLine + 2); }
This is extremely fast. I can't figure out how to translate this to Progress. The Object returned by the File:Readlines method in Progress does not have a Skip or Take method, The object in .Net does. It's the same object, System.Collections.Generic.IEnumerable<T>. A bit of research tells me this has something to do with Lync but I've been unable to get this to work in Progress.
If anyone can help translate this, or if you have other suggestions for dealing with pulling a small set of lines from a very large file, It would be greatly appreciated.
Thanks,
Tom
Hi,
It doesn't look to me like a Progress task at all, why not use SED instead?
For lines 100 to 200
sed -n 100,200p filename > newfile
That will do it, and pretty quick.
Regards,
I meant Linq, not Lync
GNU sed for windows will help since sed isn't part of Windows by default.
gnuwin32.sourceforge.net/.../sed.htm
Or you could just make your C# program available to call from Progress. If your file was fixed length you could use SEEK but other than that... Progress isn't the place to be scanning through large files.
Maybe one of the class gurus can chime in on your original question though.
Yes, a windows version of SED is certainly one way to deal with this. But I would really like to understand why I can't use those methods in Progress.
Yes, the compiled C# program works from Progress but I'd rather avoid that if possible.
Understood completely... just offering some work arounds until the Windows OO experts show up :-)
Skip() is a generic method and we do not support calling generic methods in the ABL.