realbasic-nug
[Top] [All Lists]

Re: reading/parsing large files

To: REALbasic NUG <realbasic-nug@lists.realsoftware.com>
Subject: Re: reading/parsing large files
From: Joe Strout <joe@inspiringapps.com>
Date: Mon, 29 Sep 2008 21:12:26 -0600
Authentication-results: mx.google.com; spf=neutral (google.com: 74.124.194.228 is neither permitted nor denied by best guess record for domain of realbasic-nug-bounces@lists.realsoftware.com) smtp.mail=realbasic-nug-bounces@lists.realsoftware.com
Delivered-to: listarchive@realsoftware.com
In-reply-to: <49182.202.56.7.164.1222741959.squirrel@mail.btcl.net.bd>
References: <49182.202.56.7.164.1222741959.squirrel@mail.btcl.net.bd>
Reply-to: REALbasic NUG <realbasic-nug@lists.realsoftware.com>
Sender: realbasic-nug-bounces@lists.realsoftware.com
On Sep 29, 2008, at 8:32 PM, Carlo wrote:

The code is basically a loop: skipping certain files (audio-visual and
non-visible files), I process all the others basically in this way:

//open the file as binary, false
source = defineEncoding(b.read(b.length),nil)
//parse it with Joe's TextUtilities
mnumber = countB(Source, wordToBeSearched)
//as soon as the word is found exit countB
//add a row to a listbox (file name, location etc.)

Well, hang on. Do you actually do anything with the count? If not, you shouldn't be using CountB; instead just use InStrB, to see whether the word is there or not.

(Also skip the DefineEncoding, as Tim pointed out.)

Since many files' size is more than 4M, I tried splitting them in chunks
of 2M each, but the results did not vary considerably.

You might go even smaller. The key in an app like this is to keep the disk buffer constantly full, so you never have to wait on the disk. This is often accomplished best by reading smaller chunks, so you can start working on it immediately while more data (which the various levels of the disk interface will read in anticipation of you needing them) is being read.

Therefore I was thinking of using memoryblocks, but alas, I dont know how to deal with them. If using memoryblocks helped both reducing CPU/ memory
values and search-time too, could some good soul tell me how to do it?

They don't. InStrB will almost certainly be faster than anything you can do with MemoryBlocks.

Best,
- Joe

--
Joe Strout
Inspiring Applications, Inc.
http://www.InspiringApps.com





_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>


<Prev in Thread] Current Thread [Next in Thread>