realbasic-nug
[Top] [All Lists]

RE: reading/parsing large files

To: "REALbasic NUG" <realbasic-nug@lists.realsoftware.com>
Subject: RE: reading/parsing large files
From: "Tim Hare" <tim@telios.com>
Date: Mon, 29 Sep 2008 19:56:35 -0700
Authentication-results: mx.google.com; spf=neutral (google.com: 74.124.194.228 is neither permitted nor denied by best guess record for domain of realbasic-nug-bounces@lists.realsoftware.com) smtp.mail=realbasic-nug-bounces@lists.realsoftware.com
Delivered-to: listarchive@realsoftware.com
Importance: Normal
In-reply-to: <49182.202.56.7.164.1222741959.squirrel@mail.btcl.net.bd>
Reply-to: REALbasic NUG <realbasic-nug@lists.realsoftware.com>
Sender: realbasic-nug-bounces@lists.realsoftware.com
> The code is basically a loop: skipping certain files (audio-visual and
> non-visible files), I process all the others basically in this way:
>
> //open the file as binary, false
> source = defineEncoding(b.read(b.length),nil)
> //parse it with Joe's TextUtilities
> mnumber = countB(Source, wordToBeSearched)
> //as soon as the word is found exit countB
> //add a row to a listbox (file name, location etc.)
>

Instead of using DefineEncoding, you should specify the encoding in the Read
call:

source = b.read(b.length, nil)

Otherwise, you produce 2 copies of every string just by reading it.

I'm not sure if that will make much difference, but it seems like you'd be
thrashing memory copying 4M strings over again.

Tim

No virus found in this outgoing message.
Checked by AVG.
Version: 7.5.524 / Virus Database: 270.7.3/1694 - Release Date: 9/26/2008
6:55 PM


_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>


<Prev in Thread] Current Thread [Next in Thread>