PDA

View Full Version : Filtering arrayfunction dynamically


sbeausol
02-12-2009, 08:31 PM
Hi-

I am trying to speed up my code by utilizing the filter function for arraycollections. My arraycollection looks like this but is much bigger:

dataList= new ArrayCollection([
{mass:200.2557, intensity:"57"},
{mass:578.6585, intensity:"99},
{mass:670.4367, intensity:"76"},
{mass:689.4545, intensity:"12"},
.....;
I would like to filter by mass, but every filter example I see has the filtered values either hard coded or are passed in by user fields. My question is can i simply pass in what I want to filter by into my filterFunction, or do I have to use public variables which seem messy to me? This would make the filter function filter the arraycollection by what I pass into it at a given point of code...

thanks for any input

wvxvw
02-12-2009, 08:37 PM
Just filter the collection's source and call refresh(), this would be the fastest way to do it. If you can tell what your filter should do, then there would be probably and an example for it.

sbeausol
02-12-2009, 08:54 PM
I would like to filter the collections source multiple times by different values in a loop changing min and max.

here is my filter function:

private function filterByMass(value:Object):Object {
//get clicked values
if (Number(value.mz) >= minMassFilter && Number(value.mz) <= maxMassFilter && Number(value.int != 0)) {
return true;
}
return false;
}

Min and Max will constantly be changing, and idealy I would think I would want to pass min and max in everytime I want to filter?

wvxvw
02-12-2009, 09:55 PM
a. I wouldn't call properties the same way as the buitlin classes.
b. I'd rather pass min and max values than 1 untyped object - this is the way to prevent errors.
c. I wouldn't use Array.filter() because for loop is approx 2 time faster.
var collectionSource:Array = [
{ mass:200.2557, intensity:"57" },
{ mass:578.6585, intensity:"99" },
{ mass:670.4367, intensity:"76" },
{ mass:689.4545, intensity:"12" }];

function filterArray(min:Number, max:Number):Array
{
var lnt:int = collectionSource.length;
var i:int;
var temp:Array = [];
for (i; i < lnt; i += 1)
{
if (collectionSource[i].mass < min) continue;
if (collectionSource[i].mass > max) continue;
temp.push(collectionSource[i]);
}
return temp;
}
var result:Array = filterArray(201, 671);
for each (var o:Object in result) trace(o.mass);
// 578.6585
// 670.4367

sbeausol
02-12-2009, 09:59 PM
ok, for some reason i was under the impression that the filter/sort functions on arraycollections would be more optimized than simply looping through an array. A 2fold speed increase is nice, I'll try that....

thanks!

wvxvw
02-12-2009, 10:04 PM
Yeah, that's actually sad... I've expected that too, and I actually liked the API, but, a simple loop proved to be more efficient :)

Sly_cardinal
02-14-2009, 08:23 AM
Quick note: This isn't a "you're wrong" post. This is a "let's understand when this concept affects us" post.


Yeah, that's actually sad... I've expected that too, and I actually liked the API, but, a simple loop proved to be more efficient :)

I think that needs to be a qualified statement. With a simple test (code included below) I got these results instantiating a number of objects and filtering them.

1,000 objects:
Instantiation: ~1 ms
Array.filter: ~1 ms
iterative function: ~1 ms


10,000 objects:
Instantiation: 5 ms
Array.filter: 3-4 ms
iterative function: 2-3 ms


100,000 objects:
Instantiation: 103 ms
Array.filter: 35 ms
iterative function: 19 ms


1,000,000 objects:
Instantiation: 1235 ms
Array.filter: 343 ms
iterative function: 194 ms


2,000,000 objects:
Instantiation: avg. 3070 ms (min: 2236 ms, max: 4738 ms)
Array.filter: 697 ms
iterative function: 390 ms

It's only when we get to 100,000 objects that we see any interesting difference between the two methods and even then it's only when we get to a million objects (i.e. a *lot* of objects) that there is any significant and, perhaps, noticable difference.

So, for large datasets using the iterative is about 1.8 times faster. I would conclude that if you are not dealing with a large number of items (i.e. a million objects) and if you like the Array.filter API then there is no reason not to use it - there is little real-world difference until you hit the millions-of-objects level.

Test code. Just drop it into a frame script in Flash:

var iterationCount:int;
iterationCount = 1000;

//iterationCount = 10000; // ten thousand.
//iterationCount = 100000; // one hundred thousand.
//iterationCount = 1000000; // one million.
//iterationCount = 2000000; // two million.

var maxNum:int = 100;
var objArray:Array = [];

var min:Number = 50;
var max:Number = 55;

function createObjects():void
{
var startTime:int = getTimer();
var endTime:int;

var newObject:Object;
for (var i:int = 0; i < iterationCount; i++)
{
newObject = {x:i, y:Math.random() * maxNum};
objArray.push(newObject);
}

endTime = getTimer();
trace("createObjects(): Total time=" + (endTime - startTime) + " ms");
}

function doFilterArray():void
{
var startTime:int = getTimer();
var endTime:int;

var newArray:Array = objArray.filter(filterFunction);

endTime = getTimer();
trace("doFilterArray(): Total time=" + (endTime - startTime) + " ms");
}

function doIterativeFilter():void
{
var startTime:int = getTimer();
var endTime:int;

var newArray:Array = iterativeFilter(objArray, min, max);

endTime = getTimer();
trace("doIterativeFilter(): Total time=" + (endTime - startTime) + " ms");
}

var filterFunction:Function = function(curObj:Object, index:int, array:Array):Boolean
{
return curObj.y > min && curObj.y < max;
}

function iterativeFilter(array:Array, min:Number, max:Number):Array
{
var newArray:Array = [];
var len:int = array.length;

var curObj:Object;
for (var i:int = 0; i < len; i++)
{
curObj = array[i];
if (curObj.y < min ) continue;
if (curObj.y > max) continue;
newArray.push(curObj);
}

return newArray;
}

createObjects();

doFilterArray();
doIterativeFilter();

wvxvw
02-14-2009, 10:01 AM
I'd say it doesn't have to be millions... 3D engine / particles etc will usually require you to iterate through lots of objects in a very limited time, and that's where you'd think about 2 ms as of a huge gain :)

sbeausol
02-14-2009, 03:16 PM
ok guys, i've taken some more time to benchmark things, and I'm actually finding a strong speed improvement with the array filter (50%) over the iterative filter, however my function requires 2 things,
1) filter by a number range,
2) sort that resulting array taking the one that has the highest intensity

Here is my function filtering data:

var collectionSource:Array = [
{ mass:200.2557, int:"57" },
{ mass:578.6585, int:"99" },
{ mass:670.4367, int:"76" },
{ mass:689.4545, int:"12" }];


Now my filter implementation (note I have to copy my spectrum because I'm also charting it and i don't want the chart to be filtered etc). What I'm doing is filtering by a min and max then returning the largest intensity from the matches which i do by sorting numerically descending:

private function filterByMz(value:Object):Object {
//get clicked values
if (Number(value.mz) >= minMassFilter && Number(value.mz) <= maxMassFilter && Number(value.int != 0)) {
return true;
}
return false;
}
private function getMatch():Number {
var startTime:int = getTimer();
var endTime:int;
var sort:Sort = new Sort();
sort.fields = [new SortField("int",false,true,true)];
var spectrumCopy:ArrayCollection = new ArrayCollection();
spectrumCopy.source = spectrum.toArray();
spectrumCopy.filterFunction = filterByMz;
spectrumCopy.refresh();
spectrumCopy.sort = sort;
spectrumCopy.refresh();
if (spectrumCopy.length > 0) {
//put intensity check here
endTime = getTimer();
trace("getMatch(): Total time=" + (endTime - startTime) + " ms");
return Number(spectrumCopy.getItemAt(0).int);
}
else {
endTime = getTimer();
trace("getMatch(): Total time=" + (endTime - startTime) + " ms");
return 0;
}
Here is my iteration function with the sort:
private function getMatch2(min:Number,max:Number):Number {
var startTime:int = getTimer();
var endTime:int;
var temp:Array = [];
var i:int;
var retInt:Number = 0;

for (i = 0; i < spectrum.length; i++) {
if (spectrum[i].int == 0) continue;
if (spectrum[i].mz < min) continue;
if (spectrum[i].mz > max) continue;

temp.push(spectrum[i].int);

}

//now sort and return largest int
if (temp.length > 0) {
temp.sort(Array.NUMERIC);
endTime = getTimer();
trace("getMatch2(): Total time=" + (endTime - startTime) + " ms");
return temp.shift();
}
else {
endTime = getTimer();
trace("getMatch2(): Total time=" + (endTime - startTime) + " ms");
return 0;
}

}

Here is my iteration function without the sort and its still just as slow:
private function getMatch2(min:Number,max:Number):Number {
var startTime:int = getTimer();
var endTime:int;
var temp:Array = [];
var i:int;
var retInt:Number = 0;

for (i = 0; i < spectrum.length; i++) {
if (spectrum[i].int == 0) continue;
if (spectrum[i].mz < min) continue;
if (spectrum[i].mz > max) continue;

if (retInt > spectrum[i].int) continue;
retInt = spectrum[i].int;

}
endTime = getTimer();
trace("getMatch2(): Total time=" + (endTime - startTime) + " ms");
return retInt;
}

wvxvw
02-14-2009, 03:42 PM
Are you sure you're getting the same results after both filters?
if (Number(value.mz) >= minMassFilter &&
Number(value.mz) <= maxMassFilter &&
Number(value.int != 0)) {
return true;
} looks just weird, why convert boolean to number and then back to boolean?

And in the iterating function you have assignment and getter call inside the for clause, sure it'll work slowly... =/
And it's not actually a fair comparison because these 2 functions definitely are doing different things... and it really depends very much on how often spectrum[i].int == 0, because if it happens less than 50% of all cases, then the iteration function as you have it now will be less efficient, the same is true for spectrum[i].mz < min...

sbeausol
02-14-2009, 03:49 PM
Ok tweaked the iterative filter some more, now its faster by 50%. here is the final filter:

private function getMatch2(min:Number,max:Number):Number {
var startTime:int = getTimer();
var endTime:int;
var temp:Array = [];
var i:int;
var retInt:Number = 0;
var len:int = spectrumCopy.length;
for (i = 0; i < len; i++) {
//if (spectrumCopy[i].int == 0) continue;
if (spectrumCopy[i].mz < min) continue;
if (spectrumCopy[i].mz > max) continue;
//if (spectrum[i].int != 0 && (spectrum[i].mz >= min && spectrum[i].mz <= max) ) {
if (retInt > spectrum[i].int) continue;
retInt = spectrum[i].int;
//temp.push(spectrum[i].int);
//}
}
endTime = getTimer();
trace("getMatch2(): Total time=" + (endTime - startTime) + " ms");
return retInt;
}