Discussion:
[Schevo-commit] r2371 - branches/gtk-support-3b/Schevo/schevo
Matthew Scott
2006-09-06 13:51:49 UTC
Permalink
Author: pobrien
Date: Wed Sep 6 07:38:24 2006
New Revision: 2371
branches/gtk-support-3b/Schevo/schevo/extent.py
Added a count() method, similar to find() but faster when all you need is the number of matching entities.
Can you start a new ticket and branch for this, rather than changing
more in gtk-support-3b?

It should be pretty easy to use "python setup.py develop" for Schevo in
one branch, and the same for SchevoGtk in another.

In particular, I think that if something like this is added, we should
1) add tests for it and 2) give similar functionality to query results
and links/m-namespace, where a result object would have a __len__ that
either returned the length of the query result, or raise an error if it
cannot be determined.

I'd also like to consider making the return value of Extent.find to be a
list subclass that did the following:

* Cache OIDs of results, to determine length and prepare for dereferencing

* Provide a __len__ method that returns that length.

* Override __getitem__ to dereference-and-cache on demand.

* Override __contains__ to at least dereference all items in the list
before calling the superclass __contains__.

* Override __iter__ to dereference on demand. This wouldn't cache
anything, so that converting to another type such as list or set would
have low impact.

* Disable __setitem__. If you want a mutable list from results r, use
"r = list(db.Foo.find(...))"

I started something similar in intent, but not nearly as comprehensive,
in the Results classes in schevo.query. Perhaps the ideas could be merged.

The reason to go through all the trouble is so that you can have either
of the following ways to get results, both supporting the use of len()
but allowing behind-the-scenes optimization of such an operation.

r1 = db.Foo.find(name='bar')

q = db.Foo.q.by_example(name='bar')
r2 = q()

assert len(r1) == len(r2)

Another area where len() could potentially be used instead of count()
would be in the 'count' method of an entity's .sys namespace. Instead
of using that, we could have links('Foo', 'bar') return a list subclass
that did the following:

* Override __len__ to return either the length of the list if already
gathered, or just determine and cache the count.

* Override __getitem__ and __iter__ to populate the list if not already
done.

* Disable __setattr__. Again, a mutable list can be built as described
above.

This would allow the efficient use of the len() API using the new .m
namespace:

r = db.Foo[1].m.bars()
size = len(r) # Does not actually get links, just the count.

More optimization can come later, but I'd like the interface to at least
be consistent across the entire API if quickly finding lengths of result
sets is important.

I think adopting an existing Python API, but optimizing behind the
scenes as necessary, would be more beneficial in the long-run than to
expand the count() API, which has differing method signatures in each
place that it is defined.
--
Matthew Scott
***@springfieldtech.com
Patrick K. O'Brien
2006-09-06 17:54:11 UTC
Permalink
Post by Matthew Scott
I'd also like to consider making the return value of Extent.find to be a
<details snipped>
Post by Matthew Scott
More optimization can come later, but I'd like the interface to at least
be consistent across the entire API if quickly finding lengths of result
sets is important.
I think adopting an existing Python API, but optimizing behind the
scenes as necessary, would be more beneficial in the long-run than to
expand the count() API, which has differing method signatures in each
place that it is defined.
I like this idea. Let's do it (in a branch, of course <grin>).
--
Patrick K. O'Brien
Orbtech http://www.orbtech.com
Schevo http://www.schevo.org
Louie http://www.pylouie.org
Loading...