Recently I've been busy with Debtags code.
Actually not really Debtags code but Tagcoll code. Tagcoll is a collection of generic templates implementing all that is needed to work with tagged stuff. Debtags is just an instantiation of Tagcoll to work with Debian packages, but Tagcoll could be used for pretty much anything, wonders of C++.
So, I've been busy with Tagcoll code. More specifically, I've been working on a new high-performance disk index. I've also implemented some basic benchmarks.
Follows a table with the benchmarking results:
Operation | TDBReadonlyDiskIndex | BasicStringDiskIndex | TDBIndexer |
---|---|---|---|
instantiating | 30ms | 0ms | 310ms |
hasTag | 200ms | 0ms | 0ms |
getTags[item] | 180ms | 90ms | 40ms |
getTags[items] | 10ms | 0ms | 0ms |
getItems[tag] | 330ms | 80ms | 30ms |
getItems[tags] | 1140ms | 20ms | 170ms |
getTaggedItems | 50ms | 10ms | 20ms |
getAllTags | 270ms | 0ms | 0ms |
getCardinality | 190ms | 0ms | 40ms |
getCompanionTags | 1410ms | 60ms | 280ms |
output | 18910ms | 7740ms | 50ms |
outputHavingTags | 1330ms | 90ms | 200ms |
TDBReadonlyDiskIndex is the disk index that is currently used in Debtags, BasicStringDiskIndex is the new disk index and TDBIndexer is an in-memory only index implemented with two binary trees.
I'd say, definitely not too bad :)