As this project gets closer to usability, performance on large datasets is going to become relevant. I suspect that attrs is not a bottleneck, but we do instantiate a large number of attrs dataclasses. We should evaluate whether switching to msgspec.Struct offers nontrivial performance improvements.
I would say the main two relevant measures would be:
- CLI responsiveness. If importing is slow the CLI will feel sluggish, even if we're talking a one-time cost of a couple hundred ms.
- Total runtime on a large dataset. 5-10% speedup is probably worth it. <5% if the code becomes simpler to read.
As this project gets closer to usability, performance on large datasets is going to become relevant. I suspect that attrs is not a bottleneck, but we do instantiate a large number of attrs dataclasses. We should evaluate whether switching to msgspec.Struct offers nontrivial performance improvements.
I would say the main two relevant measures would be: