Skip to content

Region GC#105

Open
kulisak12 wants to merge 16 commits into
fxpl:regions-mainfrom
kulisak12:regions-gc
Open

Region GC#105
kulisak12 wants to merge 16 commits into
fxpl:regions-mainfrom
kulisak12:regions-gc

Conversation

@kulisak12

Copy link
Copy Markdown

Please check the last commit in particular, I still don't understand the write barrier enough 😅

Comment thread Python/gc.c
@xFrednet

Copy link
Copy Markdown
Collaborator

For now, I only checked the last commit (based) on the description. I plan to look at the rest later.

The fact that this PR is this small is super impressive and a testament for how well you understood the whole thing and wired it up! I'm looking forward to reviewing it :D

@xFrednet xFrednet left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, the code is super readable and concise. I'm pretty confident that I can modify it or that we can give this to someone else to test different scheduling mechanics. Very well done, thank you for all the work you put into this!

I have some comments, but mainly some clarifying questions. I didn't find any logic bugs :D

Comment thread Objects/cownobject.c
Comment thread Objects/cownobject.c
Comment on lines +488 to +489
PyThreadState *tstate = PyThreadState_Get();
if (!_PyGC_CanRunRegionGC(tstate)) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs of PyThreadState_Get(); state that it can return NULL. But this code doesn't null check. Does it expect _PyGC_CanRunRegionGC to do the NULL check? If so, can you add a comment about this?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did you find that?
These docs say the opposite.

Comment thread Objects/cownobject.c

// Lock without blocking, we only want the cown if it is released.
int res = cown_lock(self, NO_BLOCKING_TIMEOUT, this_ip, true);
if (res != COWN_ACQUIRE_SUCCESS) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check for COWN_ACQUIRE_ERROR and in that case surface the error to the caller? This is being called from CownObject_release which can still throw an exception.

If you want to explicitly ignore errors, can you add a comment about why this is safe?

@kulisak12 kulisak12 Jun 27, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is: if res == COWN_ACQUIRE_SUCCESS, then collect the region. Otherwise, just don't collect it and ignore the error. I added a comment and also a PyErr_Clear() which should have been there.

Comment thread Python/gc.c
Comment thread Python/gc.c
Comment on lines +1979 to +1985
/* Separate child regions from contained objects.
* Finalizers need them to be in the GC list, at the start of the list.
*/
PyGC_Head contained;
gc_list_init(&contained);
gc_region_list_split(&data->gc_list, &contained);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With all of these additional splits and joins I see why you asked if we can split the contained object and child-region lists. For now, I want to keep this structure since it works, but this should definitely be on the "benchmark this" list for when the implementation is complete-ish.

The implementation thus far is super readable, so this change should be easy to in cooperate into your implementation.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, the GC iterates over all children anyway when traversing the tree. So this just means we iterate twice instead of once. That might be a good tradeoff for 16 bytes saved in the region data.

Comment thread Python/gc.c
*/
gcstate->region_budget += 3;
if (gcstate->region_budget > gcstate->young.threshold) {
gcstate->region_budget = gcstate->young.threshold;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is one of the places we can modify to adjust the scheduling?

For a steady state heap, the amount of work to do is three times

Where does the "three times" come from, is this a different comment in this file or some benchmarking?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is taken from the comment in assess_work_to_do. I came to the conclusion that it is correct.

Comment thread Python/gc.c
Comment on lines +2839 to +2840
// TODO(regions): If callback moves op into a region,
// this function would start iterating objects in the region.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this TODO is if op is local?

We can later mark the object as unmovable, when we support this on a per-instance level :)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is related to PyUnstable_GC_VisitObjects, which is an unstable API function that calls a given callback for every object tracked by the local GC.
So yes, op is local. I guess you could mark it unmovable, do the callback, and then restore movability?

Comment thread Lib/test/test_regions/test_gc.py
Comment thread Lib/test/test_regions/test_gc.py
Comment thread Lib/test/test_regions/test_gc.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants