newsletters:2024-09
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
newsletters:2024-09 [2024/10/01 03:37] – [What we'll be up to in October] osnr | newsletters:2024-09 [2024/10/01 03:47] (current) – [New parallel evaluator] osnr | ||
---|---|---|---|
Line 97: | Line 97: | ||
=== Sysmon and thread pool === | === Sysmon and thread pool === | ||
- | Started working on sysmon thread which manages the size of the thread pool (too many threads looking for work? kill some; too few threads compared to number of CPUs? spawn some) | + | Started working on sysmon thread which wakes up every few milliseconds and manages the size of the thread pool (too many threads looking for work? kill some; too few threads compared to number of CPUs? spawn some). |
- | There are subtleties here where you want to avoid churning the thread pool and constantly killing and re-spawning stuff. Haven' | + | There are subtleties here where you want to avoid churning the thread pool and constantly killing and re-spawning stuff. Haven' |
- | + | ||
- | Added workqueue display on threads | + | |
=== Sustain / time-to-live field on statements or Holds === | === Sustain / time-to-live field on statements or Holds === | ||
Line 109: | Line 107: | ||
sysmon (which previously was just managing the size of the thread pool) is extended (maybe will rename to custodian) to also handle reaping sustained statements when they hit their deadline. it feels nice to do it here rather than complicating the priority queue further, since sysmon is already waking up every couple milliseconds anyway (and it doesn' | sysmon (which previously was just managing the size of the thread pool) is extended (maybe will rename to custodian) to also handle reaping sustained statements when they hit their deadline. it feels nice to do it here rather than complicating the priority queue further, since sysmon is already waking up every couple milliseconds anyway (and it doesn' | ||
- | We had a discussion about this proposal in Discord: | + | We had a good discussion about this proposal in Discord: |
- | > Omar: when you make a When (or a statement) in parallel Folk, we might have an option for " | + | {{:newsletters: |
- | > | + | |
- | > just thinking about anti-blink measures. it wouldn' | + | <details> |
- | > | + | <summary>More discussion |
- | > | + | |
- | >> Is there a particular context where this came up? | + | {{: |
- | > it's pretty general -- you'll see page outlines blink out in the new evaluator, because they get retracted because the old camera frame is retracted & that is a race against the new camera frame getting processed (and if the retract happens before the page gets re-added due to the new frame, it blinks out) | + | |
- | > | + | {{: |
- | > the specific one that just came up is that we were exec-ing some keymap command a lot because the keyboard page kept blinking in and out (downstream | + | </details> |
- | > | + | |
- | > we also do this de facto in single-threaded Folk right now by ordering retractions after new assertions in the priority queue | + | |
- | > | + | |
- | >> Why does the old frame get retracted? | + | |
- | > retracting old statements is part of the core loop of the system -- you can kind of think of the camera subsystem as a while loop that just goes like this | + | |
- | > < | + | |
- | while true { | + | |
- | set frame [camera readFrame] ;# blocking, takes 16ms | + | |
- | | + | |
- | | + | |
- | set prevFrame $frame | + | |
- | }</code> | + | |
- | > and then when the Assert happens, the new set of active pages is calculated, and the new outlines of all pages are calculated, and then the shader display list is calculated, etc, and that's all in place | + | |
- | > | + | |
- | > and then the Retract happens and rips out the old set of active pages (but any pages that are still on the table are now supported by the new Assert so they don't blink out) | + | |
- | > | + | |
- | > (and then only once the single-threaded evaluator has converged and all tasks are complete, after all of this, the final display list is rendered on the GPU, so you don't see the transient retracted old statements) | + | |
- | >> And in the parallel case Retract is potentially happening before the set of active pages are reactively set as a result of the Assert? | + | |
- | > Yes, exactly -- it's a race since those operations are parallel-scheduled in the new evaluator | + | |
- | >> So the TTL could live in the camera frame claim or maybe the tag detector to make detected programs stickier. I think I got it. There is something that feels a little wrong about putting a time on it instead of declaring some sort of dependency, but I get it. | + | |
- | > Yeah | + | |
- | > I'm hoping that we would only need to use them internally in a couple specific places | + | |
- | > and the end user programming experience would broadly be the same | + | |
- | > | + | |
- | > hmm yeah i think the problem is that you don't really want to introduce a hard dependency because then programs can arbitrarily block the control loop if they' | + | |
- | > | + | |
- | > it feels like at some level, you do actually want to say, hey, if you can't resolve within a few milliseconds, | + | |
- | > | + | |
- | > anyway this (blinking) remains an open problem so i'm gonna try it when i get the chance and see how it feels | + | |
- | > | + | |
- | >> This is a really good point. We would want a timeout on a lock anyway. | + | |
=== Other new evaluator stuff === | === Other new evaluator stuff === | ||
+ | |||
+ | Added workqueue display on /threads Web page, which made it clear that many of the random issues with folk2 are just that work-stealing was breaking down and work items would get permanently stuck on some thread that was stuck running a permanent task (instead of the work getting stolen and executed elsewhere). | ||
Fixed some keyboard issues where the keyboard process was erroring or exec-ing stuff all the time (because of lack of persistence / because of blinking), and where the grabber was broken because it wasn't inter-thread-safe. | Fixed some keyboard issues where the keyboard process was erroring or exec-ing stuff all the time (because of lack of persistence / because of blinking), and where the grabber was broken because it wasn't inter-thread-safe. | ||
+ | |||
+ | Scheming about this memory management idea again, since it could simplify a lot of the implementation: | ||
+ | |||
+ | {{: | ||
+ | |||
+ | |||
+ | < | ||
+ | < | ||
+ | |||
+ | {{: | ||
+ | </ | ||
==== Friends and outreach ==== | ==== Friends and outreach ==== |
newsletters/2024-09.1727753850.txt.gz · Last modified: 2024/10/01 03:37 by osnr