Folk Computer

What we've been up to

Daniel Pipkin and Mason Jones paired in person in Utah on dual-tag-derived quads that work in 3D:
- Omar: sort of a modernized version of the dual-tags that Naveen Michaud-Agrawal had a couple of years ago in 2D
Daniel fixed our ESC/POS receipt printer support for folk2: now it uses fn and Subscribe:/Notify: instead of globals
Esben Sørig contributed some camera/setup fixes:
- Extend camera load timeout to 2 seconds and add retry mechanism, which seems to fix the Logitech C270 camera
  - Also don't die on non-fatal JPEG decompress errors
- Make aspect ratio of camera preview on /calibrate adapt to actual camera aspect ratio
Andrés and Brian worked on the shape library motivated by ideas for drawing shapes to highlight objects on the table
Discussions from the Folk Discord:
- Preview of the simplified shape code working in folk2 to draw a square on a quad
- Hamish Todd delivered a version his presentation on his setup and perspective experiments in Folk in the Discord
- TJoh, Paul, and Mason discussed implementing physical “knobs” using tag angles; confirmed that regions are deprecated in favor of quads in folk2, and discussed the math for extracting angles from quads
- Richard asked about persistent memory for programs not on the table; Mason pointed to the Hold! -save functionality in folk2
- Esben Sørig kicked off a thread on 4K camera recommendations, with Audrey providing notes on ELP camera compatibility
- doodr shared progress on a custom “dotframe” encoding/decoding system built with OpenCV
- Omar suggested implementing automatic ephemeral temp folders for programs to more cleanly handle downloads and Unix dependencies

OCR experiments

Omar: I've been wanting to do non-AprilTag inputs for a while, especially handwriting input (see proposal from last year, for example).

I started looking at OCR/handwriting recognition more seriously a couple of months ago, and I concluded that TrOCR was probably the right building block (AI model) for this. It runs pretty fast (0.1s to 1s) on most systems on GPU and CPU, I was able to run it locally, it's trained for handwriting and not print. I did some prototyping in Python for that.

Now that we have folk2, it's a lot easier to run synchronous tasks like handwriting recognition without blocking the rest of Folk, so I want to finally get stuff like this deeply integrated (and have the integration be pretty simple and idiomatic) and prototype how to interact with it.

So at the end of January, I finally started hacking on a TrOCR integration that runs inside Folk, so we can actually write on the table and get live text output.

Here's a draft trocr.folk that exports a callable function (more verbose / AI-y than it probably should be, but it's fully self-contained, which is great).

Easy invocation on a static on-disk image (to use live, put it in an infinite loop and query for camera slice):

When the image library is / imageLib/ &\
     the TrOCR function is /TrOCR/ {
  fn TrOCR
  set im [$imageLib loadJpeg "a01-122-02.jpg"]
  puts [TrOCR $im]
}

It runs really fast on folk-hex when CUDA is on (maybe 10x faster, multiple frames per second). That liveness is fun to play with.

Excited to show more demos around this next month.

A lot of this is built on uvx, which I just learned about this month. uvx gives us the ability to run Python with arbitrary dependencies in 'immediate mode', which aligns a lot better with Folk (wanting self-contained declarative Folk programs, wanting to be able to change them to do anything at any time, etc) than a traditional project/env-oriented Python workflow would. You just run uvx with all dependencies as arguments and then can eval any Python you want, and it automatically caches the dependencies so it doesn't have to reinstall from scratch every time.

Some issues to resolve:

TrOCR really can't cope with any padding around the handwriting; it needs a tight box around the text, so we need a separate text detection model to maximize usability (otherwise you have to write your word to exactly fill the camera slice, which is sort of annoying, especially if the slice is reasonably large)
I think we'll need to push harder on finding and keeping a really good calibration if we want to be able to project onto/alongside handwriting
I also want to figure out a camera slice change detection system so we don't have to run the OCR model continuously

Text detection with CRAFT

I started experimenting with the CRAFT text detector, which seems to be what people generally pair with TrOCR.

I tuned it to use the GPU and to be more aggressive in linking regions, so we capture the whole “wow” in one box here:

Here's the draft code that does both CRAFT and TrOCR, which I haven't tested on a table (I'm now working on a more general Python FFI so that the OCR pipeline isn't one big monolith, and so we can try other models easily).

Stuff from China

Omar: I was in China for a few weeks (for some workshops and conferences in Shanghai and Shenzhen). I prioritized getting stuff I couldn't find in the US (often these are not even on Aliexpress, only on Taobao or in person). Got a few interesting Folk-related gadgets:

"4K" "X10 MAX" projector (actually 1080p)
- - I like the automatic focus mechanism. The projector is pretty bright
- Got this for 888RMB ($126) on Taobao which is good for a good 1080p projector, but it's not 4K
- It's really hard to find a genuine 4K projector. A lot of projectors advertise 4K, but it's just 4K input and they downscale on output to 1080p (pretty useless for us, since we want the actual resolution for text editing). Salespeople and product pages often don't know this distinction and will mislead you
Hachi K1 tabletop projector/touchscreen
- I got this for 1400RMB ($200) at a random store in HQB – way cheaper than I've seen anything comparable online (I mostly see the M1 online; not sure what the difference is), or I think its original retail price
  - I heard about the Hachi devices years ago, but $600-1000 wasn't worth it to me to import, but $200 is easily worth it
- I think the K1 is the last model they put out and is China-only? It seems mostly similar to the older models I've seen online
- It's basically an Android tablet. Trying to root it and see what the sensor inputs are (cameras, laser line), maybe run Folk on it. It's a really nice form factor – could also just copy it and make our own device
Deli barcode scanners: the AA601 tabletop & especially the ES228WB handheld
- I just liked the industrial design of these:
- The handheld ES228WB especially is very inspiring for a Folk gadget form factor and looks so much more modern than most barcode scanners I've seen
- (this makes sense, since barcode scanner is the equivalent of a cash register there – they're very cheap, $10-20, very common, very consumer-facing)
Guangzhou X-Dream X1200W portable A4/letter printer
- I got this on Taobao for 800RMB ($115) – way cheaper than the Canon or Epson portable printers we've used in the past
- This is a totally viable printer for Folk if only we could get good drivers for it (I can only drive it from their iOS app right now, even their PC Linux drivers don't work for me)

In general, my hope was that we could find competitively-priced devices in these categories (that maybe are only available in China), but was only kind of successful. Let us know if you know of anything:

4K projector ($800 or cheaper, and/or less than 5 pounds) like the AAXA 4K1
Pico projector ($400 or cheaper) like the Ultimems AnyBeam
Portable printer ($300 or cheaper) like the Canon TR150 or Epson EC-C110

Briefly showed off the new pink gadget1 I made last month:

Also started looking into PCB design for a side project. As a result, I'm thinking very seriously about making a smaller gadget that fits in a barcode scanner-like chassis (something like the Deli one above). Would need a custom PCB and heat dissipation strategy, but we could just use the bare Ultimems projector board (no heat sink), and we could use just one 18650 battery. That would go a long way in reducing weight and volume.

Outreach

Open house

We had a small open house on Thursday, January 29:
- Open house visitors going through the new Folk introductory booklet:

What we'll be up to in February

Our next Folk open house is in the evening on Thursday, February 26th, in East Williamsburg, Brooklyn.
Andrés: Finishing shape library
Andrés: Working on restoring and making demos with the dot detector capability
Omar: Make and post some CRAFT+TrOCR demos; publish Python FFI; maybe start playing with segmentation
Omar: Fix some bugs with folk2 (texture blinking, leaks)
Omar: Maybe try and start on stereo calibration and/or calibration refinement
Omar: New gadget design?

Links we've enjoyed

Omar

Andrés

A large collection of PDFs from Xerox PARC
Diagrams Matter by Stan Allen
Moondream a tiny vision + language model for running on things like Raspberry Pis

Table of Contents

January 2026 newsletter