Added UTF-8's multi-byte support to a text editor today, I've always been scared to get into it, it looked messy and confusing at a distance. But the design makes it pretty accessible, for even such a small system as uxn.

The rule is pretty simple:

- starting bytes are 11xx xxxx
- continuation bytes are 10xx xxxx

The entire implementation to handle multi-byte characters is a mere 30ish bytes long.

wiki.xxiivv.com/site/utf8.html

example implementation: git.sr.ht/~rabbits/left/tree/m

0
Share
Share on Mastodon
Share on Twitter
Share on Facebook
Share on Linkedin
Replies