kbin account: e0qdk@kbin.social

This is my Lemmy alt. I’m about 50/50 between kbin and reddthat these days, but my kbin account is more established. If you’re looking for my older posts, check there.

Interests: programming, video games, anime, music composition

  • 1 Post
  • 8 Comments
Joined 7 months ago
cake
Cake day: November 27th, 2023

help-circle
rss
  • I was curious, so I did some searches on this topic for you and found these pages:

    The second link in particular notes:

    The reason that things are much easier with all ASCII data is that practically every Unicode encoding in existence maps bytes 0x00…0x7f to the corresponding code points, so byte strings and Unicode strings that contain the same all-ASCII data are basically equivalent, even semantically. What usually trips people up with non-ASCII data is that the semantic meaning of bytes in the range 0x80…0xff changes from one encoding to another.

    But, thinking like a systems programmer again, for many purposes the semantic meaning of bytes 0x80…0xff doesn’t matter. All that matters is that those bytes are preserved unchanged by whatever operations are done. Typical operations like tokenizing strings, looking for markers indicating particular types of data, etc. only need to care about the meaning of bytes in the range 0x00…0x7f; bytes in the range 0x80…0xff are just along for the ride.

    So the trick for beating Python 3 strings into submission is to put in encoding and decoding calls where you need to, choosing a single-byte encoding that doesn’t mutate 0x80…0xff. There are many of these; most of the Latin-{1…6} sequence (aka ISO-8859-1…10) is has this property. What you do not want to do is pick utf-8 or any of the multibyte Asian encodings. Latin-1 will do fine; in fact it has an advantage over the others in memory consumption, which we’ll describe below.

    Whether depending on this is actually correct or not is beyond me, but it seems like people have actually been using that pass-through behavior in practice and put it into things like Python2 -> 3 migration guides.

    The first link suggests that the seemingly undefined ranges are valid as C0 and C1 control codes which may be why it doesn’t throw errors.


  • I think the term would be “necrobump”

    That’s from old school forums where posting to a thread bumped it back to the top of the feed and thus thrust old info prominently into everyone’s view again. You won’t get that same bump effect with most sorts on Lemmy. (“New comments” sort might work like that though? I’m not sure exactly how that’s handled.)

    otherwise everyone has moved on

    It’s pretty rare to get much of a response even after just 24 hours or so – not just in terms of comments, but even for upvotes. I think after that point, posts are usually so far down people’s feeds that almost no one sees it any more. That probably also discourages most people from replying since basically no one will see it. (Maybe the poster of the thread or comment you’re replying to will see it, but probably almost no one else will if it’s more than a day or so old.)

    Some people do dig through community archives and/or user profiles – particularly after a new thread is posted – and they’ll occasionally upvote old posts, but they very rarely comment.



  • I quit YouTube along with reddit last summer. I don’t use alternate interfaces. I haven’t found a replacement for most of the niche content I liked to watch there – and yes, that sucks.

    I’ve mostly been watching offline content (like DVDs and things I downloaded years ago) when I want video entertainment, and doing other stuff with my free time.

    You might think that’d mean more time playing games given my interests, but I’ve found I’m a lot less enthusiastic about playing through games if I can’t watch an LP or two of it afterwards. So, I’m actually playing (and also buying) less of those than I used to too.