It’s not always easy to distinguish between existentialism and a bad mood.

  • 6 Posts
  • 74 Comments
Joined 1 year ago
cake
Cake day: July 2nd, 2023

help-circle




  • On each step, one part of the model applies reinforcement learning, with the other one (the model outputting stuff) “rewarded” or “punished” based on the perceived correctness of their progress (the steps in its “reasoning”), and altering its strategies when punished. This is different to how other Large Language Models work in the sense that the model is generating outputs then looking back at them, then ignoring or approving “good” steps to get to an answer, rather than just generating one and saying “here ya go.”

    Every time I’ve read how chain-of-thought works in o1 it’s been completely different, and I’m still not sure I understand what’s supposed to be going on. Apparently you get a strike notice if you try too hard to find out how the chain-of-thinking process goes, so one might be tempted to assume it’s something that’s readily replicable by the competition (and they need to prevent that as long as they can) instead of any sort of notably important breakthrough.

    From the detailed o1 system card pdf linked in the article:

    According to these evaluations, o1-preview hallucinates less frequently than GPT-4o, and o1-mini hallucinates less frequently than GPT-4o-mini. However, we have received anecdotal feedback that o1-preview and o1-mini tend to hallucinate more than GPT-4o and GPT-4o-mini. More work is needed to understand hallucinations holistically, particularly in domains not covered by our evaluations (e.g., chemistry). Additionally, red teamers have noted that o1-preview is more convincing in certain domains than GPT-4o given that it generates more detailed answers. This potentially increases the risk of people trusting and relying more on hallucinated generation.

    Ballsy to just admit your hallucination benchmarks might be worthless.

    The newsletter also mentions that the price for output tokens has quadrupled compared to the previous newest model, but the awesome part is, remember all that behind-the-scenes self-prompting that’s going on while it arrives to an answer? Even though you’re not allowed to see them, according to Ed Zitron you sure as hell are paying for them (i.e. they spend output tokens) which is hilarious if true.














  • Here’s a quick and dirty vanilla js script that highlights all posts in a thread according to how recent they are, the brighter the newer, and alse separately highlights new posts, to make long running threads easier to follow. I’m posting it in the stubsack because it’s the thread I had in mind when writing it.

    Pasting it in the browser’s console and pressing enter should be enough for the page you have open, not that I’ve cross tested it any… Worst case scenario it does nothing or it colors the posts wrong and you just reload the page, I swear it won’t steal your crypto, or mine any new.

    In Firefox you can find the console by pressing F12 and selecting the console tab.

    edit: Also if you prepend javascript: to the code and store it as a bookmark you can just invoke it by calling the bookmark, like a macro, see https://awful.systems/comment/4173451

    Note: longer threads don’t load all comments at once, so you’ll have to rerun the script if you scroll down far enough.

    edit: fixed for Edge, because why wouldn’t it show dates differently there.

    edit: updated it to check if there’s a (xx New) notice in the post count in the OP and use the number to highlight the latest xx posts, i.e. all post made since the last time you were here. Change the value of variable newPostColor if you don’t like the lovely shade of lavender I picked. Depending on if edited posts are counted as new or not the count might be off, and like, what if there’s a new post that’s also been edited? Solving that seems to mean moving away from the warmth and comfort of the quick and dirty territory, and also is there a public philthy repository somewhere?

    edit: here’s how it looks in the SAP thread:

    edit: NEW: added some legibility changes and also consecutive executions now toggle old post highlights.

    Code now in spoiler:

    spoiler
    (() => {
        function getHighlightedColor(min, max, value) {
            const percentage = (value - min) / (max - min);
            return `rgba(0,0,255,${percentage})`
        }
    
        const newPostCount = (() => {
            const text = document.querySelector("span.fst-italic").textContent;
            return text.includes("New") ? parseInt(text.match(/\d+/)[0]) : 0;
        })();
        const newPostColor = "#783AFF";
    
        const timestampNodes = [...document.querySelectorAll("span.moment-time")]
            .map(x => {
                return {
                    Node: x,
                    PostBox: x.closest('.ms-2'),
                    Date: Date.parse(
                        x.dataset.tippyContent
                            .split('\n').at(-1)
                            .replace(/Modified |at /g, "")
                            .replace(/(?<=\d+)(st|nd|rd|th)/g, ""))
                };
            })
            .filter(x => x.PostBox != null)
            .sort((x1, x2) => x2.Date - x1.Date);
    
        const minDate = timestampNodes.at(-1).Date;
        const maxDate = timestampNodes.at(0).Date;
        const hl = (dt) => getHighlightedColor(minDate, maxDate, dt);
    
        timestampNodes
            .forEach((x, i) => {
                if (i < newPostCount) {
                    x.PostBox.style.backgroundColor = newPostColor;
                    x.PostBox.querySelector('.person-listing').style.textShadow = '1px 1px 0.75px #FFFFFF';
                    x.PostBox.querySelector('.comment-content').style.paddingLeft = ".5em";
                }
                else if (x.PostBox.style.backgroundColor == "") {
                    x.PostBox.style.backgroundColor = hl(x.Date);
                    x.PostBox.querySelector('.comment-content').style.paddingLeft = ".5em";
                } else {
                    x.PostBox.style.backgroundColor = "";
                    x.PostBox.querySelector('.comment-content').style.paddingLeft = "";
                }
            });
    })()
    

  • The whole point of using these things (besides helping summon the Acausal Robot God) is for non-technical people to get immediate results without doing any of the hard stuff, such as, I don’t know, personally maintaining and optimizing an LLM server on their llinux gaming(!) rig. And that’s before you realize how slow inference gets as the context window fills up or how complicated summarizing stuff gets past a threshold of length, and so on and so forth.