AI Work Assistants Need a Lot of Handholding

Getting full value out of AI workplace assistants is turning out to require a heavy lift from enterprises. ‘It has been more work than anticipated,’ says one CIO.

aka we are currently in the process of realizing we are paying for the privilege of being the first to test an incomplete product.

Mandell said if she asks a question related to 2024 data, the AI tool might deliver an answer based on 2023 data. At Cargill, an AI tool failed to correctly answer a straightforward question about who is on the company’s executive team, the agricultural giant said. At Eli Lilly, a tool gave incorrect answers to questions about expense policies, said Diogo Rau, the pharmaceutical firm’s chief information and digital officer.

I mean, imagine all the non-obvious stuff it must be getting wrong at the same time.

He said the company is regularly updating and refining its data to ensure accurate results from AI tools accessing it. That process includes the organization’s data engineers validating and cleaning up incoming data, and curating it into a “golden record,” with no contradictory or duplicate information.

Please stop feeding the thing too much information, you’re making it confused.

Some of the challenges with Copilot are related to the complicated art of prompting, Spataro said. Users might not understand how much context they actually need to give Copilot to get the right answer, he said, but he added that Copilot itself could also get better at asking for more context when it needs it.

Yeah, exactly like all the tech demos showed – wait a minute!

[Google Cloud Chief Evangelist Richard Seroter said] “If you don’t have your data house in order, AI is going to be less valuable than it would be if it was,” he said. “You can’t just buy six units of AI and then magically change your business.”

Nevermind that that’s exactly how we’ve been marketing it.

Oh well, I guess you’ll just have to wait for chatgpt-6.66 that will surely fix everything, while voiced by charlize theron’s non-union equivalent.

  • zbyte64@awful.systems
    link
    fedilink
    English
    arrow-up
    25
    ·
    edit-2
    5 months ago

    Was wondering if they’re using RaG, and they are, but in the worst possible way:

    Complicating matters is the fact that Copilot doesn’t always know where to go to find an answer to a particular question, Spataro said. When asked a question about revenue, Copilot won’t necessarily know to go straight to the enterprise financial system of record rather than picking up any revenue-related numbers that appear in emails or documents, he said.

    Thing might be rendered useful if you could constrain it to search a particular source or site. And even better, instead of hallucinating it could just give you a link and a citation. We could call it a search engine.

    • 200fifty@awful.systems
      link
      fedilink
      English
      arrow-up
      14
      ·
      5 months ago

      If you think of LLMs as being akin to lossy text compression of a set of text, where the compression artifacts happen to also result in grammatical-looking sentences, the question you eventually end up asking is “why is the compression lossy? What if we had the same thing but it returned text from its database without chewing it up first?” and then you realize that you’ve come full circle and reinvented search engines