I agree that many/most AI models "hallucinate," and thus offer "wrong," misleading or "shaded" answers and advice. The reasons for this are: 1. The training data (and data sets) that AIs utilize has many differing opinions and factual errors; 2. Data set creators are often biased in that they are SEEKING a solution and thus data is/are weighted differently; 3. Given the increasing "pollution" throughout the internet (often influenced by "bad actors"), coupled with the escalating exploitation of mis- and disinformation (e.g. for political or business or other advantage), it's no surprise that AI agents are "error prone;" 4. Humans make error all the time for many of the prior 3 reasons, plus their added stupidity.
From the debate among ANN experts, I gather that the sources of hallucinations are not just in the training data but also the Model's architecture and objectives (in addition to the prompting style).
Fully agree on humans performing even more poorly. I mean: I'm sure that, like all geniuses, even Turing did make up some shit now and then. Dirac did it often. I know that Terence Tao and Judit Polgár do... And normals like me do it all the time 😜
So, I would suggest that people who are using AI must understand and recognize that AI systems and models are an assistant that makes mistakes and NOT an oracle of truth all the time.
If you are looking for an infallible oracle of truth you should not be using AI or any other unverified information sources on the internet. If you can work with AI as an assistant and are willing do critical thinking and quality checking with it you will find it to be a very productive assistant. That is the approach I use and always recommend to friends! Hope that helps folks.
And to add to media and new or social media content, the large news organizations must develop this veriable STAMP on content as tghey have a lot to lose if readers starts suspect the source. For now, I continue to trust Reuters, NYT, WSJ etc but who knows how long their brand will be subject to AI content attacks.
You are totally right. The “intelligence” is little more than comparing possible answers. The source for all this “wisdom” is just finding patterns in the past history of what we have done and recorded. Is that history all correct? While useful, of course it is not perfect nor even close.
The larger problem is the growing lack of fresh data to mine -and an even bigger problem is the AI output is becoming a part of the current source of ‘new’ findings. Leading to incestuous results - if you know what I mean.
Wrong. Humans go beyond what has gone before and use emotions and evaluating several future alternatives that they have the smarts to create based on a lifetime of unique experiences. I mean humans are incredible. How about general relativity?
If when you say "humans" you mean a Platonic ideal (general relativity, Lohengrin, Mona Lisa, Oscar Wilde, etc.), then you are right. But you are wrong when "humans" means ordinary people like me and >95% of the other instances.
And if we're talking the 95%, it's not only humans who "go beyond what has gone before [...]" etc. The creativity of an LLM's response is not different from that. In fact, it comes out differently in each seemingly identical context, which is [bizarre and intriguing but] not inferior to human performance.
For any passerby save those who personally know us, this very discussion of ours could be occurring between LLMs. (Only, my ANN avatar would write better English than I).
When using these AI tools professionally as part of an enterprise, most large companies have gotten smart about potential content accuracy issues and I am aware there are internal "Terms of Use and Conduct Guidelines" for employees. This guideline and conduct practice when it comes to AI, should be a must for any professional organization. And even beyond the analytical and written content, they should start a movement to stamp images and pictures that state they have been verified by the company posting content. the Chinese have gotten very good at misusing images to change public perception. These enterprise railguides when using any type of AI must be mandatory. I
Having said that, Tom, the NY Times article is less than perfect.
For example, it "hallucinates" on that 79% 😉. As far as I can tell, the 79 percent refers to the accuracy of a detection method in identifying hallucinations, not the frequency with which a new AI system hallucinates. In fact, most of the "percent" figures in the article seem somewhat casual and possibly heterogeneous.
I agree that many/most AI models "hallucinate," and thus offer "wrong," misleading or "shaded" answers and advice. The reasons for this are: 1. The training data (and data sets) that AIs utilize has many differing opinions and factual errors; 2. Data set creators are often biased in that they are SEEKING a solution and thus data is/are weighted differently; 3. Given the increasing "pollution" throughout the internet (often influenced by "bad actors"), coupled with the escalating exploitation of mis- and disinformation (e.g. for political or business or other advantage), it's no surprise that AI agents are "error prone;" 4. Humans make error all the time for many of the prior 3 reasons, plus their added stupidity.
YES! Concur.
From the debate among ANN experts, I gather that the sources of hallucinations are not just in the training data but also the Model's architecture and objectives (in addition to the prompting style).
Fully agree on humans performing even more poorly. I mean: I'm sure that, like all geniuses, even Turing did make up some shit now and then. Dirac did it often. I know that Terence Tao and Judit Polgár do... And normals like me do it all the time 😜
Yes, again! We agree....
So, I would suggest that people who are using AI must understand and recognize that AI systems and models are an assistant that makes mistakes and NOT an oracle of truth all the time.
If you are looking for an infallible oracle of truth you should not be using AI or any other unverified information sources on the internet. If you can work with AI as an assistant and are willing do critical thinking and quality checking with it you will find it to be a very productive assistant. That is the approach I use and always recommend to friends! Hope that helps folks.
And to add to media and new or social media content, the large news organizations must develop this veriable STAMP on content as tghey have a lot to lose if readers starts suspect the source. For now, I continue to trust Reuters, NYT, WSJ etc but who knows how long their brand will be subject to AI content attacks.
You are totally right. The “intelligence” is little more than comparing possible answers. The source for all this “wisdom” is just finding patterns in the past history of what we have done and recorded. Is that history all correct? While useful, of course it is not perfect nor even close.
The larger problem is the growing lack of fresh data to mine -and an even bigger problem is the AI output is becoming a part of the current source of ‘new’ findings. Leading to incestuous results - if you know what I mean.
Incidentally, your words can be used verbatim about humans. 😉
Of course! Absolutely.
Wrong. Humans go beyond what has gone before and use emotions and evaluating several future alternatives that they have the smarts to create based on a lifetime of unique experiences. I mean humans are incredible. How about general relativity?
If when you say "humans" you mean a Platonic ideal (general relativity, Lohengrin, Mona Lisa, Oscar Wilde, etc.), then you are right. But you are wrong when "humans" means ordinary people like me and >95% of the other instances.
And if we're talking the 95%, it's not only humans who "go beyond what has gone before [...]" etc. The creativity of an LLM's response is not different from that. In fact, it comes out differently in each seemingly identical context, which is [bizarre and intriguing but] not inferior to human performance.
For any passerby save those who personally know us, this very discussion of ours could be occurring between LLMs. (Only, my ANN avatar would write better English than I).
You're both right! (Bill and Paolo)
When using these AI tools professionally as part of an enterprise, most large companies have gotten smart about potential content accuracy issues and I am aware there are internal "Terms of Use and Conduct Guidelines" for employees. This guideline and conduct practice when it comes to AI, should be a must for any professional organization. And even beyond the analytical and written content, they should start a movement to stamp images and pictures that state they have been verified by the company posting content. the Chinese have gotten very good at misusing images to change public perception. These enterprise railguides when using any type of AI must be mandatory. I
Having said that, Tom, the NY Times article is less than perfect.
For example, it "hallucinates" on that 79% 😉. As far as I can tell, the 79 percent refers to the accuracy of a detection method in identifying hallucinations, not the frequency with which a new AI system hallucinates. In fact, most of the "percent" figures in the article seem somewhat casual and possibly heterogeneous.