The Mirage of AI: Blandishment and Hallucination in...
Originally published on Tumblr.

The Mirage of AI: Blandishment and Hallucination in Autoregressive Models
TL;DR: Autoregressive models often falter in long sequences, where blandishment and hallucination arise from sampling failures, challenging the reliability of perplexity and cross-entropy as accuracy metrics.
AI models can lie. Not intentionally, of course, but through a process of blandishment and hallucination that emerges from autoregressive sampling failures. These failures, particularly in long sequences, reveal the limitations of current AI systems and the metrics we use to evaluate them.
Autoregressive models, like those used in many AI applications, predict the next token in a sequence based on previous tokens. This process, however, is fraught with potential errors, especially when using temperature-scaled softmax sampling. As the model generates longer sequences, small errors can compound, leading to significant deviations from factual accuracy. This is where blandishment—overly flattering or misleading output—and hallucination—entirely fabricated content—come into play.
- Perplexity and Cross-Entropy: These are standard metrics for evaluating language models, but they fall short in assessing factual accuracy. Perplexity measures how well a model predicts a sample, while cross-entropy evaluates the difference between predicted and actual distributions. Neither metric accounts for the truthfulness of the content, allowing models to produce plausible yet incorrect information.
- Sampling Techniques: Beam search, nucleus sampling, and top-k sampling each have their own failure modes. Beam search can lead to repetitive and uncreative outputs, while nucleus and top-k sampling may introduce randomness that exacerbates hallucination. Each method struggles to balance creativity with accuracy.
- Information Theory and Log-Likelihood: Maximizing log-likelihood is a common training objective, yet it doesn’t ensure semantic coherence or truthfulness. Information theory suggests that while a model may be statistically optimal, it can still produce semantically incoherent or false outputs.
- Attention Entropy: This metric can help detect when models are ‘guessing.’ High entropy in attention weights indicates uncertainty, often correlating with less reliable outputs. Monitoring attention entropy could provide a warning system for potential inaccuracies.
In the wake of recent AI funding bubbles and overpromised capabilities, it’s crucial to scrutinize these models more closely. As we continue to integrate AI into critical areas, from healthcare to finance, ensuring the semantic coherence and truthfulness of AI outputs is paramount. How can we refine our models and metrics to better align with these goals?