Google is also playing it safe in terms of content. Users will not be able to ask for sexually explicit, illegal, or harmful material (as judged by Google) or personal information. In my demo, Bard would not give me tips on how to make a Molotov cocktail. That’s standard for this generation of chatbot. But it would also not provide any medical information, such as how to spot signs of cancer. “Bard is not a doctor. It’s not going to give medical advice,” says Krawczyk.
Perhaps the biggest difference between Bard and ChatGPT is that Bard produces three versions of every response, which Google calls “drafts.” Users can click between them and pick the response they prefer, or mix and match between them. The aim is to remind people that Bard cannot generate perfect answers. “There’s the sense of authoritativeness when you only see one example,” says Krawczyk. “And we know there are limitations around factuality.”
In my demo, Krawczyk asked Bard to write an invitation to his child’s birthday party. Bard did this, filling in the street address for Gym World in San Rafael, California. “It’s a place I drive by a ton but I honestly can’t tell you the name of the street,” he said. “So that’s where Google Search comes in.” Krawczyk clicked “Google It” to make sure the address was correct. (It was.)
Krawczyk says that Google does not want to replace Search for now. “We spent decades perfecting that experience,” he says. But this may be more a sign of Bard’s current limitations than a long-term strategy. In its announcement, Google states: “We’ll also be thoughtfully integrating LLMs into Search in a deeper way—more to come.”
That may come sooner rather than later, as Google finds itself in an arms race with OpenAI, Microsoft, and other competitors. “They are going to keep rushing into this, regardless of the readiness of the tech,” says Chirag Shah, who studies search technologies at the University of Washington. “As we see ChatGPT getting integrated into Bing and other Microsoft products, Google is definitely compelled to do the same.”
A year ago, Shah coauthored a paper with Emily Bender, a linguist who studies large language models, also at the University of Washington, in which they called out the problems with using large language models as search engines. At the time, the idea still seemed hypothetical. Shah says he was worried that they might have been overreaching.
But this experimental technology has been integrated into consumer-facing products with unprecedented speed. “We didn’t anticipate these things happening so quickly,” he says. “But they have no choice. They have to defend their territory.”