Google Archives - Press Gazette

Google Discover has become Reach’s ‘biggest referrer of traffic’

Bron Maher — Thu, 21 Nov 2024 09:10:29 +0000

Google‘s smartphone-based content recommendation feed, Google Discover, has become the single largest traffic referral source for publishing giant Reach plc, its audience director (distribution and customer marketing) has said.

Martin Little told Press Gazette the rise in Discover traffic had compensated “and then some” for a decline in referrals from Google search.

Reach’s overall Google traffic has grown in the second half of this year, Little said, but there had been “a significant shift” in the contributions from different Google platforms.

What Little called “branded search” — traffic from people actively searching for Reach titles like the Daily Mirror or Cornwall Live — has remained “solid”, he said. But referrals from topic-based searches have been falling, contributing to an overall fall in the number of visitors referred via search. Visits from Google News have remained largely stable.

But Discover is “making up for that and then some on top”, he said, and “has become our biggest referrer of traffic”.

“Overall, almost 50% of our titles are on year-on-year growth now,” Little said, “and that is partly because of the shifts in Google.”

Google Discover promotes ‘soft lens’ content — but isn’t so good for news

Google Discover is embedded in the browsing experience for most users who browse Google Chrome on a smartphone.

Little said Discover “is almost like Facebook was… it’s algorithmically served, it’s based on what it thinks you’re going to like. It’s more of an escapism-type outlet”.

The increase in Google traffic at Reach has in part been driven by greater visibility into how Google Discover works. Previously, Little said, it had been “even more of a black box than Search”.

It is now much easier to monitor both your own and your competitors’ performance on Discover, according to Little. Two years ago it would not have been possible for a publisher to even split out their Discover and Search traffic, but now “you can get a far better lens on what’s working and what’s not”.

Little said 44% of Reach content gets picked up in Google Discover, but the platform is “very selective as to what it takes in and what it doesn’t… Google clearly wants Discover to be a safe environment, with brand-safe content within it”.

He described the type of content that does well on Discover as “soft-lens”: first-person pieces do well, as does lifestyle content and articles about niche interests and sports other than football.

“What is interesting as well is, for us as a commercial publisher, we don’t get a lot of news content into Discover,” Little said. “And by news, what I mean is traditional local news, or harder news.”

Reach’s news content is still holding its own in search, Little said, but there is “always a lag for content getting to Discover”. Although stories sometimes arrive on Discover within a day, “it tends to be 24 to 48 hours before content actually gets in”, which necessarily poses a problem for news content.

“You don’t get court content in there, there’s no crime getting in there, our council content doesn’t get in there as well. Stuff from our Local Democracy Reporters doesn’t really get into Discover.

“We need that content — it’s the staple of our regional brands, and it’s frustrating that we see the BBC’s version of that story get in every single time, but we never see any commercial publishers really getting that sort of content in. So it feels like Google is, on Discover, using the BBC to serve that out, but not actually being very pluralistic in its approach.”

Google Discover selects ‘curiosity gap’ headlines to show readers

Little said Discover has a notable preference for “curiosity gap” headlines. For every story a Reach journalist publishes, they have to write four headlines: one for Facebook, one for Search, one for the home page and one for newsletters.

“Each of those are written in different ways,” Little said. “Search will have some keyword focus within it and the homepage one will be quite brand-safe. Newsletter [headlines] tend to have a little more of a sell… to encourage them to click through… and then social has got to be quite brand safe as well.”

You can’t feed Discover a specific headline: instead the platform chooses “the one that it wants the most”, and Little said most often Discover looks “for the most alluring headline” or “the greatest emotive element”.

Little defined a curiosity gap as “telling the story as it is but withholding the need-to-know piece of information from that headline so that people still feel the need to go and find out more”. (Advocates of the curiosity gap argue it differs from a “clickbait” headline strategy because curiosity gap articles actually deliver the information trailed in the title.)

He gave as examples the following real Reach headlines: “ITV Loose Women’s Janet Street-Porter speaks out from hospital after major surgery”, “BBC Death in Paradise star quits after five years, but fans will be happy with exit” and “NHS symptoms of silent killer, which hits one in 20 but takes years to diagnose”.

In the latter case, Little said: “It’s a very straight, factual headline, but it makes you think: ‘Well, what are those symptoms?’… And as long as you provide them on the other side, Discover rewards you quite heavily.”

Reach uses newsletters and Whatsapp Communities to build topic-based audiences out of Google Discover traffic

Little said Reach did not want the proportion of articles getting into Google Discover to get too large.

“You won’t want to be too reliant on it,” he said, emphasising that “our portfolio is more diverse, in terms of the ways that we generate traffic, than it’s ever been”.

Tech platforms make for notoriously unreliable long-term traffic sources, and Little said Reach has been trying to turn its fly-by Discover visitors into loyal readers.

“A nice example on the Daily Express would be that the Daily Express does really well with Formula One content on Discover.

“We’ve built up a newsletter audience of about 32,000 people, and a Whatsapp community of over 3,000 people, by targeting the people coming in from Discover on F1.

“The Express doesn’t get the same cut-through on Formula One on any other platform [as] it does on Discover, so we know that the generation of that newsletter audience is connected directly to people coming through…

“We’re laser-focused on thinking about: ‘how can we make sure that that traffic is something that we own over a long period of time?’

Little suggested Discover has been one way Google responded to the trend of news avoidance.

“Ultimately, for us as a publisher, in some ways Google Discover is a good thing, because it causes us to really think about — if that’s where the audience interest is, that’s what they’re engaging with, how do we start to diversify our content mix to get a broader range of topics across everything we do?

“And I think that’s actually a good thing because it makes us more diverse, it makes us open to new audiences that previously we wouldn’t have been, it makes us think about content in a different way.

“But all of our principles still stand — the content needs to be high quality, it needs to be well-researched, it needs to be written really well and have good imagery.”

The post Google Discover has become Reach’s ‘biggest referrer of traffic’ appeared first on Press Gazette.

News Corp adds Google-powered AI summaries to Factiva search results

Charlotte Tobitt — Wed, 13 Nov 2024 08:32:34 +0000

News Corp-owned business intelligence search engine Factiva is adding generative AI summaries to its search results.

The news database, which is part of Dow Jones, has deployed Google’s Gemini technology as part of News Corp’s ongoing business partnership with the tech giant. News Corp signed a three-year partnership with Google in 2021 which included payments for content via Google News Showcase and technology sharing.

Factiva approached every one of its almost 4,000 sources for new generative AI permissions and received the go-ahead from a “significant subset” of them according to general manager Traci Mabrey.

Having generative AI summaries in the search results has added multiple new “royalty moments” to Factiva, she said, including when a publisher’s content is referenced within the summary.

Factiva is used by institutions and professionals in fields like business, finance, academia and government who buy subscriptions to access its searchable archive of news, analysis and research.

Factiva licenses content from publishers, including both paywalled and free websites, meaning they get royalties proportionate to how much their work is surfaced by users.

With the launch of the generative AI capability, responses to searches in Factiva will contain a multi-sentence summary of the question that was asked above citations for three direct sources with links back to their articles, three additional recommended searches, and then a more classic search results section with all the articles that were responsive to the question asked.

Example generative AI summary response in Factiva to a search about “autonomous vehicle investment”.

Factiva acted as ‘publisher first’ in outreach for new AI deals

Mabrey told Press Gazette Factiva has been guided by three main principles on its AI products: “being a publisher first, an arbiter for publishers and an innovator”.

The publisher has reached out and spoken to “every publication and every source” on its books asking for generative AI rights, meaning every publisher referenced in its search summaries has consented to be there.

This contrasts with the approach taken by Google itself, which summarises publishers’ work in its own AI summaries without giving them a “genuine” choice (if they want to stay in its main search results).

They include: The Associated Press, Swiss news agency AWP Finanznachrichten AG and The Washington Post as well as Factiva’s fellow News Corp properties in the UK and Australia.

Mabrey said publishers have been “very grateful and very pleased with the fact that we have come with a very transparent and very trustful approach, and I think that has been one of the rationales for us being able to get some very strong partners from around the world to come on board because we are being very candid about what we want to do with the publications, what we are going to be offering in royalty moments”.

She added that she did not believe hallucinations or inaccuracies would be a problem as they have been in some generative AI products because only trusted sources are being used.

“So we are not bringing in just unvalidated information and what we think that is able to do is that that’s able to really ensure that our cutting edge technology is built on the foundation of the trusted journalism, data and analysis that’s needed at the core to begin the process.

“What we want to ensure is that by identifying those sources and building a highly relevant and contextual semantic search, that is going to be able to preclude what we hope is going to be a significant amount, if not all, of the hallucinations, because it is a very clean dataset that is going to be delivering the generative AI results, and what we want to be able to do is continue that steady stream of high quality views and information as we go forward.”

Other AI-generated ways publishers can earn royalties at Dow Jones include through the Factiva Feed for GenAI and Dow Jones Newswires GenAI Feed, which provide enterprises and financial firms with “licensed, copyright-compliant full-text content to power custom generative AI applications” like chatbots, as well as the Risk Center product for financial crime and risk management screening, and the automated due diligence report product Dow Jones Integrity Check.

Google gave Factiva ‘privacy and security’

Factiva’s new generative AI addition has been engineered with Google‘s Gemini models on Google Cloud as an extension of the existing partnership with the tech giant.

The Google Cloud partnership initially rolled out semantic search on Factiva, meaning AI interpreted the meaning and intent of a search rather than simply matching keywords.

Of the latest addition, Mabrey said: “Why we were very pleased to partner with Google on with this is that they offered us the privacy and security that was paramount for us, given the fact that we have our publishing IP within this Google Cloud instance, we also have the publishing IP for the external publishers and media and market data entities that are part of the ecosystem.

“So having the privacy and having the security was critically important, but then also being able to have a semantic search paradigm that was built solely on relevancy and contextual search criteria was also paramount, and they were able to truly deliver that for us.”

Google Cloud’s North America president Michael Clark said in a statement: “Dow Jones’s use of Google’s Gemini models on Google Cloud to power gen AI search summaries marks a significant step towards leveraging the technology to enhance information access and analysis.

“By combining our models with the extensive Factiva content set, we’re showcasing how GenAI can be a powerful tool for growth for the media industry.”

Dow Jones owner News Corp has separately done a deal with Google’s biggest rival in the AI space, OpenAI, for the use of the publisher’s current and archived content in products like ChatGPT.

Mabrey said: “It’s a very different agreement than what we have with our IPs [information providers] for our generative AI search in Factiva, but it really does apply very similar guiding principles… We apply those same guiding principles to our own IP and to the partners that we have around the globe.”

The post News Corp adds Google-powered AI summaries to Factiva search results appeared first on Press Gazette.

Open web group says Google Sandbox ‘governance framework’ lets it ‘mark its own homework’

Bron Maher — Wed, 13 Nov 2024 08:28:21 +0000

A proposed “governance framework” for Google’s Privacy Sandbox technology lets the tech giant “mark its own homework” around competition issues.

This is the view of the Movement for an Open Web on the latest news around Google’s proposed advertising technology for its dominant Chrome browser.

The framework, which has been articulated in response to Competition and Markets Authority concerns that Privacy Sandbox could help Google “self-preference” in the digital ad market, proposes remedies such as annual reports to the CMA and the introduction of an appeals process.

In a report published this week, the CMA indicated that the framework “could resolve a range of outstanding issues” around competition and privacy in Google’s Privacy Sandbox, a suite of technologies designed to place advertisements online without the need for third-party cookies.

However, the CMA added the framework would only resolve those issues provided it “is implemented effectively. Until Google has done so, these issues will remain unresolved”.

Sandbox technology will be offered as an alternative to third-party cookies as a way of personalising online advertising, in the process making publisher ad inventory much more valuable.

What’s in Google’s proposed governance framework for Privacy Sandbox?

Google has told the CMA its Privacy Sandbox governance framework can feature:

The introduction of a formal consultation period for major changes to any Privacy Sandbox tools with a feedback period currently pegged at three weeks
The introduction of an externally-managed appeals process that would, for example, allow publishers to appeal Google decisions around the Sandbox
Annual reports to the CMA and the Information Commissioner’s Office.

The co-founder of campaign group Movement for an Open Web, James Rosewell, said the framework “does little to calm real-world concerns”.

One of those concerns is that Google, which boasts large amounts of first-party data from logged-in users of products like Youtube and Gmail, will benefit disproportionately from the introduction of Sandbox technology.

Google already takes the lion’s share of the UK advertising market, banking around £14bn last year from search alone.

In its last report on the issue, in April 2024, the CMA said it was considering “whether additional restrictions may be needed to resolve this concern”. There does not appear to have been any motion on this front since then: in its latest report, the regulator wrote that it is “continuing to discuss this issue with Google”.

Rosewell said that this issue “has been flagged from day one and it’s still outstanding… We need to chalk up this anti-competitive effort as a failure”.

Google also appears not to have offered the CMA any guarantees that publishers won’t be penalised in Google search if they decline to use Privacy Sandbox.

The CMA has said previously that Google has confirmed its search engine “will not use a site’s decision to opt-out of the Topics API as a ranking signal”, but the Topics API (application programming interface) is only one of numerous that make up the Privacy Sandbox.

In this week’s report, the CMA said only that “we are continuing to discuss this issue with Google”. (Experts Press Gazette spoke with in April suggested the tech giant’s hesitancy to make further promises may arise from the fact some of the Sandbox APIs pertain to user protection, for example around spam and fraud, which could be legitimate reasons to down-rank a site.)

Rosewell said that the framework “is being pushed as a solution to many ills but it’s clear that it’s a weak veneer of self regulation that does little to calm real-world concerns.

“The proposed structure is little more than Google marking its own homework as it oversees its monopolistic dominance of the market. Without true independence and broad scope Google’s so-called Governance Framework will just be more window dressing.”

Responding to the criticism, a spokesperson for Google said: “We believe our approach supports healthy competition across the industry while improving user privacy. This approach, which lets people make an informed choice that applies across their web browsing, is still being discussed with regulators and we will share more details at the appropriate time.”

The CMA’s report comes after Google announced in July that it would not scrap third-party cookies on its Chrome browser, a decision that itself came four years after the company first promised to deprecate the tech.

Cookies help marketers target their ads to specific audiences and improve news publisher ad revenue by giving them, and their advertisers, a more detailed picture of who reads their content. Google’s Privacy Sandbox is intended to let advertisers target readers with relevant ads while protecting their privacy by sorting them into “cohorts” with similar interests and recent web activity.

Chrome users will now be able to choose to continue using third-party cookies if they prefer, but the CMA said its concerns around the technology remain relevant: “For the proportion of traffic where third-party cookies are unavailable, the Privacy Sandbox tools will remain important for the ad tech ecosystem to target and measure advertising”.

The post Open web group says Google Sandbox ‘governance framework’ lets it ‘mark its own homework’ appeared first on Press Gazette.

How Newsweek became world’s fastest-growing English language news website

Bron Maher — Thu, 07 Nov 2024 07:29:02 +0000

A traffic growth strategy aimed at maximising referrals from Google’s Search, Discover and News platforms has helped Newsweek become the fastest-growing English-language news site in the world.

The brand now boasts some 200 people in its news operation, its SVP of audience development Josh Awtry told Press Gazette, following growth in its politics, popular science and lifestyle teams.

Newsweek has charted as the fastest-growing English-language news site in the world on Press Gazette’s top 50 ranking almost every month this year. According to Similarweb its total web visits stood at 109.1 million in September, up 108% year-on-year, making it the 20th most-visited English language news site in the world and the 14th-most visited in the US, ahead of the Associated Press and CBS News.

“As revenue goes up, we’re reinvesting in our journalists, our journalism,” Awtry told Press Gazette last week. “We’ve tripled the size of our politics team over the past seven months.”

He added that the brand, which is co-owned by chief executive Dev Pragad and former chief executive Johnathan Davis, was “trying to hire in a really sustainable way”.

What’s behind Newsweek traffic growth?

Awtry told Press Gazette “the bulk” of Newsweek’s visits come from Google Search, Google Discover and Google News.

“We do listen very carefully to Google signals,” he said. “We don’t want to game the system. But if we’re seeing that there is increased and stable traction in a coverage area, that, to me, doesn’t mean gaming a system, that means readers are interested, they’re hungry for content in that area. So we will look at hiring staff looking for angles.”

Google Discover, he said, has “been a mainstay of where our audience comes from” during the year and a half he has worked at Newsweek. Discover is a personalised feed that is automatically presented to Google app users when they open an empty tab on their smartphone, and appears to have been growing as a source of referral traffic for publishers this year.

A Newzdash analysis published in August found Google Discover was driving 55.6% of Google traffic at the publishers assessed earlier this year, an increase of 14 percentage points on 2023.

“Our news content still does a solid job on Google Discover, but increasingly, a lot of our lifestyle content does well,” Awtry said. “If a person does something [and] it goes viral on social, we’ll reach out and talk to that person, and interview them, do that story — those sorts of stories, those real people and real voices do a good job for us on Google Discover…

“The signal there is don’t just pull more easy viral stories, but go try and get the story behind why it was viral.”

As well as Google Discover, Awtry said Newsweek had “really beefed up” its news SEO team in recent months, and that “we look at the signals from Google News much more carefully than we ever used to…

“I won’t say they have a hard and fast goal on it, but we definitely keep track of how many times Newsweek appears in the top Google News carousel, and we’re trying to grow that.”

But he emphasised that Newsweek’s page views have been growing faster than its unique user count, and page views per visitor have “basically doubled” from around 2 or 2.5 in two years.

“That’s a positive signal — we’re growing our direct readership,” he said. “The people are coming to us not via Google but via our own surfaces, whether that’s, bless their hearts, those people who bookmark newsweek.com — that’s great — but increasingly newsletters, push alerts.”

What does Newsweek’s content look like?

On Sunday 3 November, two days before the US presidential election, Newsweek published 140 articles, of which 62 (or 44%) covered sporting news or promoted partner sports betting services. Another 44 stories (31%) related to politics. The two types of content collectively accounted for three-quarters of the articles published on the site.

The remaining stories spanned health, science, weather and hard news, as well as softer lifestyle stories and rewrites of social media content.

The latter content resembles the fluffy viral lifestyle stories that were common online during the heyday of Facebook traffic referrals, running under headlines like “Rescued Cows Say Last Goodbye to Sick Friend in Heartbreaking Clip” and “Student Asks for Extra Mayo on Fries, Gets More Than She Bargained for“. As Awtry described, however, almost all those stories on Newsweek involve an interview with the originator of the viral content, whereas other publishers publish quick rewrites with no further work.

Other recurring themes in Newsweek’s content have included dogs, product recalls and the state of Texas. The site publishes daily articles aimed at helping people solve New York Times puzzles Wordle and Connections, and has published a few stories that centre on AI chatbot responses, for example “What a Second Donald Trump Term Would Look Like, According to ChatGPT, Grok” and “What ‘Yellowstone’ Fans Should Watch Next, According to ChatGPT“.

‘Our future is in using the traffic coming from Google as the start of an interaction, not the end result’

Asked whether Newsweek was concerned that it is overexposed to Google — which has been an unpredictable source of traffic for publishers in recent years — Awtry said: “Every publisher has to be concerned about that… anytime we’re fully beholden to someone else to bring us our new readers, that’s a risk.”

He added: “We really appreciate Google, we have a good relationship with Google, but we can't count on the fact that they will always be there sending readers to us.”

Citing Google’s previous disruptive algorithm changes and its ongoing deployment of AI Overviews, which are expected to hit publisher traffic referrals, Awtry said Newsweek tries to view the wave of visits from Google as “a gift", but not an end in itself.

The audience team he oversees has grown “significantly” he said, recently adding “a deeper tech stack to better understand segment and address readers” by offering them “custom experiences” which will, hopefully, keep them coming back in future.

“It's no different from what we do with any newspaper or any news site and their ad tech, serving the right ads to the right people,” he explained.

“If you've come in and it's your first time ever, let's not hit you with a pop-up to subscribe to Newsweek. Let's not even bother you about a newsletter. Let's just let you read the story you wanted to read.

“Second time back? Yeah, maybe we'll ask you if you're interested in the newsletter. If it's the third politics story you've read, on that visit, even if it's your first visit, yeah, we're gonna ask you about a politics newsletter.

“We're trying to get much more sophisticated in how we handle those lifecycle journeys. Sometimes the best way is to do nothing, but sometimes the best thing is to ask someone to just like us on social — or if they came from Reddit, acknowledge they came from Reddit.

“The tech stack for us is just coming into play over the past month, so they're new tools. But that, for us, is our future — is in using the traffic coming from Google as the start of an interaction, and not the end result.”

Newsweek looks to Reddit to acquire new readers away from Google

Awtry said Newsweek aims to differentiate itself from the competition in part through active engagement with its audience, which has manifested in a particular focus on its comment section and on Reddit. The site gets “thousands of comments a day, sometimes more than 10,000”, Awtry said, and “We just hired a new community manager whose job is not just to take down bad comments, but to really highlight the best of them, to feature them, to foster civil debate.”

Awtry credited the focus on Reddit in part to “wildly fluctuating” traffic from search engines.

“We're looking to diversify where our new readers come from. The bulk of them still come from Google related services — we can't always count on that. And so we started using Reddit a lot more.”

Reddit appealed, he said, because it was somewhere users interact with one another, and Newsweek uses it “heavily — not just posting all of our stories in the news subreddit or anything like that, but really trying to read the room of niche subreddits.

“If we do a story on Gen Z not being able to afford homes, we'll go to the Gen Z subreddit, we'll make sure we're okay to post there, we’ll put it there.”

He said Reddit would not work for publishers hoping to use it “as a page view driver”.

“We're using it as a reputational signal, as a brand signal, as a way of taking the pulse of how we're seen.”

The post How Newsweek became world’s fastest-growing English language news website appeared first on Press Gazette.

Publishers hooked on Google Discover traffic risk race to the bottom

Barry Adams — Thu, 07 Nov 2024 07:20:41 +0000

Many publishers are increasingly reliant on Google Discover for their daily traffic numbers, but some are playing a risky game.

Discover is a feed of recommended articles that users of Android phones see when they open their browser, or iPhone users in their Google app.

The articles recommended by Discover align with what Google knows about each user’s interests, favourite websites, and recent searches. This results in a highly personalised feed full of content the user is likely to click on and read.

Google says that any article has a chance to appear in Discover, as long as it aligns with their content and quality policies. These are vaguely worded, but are almost identical to Google’s content policies for news.

For many news publishers around the world Google Discover is now their primary source of traffic. Research from NewzDash, a news performance monitoring tool, shows that on average Discover accounts for 55% of publishers’ total Google traffic. This is up from 41% in a previous study.

With Google as the dominant source of traffic to most publishers, this means Discover is the single largest channel sending visitors to publishing sites.

Clickbait headlines about personal finance seem to work on Discover

Yet it’s a risky strategy to rely on Discover traffic. Several Googlers are on record saying sites shouldn’t rely on Discover. In my own experience working with publishers around the world, Discover is a highly volatile channel that can send massively different traffic numbers from one day to the next.

Additionally, Discover is especially susceptible to the whims of Google’s algorithm updates. Sites can see their entire Discover traffic evaporate overnight, without any explanation or underlying cause, just because Google has decided to slightly tweak its algorithmic levers.

Yet the temptation to maximise Discover traffic is too great for many publishers.

Once a publisher finds a specific topic that consistently generates significant visitor numbers from Discover, it’s seductive to go all-in on that topic and generate daily articles even if there’s very little of substance to say.

This is evident in the daily output of many British publishing sites. Topics that are popular on Discover – especially so-called YMYL (Your Money Your Life, content focused on personal finance and health) – are covered on a daily basis.

These articles often feature headlines that seem designed not to rank in Google’s regular search ecosystem, where straightforward factual headlines tend to win, but to elicit clicks with emotive phrasing bordering on outright clickbait.

And these tactics work. For a while at least.

Creating content for Google Discover could be race to the bottom

Creating content optimised only for Discover is a race to the bottom, with articles making promises with attractive headlines yet barely containing any meaningful information churned out by the hundreds on a daily basis by many large UK news websites.

We’ve been through this hamster wheel before, and it didn’t end well. Not so long ago, many publishers chased after Google search clicks with similar articles around celebrities’ net worth, bizarre ‘scandals’ that were often entirely invented, filler content around the latest Google doodle, entire pieces extracting the deepest insights from a D-list celebrity’s whimsical social media comment, and more of such churnalism intended purely to drive clicks from Google’s search results.

All this backfired spectacularly when Google started rolling out algorithm updates designed to deprive such content of visibility in their search results. Publishers experienced first-hand how fickle Google’s whims can be, when content that was previously almost pinned to the top of Google’s news carousels suddenly became toxic and actively harmful for a website’s long-term viability.

I fear we may soon go through the exact same with Discover. In fact, I’ve already seen publishers suffer greatly at the hands of Google’s ‘Helpful Content Updates’, which are now part of their core algorithms. More sites seem destined to follow.

Chasing after Discover clicks is likely to be another short-lived tactic for publishers desperate to claim as many visits as possible in the short term, regardless of the potential long-term effects their site may suffer.

Some publishers with multiple news websites under their wing may see this as an acceptable risk. If one site gets hit by an algorithm update, they’ve learned exactly where to draw the line and can amend their tactics on another site.

But if your portfolio of news sites is small, taking such risks is highly irresponsible. The effects of an algorithm update can be devastating and extremely hard to recover from.

Sustainable tactics for Google Discover require genuine effort

A more viable approach to Discover, which is also recommended by Google, is to see visits from this channel as ‘bonus traffic’. When advising clients, I often recommend putting Discover visits into a separate bucket entirely, separating it from your site’s core traffic channels.

There are ways to optimise for Discover in a sustainable fashion. While there will always be a degree of volatility to the channel’s daily numbers, with the right approach you can ensure your continued presence on people’s feed. Not every article will be a champion, but you can increase the probability of driving meaningful visits from Discover without risking an algorithmic backlash at a future date.

Reliable Discover optimisation tactics centre around content focusing on people’s needs and interests, while providing real value and insights.

The concept of ‘information gain’ is critical to success in Discover as well as in Google’s broader news rankings. Information gain is about adding knowledge and insight to a news topic. If you’re merely reporting what others are also saying, then your content has low information gain. If, however, you can bring something new to the table – a different perspective, fresh information not previously covered, or an expert opinion – then you are contributing to the topic’s overall knowledge and have created information gain.

Combine this with headlines that find a balance between factual and emotive, good images to attract attention without resorting to fakery, and an excellent user experience that encourages visitors to return to your site, and you have an excellent recipe for sustained success in Discover.

Winning in Google’s rapidly changing search ecosystem has never been more challenging. Publishers should be wary of chasing after cheap clicks, be that on Discover or on search, as that approach has repeatedly shown to be a dead end.

It takes genuine effort to build a news brand that your users will want to engage with time and again. This is true on every channel where your audience reads your content, and Discover is no exception.

The post Publishers hooked on Google Discover traffic risk race to the bottom appeared first on Press Gazette.

News organisations are forced to accept Google AI crawlers, says FT policy chief

Charlotte Tobitt — Wed, 06 Nov 2024 07:16:49 +0000

News organisations don’t have a “genuine choice” about whether to block Google from scraping their content for its AI services, one publisher has warned.

Matt Rogerson, director of global public policy and platform strategy at the FT and former Guardian Media Group director of public policy, argued that Google’s “social contract” with publishers – through which it provided value to the industry by sending traffic to their sites – has been broken.

This is because Google now publishes summaries of those publishers’ articles in AI Overviews at the top of many search results – and also sells data generated via its search crawler to third-party large language models (LLMs).

In September last year Google introduced Google-Extended, a control which allows website owners to block its AI chatbot Gemini (formerly Bard) and its AI development platform Vertex from scraping their content.

However Google-Extended does not stop sites from being accessed and used in Google’s AI Overviews summaries, meaning that to avoid this publishers would have to opt out of being scraped by Googlebot, which indexes for search.

Rogerson said the presence of Googlebot on “almost the vast majority of websites on the open web enables Google unparalleled access to IP published online” and added that this IP is “now being used to enable Google’s LLMs, and those of third party companies such as Meta, to respond accurately to user queries in real-time”.

In a letter to Baroness Stowell, chair of the House of Lords Communications and Digital Committee, Rogerson said: “This leaves website owners with an unenviable choice.

“To opt-out of the Google Search crawler entirely, and become invisible to the 90%+ of the UK population that currently uses Google Search, or allow scraping to continue in ways that both extract value without compensation, and undermine nascent commercial licensing markets for the use of high quality IP to build and enable the AI models of the future.”

Rogerson’s letter was triggered by Media Minister Stephanie Peacock inaccurately stating in a Future of News inquiry hearing earlier in October that the FT “has an agreement with Google“.

Peacock said licensing approaches like these are “obviously welcome” but there is “no consistency to it. It is quite piecemeal, and there is definitely a question around making it more consistent.”

The FT has not done any deal with Google for the use of its content in LLMs and other AI products, although it has previously been a partner of the tech giant in other projects like the Google News Showcase aggregation service.

The FT has separately signed a licensing agreement with OpenAI and a “number of other agreements for innovative AI related partnerships” including a private beta test with Prorata.ai, a start-up developing technology for generative AI platforms to share revenue with publishers each time their content is used to generate an answer.

Rogerson said of the FT’s OpenAI and Prorata deals: “Both of these agreements begin to align the incentives of AI platforms and publishers in the interests of quality journalism, the reader and respect for IP.

“We strongly believe that sharing revenues between technology companies that use IP and the publishers that create it – can help develop a healthier and fairer information ecosystem that encourages accurate and authoritative journalism and rightly rewards those who produce it.”

He added that this goal of aligning incentives is “being undermined by the scraping practices of incumbent technology companies” including Google.

He said the scraping of publisher IP by Google, which uses it in its own LLMs and sells it to companies like Meta, is still agreed to because sites want to appear in the tech giant’s dominant search engine.

But he said this “means that those companies extract commercial value from the source material, without a user ever engaging with the source of that information.

“From Wikipedia to the Watford Observer, websites rely on engagement with users: engagement that is generated by the content invested in and generated by those sites. Without such engagement the ability to generate any of those revenue streams disappears. This was the social contract of the open web, that value would be shared between search and social gateways and the investors in intellectual property.”

A Google spokesperson said in response: “Every day, Google sends billions of clicks to sites across the web, and we intend for this long-established value exchange with publishers to continue.

“With AI Overviews, people find Search more helpful and they’re coming back to search more, creating new opportunities for content to be discovered. People are using AI Overviews to discover more of the web, and we’re continuing to improve the experience to make that even easier.

“We also provide web publishers with a range of controls to indicate how much of their content is eligible to display in Search.” This includes AI Overviews.

Google claims, although it has not yet shared data on this, that clicks from AI Overviews are higher quality as people are more likely to spend more time on the site.

It says Googlebot is used in AI Overviews because AI has long been built into search and is integral to how it functions.

And it tells publishers that don’t want their content to appear in AI Overviews to use the NOSNIPPET meta tag and the DATA-NOSNIPPET attribute to limit visibility of specific pages or parts of page – similar to how they could previously control whether they appeared as featured snippets at the top of results.

The post News organisations are forced to accept Google AI crawlers, says FT policy chief appeared first on Press Gazette.

US election: Speed and fairness are key tactics for fast-growing Newsweek

Bron Maher — Tue, 05 Nov 2024 10:27:52 +0000

Fast-growing media brand Newseek has said it wants to use speed and fairness to help it stand out in the race for readers during the US election.

Newsweek SVP of audience development Josh Awtry told Press Gazette the newsbrand — the fastest-growing of the world’s top 50 English-language news sites — has to find a way to compete with news outlets “who have got infinitely more resources, fleets of data engineers… we’re up against people who have holograms”.

The magazine publisher’s plan, he said, was to “jump right to the key questions on people’s minds” beyond the individual race results and to be quicker than the competition.

He said the title had historically hired journalists with an emphasis on speed, has a roster of overnight journalists in the UK and that it has “prewrites” ready to go on covering a range of stories. (The brand has also reportedly been hiring AI-assisted live news reporters.)

The site typically publishes more than 300 stories per day with a strong emphasis on politics and current affairs. According to Press Gazette’s rankings Newsweek is the fastest-growing top-50 English language news website in the world.

Newsweek has tried to position itself above the partisan fray: the website features a “Daily Debate” panel high up that hopes to present two sides of an issue (or, as the election approached, pitches from both Donald Trump and Kamala Harris supporters).

It also rolled out a “Fairness Meter” a year ago that allows readers to vote on whether a given article is fair or whether it leans right or left.

A screenshot of the “Fairness Meter” that appears beneath many articles on Newsweek.com.

Awtry said the introduction of the Fairness Meter came out of a desire for the audience “to keep us honest”.

Commenting that “it’s tough to out-journalism The New York Times”, he added: “What we’re hoping to do is differentiate from a lot of the other media in being more interactive and listening a lot to our community.”

Four tenths of Newsweek’s audience “self-identifies as politically middle of the road,” Awtry said, “and if you break it apart to those who lean left and those who lean right, it’s pretty close to even…

“We don’t want to lose that ability to speak to America. If you look at our state-by-state penetration, it mirrors the population trends. Rural, urban, suburban — pick your metric, and we look like America.”

The Fairness Meter has been applied to more than 25,000 stories since the feature launched, Awtry said. It is not placed on all articles, although most harder news stories receive it. In 90% of cases where it has been used the story has been labelled “fair” by a majority of voters, he said, and of the two million votes cast in total, 70% have been for the fair option. Approximately 14% of all those votes indicated a story “tilts right” and 16% that it “tilts left”.

Users do not have to register to cast a vote on the Fairness Meter, and Awtry said the plan next year is “to keep the core functionality wide open and free for everyone, but then to use it as a registration driver for that smaller percent of readers who do want some cooler things”, for example to let users track how they have voted over time.

Newsweek.com is free to read, although it offers a paid digital subscription that removes ads and provides access to exclusive podcasts and newsletters.

“We don’t want to create any friction in the ability to register that vote,” Awtry said. “We don’t want you to have to sign up or create an account. We want that to be something that we use to take the pulse. But we’re looking for features that power users might want.”

The post US election: Speed and fairness are key tactics for fast-growing Newsweek appeared first on Press Gazette.

Who’s suing AI and who’s signing: Publisher deals vs lawsuits with generative AI companies

Charlotte Tobitt — Wed, 30 Oct 2024 16:42:24 +0000

News publishers are increasingly deciding to sign deals with AI companies over the use of their content despite early doubts and a high-profile legal case from The New York Times.

The deals commonly include the use of news publishers’ content as reference points for user queries in tools like ChatGPT (with citation back to their websites currently promised) as well as giving them the use of the AI tech to build their own products.

This page will be updated when new deals are struck or legal actions are launched relating to news publishers and AI companies (latest: Meta strikes an AI deal with Reuters while News Corp subsidiaries sue Perplexity).

OpenAI is reportedly offering news organisations between $1m and $5m per year to license their copyrighted content to train its models – although News Corp’s deal is reportedly worth more than $250m over five years.

Meanwhile Apple has reportedly been exploring AI deals with the likes of Conde Nast, NBC News and People and Daily Beast owner IAC to license their content archives, but nothing has yet been made public.

Scroll down or here are the quick links:

Latest lawsuits: including eight Alden US dailies
Latest deals: including Hearst, Conde Nast, The Atlantic, Vox Media, News Corp, Dotdash Meredith

Plenty of other news organisations are understood to be in negotiations with OpenAI while some, including the publisher of Mail Online, have suggested they are seriously considering their options legally.

But not all publishers want deals: Reach chief executive Jim Mullen told investors on 5 March that the UK’s largest commercial publisher is not in any “active discussions” with AI companies and suggested other publishers should hold off on deals to allow the industry to come at the issue with a position of solidarity.

He said: “We would prefer that we don’t get into a situation where we did with the referrers ten years ago and gave them access and we became hooked on this referral traffic and we would like it to be more structured. We produce content, which is really valuable, and we would like to license or agree how they use our base intelligence to actually inform the AI and the open markets. The challenge we have as an industry is that we need to be unified.

“I used to be the chairman of the NMA and if we stay together and work with it, then that’s a really strong position that we have, particularly with the Government to help us get to there. So I’m using this as a bit of a campaign, [it] only takes one publisher to break away and start doing deals and then it sort of disintegrates.”

Press Gazette analysis in February found that more than four in ten of the 100 biggest English-language news websites have decided not to block AI bots from the likes of OpenAI and Google.

If you feel there is something missing that should be included, or you want to alert us to a new development, please contact charlotte.tobitt@pressgazette.co.uk.

Suing

News Corp (versus Perplexity)

The News Corp subsidiaries that publish the Wall Street Journal and New York Post have filed a copyright and trademark infringement lawsuit against AI upstart Perplexity, which they accuse of “massive freeriding”.

The publisher is seeking massive damages and the removal of its content from Perplexity’s web index and wants its case heard at a jury trial.

News Corp has separately signed a deal with OpenAI (see below for more information). It is the first to sue Perplexity though other publishers including The New York Times have sent the AI company cease and desist letters.

Read the full story here.

Mumsnet

UK parenting forum and publisher Mumsnet has launched legal action via an initial letter against OpenAI over the scraping of its site and its more than six billion words – “presumably” for the training of large language model ChatGPT.

Mumsnet founder Justine Roberts told users: “Such scraping without permission is an explicit breach of our terms of use, which clearly state that no part of the site may be distributed, scraped or copied for any purpose without our express approval. So we approached Open AI and suggested they might like to licence our content.”

In particular, she said, Mumsnet’s content would be valuable because it could help to counter the misogyny “baked in” to many AI models.

But, she continued: “Their response was that they were more interested in datasets that are not easily accessible online.”

Roberts said what OpenAI differs from Google’s scraping of the web for search purposes because there is a “clear value exchange in allowing Google to access that data, namely the resulting search traffic… The LLMs are building models like ChatGPT to provide the answers to any and all prospective questions that will mean we’ll no longer need to go elsewhere for solutions. And they’re building those models with scraped content from the websites they are poised to replace.”

Roberts continued: “At Mumsnet we’re in a stronger position than most because much of our traffic comes to us direct and though it’s a piece of cake for an LLM to spit out a Mumsnet-style answer to a parenting question I doubt they’ll ever be as funny about parking wars or as honest about relationships and they’ll certainly never provide the emotional support that sees around a thousand women a year helped to leave abusive partners by other Mumsnet users.

“But if these trillion-dollar giants are simply allowed to pillage content from online publishers – and get away with it – they will destroy many of them.”

Roberts acknowledged it is “not an easy task” to go up against a big tech company like OpenAI but said “this is too important an issue to simply roll over”.

Responses from users on the forum contained a lot of “well done” and “good luck”.

The Center for Investigative Reporting

Non-profit news organisation The Center for Investigative Reporting, which produces Mother Jones (after a merger this year) and Reveal, is suing OpenAI and its largest shareholder Microsoft, it announced on 28 June.

It said the companies had used its content “without permission or offering compensation” and accused them of “exploitative practices” in a lawsuit filed in New York.

Chief executive Monika Bauerlein said: “OpenAI and Microsoft started vacuuming up our stories to make their product more powerful, but they never asked for permission or offered compensation, unlike other organizations that license our material.

“This free rider behavior is not only unfair, it is a violation of copyright. The work of journalists, at CIR and everywhere, is valuable, and OpenAI and Microsoft know it.”

She added: “For-profit corporations like OpenAI and Microsoft can’t simply treat the work of nonprofit and independent publishers as free raw material for their products.

“If this practice isn’t stopped, the public’s access to truthful information will be limited to AI-generated summaries of a disappearing news landscape.”

Eight Alden Global Capital daily newspapers

Eight daily newspapers in the US owned by Alden Global Capital are suing OpenAI and Microsoft, it was revealed on 30 April.

The newspapers involved in the lawsuit are: the New York Daily News, the Chicago Tribune, the Orlando Sentinel, the Sun-Sentinel in Florida, the Mercury News in San Jose, the Denver Post, the Orange County Register and the St. Paul Pioneer Press.

The lawsuit says the newspapers want recognition that they have a legal right over their content and compensation for the use of it in the training of AI tools so far.

Frank Pine, executive editor of Media News Group and Tribune Publishing Newspapers, the Alden subsidiaries that own the newspapers concerned, said: “We’ve spent billions of dollars gathering information and reporting news at our publications, and we can’t allow OpenAI and Microsoft to expand the Big Tech playbook of stealing our work to build their own businesses at our expense.

“They pay their engineers and programmers, they pay for servers and processors, they pay for electricity, and they definitely get paid from their astronomical valuations, but they don’t want to pay for the content without which they would have no product at all. That’s not fair use, and it’s not fair. It needs to stop.

“The misappropriation of news content by OpenAI and Microsoft undermines the business model for news. These companies are building AI products clearly intended to supplant news publishers by repurposing purloined content and delivering it to their users.

“Even worse, when they’re not delivering the actual verbatim reporting of our hard-working journalists, they misattribute bogus information to our news publications, damaging our credibility. We employ professional journalists who adhere to the highest standards of accuracy and fairness. They are real people who go out into the world to conduct first-hand interviews and engage in actual investigations to produce our journalism.

“Their work is vetted and checked by professional editors. The Mercury News has never recommended injecting disinfectants to treat COVID, and the Denver Post did not publish research that shows smoking cures asthma. These and other ChatGPT hallucinations are documented in our legal filings.”

The Intercept, Raw Story and Alter Net

Three US progressive news and politics digital outlets filed lawsuits against OpenAI on Wednesday 28 February.

The Intercept, Raw Story and Alter Net objected to the use of their articles to train ChatGPT. The Intercept also sued Microsoft, which has partnered with OpenAI to create a Bing chatbot.

Raw Story publisher Roxanne Cooper said: “Raw Story’s copyright-protected journalism is the result of significant efforts of human journalists who report the news. Rather than license that work, OpenAI taught ChatGPT to ignore journalists’ copyrights and hide its use of copyright-protected material.”

CEO and founder John Byrne added: “It is time that news organisations fight back against Big Tech’s continued attempts to monetise other people’s work.”

The New York Times

The most high-profile case against OpenAI and Microsoft from a news publisher so far, The New York Times made a surprise announcement in the days after Christmas that it would seek damages, restitution and costs as well as the destruction of all large language models (LLMs) trained on its content.

OpenAI and NYT had been in negotiations for nine months but the news organisation felt no resolution was forthcoming and decided instead to share its concerns over the use of its intellectual property publicly. The success of the lawsuit will depend on the US court’s interpretation of “fair use” in copyright law – assuming the companies don’t find their way to a settlement first.

OpenAI previously said a “high-value partnership around real-time display with attribution in ChatGPT” was on the cards with the NYT before the news organisation surprised it by launching the lawsuit.

The NYT said the two tech companies, which have a partnership centred around ChatGPT and Bing, have “reaped substantial savings by taking and using – at no cost” its content to create their models without paying for a licence. It added that the use of its content in chatbots “threatens to divert readers, including current and potential subscribers, away from The Times, thereby reducing the subscription, advertising, licensing, and affiliate revenues that fund The Times’s ability to continue producing its current level of groundbreaking journalism”.

In its response, filed on Monday 26 February, OpenAI argued: “In the real world, people do not use ChatGPT or any other OpenAI product” to substitute for a NYT subscription. “Nor could they. In the ordinary course, one cannot use ChatGPT to serve up Times articles at will.”

OpenAI accused the NYT of paying someone to hack its products and taking “tens of thousands of attempts to generate the highly anomalous results” in which verbatim paragraphs from articles were spat out by ChatGPT. “They were able to do so only by targeting and exploiting a bug (which OpenAI has committed to addressing) by using deceptive prompts that blatantly violate OpenAI’s terms of use,” it said.

“And even then, they had to feed the tool portions of the very articles they sought to elicit verbatim passages of, virtually all of which already appear on multiple public websites. Normal people do not use OpenAI’s products in this way.”

Getty Images

Getty Images began legal proceedings against Stability AI in the UK in January 2023, claiming that the AI image company “unlawfully copied and processed” millions of its copyrighted images without a licence through its text-to-image model Stable Diffusion.

In December, the High Court in London ruled that Getty’s case could go to trial after Stability AI failed to persuade a judge that two aspects of the claim – relating to training and development as well as copyright – should be struck out.

Mrs Justice Joanna Smith said Getty’s claim has a “real prospect of success” in relation to Stable Diffusion’s “image-to-image feature” which the photo agency claimed allows users to make “essentially identical copies of copyright works”.

Who’s signed news AI deals?

Reuters

Reuters, which has previously said it had struck a number of deals with unspecified AI companies and then signed up as a publisher partner for Microsoft’s new AI companion Copilot, has become the first news publisher to sign an AI deal with Meta.

The deal allows Meta’s AI chatbot to use real-time Reuters content to answer questions from users about news and current events, it announced on 25 October, although it will begin only in the US.

The chatbot, which appears with the search and messaging features on Facebook, Instagram, Whatsapp and Messenger, will provide summaries and link out to Reuters which will be compensated when its work is used in this way.

Reuters already had a fact-checking partnership with the Facebook owner.

A Reuters spokesperson said: “We can confirm that Reuters has partnered with tech providers to license our trusted, fact-based news content to power their AI platforms. The terms of these deals remain confidential.”

A Meta spokesperson told Axios: “We’re always iterating and working to improve our products, and through Meta’s partnership with Reuters, Meta AI can respond to news-related questions with summaries and links to Reuters content.

“While most people use Meta AI for creative tasks, deep dives on new topics or how-to assistance, this partnership will help ensure a more useful experience for those seeking information on current events.”

The Lenfest Institute for Journalism

OpenAI and Microsoft are distributing $10m to The Lenfest Institute for Journalism to provide five US newsrooms with a grant to each hire a fellow to work on AI projects for two years.

The newsrooms benefiting from the initial round of funding are: Chicago Public Media, Newsday in Long Island, The Minnesota Star Tribune, The Philadelphia Inquirer and The Seattle Times. Three further news organisations will receive funding in a second round.

The projects from the fellows should “focus largely on improving business sustainability and implementing AI technologies within their organisations”, Lenfest said.

OpenAI and Microsoft will also allow the publications to use their tools to experiment and develop tools to help with their local news output.

Tom Rubin, chief of intellectual property and content at OpenAI, said: “While nothing will replace the central role of reporters, we believe that AI technology can help in the research, investigation, distribution, and monetisation of important journalism.

“We’re deeply invested in supporting smaller, independent publishers through initiatives like The Lenfest Institute AI Collaborative and Fellowship, ensuring they have access to the same cutting-edge tools and opportunities as larger organizations.”

Hearst

Newspaper and magazine giant Hearst has agreed a “content partnership” with OpenAI in the US, it announced on 8 October.

Hearst said OpenAI products including ChatGPT will incorporate content from its US brands including Houston Chronicle, San Francisco Chronicle, Esquire, Cosmopolitan, Elle, Runner’s World and Women’s Health – more than 20 magazine titles and 40 newspapers in total. It does not include Hearst’s content in other countries like the UK.

Hearst said its content will “feature appropriate citations and direct links, providing transparency and easy access to the original Hearst sources” from ChatGPT.

Hearst Newspapers president Jeff Johnson said: “As generative AI matures, it’s critical that journalism created by professional journalists be at the heart of all AI products.

“This agreement allows the trustworthy and curated content created by Hearst Newspapers’ award-winning journalists to be part of OpenAI’s products like ChatGPT — creating more timely and relevant results.”

Hearst Magazines president Debi Chirichella added: “Our partnership with OpenAI will help us evolve the future of magazine content. This collaboration ensures that our high-quality writing and expertise, cultural and historical context and attribution and credibility are promoted as OpenAI’s products evolve.”

And OpenAI chief operating officer Brad Lightcap said the use of Hearst content “elevates our ability to provide engaging, reliable information to our users”.

FT, Reuters, Axel Springer, Hearst Mags, USA Today Network

The FT, Reuters, Axel Springer, Hearst Mags and USA Today Network were named as publisher partners for Microsoft’s new AI “companion”, Copilot, at the start of October.

Those announced were existing partners of Microsoft’s MSN news licensing service but Press Gazette understands these are new deals.

Microsoft said Copilot Daily can give a summary of the news and weather using an AI Copilot Voice.

“It’s an antidote for that familiar feeling of information overload. Clean, simple and easy to digest. Copilot Daily will only pull from authorised content sources. We are working with partners such as Reuters, Axel Springer, Hearst Magazines, USA Today Network and Financial Times, and plan to add more sources over time. We’ll also add additional personalisation and controls in Copilot Daily over time.”

Conde Nast

Vogue, Wired, Vanity Fair and GQ publisher Conde Nast has become the latest publisher to sign a “multi-year partnership” relating to the display of its content in OpenAI products, it announced on 20 August.

Conde Nast chief executive Roger Lynch has been outspoken about the risks generative AI poses to news businesses, telling US Congress “many” media companies could go out of business by the time any litigation passes through the courts and that “immediate action” should be taken through a clarification that content creators should be compensated for the use of their work in training.

In a memo to staff he has now said the OpenAI deal helps to make up for revenue being lost through declining search traffic.

He wrote: “It’s crucial that we meet audiences where they are and embrace new technologies while also ensuring proper attribution and compensation for use of our intellectual property. This is exactly what we have found with OpenAI.

“Over the last decade, news and digital media have faced steep challenges as many technology companies eroded publishers’ ability to monetize content, most recently with traditional search. Our partnership with OpenAI begins to make up for some of that revenue, allowing us to continue to protect and invest in our journalism and creative endeavours.”

The deal will allow OpenAI to display content from Conde Nast brands in its products, including ChatGPT and its SearchGPT AI-driven search engine prototype.

OpenAI explained what this means in a blog post: “With the introduction of our SearchGPT prototype, we’re testing new search features that make finding information and reliable content sources faster and more intuitive. We’re combining our conversational models with information from the web to give you fast and timely answers with clear and relevant sources. SearchGPT offers direct links to news stories, enabling users to easily explore more in-depth content directly from the source.

“We plan to integrate the best of these features directly into ChatGPT in the future.

“We’re collaborating with our news partners to collect feedback and insights on the design and performance of SearchGPT, ensuring that these integrations enhance user experiences and inform future updates to ChatGPT.”

Lynch praised OpenAI for being “transparent and willing to productively work with publishers like us so that the public can receive reliable information and news through their platforms”.

He continued: “This partnership recognises that the exceptional content produced by Condé Nast and our many titles cannot be replaced, and is a step toward making sure our technology-enabled future is one that is created responsibly.

“It is just the beginning and we will continue what we started in Washington earlier this year – the fight for fair deals and partnerships across the industry until all entities developing and deploying artificial intelligence take seriously, as OpenAI has, the rights of publishers.”

Financial Times, Axel Springer, The Atlantic, Fortune

Financial Times, Axel Springer, The Atlantic and Fortune (as well as Universal Music Group) have agreed to license their content to generative AI start-up Prorata.ai.

Prorata says it has a proprietary algorithm that can work out how much of various publishers’ content is used in an answer and share revenue accordingly. When it launches its own chatbot this autumn, it says, it will share 50% of the revenue from subscriptions with content creators.

Read our full story about Prorata’s plan here.

Time, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune and WordPress owner Automattic

Time, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune and WordPress.com owner Automattic have become the first publishers to sign up to a revenue-sharing deal launched by AI search chatbot Perplexity.

When Perplexity introduces advertising via sponsored related questions within the next few months, signed-up publishers will be able to share the revenue generated by interactions where their content is referenced.

The programme also gives them access to analytics platform Scalepost.ai to see which of their articles show up frequently in Perplexity answers that get monetised, access to Perplexity tech to create their own custom answer engines for their websites, and one year of Perplexity Enterprise Pro for all employees for a year.

Read our full story about the revenue-sharing programme, and Perplexity’s view on its relationship with publishers, here.

Time

Time has signed a “multi-year content deal and strategic partnership” with OpenAI, it revealed on 27 June.

The deal will give the ChatGPT creator access to Time’s 101-year-old archive and its current reporting to give up-to-date answers to users (with a citation and a link back to the website).

Time will also have access to OpenAI tech to build its own products and provide feedback to the tech company on the delivery of journalism through its tools.

Time chief operating officer Mark Howard said: “Throughout our 101-year history, Time has embraced innovation to ensure that the delivery of our trusted journalism evolves alongside technology. This partnership with OpenAI advances our mission to expand access to trusted information globally as we continue to embrace innovative new ways of bringing Time’s journalism to audiences globally.”

OpenAI chief operating officer Brad Lightcap said the deal supports “reputable journalism by providing proper attribution to original sources.”

Vox Media

Vox Media has signed a “strategic content and product partnership” with OpenAI that means content – including archive journalism – from its brands including Vox, The Verge, Eater, New York Magazine, The Cut, Vulture and SB Nation will be surfaced on ChatGPT and also that it can use OpenAI’s tech to develop audience-facing and internal products.

The publisher said it will use OpenAI tech to create stronger creative optimisation and audience segment targeting on its first-party data platform Forte, which is used across all Vox Media sites and on its ad marketplace Concert.

It will also use OpenAI tools to match people with the right products on its search-based affiliate commerce tool The Strategist Gift Scout.

Vox Media co-founder, chair and chief executive Jim Bankoff said: “This agreement aligns with our goals of leveraging generative AI to innovate for our audiences and customers, protect and grow the value of our work and intellectual property, and boost productivity and discoverability to elevate the talent and creativity of our exceptional journalists and creators.”

The Atlantic

The Atlantic also announced on 29 May it has signed a “strategic content and product partnership” with OpenAI meaning its articles will be discoverable within ChatGPT and the AI giant’s other products, with these results providing attribution and links to its website.

The partnership also means The Atlantic “will help to shape how news is surfaced and presented in future real-time discovery products”.

The companies are also collaborating on product and tech, with The Atlantic’s product team given “privileged access” to OpenAI tech to give feedback and help shape the future of news in ChatGPT and other OpenAI products.

The Atlantic said it is currently developing an experimental microsite called Atlantic Labs “to figure out how AI can help in the development of new products and features to better serve its journalism and readers”. It will pilot OpenAI’s and other emerging tech in this work.

Nicholas Thompson, chief executive of The Atlantic, said: “We believe that people searching with AI models will be one of the fundamental ways that people navigate the web in the future.”

He added that the partnership will mean The Atlantic’s reporting is “more discoverable” to OpenAI’s millions of users and give the publisher “a voice in shaping how news is surfaced on their platforms”.

OpenAI chief operating officer Brad Lightcap said: “Enabling access to The Atlantic’s reporting in our products will allow users to more deeply interact with thought-provoking news. We are dedicated to supporting high-quality journalism and the publishing ecosystem.”

WAN-IFRA

The World Association of News Publishers (WAN-IFRA) has announced a partnership with OpenAI for a programme, Newsroom AI Catalyst, designed to “help newsrooms fast-track their AI adoption and implementation to bring efficiencies and create quality content”.

The project will work with 128 newsrooms in Europe, Asia Pacific, Latin America and South Asia providing expert guidance with funding and technical assistance from OpenAI.

Each team will receive three months of learning modules, hands-on workshops, a mini hackathon, and a showcase. They will go back to their newsrooms with a clear plan on how to roll out AI.

Vincent Peyregne, chief executive of WAN-IFRA, said: “News enterprises across the globe have come under pressure from declining advertising and print subscription revenues. The adversity confronting news leaves communities without access to a shared basis of facts and shared values and puts democracy itself at risk.

“AI technologies can positively influence news organisations’ sustainability as long as you quickly grasp the stakes and understand how to turn it to your advantage.”

He added that OpenAI’s support will “help the newsrooms through the adoption of AI technologies to provide high-quality journalism that is the cornerstone of the news business”.

OpenAI’s chief of intellectual property and content Tom Rubin said the programme is “designed to turbocharge the capabilities of 128 newsrooms” and he wants to help “cultivate a healthy, sustainable ecosystem that promotes quality journalism”.

News Corp

News Corp has signed a deal that includes the use of content from many of its major newsbrands in the UK, US and Australia in OpenAI’s large language models, it was announced on 22 May.

The partnership covers content from The Wall Street Journal, Barron’s, MarketWatch, Investor’s Business Daily, FN, and the New York Post in the US; The Times, The Sunday Times and The Sun in the UK; and The Australian, news.com.au, The Daily Telegraph, The Courier Mail, The Advertiser, and the Herald Sun in Australia.

The Wall Street Journal put a value on the deal of more than $250m over five years.

News Corp chief executive Robert Thomson described OpenAI chief executive Sam Altman and his team as “principled partners… who understand the commercial and social significance of journalists and journalism.

“This landmark accord is not an end, but the beginning of a beautiful friendship in which we are jointly committed to creating and delivering insight and integrity instantaneously.”

Dotdash Meredith

Dotdash Meredith, which publishes more than 40 titles including People, Instyle and Investopedia, on 7 May signed a multi-year deal with OpenAI that will see its content and links surfaced in ChatGPT responses.

OpenAI will incorporate real-time information from Dotdash sites into ChatGPT’s responses to queries and will use the publisher’s content to train its large language models. Dotdash meanwhile will receive assistance from OpenAI in developing both consumer-facing AI products and its AI-powered contextual advertising tool, D/Cipher.

B2B giant Informa

Business information giant Informa announced a non-exclusive Partnership and Data Access Agreement with Microsoft (the main backer of OpenAI) in a trading update on 8 May. There has been an initial fee of $10m+ and then three more recurring annual payments.

Informa said the deal covers:

“Improved Productivity: Explore how AI can enable more effective ways of working at Informa, streamlining operations, utilising Copilot for Microsoft 365 to enable Colleagues to work more efficiently, and enhancing the capabilities of Informa’s existing AI and data platforms (IIRIS);

“Citation Engine: Collaborate to further develop automated citation referencing, using the latest technology to improve speed and accuracy;

“Specialist Expert Agent: Explore the development of specialised expert agents for customers such as authors and librarians to assist with research, understanding and new knowledge creation/sharing;

“Data Access: Provide non-exclusive access to Advanced Learning content and data to help improve relevance and performance of AI systems.”

Informa said the deal “protects intellectual property rights, including limits on verbatim text extracts and alignment on the importance of detailed citation references”.

Axel Springer (again)

Following its deal with OpenAI (see below) Axel Springer has announced an expanded partnership with Microsoft covering AI, advertising, content and cloud computing.

On AI, they will partner to develop new AI-driven chat experiences to inform users using Axel Springer’s journalism.

They added: “In addition, Axel Springer will leverage Microsoft Advertising’s Chat Ads API for generative AI monetisation.”

Their existing adtech collaboration will be expanded from Europe into the US to encompass Politico, while users of Microsoft’s aggregator Start-MSN will have access to more premium content from Axel Springer’s brands. Finally the publisher will migrate its SAP solutions to Microsoft Azure.

Axel Springer chief executive Mathias Dopfner said: “In this new era of AI, partnerships are critical to preserving and promoting independent journalism while ensuring a thriving media landscape.

“We’re optimistic about the future of journalism and the opportunities we can unlock through this expanded partnership with Microsoft.”

Microsoft chairman and chief executive Satya Nadella added: “Our expanded partnership with Axel Springer brings together their leadership in digital publishing with the full power of the Microsoft Cloud — including our ad solutions — to build innovative AI-driven experiences and create new opportunity for advertisers and users.”

Financial Times

On 29 April the Financial Times became the first major UK newsbrand to announce a deal with OpenAI.

The partnership involves up-to-date news content and journalism from the FT archive, meaning it is likely to assist with both real-time queries on ChatGPT and its continued training.

FT Group chief executive John Ridding said: “This is an important agreement in a number of respects.

“It recognises the value of our award-winning journalism and will give us early insights into how content is surfaced through AI… Apart from the benefits to the FT, there are broader implications for the industry. It’s right, of course, that AI platforms pay publishers for the use of their material.”

Le Monde and Prisa Media

OpenAI announced on 13 March it had signed deals with French newsbrand Le Monde and Spanish publisher Prisa Media, which publishes El País, Cinco Días, As and El Huffpost.

The deals will mean ChatGPT users can surface recent content from both publishers through “select summaries with attribution and enhanced links to the original articles”, while their content will be allowed to contribute to training OpenAI’s models.

Le Monde chief executive Louis Dreyfus said: “At the moment we are celebrating the 80th anniversary of Le Monde, this partnership with OpenAI allows us to expand our reach and uphold our commitment to providing accurate, verified, balanced news stories at scale.

“Collaborating with OpenAI ensures that our authoritative content can be accessed and appreciated by a broader, more diverse audience… Our partnership with OpenAI is a strategic move to ensure the dissemination of reliable information to AI users, safeguarding our journalistic integrity and revenue streams in the process.”

Carlos Nuñez, chairman and chief executive of Prisa Media added: “Joining forces with OpenAI opens new avenues for us to engage with our audience. Leveraging ChatGPT’s capabilities allows us to present our in-depth, quality journalism in novel ways, reaching individuals who seek credible and independent content.

“This is a definite step towards the future of news, where technology and human expertise merge to enrich the reader’s experience.”

Reuters

Thomson Reuters chief executive Steve Hasker told the Financial Times that the company had struck “a number” of deals with AI companies looking to use Reuters news content to train their models but he did not give any further details about who was involved in the deals or for how much.

He did say that “there appears to be a market price evolving”, adding: “These models need to be fed. And they may as well be fed by the highest-quality, independent fact-based content. We have done a number of those deals, and we’re exploring the potential there.”

However away from the Reuters news part of the business Thomson Reuters is suing Ross Intelligence for allegedly unlawfully copying content from its legal research platform Westlaw to train a rival AI-powered intelligence platform.

Unknown independent publishers

A handful of unnamed independent publishers are taking part in a private programme with Google, according to Adweek, which will see them paid a five-figure annual sum to take part in a trial of a new AI platform.

The publishers are reportedly expected to produce a certain number of stories for a year and provide analytics and feedback in exchange.

Social media platform Reddit has signed a deal allowing its content to be used by Google in the training of its AI tools. Reuters reported that the deal is worth around $60m per year.

Although not a news organisation, the Reddit deal is still a content licensing deal. There is also likely to be news media content copied within Reddit posts from users on the platform which could therefore fall within the remit of the deal.

Semafor (sort of)

Ben Smith and Justin B Smith’s start-up Semafor has secured “substantial” Microsoft sponsorship for an AI-driven news feed, although this was not built by the tech giant but by the newsroom itself.

The deal, announced in February, will see Microsoft help Semafor refine the tool and makes the digital outlet one of the first newsrooms to heavily involve ChatGPT in their workflow.

Although not a content deal as such, the agreement indicates a level of co-operation rather than acrimony.

Axel Springer

In December Politico, Business Insider, Bild and Welt owner Axel Springer agreed a partnership with OpenAI that would see its content summarised within ChatGPT around the world, including otherwise paywalled content, with links and attribution. Axel Springer’s content is permitted to be used to train OpenAI products going forward.

Axel Springer can also use OpenAI technology to continue building its own AI products.

Axel Springer CEO Mathias Döpfner said: “We are excited to have shaped this global partnership between Axel Springer and OpenAI – the first of its kind. We want to explore the opportunities of AI empowered journalism – to bring quality, societal relevance and the business model of journalism to the next level.”

American Journalism Project

In July 2023 OpenAI committed $5m to the American Journalism Project, a philanthropic organisation working to support and rebuild local news organisations, to support the expansion of its work. It also pledged up to $5m in OpenAI API credits to help participating organisations try out emerging AI technologies.

American Journalism Project chief executive Sarabeth Berman said: “To ensure local journalism remains an essential pillar of our democracy, we need to be smart about the potential powers and pitfalls of new technology. In these early days of generative AI, we have the opportunity to ensure that local news organisations, and their communities, are involved in shaping its implications. With this partnership, we aim to promote ways for AI to enhance—rather than imperil—journalism.”

Associated Press

OpenAI and Associated Press signed a deal in July 2023 that allows the AI company to license the news agency’s content archive going back to 1985 for training purposes.

The companies said they are also looking at “potential use cases for generative AI in news products and services” but did not share specifics.

Kristin Heitmann, AP senior vice president and chief revenue officer, said: “We are pleased that OpenAI recognises that fact-based, nonpartisan news content is essential to this evolving technology, and that they respect the value of our intellectual property. AP firmly supports a framework that will ensure intellectual property is protected and content creators are fairly compensated for their work.”

One professor told AP the deal could be particularly beneficial to OpenAI because it would mean they can still use a wealth of trusted content even if they lose other lawsuits and are forced to delete training data as a result, from The New York Times for example.

Shutterstock

In July 2023 Shutterstock expanded its partnership with OpenAI with a six-year agreement allowing access to a wealth of training data including images, videos, music and associated metadata.

For its part, Shutterstock gets “priority access” to new OpenAI technology and can offer DALL-E’s text-to-image capabilities directly within its platform.

The post Who’s suing AI and who’s signing: Publisher deals vs lawsuits with generative AI companies appeared first on Press Gazette.

Why Microsoft Copilot Daily launch is ‘moment of significance’ for news industry

David Buttle — Wed, 16 Oct 2024 15:09:34 +0000

At the start of the month, in a broader announcement about its latest AI features, Microsoft quietly unveiled Copilot Daily which “helps you kick off your morning with a summary of news and weather, all read in your favourite Copilot Voice”.

It goes on to explain how “Copilot Daily will only pull from authorised content sources… such as Reuters, Axel Springer, Hearst Magazines, USA Today Network and Financial Times“. Publishers are being paid – although on what terms we do not know – for content when it is used in Copilot Daily.

This low-key announcement represents a real moment of significance for the news industry. For some time it has been clear that generative AI can be used to create highly-personalised information services; the technology is really good at selecting and synthesising content from a large dataset, based on a set of parameters. But this is the first time these capabilities have been deployed in a news context by a major AI developer, with financials attached for the content creators.

This matters for three reasons. Firstly, services like this give rise to a new set of intermediation risks. Secondly, these risks bring to the fore licensing decisions for publishers which are strategically consequential in the era of AI disruption. Finally, it adds to the pressure on Google’s faltering relationship with publishers, particularly around AI Overviews. Let’s examine each of these in turn.

To date, AI intermediation has been manifest in the form of disruption to referrals from search. This has arisen from two mechanisms:

Firstly, Google incorporating AI Overviews into its search product, thus reducing or entirely eliminating the need for users to visit a ‘destination’ site.

Secondly, via consumers adopting AI tools in place of search. In a new survey The Information has found that more than three-quarters (77%) of its readers are using generative AI tools in place of search and over a quarter report to be doing so in a majority of cases. Whilst this is not a representative sample, this tech-native audience is highly-likely to be a leading indicator of future consumer behaviour. The underlying reason is simple: these tools are just better for certain information retrieval tasks.

As a result of these mechanisms publishers should expect a structurally-declining flow of referrals from Google. But with Copilot Daily, Microsoft is creating a product which, in some circumstances and for some users, will disrupt an entirely different kind of traffic: that which arrives directly.

Copilot Daily appears to be fairly rudimentary at the moment. But imagine how powerful it could be if it understood what was in your diary that day, in your inbox, on your to-do list, who your favourite columnists are, which media outlets you subscribe to and what specific news stories you’re interested in.

Instead of waking up and checking the FT or New York Times homepage or app, I can bark a command and Copilot Daily – or an AI-powered Apple News product – will give me a highly personalised briefing for my day.

Now clearly this isn’t black and white: even if all those factors fed into the selection of content, many users will still use their news apps and publisher homepages as consumers have always seen value in the editor’s curated view of what is important. But, unlike the effects from AI Overviews and use of e.g. ChatGPT or Perplexity instead of Google which are more likely to impact search referrals, these engagement losses will fall on direct traffic to publisher properties, the growth of which has been a key strategic priority and size of which is perceived to be a crucial measure of resilience.

As a consequence of this development, publishers need to consider a new set of strategic trade-offs. It seems inevitable that, in future, a greater proportion of news publisher revenue will need to come from licensing content as an input to a user-facing service. But in the context of that inevitability – underpinned by the advent of this new technology and its utility in this setting – what does a good deal look like today? And how should any outlet balance the brand and reach upsides with the substitutional downsides.

Beneath these broad questions, sit some granular and fiddly ones. For example, what content can be summarised – everything or just a subset? How long can those summaries be? Should access be provided in real-time or should there be a delay? Should we push to insert a termination clause? Or category exclusivity? How do we want our brands to be represented? Crucially, what is a fair price?

It’s very hard to get these right now, but publishers should be thinking about them and playing forward the likely destination for this technology. Finding the right balance between a loyal audience monetised through engagement on owned-and-operated platforms and a peripheral audience monetised by licensing content to an intermediary service-provider, will be, in my view, the central strategic challenge of the AI era of news publishing.

Finally, this is bad news for Google. Regulators – and publishers themselves (particularly their legal teams) – will all be asking why, if Microsoft is paying to summarise content, shouldn’t Google be too? The only answer being that, thanks to the monopoly position it holds in online search and the consequent imbalance in bargaining power, publishers cannot demand and secure payment.

Regulatory enforcement and judicial proceedings do not move quickly. But over the medium term it’s looking increasingly hard for the Mountain View giant to maintain the position that it will not pay for the use of content to inform AI Overviews. Possibly even general search.

As we see AI deployed in these kinds of settings – and licensing markets emerge to facilitate them – more profound questions about the future of search follow: Will conventional, general search become focused on commercial queries (where the value exchange is clearer) and a new layer of services emerge, built on licensed content, as the main access point for news and broader informational queries?

Predicting the future is hard (and fraught with the capacity to deliver embarrassment) but this certainly feels like the direction of travel.

The post Why Microsoft Copilot Daily launch is ‘moment of significance’ for news industry appeared first on Press Gazette.

Missing links: Upmarket UK newsbrands deny click-throughs to story sources

Bron Maher — Thu, 10 Oct 2024 10:12:45 +0000

Upmarket UK newsbrands are far less likely to link through to the work of their colleagues at other publishers than tabloid news sites, new Press Gazette research suggests.

Press Gazette assessed recent output from nine leading UK news websites to establish how often they include a hyperlink when repeating information sourced from other publishers.

In the snapshot survey we found that the Mirror and The Sun were the most likely to link to other publishers, doing so in eight out of ten stories assessed at each site.

The Times, Financial Times and Telegraph, on the other hand, each only linked to another news site in one of the ten articles analysed at each and appear to have taken editorial policy decision not to link.

The Guardian and BBC, meanwhile, appeared to link through to their sources slightly less often than not.

Mail Online linked to publisher sources in the majority of articles and the Express in half of the examples we found.

The overall picture is of an industry that routinely avoids linking to sources when lifting information from other sites.

Press Gazette searched each publisher for articles published in recent weeks that featured the word “reported” (i.e. “The New York Times reported…”) and selected from the results the first ten stories that carried information copied from a named third-party news outlet.

Because the research only looked at articles that disclosed they were citing another news outlet, this research does not account for the overall frequency with which the publishers credit their sources: uncredited rewrites of a competitor’s story, for example, would not be picked up in the analysis.

Across all the publishers assessed internal links to other parts of their own websites were common. Many of the publishers would also credit information to "local media" when describing something that had been reported overseas, without identifying or linking to the source.

The Mirror told Press Gazette that it is supportive of linking and that the two articles in which no external link had been inserted were the result of human error.

A spokesperson for The Sun, similarly, said: "The Sun has always been known for breaking great exclusives and we have long campaigned for publishers to receive recognition for their original journalism.

“Alongside expecting to receive this attribution we in turn make every attempt to ensure that we attribute other publications' good stories that we have picked up."

The BBC’s operating licence requires the corporation to link to relevant third parties in its online output, and in its most recent “Delivering our Mission and Public Purposes" report it said that, in a sample of 1,370 articles published across the BBC News and BBC Sport websites, 18% of its output had linked to another media organisation. The BBC declined to comment.

Mail Online declined to comment. The Guardian also declined to comment, but pointed Press Gazette to its editorial code, which instructs its journalists that material "obtained from another organisation should be acknowledged".

The Times, FT and Telegraph had not responded to a request for comment at time of publication.

What's best practice on linking to other news sites?

Gavin Allen, a digital journalism lecturer at Cardiff University’s School of Journalism and a former associate editor at Mirror.co.uk, said there can be a “double incentive” for news sites not to link to competitors: “On the one hand, you're saying ‘we didn't break the story, someone else did’ which may be bad for reputation.

“On the other hand, you're pointing readers away from your website,” which he said may lead them to click away.

Materially, Allen said traffic from backlinks is often “vanishingly small”. Instead, he said, the way un-linked re-reports “might start to cannibalise your traffic is if it’s attracting search away”.

He said: “It’s more a courtesy and an ethics thing as well, I think… If you’re doing stuff based on other people’s work then you should be crediting that work. That would be good practice.”

Search engine optimisation orthodoxy holds that Google gives better rankings to articles that link to relevant third-party websites.

The Association of Online Publishers offers the following guidance on this topic: "Fair attribution is vital to help publishers get credit for the time, money, and effort they put into sourcing, investigating, and producing original content.

"As well as helping direct users to the original source of a story, linking is vitally important for SEO. Google uses links from ‘prominent websites’ as a signal to determine ‘authoritativeness’ – a key factor in determining ranking."

The AOP invites publishers to sign up to the Link Attribution Protocol, a group of publishers who agree to follow best practice on linking and who share a single email point of contact for getting links added to stories.

Scroll down for the full linking results from each of the nine publishers

The post Missing links: Upmarket UK newsbrands deny click-throughs to story sources appeared first on Press Gazette.