How Generative Engines Choose Citations? James Dooley Interviews Sergey Lucktinov
Listen on your favourite platform
| Platform | Link |
|---|---|
| YouTube | Listen on YouTube → |
What Does “How Generative Engines Choose Citations? James Dooley Interviews Sergey Lucktinov” Talk About?
This episode of the James Dooley Podcast features a focused technical conversation between host James Dooley and Sergey Lucktinov on exactly how generative AI engines like ChatGPT and Gemini decide which sources to cite in their responses. Sergey walks through the full multi-stage citation pipeline, beginning with query fan out, where a complex user question is normalised and expanded into related queries, and moving through metadata filtering, where between forty and eighty percent of results are eliminated based on signals like spam scores, domain structure, and meta information.
The discussion then covers what happens in the final stage, where only three to ten websites remain and are fully parsed by an LLM, which selects roughly five to eight passages for the final response. James and Sergey also address practical optimisation strategies, including the risks of prompt injection and why it is likely to be penalised long term, the importance of macro semantic content structure with seed pages and node pages that mirror fan out queries, and how technical factors like time to first byte and layout stability can disqualify a page before deep evaluation even begins.
The episode also touches on the emerging trend of websites adding summarise buttons for AI tools like Perplexity and Claude, with Sergey clarifying that these are closer to light prompt engineering and have no influence on future citation processes. The conversation consistently returns to the idea that LLMs optimise for the most stable and satisfying answer at the lowest retrieval cost, making content clarity and structural alignment with how LLMs process information the most important long-term factors.
“LLMs want the most stable and satisfying answer at the lowest possible cost. If your content is good but difficult to extract, it will not be cited.”
— Sergey Lucktinov
Who Are the Guests on “How Generative Engines Choose Citations? James Dooley Interviews Sergey Lucktinov”?
Sergey Lucktinov is the featured guest and brings deep technical expertise in how generative AI systems retrieve and evaluate content for citation. He demonstrates a granular understanding of the internal mechanics behind tools like Gemini and ChatGPT, covering topics from query fan out and metadata filtering to LLM-level content parsing and semantic SEO strategy. His perspective bridges search engine optimisation and large language model behaviour, making him a knowledgeable voice for anyone trying to appear in AI-generated answers.
James Dooley is the host of the James Dooley Podcast and is known for his expertise in SEO and digital marketing. In this episode he plays the role of an informed interviewer, asking practical and pointed questions about prompt injection, summarise buttons on websites, and what actionable steps content creators can take to improve their chances of being cited by generative engines. His questions consistently push the conversation toward real-world application.
What Are the Key Takeaways From “How Generative Engines Choose Citations? James Dooley Interviews Sergey Lucktinov”?
Here are the key points discussed in this episode:
- Generative AI citation is a multi-stage process that begins with fan out queries and eliminates between forty and eighty percent of candidates through metadata filtering before deep LLM evaluation even begins.
- Only three to ten websites typically make it to the final stage, where an LLM fully parses each page and selects roughly five to eight passages for use in the response.
- Prompt injection may produce short-term visibility in AI answers but is likely to be penalised over time because it is not user-friendly from an LLM perspective.
- Macro semantic content structure, including well-organised seed pages and node pages that reflect fan out queries, mirrors how LLMs organise information internally and makes content easier and cheaper to process.
- Technical performance factors such as time to first byte above approximately three hundred milliseconds and layout shifts can cause a page to be removed from consideration early in the citation pipeline.
“Cover only the topic you intend to cover and define it clearly at the beginning of the article so it is cheap and easy for an LLM to retrieve.”
— Sergey Lucktinov
Is “How Generative Engines Choose Citations? James Dooley Interviews Sergey Lucktinov” Worth Listening To?
This episode is genuinely valuable for SEO professionals and content strategists who want to understand what is actually happening inside AI citation systems rather than relying on speculation or surface-level advice. Sergey Lucktinov explains the mechanics with unusual clarity, walking through each stage of the pipeline in a way that makes abstract AI behaviour feel concrete and actionable. The breakdown of how metadata filtering, light content checks, and final LLM parsing each play a distinct role gives listeners a framework they can apply directly to how they build and structure their websites.
What makes this episode stand out is its honesty about what does not work, including prompt injection and the summarise buttons appearing on many websites, alongside a clear explanation of why those approaches fall short. The emphasis on cost of information retrieval as the underlying logic connecting search engine optimisation and generative AI behaviour gives listeners a mental model that will remain useful as the technology continues to evolve. Anyone investing time in content creation or technical SEO in 2024 and beyond will find this conversation directly relevant to their work.
Who Should Listen to “How Generative Engines Choose Citations? James Dooley Interviews Sergey Lucktinov”?
This episode is ideal for:
- SEO professionals looking to adapt their strategies for generative AI citation visibility
- Content strategists and writers who want to understand how LLMs evaluate and select passages from web pages
- Web developers and site owners focused on technical performance factors like time to first byte and layout stability
- Digital marketers trying to understand the difference between effective long-term AI optimisation and risky short-term tactics like prompt injection
Where Can You Listen to James Dooley Podcast?
You can listen to James Dooley Podcast on all major podcast platforms:
- Apple Podcasts – Search for “James Dooley Podcast” in the Podcasts app
- Spotify – Available on Spotify for free
- Amazon Music / Audible – Listen through your Amazon account
- Overcast – For iOS users who prefer a dedicated podcast app
- Pocket Casts – Cross-platform podcast player
You can also subscribe using the RSS feed: https://feeds.transistor.fm/james-dooley-podcast
What Are Listeners Saying About This Episode?
“The breakdown of the multi-stage citation process was exactly what I needed. I had no idea that forty to eighty percent of results are filtered out just at the metadata stage before an LLM even reads the page. This completely changed how I think about optimising for AI answers.”
“Really appreciated Sergey's honest take on prompt injection and those summarise buttons showing up everywhere. It is refreshing to hear someone explain why those shortcuts do not actually influence future citations rather than just hyping up the latest trend.”
“The point about time to first byte being above three hundred milliseconds getting a page removed early in the process was a wake-up call. I immediately went and checked our site speed after listening. Short episode but packed with genuinely useful and specific information.”

James Dooley Hi, today I’m joined with Sergey Lucktinov and today’s question is how generative engines choose citations. Sergey Lucktinov The way it works is a multi stage process. It starts with fan out queries. When someone asks an LLM a complex question, that question is normalised and expanded into related or implied queries. Sometimes there are only a few fan out queries and sometimes there are many, depending on how complex the original question is. Those fan out queries are then sent to a search engine. In the case of Gemini, that is Google. In the case of ChatGPT, that is Bing. In the first stage, the system analyses metadata from the search results. This includes the website name, the page URL, the meta title, the meta description, and internal signals such as spam scores and other trust data assigned by Google or Bing. The goal at this stage is to remove irrelevant sites, so typically between forty and eighty percent of results are filtered out. In the second stage, the remaining websites are lightly checked. The system looks at domain structure, content type, relevance, and content stability, including whether the layout shifts. If a site passes this stage, it reaches the final stage. In the final stage, usually between three and ten websites remain. These are fully parsed and evaluated by an LLM. The model checks meaning, stability, and trust, then selects roughly five to eight passages that are used in the final LLM response. James Dooley If someone wants to be cited by generative engines, is there anything they can do to optimise for that, or things like prompt injection to get into AI answers? Sergey Lucktinov You can use prompt injection, but I would not recommend it. It can work in the short term, but it is likely to be penalised later because it is not user friendly from an LLM perspective. If you want long term results, you need to follow a semantic SEO approach. That means building a high quality website from a macro semantic perspective. You need proper macro pages, seed pages, and node pages that reflect fan out queries. You also need strong micro semantics, which means clarity, concise writing, and focused explanations of the entities you are describing. Cover only the topic you intend to cover and define it clearly at the beginning of the article so it is cheap and easy for an LLM to retrieve. James Dooley Some websites now have buttons for ChatGPT, Perplexity, Claude, or Gemini that say summarise this page and ask the model to cite it. Does that help future citations, or is that a form of prompt injection? Sergey Lucktinov It is closer to light prompt engineering, but it does not really help. LLMs do not have memory in that way. You cannot tell them to remember something for future use. That process simply fetches the page through an API and summarises it for a single user. It has nothing to do with the fan out process when someone else asks a different question later. James Dooley So it is more about engagement. Is there anything else you would recommend to increase the chance of being cited by generative engines? Sergey Lucktinov It depends on the goal. If the goal is a product, listicles are often effective because you do not need to rank first and you can appear across multiple sites. In some cases you do not even need your own website. If the goal is a proprietary strategy or service, then optimising your own website is critical. The most important factor is macro semantic structure. These structures work well because they mirror how LLMs organise information internally. When your content matches that structure, it becomes easier and cheaper for the model to understand. LLMs want the most stable and satisfying answer at the lowest possible cost. If your content is good but difficult to extract, it will not be cited. Speed also matters. If time to first byte is above roughly three hundred milliseconds, the page may be removed early in the process. James Dooley It always seems to come back to the cost of information retrieval, whether for search engines or LLMs. Thanks a lot, Sergey Lucktinov.
Creators & Guests
Host
James Dooley is a UK entrepreneur.