Googlebot Activity for E-commerce Websites (James Dooley Interviews Andrew Halliday)
Listen on your favourite platform
| Platform | Link |
|---|---|
| YouTube | Listen on YouTube → |
What Does “Googlebot Activity for E-commerce Websites (James Dooley Interviews Andrew Halliday)” Talk About?
This episode of the James Dooley Podcast features a focused conversation between host James Dooley and technical SEO specialist Andrew Halliday on the topic of Googlebot activity for ecommerce websites. The discussion opens by explaining why crawl behaviour is foundational to ecommerce success, particularly for sites that regularly update prices or stock levels. Andrew Halliday illustrates how misaligned crawl budget can cause real business damage, such as Google displaying outdated prices in the SERPs, leading to lower conversion rates when customers click through to find a higher price than advertised.
A central theme of the episode is the growing problem of ecommerce businesses transitioning into dropshipping and suddenly uploading massive product catalogues, sometimes jumping from 5,000 to over 120,000 pages overnight. Andrew explains how this overwhelms Googlebot, causing it to waste crawl budget on low-value or unsellable product pages instead of the core category and revenue pages that drive rankings. The conversation covers practical remedies including content pruning, phased product rollouts, and using server log analysis to understand which URLs Google is actually prioritising. Andrew also details how crawl budget can be actively increased through fixing technical issues like unnecessary 301 redirects and 500 errors, building high-quality authority links, sending social signals, and even running Google Ads.
The episode closes with advice on site migrations, specifically moving from WooCommerce to Shopify, where Andrew stresses the critical importance of monitoring server logs hourly in the days immediately following a launch to catch missed 301 redirects before they cause lasting damage. James Dooley ties the discussion together by framing crawl activity as the true foundation of the SEO funnel, the step that must come before impressions, clicks, and sales can follow.
“My biggest ranking factor in my eyes is, is the page being crawled. You can have the world's best content. Shakespeare could have read your content, for example. You can have the most amazing links pointing to it. But if Google's not crawled it, it's not going to rank.”
— Andrew Halliday
Who Are the Guests on “Googlebot Activity for E-commerce Websites (James Dooley Interviews Andrew Halliday)”?
Andrew Halliday is a seasoned technical SEO specialist with deep expertise in server log analysis and crawl budget optimisation. He works with ecommerce businesses through his personal site andrew-hal.com and his technical SEO service onpage.rocks. Andrew has spent years analysing Googlebot behaviour across large ecommerce sites and has even built custom log analysis tools beyond the capabilities of standard off-the-shelf software like Screaming Frog and Jet Octopus, allowing him to identify edge cases and outliers that automated tools frequently miss.
James Dooley is the host of the James Dooley Podcast and an experienced SEO professional. He brings a practical, business-owner perspective to technical discussions, consistently steering the conversation toward actionable takeaways. Throughout the episode he demonstrates a solid working knowledge of concepts like PageRank sculpting, internal link architecture, and click depth, making him an effective interviewer who challenges Andrew to make complex technical topics accessible to a broad ecommerce audience.
What Are the Key Takeaways From “Googlebot Activity for E-commerce Websites (James Dooley Interviews Andrew Halliday)”?
Here are the key points discussed in this episode:
- Crawl budget is a foundational SEO factor for ecommerce sites because if Google is not crawling a page it will not rank, regardless of content quality or backlink profile.
- Suddenly uploading large product catalogues, such as jumping from 5,000 to 120,000 pages at once, confuses Googlebot and causes it to waste crawl budget on low-value URLs instead of core category and revenue pages.
- Content pruning and phased product rollouts are the recommended strategy for recovering crawl efficiency, with Andrew suggesting a category-by-category approach rather than removing everything at once.
- Crawl budget can be actively increased by fixing technical foundations such as unnecessary 301 redirects and 500 server errors, building authority links, generating social signals, and running Google Ads campaigns.
- Server log analysis is the most reliable way to understand which URLs Google is actually prioritising, and monitoring logs should be near-hourly in the days immediately after a major site migration to catch missed redirects before they cause lasting ranking damage.
“Every time Googlebot sees a 500 error, it worries that it's the reason it's caused your servers to go down and it doesn't want to be the cause of, in essence, a DoS attack. So it reduces your crawl budget.”
— Andrew Halliday
Is “Googlebot Activity for E-commerce Websites (James Dooley Interviews Andrew Halliday)” Worth Listening To?
This episode is worth listening to for any ecommerce site owner or SEO professional who has ever wondered why rankings dropped after a large product upload or site migration. Andrew Halliday does not deal in vague theory. He walks through a concrete scenario, a business moving from 5,000 to 120,000 pages overnight due to dropshipping supplier CSV feeds, and explains exactly why this tanks rankings and what a phased recovery plan looks like in practice. The discussion of how 301 redirects, 500 errors, orphan pages from legacy URLs like old fax machine product pages, and even Google Ads spend all directly affect crawl budget gives listeners a surprisingly complete technical framework in a short amount of time.
What makes this episode particularly valuable is the honest acknowledgement of where tools end and expertise begins. Andrew concedes that platforms like Jet Octopus and Screaming Frog Log Analyser cover roughly 70 percent of issues, but it is the outliers, the edge cases and rabbit holes that Googlebot follows which no tool flags automatically, that separate a professional audit from a DIY one. For ecommerce businesses preparing for a platform migration, the specific advice around monitoring logs hourly post-launch to catch missed Shopify URL structure redirects is alone worth the listening time.
Who Should Listen to “Googlebot Activity for E-commerce Websites (James Dooley Interviews Andrew Halliday)”?
This episode is ideal for:
- Ecommerce site owners who have experienced unexplained ranking drops after a large product catalogue upload or site restructure
- Technical SEO professionals and consultants who want to sharpen their approach to server log analysis and crawl budget optimisation
- Digital marketing managers responsible for ecommerce platforms who need to understand the relationship between crawl activity, indexation, and revenue
- Developers and project managers overseeing ecommerce platform migrations, particularly moves from WooCommerce to Shopify, who need to understand post-launch monitoring priorities
Where Can You Listen to James Dooley Podcast?
You can listen to James Dooley Podcast on all major podcast platforms:
- Apple Podcasts – Search for “James Dooley Podcast” in the Podcasts app
- Spotify – Available on Spotify for free
- Amazon Music / Audible – Listen through your Amazon account
- Overcast – For iOS users who prefer a dedicated podcast app
- Pocket Casts – Cross-platform podcast player
You can also subscribe using the RSS feed: https://feeds.transistor.fm/james-dooley-podcast
What Are Listeners Saying About This Episode?
“The section on dropshipping businesses dumping 100,000 products onto a site overnight was an eye-opener. I had exactly this problem and never connected it to crawl budget until now. Andrew explains it in a way that makes you want to audit your server logs immediately.”
“Really appreciated the honest take on tools like Screaming Frog and Jet Octopus. Knowing they cover 70 percent but miss the edge cases helps me understand when I actually need to bring in a specialist rather than just running another crawl. Practical and no fluff.”
“The migration advice at the end was exactly what I needed. Moving from WooCommerce to Shopify next quarter and I had not thought about monitoring logs hourly after launch to catch missed 301s. Short episode but genuinely packed with useful, specific guidance.”

James Dooley: Hi, today I'm joined with Andrew Halliday and we're going to be talking about Googlebot activity, specifically on ecommerce websites. So anyone who's got an ecommerce website online, why is Googlebot activity and why is tracking that Googlebot activity very important for ecommerce websites?
Andrew Halliday: Yes. So, it's important for a number of reasons. First of all, if you are changing prices regularly or products going in and out of stock, then you need Google to reflect that. Especially if you've got the structured data set up right and it's appearing in the SERPs. If your prices are saying one thing in Google and then they're clicking through and the price has gone up, it does lower conversion rate. So the only way you can do that is by having Google optimised crawl budgets so Google are hitting your pages every single time. But the other issue I've come across recently is, as more and more suppliers are getting more advanced, they're sending ecommerce businesses CSV daily files of new products. So historically, an ecommerce business would have to go and buy all these products and stock them in their own warehouse. Whereas now more and more are moving into the dropshipping realm, and this is more and more of the established business moving into the dropshipping realm. They're suddenly trying to add 100,000 products onto their sites which might historically only have 5,000. So they're losing their rankings because Googlebot, instead of crawling these 5,000 pages regularly and understanding the category pages when changes are made and understanding the blog and the content, they're now trying to crawl a page that you might not sell this product of. It might be very small. So they're wasting all the time trying to crawl these extra pages, instead of going through a smart way of adding maybe 10,000 one week, getting Googlebot to crawl them, increasing the crawl budget by monitoring it, then add another 10,000 or adding 20,000. They're just choking 100,000 products on plus an extra couple of thousand category pages and they might write a couple of thousand blog articles to go over this. So they've gone from having a 5,000 page site to maybe a 120,000 page site and wondering why everything's getting tanked because Googlebot's getting confused. It's not able to crawl what it likes to crawl.
James Dooley: Yeah. And with regards to when Googlebot is getting confused and it is then not crawling the main 5,000 pages and it's going off to different products and stuff like that. Would you then, if you was to take over, if someone was to come to you and go, Andrew Halliday, you are the expert at technical SEO. Can you run a server log analysis and check to see the Googlebot activity? Would you then be looking to remove some of them products and do like a content pruning exercise and then potentially, as the site started to grow in time, re-add them back in? Is that something that you would want to do if you was to take it over?
Andrew Halliday: One hundred per cent, yeah. Go back to not necessarily straight back to the 5,000 they had originally, but go through and go maybe category at a time. So let's say you're selling tech. Obviously we're on a computer now and you've added 100,000 products. I might actually go, no, I only want to focus on the chargers. Well, let's add the chargers. That's 5,000. So I would be taking 90,000 products off. You do have one advantage in ecommerce in that you have the Google Shopping feed and Google wants to crawl that. So every time you do add a product into there, you do see an initial spike in Googlebot activity. But that suddenly goes away if you don't have the authority to keep that level up there. So yeah, the first thing would be to actually just go through pruning and be quite ruthless to get it back crawling where I want it to crawl. And then a phased plan. It depends on the size of the business, what budgets they've got, to try and get back up to that ultimate level of 100,000 products, just over time rather than in one go.
James Dooley: Is there any way of tracking the Googlebot activity on ecommerce websites?
Andrew Halliday: Yeah. Great question. Main way is just downloading your server logs, importing into either, there's multiple tools online, whether that's Jet Octopus, Screaming Frog Log Analyser, and others. I have custom built tools because I've been doing it that long. But yeah, there's multiple ways of tracking it. Fundamentally, you've got to download your server log file, which you can get from your server, and then you can just track the trend.
James Dooley: So obviously I'm going to give you now 30 seconds where you can pitch yourself. Why should I come to you as a professional service and hire Andrew Halliday to go and do my Googlebot activity analysis, as opposed to downloading them and putting them into Screaming Frog or Jet Octopus and getting the information from there? Obviously, I don't know how to do it is probably the number one reason, and not knowing what that data means. But 30 seconds, why you over using a tool like Jet Octopus or Screaming Frog?
Andrew Halliday: There's nothing wrong with them tools. Let's clarify that. Them tools are only going to give you 70 per cent of the main issues. What the difference here is that I'm used to looking, especially for ecommerce sites, on the outliers. If things are going wrong, if Googlebot's going down a rabbit hole, the stuff that you manually need to go in to find, the stuff you can't find in the tool because they're one offs, the edge cases. They're not something worth building into the tool because it doesn't happen that often.
James Dooley: Yeah. So obviously when you're seeing ecommerce websites every single day, you're seeing the common problems, the common crawl problems. You're seeing specific areas that might be being crawled that now, by deleting or saving those as a draft, then could increase not just increase crawl budget, you might have the same crawl budget, but be able to recrawl specific important pages. But this leads me on to my next question. If I was really hungry and I'm like, Andrew, I want my 20,000 pages, right? I want to do it as soon as possible. And you recommend that I shouldn't do that, but I'm like, nope, I'm going to do it. Is there a way of me being able to increase my crawl budget? And if so, how do I increase my crawl budget on my ecommerce website so I'm getting more Googlebot activity?
Andrew Halliday: That is usually the response most business owners give. They want to go straight for the 100,000, but I can usually get them down to a more reasonable number. But yes, there are ways of doing it. First of all is to fix any technical issues you've got on your site. So let's say your crawl budget is 30,000 at the moment and you've got loads of 301s on because you can't be bothered to go in there and change them, well every one of them counts as a wasted crawl budget. So first of all, I go through and fix all the fundamentals, the foundations. The next thing I'll be doing is checking the server is set up correctly, that it's not going down. Because every time Googlebot sees a 500 error, it worries that it's the reason it's caused your servers to go down and it doesn't want to be the cause of, in essence, a DoS attack. So it reduces your crawl budget. So I'd fix the foundations first. Then once that's in place, and they're assuming they're all okay, it's then going out and building authority links, using services out there to get decent high quality links back, appearing on something like a Google News site to trigger the Google News bot to come, which in Google's eyes is more authoritative than just the standard bot. I would then be sending social signals, whether that's through paid ads or just general if they've got a decent following, trying to send other third party signals. And something that obviously Google likes, paid adverts. Driving paid adverts increases your crawl budget because if you're spending the money with Google, they want to check that where they're sending to is right. So they spend more on their crawl budget to you.
James Dooley: So the more you spend with Google Ads, the more, especially initially.
Andrew Halliday: Yeah.
James Dooley: Yeah. So with regards to, at present we're looking at specific mapping out of our site. We're looking at click depth, the amount of unique internal links that's going through to specific pages, and like a PageRank simulator to go, okay, majority of the PageRank is going through to this page. But realistically, is there a way for me to be able to send you my full list of URLs and you can tell me in the last three months how many times Googlebot has visited those specific pages? Because even though my PageRank simulator might say this is my most powerful page, you might have some Googlebot activity and it could be a random page, a random URL that could be ranking, that might not have as much internal link juice, but because the volume of, let's say, social media mentions and links through to that externally, or traffic because of the rankings of it, couldn't that be my most valuable page from Googlebot activity? And I could then strategically try to do PageRank sculpting, so to speak, internal link it through to my biggest most important pages. Is there a way of doing that where I download my URLs or send you through my full site and you can order most Googlebot activity to least Googlebot activity at a page level?
Andrew Halliday: Yes. So in the log file, without getting too technical, it does give you the URL path. And basically, one of the first steps in the audit is checking, is every page being crawled. You want it crawled at least once every 90 days, ideally once every 30 days. But yeah, the first thing would be to do a list of all your URLs. I'd get that from several places. One, crawling your site. Two, downloading your sitemap. And then looking at, because some pages you might be getting hits from Googlebot that is a legacy page and it could be causing a 404 because you had a page, let's say your ecommerce site is 20 years old. Twenty years ago, it's a tech site, you were selling fax machines. Well, nobody's really selling fax machines nowadays. So that page is dead for obvious reasons, but you forgot to put a 301 in. So Google's still hitting that page because there's loads of external links to it. So it's still using your crawl budget, but it's 404ing, but you're not picking that up in your Screaming Frog crawl or your manual crawling of the site because there's no internal linking. So that would be one of the first things. Checking how many they're hitting, where they're hitting, the last time they hit it, and where they're hitting that they shouldn't be hitting.
James Dooley: Yeah. And then with regards to if I've got an ecommerce site, let's say I'm on WordPress and I've got WooCommerce and I really want to move over to Shopify and I want to migrate my website over. How important is the month before and as I'm looking to do the migration and the couple of months after, how important then is tracking that Googlebot activity? Is that probably, I'm presuming, that would be one of the most important times to make certain that things aren't being crawled, URLs aren't being crawled that aren't strategically being 301ed properly and stuff like that.
Andrew Halliday: Yeah, the month before and definitely the days and the weeks after the launch is crucial. You're in the log files, almost, I would say hourly, but at least daily so you can pick up any errors. As you know with every big migration, someone's forgot to put a 301 in here. Or especially if you're moving to Shopify and you can't change the blog URL links or the category links, you might have to put 301s in. If someone's missed something. So yeah, straight after the migration would be hourly, but then daily pretty soon after and then constant checking.
James Dooley: Yeah, I mean it's crazy for me with regards to Googlebot activity for ecommerce sites where not many people talk about crawl data, crawl budget, crawl stats, Googlebot activity. But they're the foundations initially that get your keywords to be ranking, which then gets you the impressions, which then leads onto the clicks, which then leads onto the sales. Everyone seems to obsess of how many sales have we had, how many clicks have we had. Realistic, if they started right at the foundations, it's the most important part, making certain you're tracking that Googlebot activity. Because if you can keep trying to slowly improve and increase that Googlebot activity, like you said, it's then going to lead on longer term to more impressions, more clicks, and more sales. So what would you say to people that don't really track the crawl stats, the crawl budget, and the Googlebot activity on an ecommerce website specifically?
Andrew Halliday: I say my biggest ranking factor in my eyes is, is the page being crawled. You can have the world's best content. Shakespeare could have read your content, for example. You can have the most amazing links pointing to it. But if Google's not crawled it, it's not going to rank. So if you don't understand what Google's crawling and not crawling, you got no chance of any sales. Sales are the backbone of any business, granted. But if it's not being crawled and indexed, then you're not going to get the sales. So sales are important. I'm not denying that. But without the rankings, without that first crawl by Googlebot, you've got no chance of any of the following steps happening. But obviously, then flipping that, if you have got poor content and Googlebot's crawling it, that's going to have a negative impact on you. So if you just go onto ChatGPT, ask it to write you a category description, leave all the m dashes in there and all the other telltale signs it's been written by ChatGPT or any other AI agent, then that's going to have a negative impact because Google's going to be like, maybe I don't trust this site. It's not super quality.
James Dooley: Yeah. So for me, the key takeaway here with regards to Googlebot activity on ecommerce websites is to reach out to Andrew Halliday to get a professional server log analysis done. If you are savvy enough and a technical SEO yourself, make certain you're downloading those server logs, loading it into Jet Octopus or Screaming Frog and keep checking that Googlebot activity. Make certain that you're internal linking from the most active pages with the most activity. Make certain that there's no orphan pages on there, not just from an internal linking point of view, but no Googlebot activity. That then tells you from a technical standpoint when content pruning might be needed. If all the pages are being crawled often and you've got other pages you want to add, you can slowly start building up with article velocity. But for anyone who wants to reach out to you, Andrew, what's the best way of getting hold of you? If they was to say, Andrew Halliday, I want to get you involved in a technical audit to run my Googlebot activity. What's the best way of getting hold of you?
Andrew Halliday: Either on my personal site, which is andrew-hal.com, or on my technical site, which is onpage.rocks.
James Dooley: Superb. It's been a pleasure on having you on and hopefully people have liked the episode about Googlebot activity on ecommerce websites.
Andrew Halliday: Thank you.
Creators & Guests
Host
James Dooley is a UK entrepreneur.