This article provides an overview of why Google Search Console may report "not indexed pages" for a website. It will run through which index categories are common and which ones may need to be worked to resolve.
Google Search Console: Understanding and Resolving "Not Indexed" Pages
When you monitor your website using Google Search Console, you may come across the "Not Indexed" pages report. This can be concerning, especially if your website has many important pages that aren't appearing in Google's search results. However, not all "not indexed" pages are problematic, and many are normal occurrences during the crawling and indexing process.
In this article, we’ll break down the various categories of "not indexed" pages in Google Search Console, highlight the common issues and non-issues, and explain how to resolve them. We'll also touch on how Google's crawl budget and indexing delays can impact this process, especially for larger websites like eCommerce stores.
Common Categories of "Not Indexed" Pages
-
Page with Redirect (3xx)
- What it Means: Google found a page that redirects to another page. This can happen due to 301 (permanent) or 302 (temporary) redirects.
- Top Issues: If these pages are redirecting to the wrong destination, it could mean there’s a problem with your site's internal linking structure or that you’ve forgotten to update old URLs.
- Quick Solution: Check if the redirects are set correctly in your .htaccess or web server configuration. Make sure you're not unintentionally redirecting important pages.
- More In-depth Solution: Use Google Search Console's “Inspect URL” tool to verify the destination and adjust your redirects accordingly. Audit your site's link structure to ensure no critical pages are being incorrectly redirected.
-
Not Found (404)
- What it Means: This category indicates that Google crawled a page that does not exist anymore and returned a 404 error.
- Top Issues: 404 errors can arise from deleted or moved pages that don't have proper redirects.
- Quick Solution: Redirect any deleted or moved pages to relevant, live pages to ensure visitors (and search engines) aren't landing on dead ends.
- More In-depth Solution: Use a site crawl tool to identify all broken links across your website and set up 301 redirects for all deleted pages that should still have traffic value.
-
Excluded by ‘noindex’ tag
- What it Means: The page includes a
noindex
tag, telling Google not to index it. This is a valid reason for non-indexing if you want to prevent a page from showing up in search results. - Top Issues: These pages might be mistakenly marked as
noindex
, preventing them from being indexed. - Quick Solution: Double-check whether the
noindex
tag is intentional. If the page should be indexed, remove the tag from the HTML code. - More In-depth Solution: For bulk changes, you can use a CMS tool or script to identify pages with the
noindex
directive and remove it from the code.
- What it Means: The page includes a
-
Server Error (5xx)
- What it Means: Google encountered a server error when trying to access the page, such as a 500 error.
- Top Issues: Server errors are often temporary, but if they persist, they can prevent pages from being crawled or indexed.
- Quick Solution: Monitor server logs to identify and resolve any issues affecting your site’s uptime.
- More In-depth Solution: Ensure your web hosting provider is reliable. Check your server configuration, especially for dynamic pages like those generated by databases.
-
Soft 404
- What it Means: A "soft 404" is when a page returns a “200 OK” status (as if the page is valid) but displays a message like "Page not found" or "This page no longer exists."
- Top Issues: Soft 404 errors can confuse Google, as they may appear to be valid pages but do not provide meaningful content.
- Quick Solution: Ensure that deleted or redirected pages return a proper 404 or 301 status code.
- More In-depth Solution: Review your content management system (CMS) to make sure that missing pages show the correct error messages and status codes.
-
Alternate Page with Proper Canonical Tag
- What it Means: The page is considered an alternate version of a canonical page (i.e., the page is marked with a
rel="canonical"
tag pointing to another URL). - Top Issues: This usually isn’t an issue, but it means Google has chosen not to index the page in favor of the canonical version.
- Quick Solution: If this page should be indexed, reconsider the canonical tag placement or remove it.
- More In-depth Solution: For eCommerce sites, this can happen with product variants. Check that canonical tags are properly configured for pages with duplicate content.
- What it Means: The page is considered an alternate version of a canonical page (i.e., the page is marked with a
-
Crawled - Currently Not Indexed
- What it Means: Google has crawled the page but hasn’t indexed it yet. This could happen due to crawl budget limits, or the page is considered low-priority.
- Top Issues: This often occurs with new or low-value pages.
- Quick Solution: Check if the page has valuable content and if internal links are pointing to it. Ensuring good internal linking can boost its importance.
- More In-depth Solution: Increase internal links to the page, improve its content quality, and avoid having duplicate content.
-
Discovered - Currently Not Indexed
- What it Means: Google knows about the page (it has been discovered), but it hasn’t crawled or indexed it yet.
- Top Issues: This is often a matter of timing—Google simply hasn't crawled the page yet, or it's waiting for available crawl budget.
- Quick Solution: Submit the page to Google via Search Console's “Request Indexing” tool.
- More In-depth Solution: If this issue persists, look at your crawl budget. Improve your internal linking structure and site architecture to ensure Google prioritizes crawling high-value pages.
-
Duplicate
- What it Means: Google detected that the content on the page is a duplicate of another page. Google will usually prefer one version and ignore the rest.
- Top Issues: Duplicate content often occurs when pages are similar but not identical. For eCommerce sites, this could include product pages with slight variations in URL.
- Quick Solution: Use canonical tags to indicate the preferred version of the page or consolidate duplicate pages into one.
- More In-depth Solution: For large websites with many similar pages, such as product variants, ensure you have an effective internal linking strategy and use the
rel=canonical
tag correctly.
-
Google Chose Different Canonical Than User
- What it Means: Google has chosen a different canonical page than the one specified by the website owner. This may happen if Google thinks the page specified as canonical isn’t the best version of the content.
- Top Issues: This isn’t a major issue unless the wrong page is selected as the canonical, which could lead to missed opportunities in search.
- Quick Solution: Review the page's content and internal links to ensure that the canonical URL you specified makes sense.
- More In-depth Solution: If you believe Google has made the wrong choice, you may need to adjust the content, internal linking, and structure of the pages to make the selected canonical version more valuable.
Common Non-Issues and Why They Occur
-
Pages with Low Value: If Google doesn't index a page, it might be because it's not adding much value in terms of content or user experience. Google may choose to ignore these pages in favor of higher-priority content.
-
Ecommerce Product Variants: In many cases, eCommerce sites with numerous product variations (e.g., size, color) may see pages marked as “alternate page with proper canonical tag.” This isn’t a problem if you're using proper canonical tags to tell Google which version of the page should be indexed.
-
Crawl Budget Limitations: Large websites may experience delays in indexing because Google limits how many pages it can crawl on any given site. Google’s crawl budget is influenced by factors like site speed, internal linking, and page quality. If pages aren’t being indexed right away, this may just be due to Google prioritizing other pages first.
Conclusion
If Google Search Console shows "Not Indexed" pages, it’s important to assess each category carefully. Some issues are easy to fix, like redirect errors or incorrect noindex
tags. Others, such as "Crawled - Currently Not Indexed" or "Discovered - Currently Not Indexed," may simply require patience as Google works through its crawl budget. Regularly monitoring your site and making adjustments as needed can help ensure your pages are being indexed appropriately, leading to better visibility in search results.