avril 27, 2021
Google bots process JS differently than a non-JS page. Bots process them in three phases, namely crawling, indexing, and rendering.
This phase is about the discoverability of your content. It’s a complicated process, involving subprocesses, namely seed sets, crawl queuing and scheduling, URL importance, and others.
To begin with, Google’s bots queue the pages for crawling and rendering. The bots use the parsing module to fetch pages, follow links on the pages, and render until a point when the pages are indexed. The module not only renders pages but also analyses the source code and extracts the URLs in the <a href=”…”> snippets.
The bots check the robots.txt file to see whether or not crawling is allowed. If the URL is marked disallowed, the bots skip it. Therefore, it’s critical to check the robots.txt file to avoid errors.
The process of displaying the content, templates, and other features of a site to the user is called rendering. There is server-side rendering and client-side rendering.
As the name suggests, in this type of rendering the pages are populated on the server. Each time the site is accessed, the page is rendered on the server and sent to the browser.
In other words, when a user or bot accesses the site, they receive the content as HTML markup. This usually helps the SEO as Google doesn’t have to render the JS separately to access the content. SSR is the traditional rendering method and may prove to be costly when it comes to the bandwidth.
Coming back to what happens after a page has been crawled, the bots identify the pages that need to be rendered and add them to the render queue unless the robots meta tag in the raw HTML code tells Googlebot not to index the page.
Once the WRS fetches the data from external APIs and databases, the Caffeine indexer on Google can index the content. This phase involves analysing the URL, understanding the content on the pages and its relevance, and storing the discovered pages in the index.
All the on-page SEO rules that go into optimizing your page to help them rank on search engines still apply. Optimize your title tags, meta descriptions, alt attributes in images, and meta robot tags. Unique and descriptive titles and meta descriptions help users and search engines easily identify the content. Pay attention to the search intent and the strategic placement of semantically-related keywords.
Also, it’s good to have an SEO-friendly URL structure. In a few cases, websites implement a push State change in the URL, confusing Google when it’s trying to find the canonical one. Make sure you check the URLs for such issues.
If your content can be seen in the DOM, chances are your content is being parsed by Google. Checking the DOM will help you determine whether or not your pages are being accessed by the search engine bots.
Bots skip rendering and JS execution if the meta robots tag initially contains no index. Googlebot doesn’t fire events at a page. If the content is added to the page with the help of JS it should be done after the page has loaded. If the content is added to the HTML when clicking the button, when scrolling the page, and so so, it won’t be indexed.
To avoid the issue of Google not being able to find JS content, a few webmasters use a process called cloaking that serves the JS content to users but hides it from crawlers. However, this method is considered to be a violation of Google’s Webmaster Guidelines and you could be penalized for it. Instead, work on identifying the key issues and making JS content accessible to search engines.
At times, the site host may be unintentionally blocked, barring Google from seeing the JS content. For instance, if your site has a few child domains that serve different purposes, each should be having a separate robots.txt because subdomains are treated as a separate website. In such a case, you need to make sure that none of these robots.txt directives are blocking search engines from accessing the resources needed for rendering.
Google’s crawlers use HTTP status codes to identify issues when crawling a page. Therefore, you should use a meaningful status code to inform the bots if a page shouldn’t be crawled or indexed. For instance, you could use a 301 HTTP status to tell the bots that a page has moved to a new URL, allowing Google to update its index accordingly.
What’s more, image searches are also a source of additional organic traffic. So, if you have lazy-loaded images, search engines will not pick them. While lazy loading is great for users, it needs to be done with care to prevent bots from missing potentially critical content.
Remember what John Mueller said about bad URLs at an SEO event?
“For us, if we see kind of the hash there, then that means the rest there is probably irrelevant. For the most part, we will drop that when we try to index the content…”
Yet several JS-based sites generate URLs with a hash. This can be disastrous for your SEO. Make sure your URL is Google-friendly.
Google requires proper <a href> links to find URLs on your site. Also, if the links are added to the DOM after clicks on a button, the bots will fail to see them. Most webmasters miss out on these points, causing their SEO to suffer.
Take care to provide the traditional ‘href’ link, making them reachable for the bots. Check your links using the website audit tool, SEO profiler to improve your site’s internal link structure.
If you have a JS-based website and cannot find your content on Google, it’s time to address the issues.
If you need advice, or want a solution for your IT needs, please contact ITStacks.
Here at ITStacks, we provide solutions that companies can use to automate internal processes, improve customer service, increase system performance, enable information security, increase sales and reduce operating, labor and infrastructure costs.
To make the most of the many advantages of our IT outsourcing specialty, ask our experts for your free audit today.
Select ITStacks as your IT outsourcing partner for reaping the benefits of competitive prices, total transparency, expertise from our highly talented technical teams, modern tech infrastructure, strong work ethic and an Agile mindset focused on growth that makes ITStacks one of the best development centers in the European region.
To make the most of the many benefits of our specialization at IT outsourcing, ask our experts to make your free audit now.
Input your search keywords and press Enter.