|Feature||Explanation||Does Botify support a feature?|
|Basic SEO reports|
|List of indexable/non-indexable pages|
It's necessary to view a list of indexable / non indexable pages to make sure there are no mistakes. Maybe some URLs were intended to be indexable?
|Yes, go to URL Explorer and select "Meta No-index". Alternatively, you can view the list of compliant pages. To do so, go to URL explorer and add a new filter: "is compliant". You can easily filter the results by segments|
|Missing title tags|
Meta titles are an important part of SEO audits. A crawler should show you a list of pages that have missing tags.
|Yes. Go to HTML Tags -> Insights. There are some reports related to missing page titles|
|Filtering URLs by status code (3xx, 4xx, 5xx)|
When you perform an SEO audit, it's necessary to filter URL by status code. How many URLs are not found (404)? How many URLs are redirected (301)?
|Yes. Go to the "HTTP Codes" section|
|List of Hx tags|
“Google looks at the Hx headers to understand the structure of the text on a page better.” - John Mueller
|Yes. Go to the Content section and display the HTML tags graphs. Alternatively, you can go to URL explorer and add additional columns -> H1 / H2 /H3|
|View internal nofollow links|
It's nice to see internal nofollow list to make sure there aren’t any mistakes.
|Yes. Click on Inlinks -> Nofollow.|
|External links list (outbound external)|
A crawler should allow you to analyze both internal and external outbound links.
|Yes, click on "Outlinks" -> select either "URLs with External Follow Outlinks" or "External Nofollow Outlinks"|
|Link rel="next" (to indicate a pagination series)|
When you perform an SEO audit, you should analyze if the pagination series are implemented properly.
|Yes. Go to Distribution Section,select the following chart: "REL PREV/NEXT DISTRIBUTION" Alternatively, you can go to explorer -> Edit filters- > add a new filter: "Has at Least an incoming Rel next: yes"|
Hreflang tags are the foundations of international SEO, so a crawler should recognize them to let you point to hreflang-related issues.
|Yes. Go to Distribution -> Internationalization, Also possible in the URL Explorer and in Botify Plugin|
|Canonical tags||Every SEO crawler should inform you about canonical tags to let you spot indexing issues.||Yes, Go to HTML Tags -> Insights and review the "Canonicals" section. You can even see canonicals pointing to 404 pages: To do so, go to URL explorer -> Add a new filter: "Canonical To: HTTP status code" -> `Equal to 404`|
|Information about crawl depth - number of clicks from a homepage|
Additional information about crawl depth can give you an overview of the structure of your website. If an important page isn’t accessible within a few clicks from a homepage, it may indicate poor website structure.
|Yes. click on "Select columns" -> add a new column: "depth". You can also display active pages ratio, visits or any Botify KPIs by depth|
|List of empty / Thin pages|
A large number of thin pages can negatively affect your SEO efforts. A crawler should report them.
|Yes. Go to Content Section -> Content quality evalution. Alternatively, you can filter pages by by number of words and display insights by pages types segment|
|Duplicate content recognision|
A crawler should give you at least basic information on duplicates across your website.
|Botify: "Yes. Go to Content Section and Content quality evalutaion. Botify removes templating to do its content similarity evaluation. Note that you can filter by the % of similarity between pages. There's also a section dedicated to Mobile / Desktop content parity. As always, filters are available to evaluate content duplication among segments or sub segments of pages."|
|A detailed report for given URL|
It's must-have! If you do a crawl of a website, you may want to see internal links pointing to a particular URL, to see headers, canonical tags, etc.
Advanced URL filtering for reporting - using regular expressions and modifiers like "contains," "start with,” "end with."
I can't imagine my SEO life without a feature like this. It’s common that I need to see only URLs that end with “.html” or those which contain a product ID. A crawler must allow for such filtering.
|Yes + you can combine rules by ON/AND)|
|Adding additional columns to a report
This is also a very important feature of crawlers. I simply can't live without it. When I view a single report, I want to add additional columns to get the most out of the data. Fortunately, most crawlers allow this.
Some crawlers offer the possibility to categorize crawled pages (i.e. blog, product pages etc) and see some reports dedicated to specific categories of pages.
|Yes. "Botify offers different grouping options. URLs segmentation makes possible the filter for parent and children companies. Advanced selctros make it possible to do cross segmentation between pages types and any dat or KPis collected during the analysis (Ex: analyse how a too high content similarity % can impact the organic traffic of your article segment)"|
|Filtering URLs by type (HTML, CSS, JS, PDF etc)
Crawlers visit resources of various types (HTML, PDF, JPG). But, usually you want to review only HTML files. A crawler should support this.
|Yes. Go to URL explorer and select the "Content type Distribution" chart. https://elephate.gyazo.com/be7acedf30bc4df2be2f8669a8a425d5|
|Basic statistics about website structure - ie. Depth stats,||Yes + statistics can be displayed by segments.|
|Overview - the list of all the issues listed on a single dashboard
It's a positive if a crawler lists all the detected issues on a single dashboard. Of course, it will not do the job for you, but it can make SEO audits easier and more efficient.
|No. However, for each section Botify provides valuable insights.|
|Comparing to the previous crawl|
When you work on a website for a long time, it’s important to compare the crawls that were done before and after the changes.
|Yes. You can see some insights in the `Movement` section.|
|List mode - crawl just the listed URLs (helpful for a website migration)|
Sometimes you want to perform a quick audit of a specified set of URLs without crawling the whole website.
|Changing the user agent|
Sometimes, it's necessary to change the user agent. For example, even when a website blocks Ahrefs, you still need to perform a crawl. Also, more and more websites detect Googlebot by user agent and serve it a pre-rendered version instead of fully equipped JS.
|Yes (+ if necessary for whitelisting, Botify can use a static IP)|
|Crawl speed adjusting |
You should be able to set a crawl speed i.e 1-3 URLs per second if a website can't handle host load, while you may want to crawl much faster if a website is healthy.
|Yes + Botify speed is automatically adapted to servers loading performance|
|Can I limit crawling? Crawl depth, max number of URLs|
Many websites have millions of URLs. Sometimes, it's good to limit the crawl depth or specifying a max number of URLs allowed to crawl.
|Analyzing a domain protected by an htaccess Login|
(helpful for analyzing staging websites)
This is a helpful feature if you want to crawl the staging website.
|Can I exclude particular subdomains, include only specific directories?||Yes. You can include/exclude specific directories by setting virtual robots.txt|
|Universal crawl -> crawl + list mode + sitemap||Yes|
It's handy to be able to schedule a crawl and set monthly/weekly crawls.
|Indicating the crawling progress|
If you deal with big websites, you should be able to see the current status of a crawl. Will you wait a few hours, or weeks till the 1kk+ crawl will finish?
Accidental changes in robots.txt can cause Google to not be able to read and index your content. It's beneficial if a crawler detects changes in Robots.txt and informs you.
|Crawl data retention|
It’s good if a crawler can store results for a long period of time.
|Yes, as long as the customer account is active|
|Notifications - crawl finished|
A crawler should inform you when a crawl is done (desktop notification / email).
|Advanced SEO reports|
|List of pages with less than x links incoming|
If there are no internal links pointing to a page, it may mean for Google that the page is probably irrelevant. It’s crucial to spot orphan URLs.
|Yes. Go to "Inlinks" -> "Insights" and choose one of the following reports: "URLs with 1 Follow inlink" / "URL between 2 and 5 follow inlinks." Alternatively, you can go to "URL explorer" and click on "add a new filter: "number of internal follow links".|
|Comparison of URLs found in sitemaps and in crawl.||Sitemaps should contain all the valuable URLs. If some pages are not included in a sitemap, it can cause issues with crawling and indexing by Google. |
If a URL is apparent in a sitemap, but can't be accessible through crawl, it may be a signal to Google that a page is not relevant.
|Internal Page Rank value||Although any PageRank calculations can’t reflect Google’s link graph, it’s still a really important feature. Imagine you want to see the most important URLs based on links. Then you should sort URLs by not only simple metrics like number of inlinks, but also by internal PageRank. You think Google doesn’t use PageRank anymore? http://www.seobythesea.com/2018/04/pagerank-updated/||Yes, Go to Distribution section and display charts like Page Rank by segment and Page rank by Detph. When viewing a report click on "Select columns" and add "Internal Pagerank".|
In mobile-first indexing it’s necessary to perform a content parity audit between the mobile and desktop versions of your website
|Additional SEO reports|
|Malformed URLs (https://https://, https://example.com/tag/someting/tag/tag/tag or https://www.example.com/first_part of URL||You can do it partially. Go to URL explorer and try the following filters: "URL contains space", "URL contains https://https".|
|List of URLs with parameters||Yes. Go to URL Explorer -> Add filter: "Full URL contains `?` "|
|Mixed content (some pages / resources are served via HTTPS, some by HTTPS)||Yes, use protocol and errors analyses|
|Redirect chains report|
Nobody likes redirect chains. Not users, not search engines. A crawler should report any redirect chains to let you decide if it's worth fixing.
|Yes. Go to "HTTP Codes" -> "Insights" -> "URLs in Redirect chain"|
|Website speed statistics|
Performance is becoming more and more important both for users and SEO. So crawlers should present reports related to performance.
|Yes, Go to "Performance -> Insights"|
|List of URLs blocked by robots.txt|
It happens that a webmaster mistakenly prevents Google from crawling a particular set of pages. As an SEO, you should review the list of URLs blocked by robots.txt - to make sure there are no mistakes.
|You can export to CSV the following report: "Uncrawled URLs because out of scope". Then, you can filter it by column: "Blocked by robots.txt".|
|Exporting to excel / CSV|
Sometimes a crawler has no power here and you need to export the data and edit it in Excel / other tools.
|Yes, you can export the data to CSV|
|Exporting to PDF||Yes|
|Custom reports / dashboards||Yes, go to Custom Reports|
|Sharing individual reports|
Imagine that you want to share a report related to 404s with your developers. Does the crawler support it?
|Granting access to a crawl for another person|
It's pretty common that two or more people work on the same SEO audit. Thanks to report sharing, you can work simultaneously.
|Explanation on the issues|
If you are new to SEO, you will appreciate the explanation of the issues that many crawlers provide.
A crawler should let you perform a custom extraction to enrich your crawl. For instance, while auditing an e-commerce website, you should be able to scrape information about product availability and price.
|Yes, you can extract the data using CSS selectors and Regex + you can filter any report by extracted data|
|Can crawler detect the unique part - that is not a part of the template?||It’s valuable if a crawler let you analyse only the unique part of a page (excluding navigation links, sidebars and footer).||Yes|
|Ability to use the crawler's API||Yes|
|Supported operating systems||All - it's a web-based application|
|Integration with Google Analytics||Yes. The Google Analytics integration makes the bridge between the Botify technical SEO data model and your webanalytics insights. It gives SEO the power to track and optimize the organic performance of any URLs (or group of URLs) crawled by Botify (including business data, conversion and user behavior on SERPs)|
|Integration with Google Search Console||Yes. Botify Keywords is the first application merging Real Keywords (trends, research, rankings, CTR with an adavnced technical SEO data model). features availble within the report (for mobile / desktop), Keywords Explore, URL Details etc...). The keywords attributes become a filter of your technical SEO analyses and gives you how technical SEO can influence your rankings and keywprds strategy.|
|Integration with server logs||Yes|
|Integration with other tools||Analytics software: Google Analytics Premium, Adobe Analytics, AT Internet|
|Why do users should use your crawler?|
|Free account - try||Book a demo and you will have a trial|