DeepCrawl - checklist
 Share
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

 
View only
 
 
ABCDEFGHIJKLMNOPQRSTUVWX
1
FeatureExplanationDeepCrawl
2
Basic SEO reports
3
List of indexable/non-indexable pages
It's necessary to view a list of indexable / non indexable pages to make sure there are no mistakes. Maybe some URLs were intended to be indexable?
Yes. Go to "indexation -> "indexable pages". If you want to see non-indexable pages, go to "Indexation" -> "non-indexable pages"
4
Missing title tags
Meta titles are an important part of SEO audits. A crawler should show you a list of pages that have missing tags.
Yes. Go to Content-> Titles & descriptions -> Missing Titles
5
Filtering URLs by status code (3xx, 4xx, 5xx)
When you perform an SEO audit, it's necessary to filter URL by status code. How many URLs are not found (404)? How many URLs are redirected (301)?
Yes. There are some reports dedicated to it in the "Indexation" section. You can i.e. view reports related to the Non-200 status pages. + For every report you can set a new filter related to status code
6
List of Hx tags
“Google looks at the Hx headers to understand the structure of the text on a page better.” - John Mueller
You can go to "Content" -> "Missing H1 tag pages". If you go to a detailed report, you can see a list of H1-H3 tags
7
View internal nofollow links
It's nice to see internal nofollow list to make sure there aren’t any mistakes.
Yes. Go to "Links" -> "Internal links" -> "Unique internal links" and set a new filter: "Nofollow = true"
8
External links list (outbound external)
A crawler should allow you to analyze both internal and external outbound links.
Yes
9
Link rel="next" (to indicate a pagination series)
When you perform an SEO audit, you should analyze if the pagination series are implemented properly.
Yes. Go to "Config" -> "all `rel` links"
10
Hreflang tags
Hreflang tags are the foundations of international SEO, so a crawler should recognize them to let you point to hreflang-related issues.
Yes. Go to "Config" -> "all hreflang links"
11
Canonical tagsEvery SEO crawler should inform you about canonical tags to let you spot indexing issues.
Yes. Go to "indexation" -> "Canonicalized pages" or "Indexation" -> "Self canonicalized pages". Also, you can see the list of pages without valid canonical tags. To do so, go to "Config" -> "Pages Without Valid Canonical tags"
12
Information about crawl depth - number of clicks from a homepage
Additional information about crawl depth can give you an overview of the structure of your website. If an important page isn’t accessible within a few clicks from a homepage, it may indicate poor website structure.
Yes. Look at the "Level" column in any report
13
Content analysis
14
List of empty / Thin pages
A large number of thin pages can negatively affect your SEO efforts. A crawler should report them.
Yes. Go to "Body content" -> "Thin Pages".
15
Duplicate content recognision
A crawler should give you at least basic information on duplicates across your website.
Yes. DeepCrawl: "We first detect pages which are very similar, based on identical titles, descriptions and similar body content, and flag them as duplicate pages.
Then any pages which are not flagged as duplicates are analysed separately for combinations of duplicate titles, descriptions or body and overall patterns. The sensitivity for detecting the duplicate body can be adjusted to meet your specific needs.
Our duplicate detection methods are advanced, and based on DeepRank -our internal ranking system- your primary v duplicate pages will be highlighted, making it easy for you to understand the pages you want indexed, as opposed to the pages you want to canonicalise from.
DeepCrawl is a custom metric, measuring the internal weight of a link based on a calculation which is similar to Google’s PageRank algorithm.
We rate each of your URLs on a scale of 0-10, making sure to clearly flag your most important URLs, AKA in need of the most improvement."
16
Convenience
17
A detailed report for given URL
It's must-have! If you do a crawl of a website, you may want to see internal links pointing to a particular URL, to see headers, canonical tags, etc.
Yes
18

Advanced URL filtering for reporting - using regular expressions and modifiers like "contains," "start with,” "end with."

I can't imagine my SEO life without a feature like this. It’s common that I need to see only URLs that end with “.html” or those which contain a product ID. A crawler must allow for such filtering.
Yes + you can combine rules by OR/AND
19
Adding additional columns to a report


This is also a very important feature of crawlers. I simply can't live without it. When I view a single report, I want to add additional columns to get the most out of the data. Fortunately, most crawlers allow this.
No
20
Page categorizing
Some crawlers offer the possibility to categorize crawled pages (i.e. blog, product pages etc) and see some reports dedicated to specific categories of pages.
No
21
Filtering URLs by type (HTML, CSS, JS, PDF etc)


Crawlers visit resources of various types (HTML, PDF, JPG). But, usually you want to review only HTML files. A crawler should support this.
Yes, if you want to see other types, go to Content and click on "Non-HTML pages"
22
Basic statistics about website structure - ie. Depth stats, Yes, go to Summary -> Dashboard
23
Overview - the list of all the issues listed on a single dashboard


It's a positive if a crawler lists all the detected issues on a single dashboard. Of course, it will not do the job for you, but it can make SEO audits easier and more efficient.
Yes. Go to Summary -> Issues.
24
Comparing to the previous crawl
When you work on a website for a long time, it’s important to compare the crawls that were done before and after the changes.
Yes. Go to Summary -> Issues -> Changes to see an overview on website changes. Also, for every report you can see the trend line.
25
Crawl settings
26
List mode - crawl just the listed URLs (helpful for a website migration)
Sometimes you want to perform a quick audit of a specified set of URLs without crawling the whole website.
Yes
27
Changing the user agent
Sometimes, it's necessary to change the user agent. For example, even when a website blocks Ahrefs, you still need to perform a crawl. Also, more and more websites detect Googlebot by user agent and serve it a pre-rendered version instead of fully equipped JS.
Yes (+ if necessary for whitelisting, Deepcrawl can use a static IP)
28
Crawl speed adjusting

You should be able to set a crawl speed i.e 1-3 URLs per second if a website can't handle host load, while you may want to crawl much faster if a website is healthy.
Yes + you can change the crawling speeed during crawling
29
Can I limit crawling? Crawl depth, max number of URLs
Many websites have millions of URLs. Sometimes, it's good to limit the crawl depth or specifying a max number of URLs allowed to crawl.
Yes
30
Analyzing a domain protected by an htaccess Login
(helpful for analyzing staging websites)

This is a helpful feature if you want to crawl the staging website.
Yes
31
Can I exclude particular subdomains, include only specific directories?

Yes, you can exclude URLs from crawling by pasting them in the config or by specifying regular expression. Also, you can use custom robots.txt file
32
Universal crawl -> crawl + list mode + sitemapYes
33
Maintenance
34
Crawl scheduling
It's handy to be able to schedule a crawl and set monthly/weekly crawls.
Yes
35
Indicating the crawling progress
If you deal with big websites, you should be able to see the current status of a crawl. Will you wait a few hours, or weeks till the 1kk+ crawl will finish?
Yes, there is a dedicated dashboard for it
36
Robots.txt monitoring
Accidental changes in robots.txt can cause Google to not be able to read and index your content. It's beneficial if a crawler detects changes in Robots.txt and informs you.
No
37
Crawl data retention
It’s good if a crawler can store results for a long period of time.
Forever, unless you delete it.
38
Notifications - crawl finished
A crawler should inform you when a crawl is done (desktop notification / email).
Yes
39
Advanced SEO reports
40
List of pages with less than x links incoming
If there are no internal links pointing to a page, it may mean for Google that the page is probably irrelevant. It’s crucial to spot orphan URLs.
Yes. Go to Indexation -> All pages -> add a new filter: "Links In count"
41
Comparison of URLs found in sitemaps and in crawl.Sitemaps should contain all the valuable URLs. If some pages are not included in a sitemap, it can cause issues with crawling and indexing by Google.
If a URL is apparent in a sitemap, but can't be accessible through crawl, it may be a signal to Google that a page is not relevant.
Yes
42
Internal Page Rank valueAlthough any PageRank calculations can’t reflect Google’s link graph, it’s still a really important feature. Imagine you want to see the most important URLs based on links. Then you should sort URLs by not only simple metrics like number of inlinks, but also by internal PageRank. You think Google doesn’t use PageRank anymore? http://www.seobythesea.com/2018/04/pagerank-updated/
Yes. It's called "DeepRank"
43
Mobile audit
In mobile-first indexing it’s necessary to perform a content parity audit between the mobile and desktop versions of your website
Yes. You can see the following reports: Mobile Content Mismatch report, Mobile Word Count Mismatch report, and Mobile Links Out Mismatch report. + you can set 2 crawls, one using Gbot mobile, second with Gbot desktop and Deepcrawl will show the differences.

You can read more in https://www.google.com/url?q=http://deepcrawl.actonservice.com/acton/fs/blocks/showLandingPage/a/31628/p/p-0038/t/page/fm/0&sa=D&ust=1527603247771000&usg=AFQjCNEJeb806gFfWreHs79qIX7sPFgtcw
44
Additional SEO reports
45
Malformed URLs (https://https://, https://example.com/tag/someting/tag/tag/tag or https://www.example.com/first_part of URL Yes. Go to "indexation" -> "Malformed URLs". You can also see URLs of a length more than X. If you want so, go to Indexation -> All pages -> add a new filter: "URL length greater than x"
46
List of URLs with parametersYes. Go to Indexation -> All pages and add a new filter: URL contains "?"
47
Mixed content (some pages / resources are served via HTTPS, some by HTTPS)Yes. Go to Config -> Mixed content
48
Redirect chains report
Nobody likes redirect chains. Not users, not search engines. A crawler should report any redirect chains to let you decide if it's worth fixing.
Yes. Go to the "Config" section and click on "Redirect Chains". Also, you can look into the 301 redirect section: by clicking on "indexation" -> "301 redirects".
49
Website speed statistics
Performance is becoming more and more important both for users and SEO. So crawlers should present reports related to performance.
Yes. Look into the "performance" section
50
List of URLs blocked by robots.txt
It happens that a webmaster mistakenly prevents Google from crawling a particular set of pages. As an SEO, you should review the list of URLs blocked by robots.txt - to make sure there are no mistakes.
Yes. Go to indexation -> Disallowed pages. You can then filter the results by the number of internal incoming links
51
Schema.org detectionNo, but you can use custom extraction: https://www.deepcrawl.com/blog/best-practice/getting-the-most-out-of-deepcrawl-with-custom-extraction/
52
Export, sharing
53
Exporting to excel / CSV
Sometimes a crawler has no power here and you need to export the data and edit it in Excel / other tools.
Yes, full export to CSV / XLSX. There are some pre-built exports which you can access via 'Bulk exports'
54
Exporting to PDFYes + you can enrich an report with your logo and slogan
55
Custom reports / dashboards

No
56
Sharing individual reports
Imagine that you want to share a report related to 404s with your developers. Does the crawler support it?
Yes.
Also, you can define a specific timeframe on how long the report will be available.
57
Granting access to a crawl for another person


It's pretty common that two or more people work on the same SEO audit. Thanks to report sharing, you can work simultaneously.
Yes
58
Miscellaneous
59
Explanation on the issues
If you are new to SEO, you will appreciate the explanation of the issues that many crawlers provide.
Yes. If you're viewing a report, click on the "info" icon to get more information. Also, for every report you can see its definition. http://take.ms/LD554
60
Custom extraction
A crawler should let you perform a custom extraction to enrich your crawl. For instance, while auditing an e-commerce website, you should be able to scrape information about product availability and price.
Yes, based on the regex patterns
61
Can crawler detect the unique part - that is not a part of the template? It’s valuable if a crawler let you analyse only the unique part of a page (excluding navigation links, sidebars and footer).
No
62
Ability to use the crawler's APIYes
63
Supported operating systemsAll - it's a web-based application
64
Integration
65
Integration with Google AnalyticsYes. DeepCrawl: "We directly integrate with Adobe Analytics and Google Analytics, though you can manually add data from any provider into your crawls.

Adding Analytics data into your crawls makes it easier to find out the reason for low engagement.

You can also look at the tree diagram functionality which we call Site Explorer, in order to browse and determine if the reason your pages have low engagement is because of empty or thin content."
66
Integration with Google Search ConsoleYes. DeepCrawl: "By adding Google Search Console as a source in your crawl settings, DeepCrawl provides click, impression, clickthrough rate (CTR), and average position metrics for every indexed page appearing in search results, and is not limited by the same number of URLs in thee GSC interface.

We highlight 17 SERP metrics including traffic reports like Broken Pages with Traffic, Indexable/Non-Indexable Pages without Search Impressions, Desktop Pages with Mobile Search Clicks, Search Clicks by Device and more!

Here is a detailed post on our GSC integration: https://www.deepcrawl.com/blog/releases/advanced-google-search-console-integration/"
67
Integration with server logs Yes (you can upload server logs or integrate DeepCrawl with Splunk and Logz.io)
68
Integration with other toolsIntegration with Woorank, Conductor, Zapier and with Majestic (this is free so you can see their backlink data in DeepCrawl even if you don't have a Majestic account), Splunk, Logzio, Adobe Analytics.
69
JavaScript rendering
JavaScript is more and more popular. If your website depends heavily on JavaScript, it's a good idea to use a crawler that supports JS.
Yes, DeepCrawl uses Headless Chrome for it
70
Why do users should use your crawler? To be completed
71
Free account - tryYes, you can have a 2-week trial, crawling up to 1000 URLs. You can add multiple data layers for no extra fee, including Majestic, GA, GSC integration.
However, there are some features that are restricted for trials: JS rendering, logfile integration.
72
Referral linkYou can use the discount voucher code to get 10% off any annual package by using the code: ELEPHATE
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
Loading...
Main menu