Introduction
Qwant uses web crawlers to enhance its index and provide the best possible service. This page gives information about how they work and their behaviour on your websites.
User-agent
While crawling, we announce ourselves with different user-agents depending on the version of our crawler. Something that will always appear in our user-agents that you can use to identify our web crawlers is the name: Qwantbot.
Our user-agents are defined as:
Mozilla/5.0 (compatible; Qwantbot{-news}/X.Y_{worker_id}; +https://help.qwant.com/bot/)
- ⚠️ Note: String between
{}
are optional
Here are a few examples of user-agents we might crawl you with:
Mozilla/5.0 (compatible; Qwantbot/1.0_12345; +https://help.qwant.com/bot/)
Mozilla/5.0 (compatible; Qwantbot-news/2.0; +https://help.qwant.com/bot/)
Robots.txt
The crawler respects the robots rules standard described at https://www.robotstxt.org/orig.html
Verifying Crawler
Reverse & Forward DNS Lookup
To check if a web crawler accessing your server is from Qwant, perform a reverse DNS lookup and verify that it resolves to a name ending with “qwant.com”.
Optionally you can do a forward DNS lookup using the name in previous step to confirm that it resolves back to the same IP.
For example, on Linux you can use the “host” command:
> host 91.242.162.1
1.162.242.91.in-addr.arpa domain name pointer qwantbot-1-162-242-91.qwant.com.
> host qwantbot-1-162-242-91.qwant.com
qwantbot-1-162-242-91.qwant.com has address 91.242.162.1
Using IP ranges
First method is the preferred one.
Alternatively, you can identify our bots by matching the remote IP address of the HTTP request against this json file: qwantbot.json.
Refresh this list on a daily basis as it can change any time.
Troubles
If something went wrong when we visited your website using our crawlers, we are sincerely sorry for the inconvenience.
Please report us any problem caused by the crawlers by sending an email to qwantbot@qwant.com