Skip to main contentCrawler Verification Process
Understanding who’s accessing your website is crucial for security and analytics accuracy. Profound Agent Analytics platform employs robust verification methods to ensure that crawlers claiming to be from major AI and search platforms are genuine.
Why Verification Matters
Accurate crawler identification is essential for:
-
Protecting your website from malicious actors
-
Ensuring data accuracy in your analytics
-
Managing resource allocation effectively
-
Maintaining security compliance
-
Optimizing content delivery for legitimate AI platforms
Verification Methods
Primary Verification Techniques
Our platform employs multiple verification methods to ensure accuracy:
-
Reverse DNS Lookup
-
Verifies the crawler’s hostname matches the claimed organization
-
Provides an additional layer of authenticity checking
-
Used by established platforms like Google and Apple
-
IP Range Validation
-
Confirms the crawler originates from the organization’s known IP ranges
-
Particularly effective for platforms like OpenAI and You.com
-
Updated regularly to maintain accuracy
-
Heuristic Detection
-
Analyzes crawler behavior patterns
-
Identifies characteristic signatures
-
Helps verify crawlers without published verification methods
-
Google
-
Googlebot: Reverse DNS verification
-
Storebot-Google: Reverse DNS verification
-
Google-Extended: Reverse DNS verification
-
Microsoft Bing
- BingSearch: Reverse DNS verification
-
Apple
-
Applebot: Reverse DNS verification
-
Applebot-Extended: Reverse DNS verification
-
OpenAI
-
OAI-SearchBot: IP range verification
-
ChatGPT-User: IP range verification
-
GPTBot: IP range verification
-
You.com
- YouBot: IP range verification
Some platforms use common cloud provider IPs or don’t publish verification methods, making complete verification challenging:
-
Anthropic
-
ClaudeBot: Primarily Amazon AWS IPs
-
Verification method: Heuristic detection
-
Bytedance
-
Bytespider: Mixed cloud provider IPs
-
Verification method: Behavioral analysis
-
Perplexity
-
PerplexityBot: No published verification method
-
Verification method: Pattern matching and heuristics
Stay Updated
Our platform continuously updates verification methods as:
-
New AI platforms emerge
-
Existing platforms modify their crawler infrastructure
-
Additional verification methods become available
-
Security requirements evolve
Important Note About Data Updates
We continuously monitor and improve our verification processes as the AI crawler landscape evolves. As we enhance our detection methods and crawler identification techniques, you may notice changes in your historical and current analytics data. These updates reflect our commitment to providing the most accurate and reliable crawler identification possible.
If you observe any significant changes in your data, it’s likely due to improvements in our verification system. We recommend regularly reviewing your analytics dashboard to stay informed about the latest insights into AI crawler behavior on your site.