# Synapse Global Theranostics - Robots.txt # https://synapseglobaltheranostics.com # Last Updated: 2025-01-21 # =========================================== # SEARCH ENGINE CRAWLERS # =========================================== # Google User-agent: Googlebot Allow: / Crawl-delay: 1 User-agent: Googlebot-Image Allow: / User-agent: Googlebot-Video Allow: / User-agent: Googlebot-News Allow: / User-agent: Storebot-Google Allow: / # Bing & Microsoft User-agent: Bingbot Allow: / Crawl-delay: 1 User-agent: MSNBot Allow: / User-agent: AdIdxBot Allow: / # Yahoo User-agent: Slurp Allow: / # DuckDuckGo User-agent: DuckDuckBot Allow: / # Baidu (China) User-agent: Baiduspider Allow: / # Yandex (Russia) User-agent: YandexBot Allow: / User-agent: YandexImages Allow: / # Naver (Korea) User-agent: Yeti Allow: / # Sogou (China) User-agent: Sogou Allow: / # Qwant (Europe) User-agent: Qwantify Allow: / # Ecosia User-agent: Ecosia Allow: / # =========================================== # SOCIAL MEDIA CRAWLERS # =========================================== User-agent: Twitterbot Allow: / User-agent: facebookexternalhit Allow: / User-agent: LinkedInBot Allow: / User-agent: Pinterest Allow: / User-agent: Pinterestbot Allow: / User-agent: Slackbot Allow: / User-agent: Slackbot-LinkExpanding Allow: / User-agent: WhatsApp Allow: / User-agent: TelegramBot Allow: / User-agent: Discordbot Allow: / # =========================================== # AI & LLM CRAWLERS (AI GEO Optimization) # =========================================== # OpenAI / ChatGPT User-agent: GPTBot Allow: / # GPTBot crawls for training and plugins User-agent: ChatGPT-User Allow: / # ChatGPT browsing feature # Anthropic / Claude User-agent: Claude-Web Allow: / User-agent: ClaudeBot Allow: / User-agent: anthropic-ai Allow: / User-agent: Anthropic Allow: / # Google AI / Bard / Gemini User-agent: Google-Extended Allow: / # Google-Extended is for Bard/Gemini training # Perplexity AI User-agent: PerplexityBot Allow: / User-agent: Perplexity-User Allow: / # Cohere AI User-agent: cohere-ai Allow: / User-agent: Cohere-ai Allow: / # You.com AI User-agent: YouBot Allow: / # Neeva AI (acquired by Snowflake) User-agent: NeevaBot Allow: / # Meta AI User-agent: FacebookBot Allow: / User-agent: Meta-ExternalAgent Allow: / User-agent: Meta-ExternalFetcher Allow: / # Apple / Siri User-agent: Applebot Allow: / User-agent: Applebot-Extended Allow: / # Amazon / Alexa User-agent: Amazonbot Allow: / # Microsoft Copilot User-agent: Copilot Allow: / # Hugging Face User-agent: HuggingFaceBot Allow: / # Common Crawl (used by many AI models) User-agent: CCBot Allow: / # AI2 (Allen Institute for AI) User-agent: AI2Bot Allow: / # Diffbot (knowledge graph) User-agent: Diffbot Allow: / # Brave Search AI User-agent: BraveBot Allow: / # Mojeek (independent search) User-agent: MojeekBot Allow: / # Phind (developer AI search) User-agent: Phind Allow: / # Kagi Search User-agent: Kagi Allow: / # =========================================== # SEO & ANALYTICS TOOLS # =========================================== User-agent: AhrefsBot Allow: / Crawl-delay: 5 User-agent: SemrushBot Allow: / Crawl-delay: 5 User-agent: MJ12bot Allow: / Crawl-delay: 5 User-agent: DotBot Allow: / User-agent: rogerbot Allow: / User-agent: Screaming Frog SEO Spider Allow: / # =========================================== # GENERAL RULES # =========================================== # Default rule for all other crawlers User-agent: * Allow: / Crawl-delay: 2 # =========================================== # DISALLOWED PATHS # =========================================== # Block development and private files User-agent: * Disallow: /*.json$ Disallow: /*.xml$ Disallow: /api/ Disallow: /.well-known/ Disallow: /private/ Disallow: /admin/ Disallow: /tmp/ # Allow specific files for AI discovery User-agent: * Allow: /llms.txt Allow: /robots.txt Allow: /sitemap.xml # =========================================== # AI-SPECIFIC CONTENT HINTS # =========================================== # LLMs.txt location for AI assistants # See: https://llmstxt.org/ # File: /llms.txt # =========================================== # SITEMAP & HOST # =========================================== # Primary sitemap index (contains all sitemaps) Sitemap: https://synapseglobaltheranostics.com/sitemap-index.xml # Individual sitemaps (also accessible directly) Sitemap: https://synapseglobaltheranostics.com/sitemap.xml Sitemap: https://synapseglobaltheranostics.com/sitemap-pages.xml Sitemap: https://synapseglobaltheranostics.com/sitemap-blog.xml Sitemap: https://synapseglobaltheranostics.com/sitemap-images.xml Sitemap: https://synapseglobaltheranostics.com/sitemap-ai.xml Host: https://synapseglobaltheranostics.com