Am I getting DDOS from crawlers?Is it possible for web crawlers to see static pages without following a link...

Do I really need diodes to receive MIDI?

Accidentally deleted the "/usr/share" folder

What does this colon mean? It is not labeling, it is not ternary operator

SQL Server Management Studio SSMS 18.0 General Availability release (GA) install fails

If Earth is tilted, why is Polaris always above the same spot?

How can I close a gap between my fence and my neighbor's that's on his side of the property line?

Can I get a paladin's steed by True Polymorphing into a monster that can cast Find Steed?

Is induction neccessary for proving that every injective mapping of a finite set into itself is a mapping onto itself?

Identifying my late father's D&D stuff found in the attic

What was the state of the German rail system in 1944?

Junior developer struggles: how to communicate with management?

Casual versus formal jacket

Transpose of product of matrices

Did we get closer to another plane than we were supposed to, or was the pilot just protecting our delicate sensibilities?

How do I tell my manager that his code review comment is wrong?

Why do money exchangers give different rates to different bills?

Would "lab meat" be able to feed a much larger global population

How could a planet have most of its water in the atmosphere?

Roll Dice to get a random number between 1 and 150

Reconstruct a matrix from its traces

Airbnb - host wants to reduce rooms, can we get refund?

Type-check an expression

Are we obligated to aspire to be Talmidei Chachamim?

What are the differences between credential stuffing and password spraying?



Am I getting DDOS from crawlers?


Is it possible for web crawlers to see static pages without following a link to them?How often are sitemap.xml checked for updates by crawlers?Free standalone crawlersDo web crawlers use the “revised” meta tag?multiple encoded requests from crawlersHow to block the most popular spider crawlers via robots.txt?Alexa SEO audit reports that all crawlers are blocked despite “Allow: /” for specific crawlers in robots.txtWhy am I getting bot hits from compute-1.amazonaws.com?How deep do search engine crawlers go?Major Crawlers not honoring robots.txt?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







1















My website is currently getting hit with spamming bots (Example: 66.249.73.*) that are causing high CPU usage. Is it common for Google/Bing to crawl a website 10/sec?



I have done reverse look ups on the IPs and they appear to be valid crawlers using https://support.google.com/webmasters/answer/80553?hl=en.



Because I have blocked some IP's, Google/Bing Search Console are reporting errors and hurting my index.



This has been happening this month (April). Is this a referral attack? Is it possible someone is spoofing the IPs? Is there something I can do to limit the amount of crawls?



I am currently in the process of creating a server-side rendering for crawlers, but this is a tedious task for something that just started happening randomly.










share|improve this question









New contributor




Matt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • If your CPU heavily on legit, or even rogue bots, it's time to upgrade your hosting. Your website hosting should be able to handle users, legit bots, and even 3rd bots.

    – Simon Hayter
    4 hours ago


















1















My website is currently getting hit with spamming bots (Example: 66.249.73.*) that are causing high CPU usage. Is it common for Google/Bing to crawl a website 10/sec?



I have done reverse look ups on the IPs and they appear to be valid crawlers using https://support.google.com/webmasters/answer/80553?hl=en.



Because I have blocked some IP's, Google/Bing Search Console are reporting errors and hurting my index.



This has been happening this month (April). Is this a referral attack? Is it possible someone is spoofing the IPs? Is there something I can do to limit the amount of crawls?



I am currently in the process of creating a server-side rendering for crawlers, but this is a tedious task for something that just started happening randomly.










share|improve this question









New contributor




Matt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • If your CPU heavily on legit, or even rogue bots, it's time to upgrade your hosting. Your website hosting should be able to handle users, legit bots, and even 3rd bots.

    – Simon Hayter
    4 hours ago














1












1








1








My website is currently getting hit with spamming bots (Example: 66.249.73.*) that are causing high CPU usage. Is it common for Google/Bing to crawl a website 10/sec?



I have done reverse look ups on the IPs and they appear to be valid crawlers using https://support.google.com/webmasters/answer/80553?hl=en.



Because I have blocked some IP's, Google/Bing Search Console are reporting errors and hurting my index.



This has been happening this month (April). Is this a referral attack? Is it possible someone is spoofing the IPs? Is there something I can do to limit the amount of crawls?



I am currently in the process of creating a server-side rendering for crawlers, but this is a tedious task for something that just started happening randomly.










share|improve this question









New contributor




Matt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.












My website is currently getting hit with spamming bots (Example: 66.249.73.*) that are causing high CPU usage. Is it common for Google/Bing to crawl a website 10/sec?



I have done reverse look ups on the IPs and they appear to be valid crawlers using https://support.google.com/webmasters/answer/80553?hl=en.



Because I have blocked some IP's, Google/Bing Search Console are reporting errors and hurting my index.



This has been happening this month (April). Is this a referral attack? Is it possible someone is spoofing the IPs? Is there something I can do to limit the amount of crawls?



I am currently in the process of creating a server-side rendering for crawlers, but this is a tedious task for something that just started happening randomly.







web-crawlers






share|improve this question









New contributor




Matt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




Matt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 6 hours ago









John Conde

83.7k16132230




83.7k16132230






New contributor




Matt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked 8 hours ago









MattMatt

62




62




New contributor




Matt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Matt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Matt is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.













  • If your CPU heavily on legit, or even rogue bots, it's time to upgrade your hosting. Your website hosting should be able to handle users, legit bots, and even 3rd bots.

    – Simon Hayter
    4 hours ago



















  • If your CPU heavily on legit, or even rogue bots, it's time to upgrade your hosting. Your website hosting should be able to handle users, legit bots, and even 3rd bots.

    – Simon Hayter
    4 hours ago

















If your CPU heavily on legit, or even rogue bots, it's time to upgrade your hosting. Your website hosting should be able to handle users, legit bots, and even 3rd bots.

– Simon Hayter
4 hours ago





If your CPU heavily on legit, or even rogue bots, it's time to upgrade your hosting. Your website hosting should be able to handle users, legit bots, and even 3rd bots.

– Simon Hayter
4 hours ago










3 Answers
3






active

oldest

votes


















3














Google uses many IP ranges. From the one you posted, any IP from them in the range 66.249.64.0 - 66.249.95.255 and identifying itself as Googlebot should be a legit bot.



There are many reasons for the increase in crawl rate, maybe some of your content is viral, or their bot wants to refreshen the data from your site sooner. It usually IS a good thing.



I would NEVER block a Google IP range, unless you don't want visitors to reach your site. What you CAN do if it's hammering your resources is to specify a crawl-delay for other search bots in Robots.txt.



Google does not support the Crawl-delay directive. However, Google does support defining a crawl rate in Google Search Console.






share|improve this answer








New contributor




Luis Alberto Barandiaran is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





















  • Thanks for the response. This info was helpful.

    – Matt
    4 hours ago



















1














It's not likely that Google is crawling your site 10x a second. Google addresses have been used before by hackers for various reasons. See https://www.abuseipdb.com/user/23706.



You can block this range using your htaccess file. See https://stackoverflow.com/questions/18483068/how-to-block-an-ip-address-range-using-the-htaccess-file/18483210



What I don't know is if the range you have is from Google servers or client network.






share|improve this answer



















  • 1





    You should NEVER block Google IP addresses, unless you don't want them to send visitors to your site.

    – Luis Alberto Barandiaran
    6 hours ago











  • @LuisAlbertoBarandiaran, I would agree. But not every address assigned to Google is assigned to their servers. That's why I said I don't know if the range is from servers or client network.

    – Trebor
    6 hours ago



















1














If the IP addresses from where the requests are originating are identified as that of Google/Bing then you should not block them as this will impact your SEO. Instead of blocking them you should adjust the crawl rate in their respective Webmaster tools



Both Google and Bing offer the ability to adjust crawl rate with good flexibility.



Change Googlebot crawl rate



Bing Crawl Control



In the case of Yandex crawling your site you can add a Crawl-delay directive to slow down the crawl rate for yandex



If you think there are spam bots that are affecting your servers (which can be determined by observing the server logs) consider using a Web Application Firewall that will block suspicious IP Addresses. Cloudflare has the ability to allow known bots and block suspicious bots based on the threat score which it calculates. Additionally you can also block certain User-agents from crawling the website






share|improve this answer
























    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "45"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    Matt is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fwebmasters.stackexchange.com%2fquestions%2f122575%2fam-i-getting-ddos-from-crawlers%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    3














    Google uses many IP ranges. From the one you posted, any IP from them in the range 66.249.64.0 - 66.249.95.255 and identifying itself as Googlebot should be a legit bot.



    There are many reasons for the increase in crawl rate, maybe some of your content is viral, or their bot wants to refreshen the data from your site sooner. It usually IS a good thing.



    I would NEVER block a Google IP range, unless you don't want visitors to reach your site. What you CAN do if it's hammering your resources is to specify a crawl-delay for other search bots in Robots.txt.



    Google does not support the Crawl-delay directive. However, Google does support defining a crawl rate in Google Search Console.






    share|improve this answer








    New contributor




    Luis Alberto Barandiaran is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





















    • Thanks for the response. This info was helpful.

      – Matt
      4 hours ago
















    3














    Google uses many IP ranges. From the one you posted, any IP from them in the range 66.249.64.0 - 66.249.95.255 and identifying itself as Googlebot should be a legit bot.



    There are many reasons for the increase in crawl rate, maybe some of your content is viral, or their bot wants to refreshen the data from your site sooner. It usually IS a good thing.



    I would NEVER block a Google IP range, unless you don't want visitors to reach your site. What you CAN do if it's hammering your resources is to specify a crawl-delay for other search bots in Robots.txt.



    Google does not support the Crawl-delay directive. However, Google does support defining a crawl rate in Google Search Console.






    share|improve this answer








    New contributor




    Luis Alberto Barandiaran is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





















    • Thanks for the response. This info was helpful.

      – Matt
      4 hours ago














    3












    3








    3







    Google uses many IP ranges. From the one you posted, any IP from them in the range 66.249.64.0 - 66.249.95.255 and identifying itself as Googlebot should be a legit bot.



    There are many reasons for the increase in crawl rate, maybe some of your content is viral, or their bot wants to refreshen the data from your site sooner. It usually IS a good thing.



    I would NEVER block a Google IP range, unless you don't want visitors to reach your site. What you CAN do if it's hammering your resources is to specify a crawl-delay for other search bots in Robots.txt.



    Google does not support the Crawl-delay directive. However, Google does support defining a crawl rate in Google Search Console.






    share|improve this answer








    New contributor




    Luis Alberto Barandiaran is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.










    Google uses many IP ranges. From the one you posted, any IP from them in the range 66.249.64.0 - 66.249.95.255 and identifying itself as Googlebot should be a legit bot.



    There are many reasons for the increase in crawl rate, maybe some of your content is viral, or their bot wants to refreshen the data from your site sooner. It usually IS a good thing.



    I would NEVER block a Google IP range, unless you don't want visitors to reach your site. What you CAN do if it's hammering your resources is to specify a crawl-delay for other search bots in Robots.txt.



    Google does not support the Crawl-delay directive. However, Google does support defining a crawl rate in Google Search Console.







    share|improve this answer








    New contributor




    Luis Alberto Barandiaran is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.









    share|improve this answer



    share|improve this answer






    New contributor




    Luis Alberto Barandiaran is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.









    answered 6 hours ago









    Luis Alberto BarandiaranLuis Alberto Barandiaran

    1214




    1214




    New contributor




    Luis Alberto Barandiaran is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





    New contributor





    Luis Alberto Barandiaran is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    Luis Alberto Barandiaran is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.













    • Thanks for the response. This info was helpful.

      – Matt
      4 hours ago



















    • Thanks for the response. This info was helpful.

      – Matt
      4 hours ago

















    Thanks for the response. This info was helpful.

    – Matt
    4 hours ago





    Thanks for the response. This info was helpful.

    – Matt
    4 hours ago













    1














    It's not likely that Google is crawling your site 10x a second. Google addresses have been used before by hackers for various reasons. See https://www.abuseipdb.com/user/23706.



    You can block this range using your htaccess file. See https://stackoverflow.com/questions/18483068/how-to-block-an-ip-address-range-using-the-htaccess-file/18483210



    What I don't know is if the range you have is from Google servers or client network.






    share|improve this answer



















    • 1





      You should NEVER block Google IP addresses, unless you don't want them to send visitors to your site.

      – Luis Alberto Barandiaran
      6 hours ago











    • @LuisAlbertoBarandiaran, I would agree. But not every address assigned to Google is assigned to their servers. That's why I said I don't know if the range is from servers or client network.

      – Trebor
      6 hours ago
















    1














    It's not likely that Google is crawling your site 10x a second. Google addresses have been used before by hackers for various reasons. See https://www.abuseipdb.com/user/23706.



    You can block this range using your htaccess file. See https://stackoverflow.com/questions/18483068/how-to-block-an-ip-address-range-using-the-htaccess-file/18483210



    What I don't know is if the range you have is from Google servers or client network.






    share|improve this answer



















    • 1





      You should NEVER block Google IP addresses, unless you don't want them to send visitors to your site.

      – Luis Alberto Barandiaran
      6 hours ago











    • @LuisAlbertoBarandiaran, I would agree. But not every address assigned to Google is assigned to their servers. That's why I said I don't know if the range is from servers or client network.

      – Trebor
      6 hours ago














    1












    1








    1







    It's not likely that Google is crawling your site 10x a second. Google addresses have been used before by hackers for various reasons. See https://www.abuseipdb.com/user/23706.



    You can block this range using your htaccess file. See https://stackoverflow.com/questions/18483068/how-to-block-an-ip-address-range-using-the-htaccess-file/18483210



    What I don't know is if the range you have is from Google servers or client network.






    share|improve this answer













    It's not likely that Google is crawling your site 10x a second. Google addresses have been used before by hackers for various reasons. See https://www.abuseipdb.com/user/23706.



    You can block this range using your htaccess file. See https://stackoverflow.com/questions/18483068/how-to-block-an-ip-address-range-using-the-htaccess-file/18483210



    What I don't know is if the range you have is from Google servers or client network.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered 7 hours ago









    TreborTrebor

    528112




    528112








    • 1





      You should NEVER block Google IP addresses, unless you don't want them to send visitors to your site.

      – Luis Alberto Barandiaran
      6 hours ago











    • @LuisAlbertoBarandiaran, I would agree. But not every address assigned to Google is assigned to their servers. That's why I said I don't know if the range is from servers or client network.

      – Trebor
      6 hours ago














    • 1





      You should NEVER block Google IP addresses, unless you don't want them to send visitors to your site.

      – Luis Alberto Barandiaran
      6 hours ago











    • @LuisAlbertoBarandiaran, I would agree. But not every address assigned to Google is assigned to their servers. That's why I said I don't know if the range is from servers or client network.

      – Trebor
      6 hours ago








    1




    1





    You should NEVER block Google IP addresses, unless you don't want them to send visitors to your site.

    – Luis Alberto Barandiaran
    6 hours ago





    You should NEVER block Google IP addresses, unless you don't want them to send visitors to your site.

    – Luis Alberto Barandiaran
    6 hours ago













    @LuisAlbertoBarandiaran, I would agree. But not every address assigned to Google is assigned to their servers. That's why I said I don't know if the range is from servers or client network.

    – Trebor
    6 hours ago





    @LuisAlbertoBarandiaran, I would agree. But not every address assigned to Google is assigned to their servers. That's why I said I don't know if the range is from servers or client network.

    – Trebor
    6 hours ago











    1














    If the IP addresses from where the requests are originating are identified as that of Google/Bing then you should not block them as this will impact your SEO. Instead of blocking them you should adjust the crawl rate in their respective Webmaster tools



    Both Google and Bing offer the ability to adjust crawl rate with good flexibility.



    Change Googlebot crawl rate



    Bing Crawl Control



    In the case of Yandex crawling your site you can add a Crawl-delay directive to slow down the crawl rate for yandex



    If you think there are spam bots that are affecting your servers (which can be determined by observing the server logs) consider using a Web Application Firewall that will block suspicious IP Addresses. Cloudflare has the ability to allow known bots and block suspicious bots based on the threat score which it calculates. Additionally you can also block certain User-agents from crawling the website






    share|improve this answer




























      1














      If the IP addresses from where the requests are originating are identified as that of Google/Bing then you should not block them as this will impact your SEO. Instead of blocking them you should adjust the crawl rate in their respective Webmaster tools



      Both Google and Bing offer the ability to adjust crawl rate with good flexibility.



      Change Googlebot crawl rate



      Bing Crawl Control



      In the case of Yandex crawling your site you can add a Crawl-delay directive to slow down the crawl rate for yandex



      If you think there are spam bots that are affecting your servers (which can be determined by observing the server logs) consider using a Web Application Firewall that will block suspicious IP Addresses. Cloudflare has the ability to allow known bots and block suspicious bots based on the threat score which it calculates. Additionally you can also block certain User-agents from crawling the website






      share|improve this answer


























        1












        1








        1







        If the IP addresses from where the requests are originating are identified as that of Google/Bing then you should not block them as this will impact your SEO. Instead of blocking them you should adjust the crawl rate in their respective Webmaster tools



        Both Google and Bing offer the ability to adjust crawl rate with good flexibility.



        Change Googlebot crawl rate



        Bing Crawl Control



        In the case of Yandex crawling your site you can add a Crawl-delay directive to slow down the crawl rate for yandex



        If you think there are spam bots that are affecting your servers (which can be determined by observing the server logs) consider using a Web Application Firewall that will block suspicious IP Addresses. Cloudflare has the ability to allow known bots and block suspicious bots based on the threat score which it calculates. Additionally you can also block certain User-agents from crawling the website






        share|improve this answer













        If the IP addresses from where the requests are originating are identified as that of Google/Bing then you should not block them as this will impact your SEO. Instead of blocking them you should adjust the crawl rate in their respective Webmaster tools



        Both Google and Bing offer the ability to adjust crawl rate with good flexibility.



        Change Googlebot crawl rate



        Bing Crawl Control



        In the case of Yandex crawling your site you can add a Crawl-delay directive to slow down the crawl rate for yandex



        If you think there are spam bots that are affecting your servers (which can be determined by observing the server logs) consider using a Web Application Firewall that will block suspicious IP Addresses. Cloudflare has the ability to allow known bots and block suspicious bots based on the threat score which it calculates. Additionally you can also block certain User-agents from crawling the website







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 1 hour ago









        Shahzeb QureshiShahzeb Qureshi

        2416




        2416






















            Matt is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            Matt is a new contributor. Be nice, and check out our Code of Conduct.













            Matt is a new contributor. Be nice, and check out our Code of Conduct.












            Matt is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Webmasters Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fwebmasters.stackexchange.com%2fquestions%2f122575%2fam-i-getting-ddos-from-crawlers%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Taj Mahal Inhaltsverzeichnis Aufbau | Geschichte | 350-Jahr-Feier | Heutige Bedeutung | Siehe auch |...

            Baia Sprie Cuprins Etimologie | Istorie | Demografie | Politică și administrație | Arii naturale...

            Nicolae Petrescu-Găină Cuprins Biografie | Opera | In memoriam | Varia | Controverse, incertitudini...