全球主机交流论坛

 找回密码
 注册

QQ登录

只需一步,快速开始

CeraNetworks网络延迟测速工具IP归属甄别会员请立即修改密码
查看: 1401|回复: 2
打印 上一主题 下一主题

[疑问] 求助爬虫大佬

[复制链接]
跳转到指定楼层
1#
发表于 2023-6-2 22:07:04 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
https://www.visa.com.sg/cmsapi/fx/rates?amount=100&fee=0&utcConvertedDate=06/02/2023&exchangedate=06/02/2023&fromCurr=CNY&toCurr=USD

想从visa网站爬汇率,上面这个url,一访问就给脸色看,返回的部分内容:
  1. <h1 data-translate="block_headline">Sorry, you have been blocked</h1>
  2.         <h2 class="cf-subheadline"><span data-translate="unable_to_access">You are unable to access</span> visa.com.sg</h2>
  3. <p data-translate="blocked_why_detail">This website is using a security service to protect itself from online attacks. The action you just performed triggered the security solution. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data.</p>
  4.           <h2 data-translate="blocked_resolve_headline">What can I do to resolve this?</h2>

  5.             <p data-translate="blocked_resolve_detail">You can email the site owner to let them know you were blocked. Please include what you were doing when this page came up and the Cloudflare Ray ID found at the bottom of this page.</p>
  6.           <span class="cf-footer-item sm:block sm:mb-1">Cloudflare Ray ID: <strong class="font-semibold">7d1030877eac3f87</strong></span>
  7.     <span class="cf-footer-separator sm:hidden">•</span>
  8.     <span id="cf-footer-item-ip" class="cf-footer-item sm:block sm:mb-1">
  9.       Your IP:
  10.       <button type="button" id="cf-footer-ip-reveal" class="cf-footer-ip-reveal-btn">Click to reveal</button>
  11.       <span class="hidden" id="cf-footer-ip">20.212.226.221</span>
  12.       <span class="cf-footer-separator sm:hidden">•</span>
  13.     </span>
  14.     <span class="cf-footer-item sm:block sm:mb-1"><span>Performance &amp; security by</span> <a rel="noopener noreferrer"  id="brand_link" target="_blank">Cloudflare</a></span>
复制代码



原始页面是: https://www.visa.com.sg/support/consumer/travel-support/exchange-rate-calculator.html


试过了,不是因为ip,因为第一个url随便打开一个浏览器直接访问就能返回json;根本不需要访问原始页面也可以。Chrome隐私模式也都没问题。

用的selenium+chromedriver, headless模式
2#
发表于 2023-6-2 22:39:57 | 只看该作者
curl也不行
说明需要header等信息,
不能让网站识别出来你是selenium

伪装一下吧
3#
发表于 2023-6-2 22:55:38 | 只看该作者
去试下stealth.min.js,我用这个通过不少检测,再高级的我也不会
您需要登录后才可以回帖 登录 | 注册

本版积分规则

Archiver|手机版|小黑屋|全球主机交流论坛

GMT+8, 2025-11-7 12:08 , Processed in 0.057839 second(s), 10 queries , Gzip On, MemCache On.

Powered by Discuz! X3.4

© 2001-2023 Discuz! Team.

快速回复 返回顶部 返回列表