angularjs – 如何在使用Amazon S3时将爬虫请求重定向到预渲染页面？

问题

我有一个使用Angular构建并在Amazon S3上托管的静态SPA站点.我正在尝试使爬网程序可以访问预呈现的页面,但由于Amazon S3不提供URL重写选项且重定向规则有限,因此我无法重定向爬网程序请求.

是)我有的

我已将以下元标记添加到< head>我的index.html页面：

<Meta name="fragment" content="!">

此外,我的SPA使用漂亮的URL(没有哈希#符号)和HTML5推送状态.

通过此设置,当抓取工具找到我的http://mywebsite.com/about链接时,它会向http://mywebsite.com/about?_escaped_fragment_=发出GET请求.这是一个pattern defined by Google,其次是其他爬虫.

我需要的是使用about.html文件的预渲染版本来回答此请求.我已经使用Phantom.js完成了这个预渲染,但我无法向抓取工具提供正确的文件,因为Amazon S3没有重写规则.

在Nginx服务器中,解决方案是添加重写规则,如：

location / {
  if ($args ~ "_escaped_fragment_=") { 
    rewrite ^/(.*)$/snapshots/$1.html break; 
  } 
}

但在Amazon S3中,我受到基于KeyPrefixes和HttpErrorCodes的redirect rules的限制. ？_escaped_fragment_ =不是KeyPrefix,因为它出现在URL的末尾,并且它不会出现HTTP错误,因为Angular会忽略它.

我试过的

我开始尝试使用ngRoute的动态模板,但后来我意识到我无法使用任何Angular解决方案来解决这个问题,因为我的目标是无法执行JavaScript的抓取工具.

使用Amazon S3,我必须坚持使用重定向规则.

我已经成功地使用了一个丑陋的解决方法.如果我为每个页面创建一个新规则,我就完成了：

<RoutingRules>

  <!-- each page needs it own rule -->
  <RoutingRule>
    <Condition>
      <KeyPrefixEquals>about?_escaped_fragment_=</KeyPrefixEquals>
    </Condition>
    <Redirect>
      <HostName>mywebsite.com</HostName>
      <ReplaceKeyPrefixWith>snapshots/about.html</ReplaceKeyPrefixWith>
    </Redirect>
  </RoutingRule>

</RoutingRules>

正如您在此解决方案中所看到的,每个页面都需要自己的规则.由于亚马逊仅限于50个重定向规则,因此这不是一个可行的解决方案.

另一种解决方案是忘记漂亮的URL并使用hashbang.有了这个,我的链接将是http://mywebsite.com/#!about,爬虫将通过http://mywebsite.com/?_escaped_fragment_=about请求此链接.由于URL将以？_escaped_fragment_ =开头,因此可以使用KeyPrefix捕获它,只需一个重定向规则即可.但是,我不想使用丑陋的URL.

那么,我如何在Amazon S3中拥有静态SPA并且对SEO友好？

@H_502_38@

var fs = require('fs'); var webPage = require('webpage'); var page = webPage.create(); // since this tool will run before your production deploy,// your target URL will be your dev/staging environment (localhost,in this example) var path = 'pages/my-page'; var url = 'http://localhost/' + path; page.open(url,function (status) { if (status != 'success') throw 'Error trying to prerender ' + url; var content = page.content; fs.write(path,content,'w'); console.log("The file was saved."); phantom.exit(); });

angularjs – 如何在使用Amazon S3时将爬虫请求重定向到预渲染页面？

猜你在找的Angularjs相关文章