AngularJS: Index and Supercharge Your SEO

AngularJS is a good framework for building websites and apps because it makes them much faster and richer for users.

But there’s one problem every developer faces when pushing their AngularJS product: Search Engines Optimization, or SEO.

Quality SEO means getting found amidst all the noise online. When a site or app is optimized for search it is more likely to be found by prospective users. If it is not optimized, then a developer might as well be screaming at the wind – no exposure and no users are almost guarantees.

Right now the most common techniques for mitigating this problem is to either build separate versions of the app or website for search engines, or to pre-render pages on the server. The problem with these solutions, however, is that you have to maintain two separate systems: one for your users and another for Google and the other search engines.

While Google has attempted to help developers by improving its capability to index JavaScript and CSS but even if Google’s efforts to index JavaScript since 2014 have advanced, it doesn’t mean that your app will be indexed properly.  Indeed, Google still recommends creating snapshots to make an AJAX application crawlable.

But how exactly are these snapshots created? And how can a developer be sure that their AngularJS app or website is correctly and completely indexed by Google?

In this post we present a free and self hosted solution to generate snapshots and to make sure that your AngularJS website or application crawlable by, indexed by, and optimized for Google.

AngularJS and SEO: The Problem

Search engines crawlers were originally designed to index the HTML content of web pages.

Today, however, JavaScript and other frameworks like AngularJS and BackboneJS  are playing a leading role in web development and the creation of content and application online.

Unfortunately, the crawlers and the other indexing mechanisms behind search engines remain decidedly unfriendly to JavaScript powered sites.

angularjs-js-code

AngularJS and SEO: The Solution

Overcoming the indexing problem is not difficult when developers embrace what are called ‘snapshots’.

Snapshots is a term used to refer to content generated for the search engine crawlers on the website’s backend. The idea behind snapshots is that the developer does the work for the crawler that it cannot or doesn’t want to do on it’s own. Optimizing and caching snapshots not only help you get indexed, but also improves significantly the speed of indexation.

An important note: JavaScript indexation currently only applies to Google’s crawler. Other search crawlers (such as those from Microsoft’s Bing search engine) do not support crawling JavaScript applications yet. As well, despite web content being increasingly shared to social networks like Facebook and Twitter, most social network crawlers don’t handle JavaScript either.

So how do you generate snapshots, and how do you work with them to make sure you are indexed?

Read on for the step-by-step guide.

Step One: Generate Snapshots

The first step is to generate the snapshots themselves.

To do this we need access to a snapshot server based on a headless browser such as PhantomJS or ZombieJS. In this example we will use the open source middleware Prerender that already packages PhantomJS and is ready to handle our special crawler requests and serve HTML snapshots.

In order to reduce the time, it takes to generate snapshots a cache can be employed. Snapshots are cached on a Redis Server the first time they are requested, and then re-cached once a day (note: this can be manually configured to suit your needs) to make sure the content stays up-to-date.  As a result, a static snapshot is always and instantly available to be served to the crawler.

angularjs-map

Step 2: Server Installation

In this example we will use an Apache server run on Ubuntu 14.04.2 LTS.

There are five sub-steps to work through here.

1 – Install NPM and NodeJS

sudo aptget update

sudo aptget install nodejs npm

ln s /usr/bin/nodejs /usr/bin/node

2 – Install Forever

npm install forever g

3 – Install  and Start Prerender.io

git clone https://github.com/prerender/prerender.git

cd prerender

npm install

Make sure the server starts on 4001 and that PhantomJS is on 4002.

You can edit this file if you want to change the port:

/lib/index.js

Return to the Prerender folder and start the server using forever – this will help to start the server continuously in the background.

forever start server.js

4 – Install Redis server

Add the Dotdeb repositories to your APT sources. To do this, create a new list file in /etc/apt/sources.list.d/ and fill it with the following content:

# /etc/apt/sources.list.d/dotdeb.org.list

deb http://packages.dotdeb.org squeeze all

debsrc http://packages.dotdeb.org squeeze all

Then you need to authenticate these repositories using their public key:

wget q O  http://www.dotdeb.org/dotdeb.gpg | sudo apt-key add –

Next, install Redis using apt-get:

sudo aptget update

sudo aptget install redisserver

Then enable the Redis service to start on boot:

sudo service redis_6379 start

sudo service redis_6379 stop

You should then check the Redis status:

$ rediscli ping

You will get “PONG” if everything is ok.

5 – Make Prerender use the Redis server to cache snapshots

Prerender has an open source module, Prerender-Redis-Cache, that makes it easy to perform this task.

In your local prerender project ( prerender/server.js) run:

$ npm install prerenderrediscache save

Then add these two lines in prerender/server.js :

process.env.PAGE_TTL = 3600 * 24 * 5// change to 0 if you want all time cache

server.use(require(prerenderrediscache));

Restart Prerender by:

forever stopall

forever start server.js

And if you want to clean all REDIS cache you can use:

rediscli p 6379 flushall

Step 3: Server Configuration

Now we will redirect crawlers to the local Prerender server using a simple .htaccess file.

This htaccess file have contain all the redirect configurations. Note that the .htaccess file needs to be in same directory with your main AngularJS index.html file.

 

<IfModule mod_rewrite.c>

RewriteEngine On
 
Options +FollowSymLinks
 
# redirect non www to www
 
RewriteCond %{HTTP_HOST!^www\.
 
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
 
# Don’t rewrite files or directories
 
RewriteCond %{REQUEST_FILENAMEf [OR]
 
RewriteCond %{REQUEST_FILENAMEd
 
RewriteRule ^  [L]
 
# List of all crawlers and bot you can add more
 
RewriteCond %{HTTP_USER_AGENTGooglebot|bingbot|GooglebotMobile|Baiduspider|Yahoo|YahooSeeker|DoCoMo|Twitterbot|TweetmemeBot|Twikle|Netseer|Daumoa|SeznamBot|Ezooms|MSNBot|Exabot|MJ12bot|sogou\sspider|YandexBot|bitlybot|ia_archiver|proximic|spbot|ChangeDetection|NaverBot|MetaJobBot|magpiecrawler|Genieo\sWeb\sfilter|Qualidator.com\sBot|Woko|Vagabondo|360Spider|ExB\sLanguage\sCrawler|AddThis.com|aiHitBot|Spinn3r|BingPreview|GrapeshotCrawler|CareerBot|ZumBot|ShopWiki|bixocrawler|uMBot|sistrix|linkdexbot|AhrefsBot|archive.org_bot|SeoCheckBot|TurnitinBot|VoilaBot|SearchmetricsBot|Butterfly|Yahoo!|Plukkie|yacybot|trendictionbot|UASlinkChecker|Blekkobot|Wotbox|YioopBot|meanpathbot|TinEye|LuminateBot|FyberSpider|Infohelfer|linkdex.com|Curious\sGeorge|FetchGuess|ichiro|MojeekBot|SBSearch|WebThumbnail|socialbm_bot|SemrushBot|Vedma|alexa\ssite\saudit|SEOkicksRobot|Browsershots|BLEXBot|woriobot|AMZNKAssocBot|Speedy|oBot|HostTracker|OpenWebSpider|WBSearchBot|FacebookExternalHit|quoralinkpreview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator [NC,OR]
 
RewriteCond %{QUERY_STRING_escaped_fragment_
 
# Only proxy the request to Prerender if it’s a request for HTML
 
RewriteRule ^(?!.*?(\.js|\.css|\.xml|\.less|\.png|\.jpg|\.jpeg|\.gif|\.pdf|\.doc|\.txt|\.ico|\.rss|\.zip|\.mp3|\.rar|\.exe|\.wmv|\.doc|\.avi|\.ppt|\.mpg|\.mpeg|\.tif|\.wav|\.mov|\.psd|\.ai|\.xls|\.mp4|\.m4a|\.swf|\.dat|\.dmg|\.iso|\.flv|\.m4v|\.torrent))(.*http://www.mydomain.com:4001/http://www.mydomain.com/$2 [P,L]
 
# Rewrite everything else to index.html to allow html5 state links
 
RewriteRule ^ index.html [L]
 
</IfModule>

You have now finished all server side installation tasks, so it’s now time to configure the AngularJS App.

Step 4: App Configuration

First open you Angularjs index.html file and:

  1. make sure you have <base href=”/”> before </head>
  2. add <meta name=”fragment” content=”!”> between <head></head> (by adding this tag into the page www.example.com, the crawler will temporarily link this URL to www.example.com?_escaped_fragment_= and will request this from your server)

Second, activate HTML5 mode.

In your config.js file add:

$locationProvider.html5Mode(true);

This will tell your application to use HTML5 URL format.

URLs typically look like http://www.example.com/directory/page. By default AngularJS elements will have URLs like this: http://www.example.com/#!/directory/page

var app = angular.module(app)

.config([$httpProvider$locationProviderfunction ($httpProvider$locationProvider) {
 
// use the HTML5 History API
 
$locationProvider.html5Mode(true);

Third, you need to manage the meta tags.

To improve the SEO of your app or website you need to have a unique title and description for each page. An AngularJS module called AngularJS-View-Head already exists to fix this problem. This module will help us to change the HTML title and head elements on a per-view basis.

How do you work this in practice?

Start by installing this module using bower.

Next,  declare the module as a dependency of your application:

var app = angular.module(myApp, [ngviewhead]);

This makes available the directives described in your HTML template.

Finally add the meta tags inside your template.

<view-title>{{ artist.title }}</view-title>
 <meta view-head name=“description” content=“{{ ‘SEO.ARTIST.DESCRIPTION’ | translate:{ artist: artist.name, artist_ar: artist.name_ar } }}”/>
 
<!– Open Graph data —>
<meta view-head property=“og:title” content=“{{artist.name}}” />
<meta view-head property=“og:type” content=“article” />
<meta view-head property=“og:url” content=“http://www.example.com/artist/{{artist.slug}}” />
<meta view-head property=“og:image” content=“{{artist.photo}}” />
<meta view-head property=“og:description” content=“{{ artist.description }}” />

Step 5: Test Prerender Server

If you’ve followed all of the steps things should be working well. However, better safe than sorry, it’s time to test.

Compare the source of one of your pages with and without _escaped_fragment_ in the URL.

You can check specific routes in your browser and compare :

http://www.example.com/directory/page
http://www.example.com/directory/page?_escaped_fragment_=

Step 6: Add a Sitemap

The final step in your AngularJS SEO strategy is to develop a sitemap.

To help search engines crawl your app or website and make sure pages are indexed quickly you must create a sitemap for all of your routes. This sounds difficult to do, however, with proper build processes this can be automated using a tool like Grunt.

(Heads up: we’ll be publishing another post soon explaining just how to automate a sitemap using Grunt!)

Conclusion

Making sure search engines and – through searches – users can find your app or website is essential. The strategy presented in this post is quick to implement, scalable and easy to maintain and, where employed, should help you make the connections you need and win the users you want.

Related Posts: