Solutions for crawling JavaScript links

11/5/2012

Kunskapsdatabasen, Indexering

Links that depend on JavaScript are generally not crawled by SiteSeeker or global search engines. If the website uses JavaScript links exclusively, it may become invisible to the outside world.

It can also be difficult or even impossible for disabled persons to use the website if they are using special web browsers that cannot handle JavaScript. Other users who for some reason does not use JavaScript will also be negatively affected, such as users who for security reasons disable JavaScript.

If you are not sure if JavaScript is in use on your website, you can easily find out by disabling JavaScript in your web browser settings. Then browse your website and follow the links as you normally do in order to see how well it works without JavaScript. If you cannot follow the links it is possible that SiteSeeker cannot find all web pages. Web pages and documents without links to them cannot be crawled by SiteSeeker, and because of that cannot be found.

Solution 1

A good solution for all of the problems listed above is to present the links using ordinary HTML in addition to the JavaScript version. Here follows an example where the JavaScript which produces a visually attractive menu is supplemented with corresponding "normal" links in clear text:

<script language="JavaScript1.2" src="menus.js"></script>
<noscript>
<a href="menu selection-1.html">News</a>
<a href="menu selection-2.html">Products</a>
...
<a href="menu selection-n.html">Archive</a>
</noscript>

By adding the links in this way, the website will remain the same as before for the users that have JavaScript enabled in the web browser. The difference is that the links are now visible and works for people without JavaScript too, and that SiteSeeker together with the global search engines can index every document.

Solution 2

Alternatively, if the JavaScript is in the href argument in the link, it can be moved to the onclick attribute and the normal URL can then be set as the href attribute.

<a href="javascript:window.open('http://example.com/a.htm')">Open page</a>

is replaced by:

<a href="http://example.com/a.htm"onclick="window.open('http://example.com/a.htm');return false;">Open page</a>

Solution 3

Yet another, less optimal, way to partly solve the problem is to create an XML sitemap or a special web page with links to all web pages and documents that had otherwise only been linked to with JavaScript links. SiteSeeker and other search engines are able to find the web pages and documents in this way, but it does not make the website more usable for the other users.