Home » SEO Tools » Web Server Log Analytics

Web Server Log Analytics

Web analytics can come as many forms. Web server log analytics is one form of analytics that can complement traditional web analytics (i.e. Google Analytics or other paid web analytics tools).

We don’t use web server log file data as the main data source to build our web analytics reports, because there are always simpler tools (such as directly using Google Analytics reports). However, web server log file data consists of data that is usually lack in Google Analytics (or other web analytics tools).

The data in web server log files can be used to find potential technical SEO issues that are happening within your website (and/or web pages).

Nowadays major search engines such as Google has been providing much required SEO data of your website  through Google Search Console. Google Search Console is another useful source for getting data of technical SEO issues of your site.

200+ Free Traffic Sources - The Complete List

Promote Your Website with Free Traffic

guide-free-traffic-sources

Get the Complete List as a PDF Document

What is in a Web Server Log File?

Web server log file is a file (or usually multiple files) that are sitting in your web server.

As soon as the web server that is hosting your website goes live and is running, the web server automatically starts recording data, or logging data of any visitors and bots (who/that visit your website).

The advantage of using web server log file data for web analytics is it usually doesn’t require pre-installation. All web analytics tools (including Google Analytics) will require setup i.e. Usually by inserting a JavaScript tracking code on to your web pages.

  • When a user visits a web page on your website, your web server logs a single line of record.
  • At the same time when the web page he/she visits has an image, another line of record is logged.

Any files (hosted on your site) that have been triggered to load by a user’s visit to your website, the action is recorded in the log file as a single line.

Below is a typical log file record. A user (with IP address 192.168.22.10) visited your website’s homepage ( / ) successfully (i.e. http status 200). The traffic source is www.google.com, and the user came through Firefox web browser when visiting the page.

192.168.22.10 - - [21/Nov/2003:11:17:55 -0400] "GET / HTTP/1.1" 200 10801 "http://www.google.com/search?q=china+seo&ie=utf-8&oe=utf-8 &aq=t&rls=org.mozilla:en-US:official&client=firefox-a" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.7) Gecko/20070914 Firefox/2.0.0.7"

Identify Search Engine Spider (Bot) in Web Server Log Files

Advantage 1 of web server log analytics

One major advantage of web server log analytics is search engine spider/bot visits are actually recorded by log files.

Web analytics tools (such as Google Analytics) aren’t able to capture bot visits because:

  • Google Analytics tracks using JavaScript-based tracking code inserted on to your website’s web pages. Each visit that opens a web page is recorded as a pageview. Refer to Install Google Analytics on WordPress.
  • A search engine bot does not necessarily visit a web page, but it can choose to visit any other resources that is not a web page (that triggers the recording of a pageview).

Below is a typical log file record when a search engine spider (i.e. Googlebot) visits your website’s page (/a.html).

66.250.65.101 - - [21/Nov/2003:04:54:20 -0400] "GET /a.html HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

Below is the part that reveals the visit was from Googlebot:

compatible; Googlebot/2.1; +http://www.google.com/bot.html

What this means is, Googlebot has visited your page (i.e. /a.html). Only after a web page is known to (or visited by) Google, Google will then analyze and decide if the web page meets the quality requirements to be ranked in Google’s search engine results pages (SERP).

Technical SEO Issues in Web Server Log File Data

Advantage 2 of web server log analytics

Web server log files can reveals your website’s technical SEO issues.

In the log files, whether it is a record of a user’s visit, or a record a search engine spider’s visit, the record shows a HTTP status code. Below are the most common HTTP status codes appearing in web server log files.

  • 200 – OK
  • 301 – Permanently moved
  • 302 – Temporarily moved
  • 404 – Not found
  • 500 – Internal server error
  • 503 – Service Unavailable

In log files, all records that show http status codes 200 mean no technical SEO issue.

HTTP status codes of 301 or 302 normally means no problem at all. But in some cases it can trigger technical SEO concerns, when the 301 redirect or 302 redirect form part of a chained redirect. Webmasters should review this type of chained redirect before they run into bigger problems.

HTTP status code of 404 means resources of your website (including HTML page, CSS, JavaScript, images, or any other files) are missing or cannot be found when a search engine bot/spider or a human visitor requested the resource.

Records with HTTP status codes 500 or 503 have to do with web server errors that may require further attention.

Refer to HTTP status codes.

Search Engine Optimization (SEO) & WordPress Website Development

ARTICLES ON Traffic & SEO TOPICS

Rank Your Site on Google’s 1st Page – Google’s Mobile-first Index ranks web pages based on their mobile version. Optimize your site for mobile to load super fast and display well on all devices.

Complete list of websites: Get the full 200+ free traffic sources list including search engines, social media, document/photo sharing, bookmarking, video sites, forums, communities.