Web Server Log Analytics
Web Server-side Analytics For Website SEO
Web analytics can come as many forms. Web server log analytics is one form of analytics that can complement traditional web analytics (i.e. Google Analytics or other paid web analytics tools).
We don’t use web server log file data as the main data source to build our web analytics reports, because there are always simpler tools (such as directly using Google Analytics reports). However, web server log file data consists of data that is usually lack in Google Analytics (or other web analytics tools).
The data in web server log files can be used to find potential technical SEO issues that are happening within your website (and/or web pages).
Nowadays major search engines such as Google has been providing much required SEO data of your website through Google Search Console. Google Search Console is another useful source for getting data of technical SEO issues of your site.
- What is in a Web Server Log File?
- Identify Search Engine Spider (Bot) in Web Server Log Files
- Technical SEO Issues in Web Server Log File Data
What is in a Web Server Log File?
Web server log file is a file (or usually multiple files) that are sitting in your web server.
As soon as the web server that is hosting your website goes live and is running, the web server automatically starts recording data, or logging data of any visitors and bots (who/that visit your website).
- When a user visits a web page on your website, your web server logs a single line of record.
- At the same time when the web page he/she visits has an image, another line of record is logged.
Any files (hosted on your site) that have been triggered to load by a user’s visit to your website, the action is recorded in the log file as a single line.
Below is a typical log file record. A user (with IP address 192.168.22.10) visited your website’s homepage ( / ) successfully (i.e. http status 200). The traffic source is www.google.com, and the user came through Firefox web browser when visiting the page.
192.168.22.10 - - [21/Nov/2003:11:17:55 -0400] "GET / HTTP/1.1" 200 10801 "http://www.google.com/search?q=china+seo&ie=utf-8&oe=utf-8 &aq=t&rls=org.mozilla:en-US:official&client=firefox-a" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:126.96.36.199) Gecko/20070914 Firefox/188.8.131.52"
Identify Search Engine Spider (Bot) in Web Server Log Files
Advantage 1 of web server log analytics
One major advantage of web server log analytics is search engine spider/bot visits are actually recorded by log files.
Web analytics tools (such as Google Analytics) aren’t able to capture bot visits because:
- A search engine bot does not necessarily visit a web page, but it can choose to visit any other resources that is not a web page (that triggers the recording of a pageview).
Below is a typical log file record when a search engine spider (i.e. Googlebot) visits your website’s page (/a.html).
184.108.40.206 - - [21/Nov/2003:04:54:20 -0400] "GET /a.html HTTP/1.1" 200 11179 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Below is the part that reveals the visit was from Googlebot:
compatible; Googlebot/2.1; +http://www.google.com/bot.html
What this means is, Googlebot has visited your page (i.e. /a.html). Only after a web page is known to (or visited by) Google, Google will then analyze and decide if the web page meets the quality requirements to be ranked in Google’s search engine results pages (SERP).
Technical SEO Issues in Web Server Log File Data
Advantage 2 of web server log analytics
Web server log files can reveals your website’s technical SEO issues.
In the log files, whether it is a record of a user’s visit, or a record a search engine spider’s visit, the record shows a HTTP status code. Below are the most common HTTP status codes appearing in web server log files.
- 200 – OK
- 301 – Permanently moved
- 302 – Temporarily moved
- 404 – Not found
- 500 – Internal server error
- 503 – Service Unavailable
In log files, all records that show http status codes 200 mean no technical SEO issue.
HTTP status codes of 301 or 302 normally means no problem at all. But in some cases it can trigger technical SEO concerns, when the 301 redirect or 302 redirect form part of a chained redirect. Webmasters should review this type of chained redirect before they run into bigger problems.
Records with HTTP status codes 500 or 503 have to do with web server errors that may require further attention.
Refer to HTTP status codes.
Articles on SEO Tools Topics
Search Engine Optimization (SEO) & WordPress Website Development
Web traffic, SEO best practices & algorithms:
- Free Traffic Sources
- Mobile SEO Best Practices
- Local SEO Best Practices
- YMYL Sites
- Subdomains SEO
- Hong Kong Search Engines
WordPress website development: