Logfile analysis - for SEOs by SEOs

Track the behaviour of search engine bots and understand it.

Google Bot Tabelle

The Explorer was originally developed for technical SEO purposes. Above all, we wanted to make the crawling behaviour of Googlebot and other search engine spiders visible – especially for large websites with lots of traffic. Our first insights were even more fruitful than we could have imagined.

Logrunner processes and categorizes each log entry to make log file analysis as efficient as possible. A possible categorization might be the accurate web crawler assignment.

Identifying web crawlers

Search engines

Google, Bing, Yahoo, Baidu, Yandex … Search engines operate with various user-agents and IPs. There are ad bots, video bots, image bots, smartphone bots, news bots - or a Google AMP bot.

SEO Tool bots

SEO tools like Majestic crawl the Internet’s hyperlink structure. RSS reader bots are ever present. Media tools send their bots to news sites. Web scraper tools scan and read information.

Social Media & Messenger bots

See what content is being shared before the shared link even gets clicked. This works for Apps like Facebook Messenger, WhatsApp, Twitter, Skype, Instagram, Viber etc...

Hidden bots

These are web crawlers which pretend to be human users but can be identified by their behaviour.

Spoofed bots

These are bots which try to pretend to be other bots (such as Googlebot).


Analyse bot behaviour

Which types of content (JS files, images, CSS files, PDFs, URLs) are being crawled, and which are not? How often are they crawled (crawl frequency)? Peak periods are displayed in terms of days of the week or hours of the day. Status codes are displayed, and broken links are found and fixed. The behaviours of different bots can be illustrated and compared over time.


Segmentation & Comparison

Analyse subsets of log events in isolation.

Google Bot Tabelle

Segments can be configured in Explorer in much the same way as they can with Google Analytics. You can create subsets from the raw data of your log events, and then store them for later use. A possible segment might be ‘search engine only’ crawls, 404 status codes, or perhaps a separate segment for Googlebot. There are a large number of options.

These segments can then be used like a filter right across your entire dashboard. If you navigate in Explorer with one particular active segment, the values of all the tables and graphs will then adapt to match your segment content.

In addition to creating segments, you can also compare one with another over time (just as you can also do with non-segmented data). This feature also allows you to compare different segments over a selected time period. And, of course, one selected time period can then also be compared with another.


Bot detection

Use of algorithms to find out what the bots are up to.

Bot Erkennung

Bots from well-known search engines are generally quite easy to recognise. However, it gets a bit more complicated where crawlers don’t want to identify themselves as such. Often they are found in log files which are hardly distinguishable from conventional website visitors. Our machine learning algorithms can help to give greater transparency here. By carefully noting certain patterns and behaviours, we can spot bots trying to hide. We can then show you these in the Data Explorer.


Find Crawling Errors & unnecessary redirects

Detect broken links and status codes.

Crawl errors

You can easily find broken links by interpreting status codes. 404 pages and temporary and permanent redirects (3XX) are shown, and combinations which include alerts, client- (4XX) and server errors (5XX) are also easy to monitor.

As an SEO, this is the first source consulted to see where Google and others might struggle. You can only find this information in log files.


Super fast analysis

No website is too big for

The performance of the frontend is key to No matter how large your website is, you’ll get your data in a matter of seconds. And the best part of it: It’s real, accurate data and not a statistical approximation.


Real Time Tracking

Controlling everything in real time.

Real time tracking gives you immediate feedback about web server events, so all data about such log events is displayed within seconds. With real time tracking set up in Explorer, you gain some significant advantages: You can arrange to receive alerts for real-time events, such as traffic fluctuations (DDoS, server failures, spambot crawls), search engine crawls under certain status codes, and more. You can check which content is being viewed, and view your current volume of web server traffic. And because human users and active bot traffic are displayed separately, you can see which crawler is currently on your site and what pages it is interacting with. You can also determine the immediate effects of a tweet, Facebook post, a ‘Fetch as Google’ or a screaming frog crawl.


Alerts & Reporting (Alpha testing)

Simple monitoring, meaningful reports.

You can create alerts in Explorer with just a few clicks, and choose for yourself just exactly what you want to monitor. For example, you can monitor server errors, 404 web pages, redirects, Googlebot crawl fluctuations, a sudden increase in log counts, large content volumes, and long server load times. Explorer will send you real time notifications by email, or via a Slack report or an RSS feed – whichever is your preferred option.

Reports showing the current view can be created in .csv format with just one click.


Adapt your website and evaluate the changes

The analysis of crawling behaviour often leaves a lot of room for interpretation, and certainly raises many questions and assumptions. And what’s the best way to remove such conjecture? Develop your own hypotheses, then measure and evaluate the outcomes so that you can draw the right conclusions.

For instance, you can find out whether certain new internal links will affect the crawling behaviour of search engine bots in the way you hoped; see if other pages are crawled more once you delete unimportant pages; check whether a server upgrade leads to more crawling; and verify what effect a noindex or a nofollow tag will actually have.

To consider other dimensions: What effect will a website relaunch have? Will your design changes have any effect? (Spoiler alert: Yes, they will!) What happens if there is a domain change? What about a new URL structure? And what about the inclusion of many new backlinks? Will there be any noticeable differences before Google rolls out a ranking update? And what about afterwards?

With you can get a clear and definitive answer to all such questions.

No Comments

Sorry, the comment form is closed at this time.