September 20, 2022

URL optimization

Back

We’ve all seen those strange long (and sometimes suspicious) addresses with upper and lower case letters and all sorts of symbols, the content behind which remains completely unclear… until we open them. Needless to say, an optimized URL would look a lot more acceptable from a user’s perspective.

Moreover, a “clean” web address is relatively easy to remember – we can type it directly into the browser instead of searching around the site. 😉

URL optimization

Actually, SEF (Search Engine Friendly) URLs are useful for search engines too, as their name suggests. Apart from the well-known fact that URLs on our site need to be unique (in order to be crawled and indexed), it’s also worth mentioning that all other things being equal, search engines would rank pages with user-friendly URLs higher.

Let’s recap what a URL is before we move on:

A URL (Uniform Resource Locator) address, or so-called web address, denotes the location of a unique web page (or any other resource) in web space. In effect, it replaces the IP addresses through which browsers communicate with servers.

Let’s look at the basic elements of a web address:

Let's look at the basic elements of a web address:

  1. Typically, the protocol is http:// or https:// (with the s standing for secure, given an SSL certificate), but it can be different – such as mailto://, for example.
  1. A Top-Level Domain (TLD) can be .com, .net, .gov, .org, etc. A Country-Code Top-Level Domain (ccTLD) is a Top-Level Domain for a particular country. Each country has its own reserved Top-Level Domain, which is usually 2 letters long. E.g. .bg is the reserved ccTLD for Bulgaria.
  1. Domain name is the name of the website where the resource is located. The domain name and the Top-Level Domain together form the Root Domain – which is usually highest in the site hierarchy and is most often the home page. Logically, any URL that is part of our website should contain the Root Domain in its web address.
  1. Subdomains are added before the Root domain and are separated from it by a dot. Most often we can find www. as well as blog. There are subdomains for countries that replace ccTLDs. E.g. https://bg.site.com . Here it is worth mentioning that a country can be indicated in a third way in the URL – by a subdirectory. E.g. https://site.com/bg.
  1. Depending on the page the user is on and the path they have taken to get there (the user’s path), the URL may contain the names of categories at different levels of the site, as well as parameters (session ID, GET-parameters, GCLID-parameters) that are “stored” in the URL.

What is a clean URL?

A SEF (Search Engine Friendly) URL, Clean URL, User-friendly URL or even Pretty URL is a URL that is logically structured and understandable to users. SEF URLs are usually short, clear, easy to remember and ideally contain keywords as well.

For Google, the presence of keywords in URLs does not play much of a role and we immediately post a statement from John Mueller himself on the topic (2017):

The truth is that nowadays, when we are on Google mainly via a mobile browser, we quite rarely see URLs. Furthermore, the idea that keywords in URLs can boost the CTR (Click-Through Rate) from the results page is now slightly outdated, given that Google hardly shows URLs in the SERP (Search Engine Result Page). Why? Because it instead offers the user easier navigation right on the results page through breadcrumb navigation:

The truth is that nowadays, when we are on Google mainly through a mobile browser, we rarely see the URLs.  Also, the idea that keywords in a URL can increase the CTR (Click-Through Rate) from the results page is now slightly outdated, considering that Google hardly ever displays URLs in the SERP (Search Engine Result Page).

Absolutely spot on here would be the question:

Well, why bother with implementing SEF URLs then?

Google prefers SEF URLs mainly for one simple reason: users prefer them.

So far, we all agree that optimised addresses certainly look more presentable than a long address with a session ID or incomprehensible parameters in the queue.

However, here are 3 more objective reasons to optimize your website URLs

1. Google Ranking Factors

You may already know that Google relies on over 200 ranking factors for its algorithm – some are confirmed, others are considered speculation. However, we’re going to be relatively contrarian and in this article we’ll point out the ones that are related to URLs. According to the 2022 Google Ranking Factors list from Backlinko, the ranking factors that relate to URLs occupy the following positions:

  • URL Length. According to several studies on the topic, short URLs have a slight advantage in Google SERP positioning.
Length of the URL
  • URL Path. Pages that are closer to the homepage (Homepage) are preferred by Google over those that are deeply buried in the site hierarchy and distant on the homepage.

Good practice:

For online stores, it is advisable to have no nesting when composing URLs for product pages.

Why? Because if a product is present in more than one category, nesting is a prerequisite for duplicate content.

Example:

https://site.bg/category-1/product-1
https://site.bg/category-2/product-1

These are two different URLs for Google, but they have identical content – the same product.

  • Keyword in URL. As mentioned before, Google denies that the keyword in the URL is much of a ranking factor, as we hardly see URLs nowadays. But it is still a signal of relevancy and remains a ranking factor. Minor one, but still a ranking factor, right?
  • URL String. The categories in a URL string can be a thematic signal to Google about the content on the page.

    https://drehi.bg/damski-drehi/letni-rokli

2. Better UX (User Experience)

Users are more likely to click on a URL that they understand. SEF URLs provide both humans and search engines with easy and quick to perceive information about the page content. Clean URLs are understandable to even the most casual user, are relatively easy to remember, and are sometimes typed directly into the search box.

3. Links (bare URLs)

Optimized URLs can serve as anchors when we copy and paste them into forums, blog articles, social media posts and comments. If the URL isn’t optimised, we run the risk of it not looking very trustworthy to users, as they have no way of knowing what the content of the page is until they open it.

Example with a LinkedIn post:

On the first image, we can see a user-friendly URL and immediately understand what the content of the page is before we’ve even clicked – SEO articles on the Netpeak blog.

When the address is not optimized, as in the case of the second image, it is rather possible to achieve the opposite effect of desires – our friends and colleagues will not click on this link and will not understand how cool and useful our articles are. 🙂

And last but not least: even though Google doesn’t show URLs on the results page other popular search engines, like Bing for example, still show them:

although Google does not display URLs on the results page, other popular search engines such as Bing still display them:

How to optimize URLs

Now we come to the practical part: the most important talking points and recommendations when optimizing URLs.

It is recommended that the URL addresses be maximally simplified and logically structured, so that they are “easy to digest” for the user. But how?

1. Words > Parameters?

Whenever possible, it is advisable to use simple and descriptive words in URLs instead of generating ID numbers or other parameters. From an SEO perspective, the most sensible advice would be to avoid parameters and use static URLs. But is it worth it…

What are static URLs?

A static URL, as its name suggests, is a URL that does not change and does not contain parameters. These URLs are relatively shorter and easier to remember because they usually contain words instead of parameters, unlike dynamic URLs. However, rewriting dynamic URLs into static ones is quite time consuming and complicated to implement when the information on the site is growing rapidly. Therefore, professionals who are responsible for larger online stores, blogs or forums (sites with a large amount of information that is frequently updated) sometimes prefer to use dynamic URLs.

What about dynamic URLs?

We can often recognize a dynamic URL by characters such as: ? and &, or = . The biggest disadvantage of dynamic URLs is that several different URLs can have similar or even identical content. Another downside is that when shared on social networks, chats and emails, they don’t attract as many clicks as a static, user-friendly URL would because they’re… too long and full of parameters.

Static or dynamic URL – which is better and why?

As mentioned earlier, rewriting dynamic URLs into static ones is often a complex and lengthy process. Apart from the actual “conversion” of URLs from one type to the other, it is also difficult to maintain them in case the database grows nonstop and we have to hard-code every new page on the site. 🙁 

It’s important not to try to hide parameters in URLs in attempts to make them look static. In order to have static URLs, we need to have static content. If we have dynamic URLs and a large amount of variable content, it is safer from an SEO point of view to leave them dynamic and manage the parameters correctly.

Myth: Google cannot crawl and index dynamic URLs.

Bots understand the parameters in dynamic URLs, so indexing and ranking such URLs is not a problem. As for rewriting dynamic URLs, Google recommends that we don’t rewrite dynamic URLs to look static by removing or hiding information from the parameters. It’s better to use static URLs for static site content, but when dynamic URLs are required, let’s leave it up to the bots to use the parameter information and parse it.

Let’s look at URL parameters:

They can be of different types and serve different purposes, such as tracking traffic from ads and ad campaigns (UTM codes) or pagination (e.g. ?page=2). However, we need to know how to manage them so they don’t become a ‘pain in the a*s’ 😉

Most often URL parameters can cause:

  • Duplicate content. Here’s an example of duplicate content with dynamic URLs:
https://site.bg/index.php?param=1&param=2
https://site.bg/index.php?param=2&param=1

In these two different URLs, the parameters are the same, but they are swapped, so the content of these pages is the same (duplicated). This problem can be avoided by using static URLs (if possible).

  • Loss of crawling budget. URLs with many parameters that point to the same or identical content make it difficult for bots and waste crawling budget.
  • Uncertain ranking signals. When many URLs lead to identical content, search engines are confused about which URL of all to rank for a particular search.

Session IDs

These are unique IDs that are generated by the site’s server when a user visits our site for the first time. These numbers track a user’s interaction with the site, and can also track their click-path.

Example URL with session ID:

https://www.sait.bg/blog/saveti-url-optimizatsia?sid=2354984855ghtfs_564981557asdf

The problem that can arise with URLs with an ID parameter (and any parameters in general) is duplicate content. If a session ID number is added as a parameter to the URL of a particular page, Googlebot will receive a new ID number every time a new user accesses the page. But all these versions of the respective page have the same content… Which is a red flag for Google. 😉

What can we do in such a case? If our site’s Content Management (CMS) system uses ID parameters in the URL, it is advisable to set rel=canonical from each page that uses an ID parameter to the main version of the page – without a parameter, for example:

<link rel=”canonical” href=”https://sait.bg/blog/saveti-url-optimizatsia”>

instead of

<link rel=”canonical” href=”https://sait.bg/blog/saveti-url-optimizatsia?sid=235984855ghtfs_564981557asdf”>

2. URLs – in Latin!

URLs need to be in Latin. Transliteration rules are used for Cyrillic URLs:

According to Google’s recommendations for addresses, we should use UTF-8 (Unicode Transformation Format) – this is a character encoding standard whose first 128 characters are identical to those in the ASCII standard (American Standard Code for Information Interchange), so any valid ASCII text is also UTF-8 valid.

ASCII is the first encoding standard that was used for exchanging information on the Internet, and is based on the Latin alphabet.

URLs can only be shared on the web if they are “translated” into ASCII.

If the URL contains characters that are not part of the ASCII standard, they must be converted to the appropriate format using URL encoding. Typically, these characters are replaced by a percent sign (%) followed by two characters from the Hexadecimal number system, which consists of the Arabic numerals 0 through 9 and the Latin letters A through F. In addition, addresses cannot contain spaces – they are replaced by a plus sign (+) or %20.

For example, here is what the following combination of Cyrillic characters would look like:

hello

in ASCII format:

%D0%B7%D0%B4%D1%80%D0%B0%D0%B2%D0%B5%D0%B9

In reality, URLs containing Cyrillic characters are successfully ranked – we can take the pages written in Bulgarian on Wikipedia as an example. The problem is that when we share on the web, e.g. when we paste a URL with Cyrillic characters into a post or a private message, it becomes a long string of characters, absolutely incomprehensible to the human brain.

3. Lowercase

https://sait.bg/blog/Saveti-URL-Optimizatsia

or

https://sait.bg/blog/saveti-url-optimizatsia

Almost all servers nowadays treat lowercase and uppercase the same way in URLs. However, some servers might return a 404 response, or count the two URLs as separate pages and bring up duplicate content. Better to save ourselves the hassle and avoid capitalizing URLs.

4. Hyphens in U-R-L

All punctuation and spaces, except letters and numbers, in the URL should be replaced by a single hyphen (-). 

  • If two or more hyphens occur in a row, they must be replaced by a single hyphen;
  • If hyphens happen to be present at the beginning or end of the URL, they must be removed.

If we have several words in a row in the URL, it is advisable to separate them with hyphens instead of merging them, in order to make it easier for users to understand. 

For example:

https://drehi.bg/p-summer-dress-with-blue-dots

5. Slash at the end of URL/

In 2017, John Mueller commented on the importance of the slash (/) at the end of URLs. The presence of a slash after the root domain in a URL does not distinguish different URLs. Elsewhere, however, the slash plays a role (e.g., after “path” or file at the end of the URL). In HTML, the slash serves to indicate the presence of a directory or category.

Example:

https://www.sait.bg = https://www.sait.bg/

 but

https://www.sait.bg/something ≠ https://www.sait.bg/something/

Since Google doesn’t give us a clear recommendation whether to use a slash at the end of a URL or not, the advice we can give is to choose one of the two options and stick with it.

6. Dates in URLs

Previously, the use of dates in URLs was automatic in some CMSs (like WordPress, for example). Nowadays, dates in URLs are relatively less common and, from an SEO point of view, unnecessary, as they only lengthen the URL without providing any benefit.p>

7. Geotargeting in URLs

If our website operates in multiple regions it is advisable to use a URL structure that facilitates geotargeting. We have three options:

  • ccTLD;
https://sait.bg

The advantage of ccTLD is clean geotargeting and easy separation of our websites for different countries.

  • Country subdomain;
https://bg.sait.com

Regional subdomains and subdirectories are relatively easy to set up, unlike ccTLDs which are the most expensive option and where there are a lot of requirements sometimes.

  • Country subdirectory.
https://sait.com/bg

Subdirectories are usually the easiest to maintain, but the language versions of our site seem so similar that often it’s difficult to distinguish one from the other..

HTTPS > HTTP

HTTPS (or HTTP over TLS) authenticates the security of the site and the server it is uploaded to. Back in 2014, Google identified the importance of security and User Experience on our site. Thus, the presence of HTTP over TLS becomes another ranking factor, because it guarantees the security of data exchange between user and server. 

It’s certainly not a shortcut to the TOP 3 of the results page (especially if our site doesn’t rank on the first page at first). Google recommends https:// over http://, and users are more likely to trust sites with HTTPS protocol.

Conclusion

Users are the driving force on the Web, and for this reason Google’s top recommendation when it comes to URL optimization is to optimize for people. It’s a good idea to keep our URLs as short and user-friendly as possible, with a clear structure and no unnecessary parameters. Of course, if all this is achievable and would bring us more benefits than negatives.

It would be too simple if there was not at least one but. It is important that before implementing SEF URLs, we objectively evaluate the capabilities of our website’s Content Management System (CMS). It is recommended  to optimize URLs if the CMS system allows it. Otherwise, instead of producing positive results, implementing user-friendly URLs could cause server load and slow down the site.

Article originally posted by Netpeak Bulgaria as “Оптимизация на URL адреси

Expanding your business in Europe?

Let us know
Basta digital Bluerank Dinetix DWF Effectix Fragile GoUP Growww Digital Intren KG Media Kokos Agency Netpeak Bulgaria Omnicliq Promodo WebDigital