A Horror Story from the Interwebs: Duplicate Content

A Horror Story from the Interwebs: Duplicate Content

I have some scary news, something from the World Wide Webernets…the thing is…it’s not what you think. It’s actually a ghost, you may not be able to see it at all. You ready? Prepare yourself…

Resolving your www domain

….you may be duplicating your efforts on two totally different websites and not even know it. To you, it would seem like the same site but in reality, you have a ghost in your midst. The answer lies with how Google sees your site. Google may think you have 2 websites even though you should only have one, duplicate content and duplication of effort. So what should you do?

Before we begin, however: You may have heard about 301 redirects before, this is not what we are talking about here. We want to know how to resolve your “www” domain to “non-www” or vice versa…we call this your www resolve or URL Canonicalization.


Which site should your domain resolve to?

Have you ever noticed when you type in a website, a www will either appear, disappear or stay the same? This is because there is a preferred site set by the dev team (at least we hope this is happening). Let’s start with an example…let’s say you type in “stephdokin.com” exactly as shown. Keeping your eyes on the URL, you will see it resolves to the “www.stephdokin.com” version of the site.

So in this case, “stephdokin.com” resolves to “www.stephdokin.com”

or in other words from “non-www” to the “www” version.

This means we only have one website and all our URLs are resolved to the proper domain. The CMO4Hire website is setup the same way. You can also make your resolve go the other way from the “www” version to the “non-www” … just depends on your preference. In that case, you would type in www.yourwebsite.com and you will witness the 3 W’s disappear right before your eyes. This means your resolve is setup correctly from “www” to the “non-www” version.

So now you know how the resolve works, the question remains…do you have your site setup correctly? And more importantly, how do you fix it? It’s your lucky day! Learn how to get rid of that duplicate ghost right here, right now.


Learn how to fix it

First, clear your history and cookies then type your site URL in the browser (or use incognito mode). Does it resolve from one to the other?? (try both ways.)

Next just to be sure, run your site through an SEO tool that will confirm your findings from step one. You can request a report here or use other free online tools such as SEO site checker. If your URL checks out in the report and resolves via the live tests, then you have reason to jump for joy. No scary duplicate content issues for you! BUT be sure to check your site in a few different tools, you will want to get a second opinion (learn why you should do multiple tests in my last post).


BLOG PAUSE — Decide: “which group am I in?”

  1. Group One: “I am 100% certain my domain is resolving correctly” — Great news! But you will also want to skip to the end of this post to learn about another issue…SSL duplicates.
  2. Group 2: “I don’t think, or I know for sure my resolve is not working” — For this second group who has the ghost following them still, read on from here to fix the issue (or send this article to your web guy…or maybe just contact CMO4Hire).

****WARNING**** We will be getting into some advanced stuff here so if you don’t feel comfortable, don’t proceed. Wouldn’t want your website to crash!

BLOG UNPAUSE — “I have decided to continue reading, for learning and/or to potentially implement changes at my own risk”


STEP 1: Decide your website’s fate

Choose whether you are adding the www prefix or removing it. Some have a preference but let’s be honest, it does not matter! I really like double U’s and also W’s. That’s why I choose to keep the www in my domain URLs 😉

STEP 2: Get into Cpanel and click all the buttons

Kidding! Don’t click anything except for the following steps (may vary from host to host):

  1. Login to your host and get access to Cpanel.
  2. Unhide files in the root folder and find your ht-access file.
  3. Open your ht-access file and stare at the screen.

STEP 3: Make the www resolve happen

Referencing your decision from step 1, paste the correct code into the file. You have 2 options as we have discussed…

If you want your URLs to show up with the “www” prefix included use this:

# Add www to any URLs that do not have them:
RewriteEngine on
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]

 

If you want your URLs to appear with NO www and http:// only, then use this:

# Remove www from any URLs that have them:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.
RewriteRule ^(.*)$ http://example.com/$1 [R=301,L]

 

**Note: ONLY USE ONE OF THE ABOVE CODE SNIPPETS (Depending on your preferred domain redirect).

STEP 4: Save and verify

The last few things you need to do:

  • Save the ht-access file with your pasted code.
  • Test your site to verify the changes.
  • Jump and dance around for at least 5 minutes (helps make sure your resolve ghost is gone for good).

Voila!! Now (theoretically) you have your site properly redirecting to the domain you want. This will avoid duplicate content where Google thinks you have 2 URLs for every single page and post. No more duplicate efforts on 2 seperate sites!

Additionally, you will want to set your preferred domain in Google Webmaster Tools/Search Console. This verifies your chosen “preferred domain” with Google. You can reach out to me at any time for steps on how to complete this action: riley[at]cmo4hire.com

We are done talking about duplicate content for this blog’s main topic, now for some BONUS DISCUSSION!!!


Duplicate content via SSL

For this post, we talked about the URL in regards to the “http://www” prefix…there is more to know! What about your SSL layer? We will be doing another post regarding SSL specifics but for now what you need to know is that a secure site with a verified Secure Socket Layer has an “https://” instead of just “http://”

Notice that “s” in there?

We call this an SSL certificate…I wanted to bring this up here as you can also run into duplicate content issues if you implement or in the past have implemented an SSL certificate. This means you can actually have up to 4 different sites in Google’s eyes.

Using the Stephdokin.com example again, pay close attention to the following URLs:

“https://www.stephdokin.com”

“https://stephdokin.com”

“http://www.stephdokin.com”

“http://stephdokin.com”

As you can see there are 4 different sites, all with the same content! Potentially a pretty spooky duplicate nightmare. Usually, however, the case boils down to duplicate content issues for either one of the following (not both):

  • “non-www” vs. “www” as discussed above, (aka http://mywebsite.com  …VS…  http://www.mywebsite.com)
  • “non-secure” vs. “secure” the slight difference shown here…  http:// …VS… https://

Note: “non-www” vs. “www” duplicate content could happen on a secure site (https) or a non-secure site (http)

We need to make sure people are getting to our secure site if we have an SSL installed. If you don’t have SSL then don’t worry about it for now. We do recommend you install SSL eventually, however. It can be a security issue AND there have been some changes in chrome that penalize those that are not SSL. In the past, Secure Socket Layers were only for sites that have sensitive data transfer such as personal info or credit card numbers. This all seems to be changing with the direction being everyone on SSL. If your site does transfer sensitive info then you should definitely have SSL!


Duplicate Content Eliminated (hopefully)

So as you can see in some scenarios, duplicate content issues can pretty damn scary. Luckily, we have some epxerience dealing with this type of issue and can sort out event the most complicated of situations. This post outlines how to fix one of the more simpler scenarios. Hope you learned from this post and resolve to make some changes to your site (bad pun). Lastly, I have a a little test, let’s see how much you learned. Visit my personal website and tell me which way the resolve points, at this stage you are a topic expert 😉

-Riley Kearl-

Add Comment

Your email address will not be published. Required fields are marked *