Digital Assets & Content Leadership Exchange | Jan 22-24 | New York

Digital Assets and Content Management

As content velocity increases and the volume of digital assets grows exponentially maximizing those assets’ value hinges on managing them effectively. This two-part conference will provide innovative insights on how you can effectively monetize and maintain your digital asset infrastructure.

Attend IEN’s Digital Assets and Content Leadership Exchange for practical strategies to enable you to keep up with the increasing complexity and quickening pace of asset management and delivery. Don’t be left behind!

Register for Digital Assets & Content Leadership Exchange

Part I: Strategy & Monetization

  • How to extract greater value from your digital assets
  • The realities of integrating your asset management systems across the enterprise
  • Managing rights data to get the MOST out of content licensing
  • Rich Metadata—Your key to new monetization opportunities
  • Preserving and leveraging your brand history through digital asset management
  • Reduce, Reuse, Recycle: Secrets to achieving a more sustainable creative workflow
  • ‘Off-the-shelf’ vs. ‘build-it-yourself’—what is the best software solution for YOU?

Part II: Implementation & Stewardship

  • Key takeaways from experienced DAM and content stewards on what they’ve learned and what they would have done differently had they known what they do now
  • Digital content and asset managers as strategic partners—leveraging your department’s value
  • Digital Asset Management—Where did the boundaries around file types come from and are they still useful?
  • The evolution of content and digital asset management strategies and how to meet the new challenges
  • Where and how should digital content be stored to ensure both security and ease of access?
  • Evaluating the digital content and assets marketplace and key considerations when selecting a vendor
  • The future disruption of asset management—what role will AI play in optimizing the way digital content is stored, tagged, recalled, repurposed, monetized, etc?
  • Which skill sets are essential and how to find the right talent for the cross-functional roles within a mature and fully integrated digital asset management team
  • Increasing enterprise-wide tool use through effective training programs—hear from those who have done this successfully
  • Demonstrating departmental value through key metrics
  • Wringing greater returns out of assets through advances in metadata
  • Addressing the challenges of the media and file types of the near future

Register for Digital Assets & Content Leadership Exchange

Download a Sponsored Marketing Whitepaper:

5 Things You're Doing Wrong with Keyword Research

5 Things You’re Doing Wrong with Keyword Research

It’s no secret that some of the best ROI in digital marketing comes from SEO efforts. Download Now

Download a Sponsored Marketing Whitepaper:

5 Things You're Doing Wrong with Keyword Research

5 Things You’re Doing Wrong with Keyword Research

It’s no secret that some of the best ROI in digital marketing comes from SEO efforts. Download Now


© 2017 DK New Media. All Rights Reserved. Visit and Subscribe to MarTech today!

Advertisements

How Links in Headers, Footers, Content, and Navigation Can Impact SEO – Whiteboard Friday

Posted by randfish

Which link is more valuable: the one in your nav, or the one in the content of your page? Now, how about if one of those in-content links is an image, and one is text? Not all links are created equal, and getting familiar with the details will help you build a stronger linking structure.

https://fast.wistia.net/embed/iframe/syq8vfj09l?videoFoam=true

https://fast.wistia.net/assets/external/E-v1.js

How Links in Headers, Footers, Content, and Navigation Can Impact SEO

Click on the whiteboard image above to open a high-resolution version in a new tab!

Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week we’re going to chat about links in headers and footers, in navigation versus content, and how that can affect both internal and external links and the link equity and link value that they pass to your website or to another website if you’re linking out to them.

So I’m going to use Candy Japan here. They recently crossed $1 million in sales. Very proud of Candy Japan. They sell these nice boxes of random assortments of Japanese candy that come to your house. Their website is actually remarkably simplistic. They have some footer links. They have some links in the content, but not a whole lot else. But I’m going to imagine them with a few more links in here just for our purposes.

It turns out that there are a number of interesting items when it comes to internal linking. So, for example, some on-page links matter more and carry more weight than other kinds. If you are smart and use these across your entire site, you can get some incremental or potentially some significant benefits depending on how you do it.

Do some on-page links matter more than others?

So, first off, good to know that…

I. Content links tend to matter more

…just broadly speaking, than navigation links. That shouldn’t be too surprising, right? If I have a link down here in the content of the page pointing to my Choco Puffs or my Gummies page, that might actually carry more weight in Google’s eyes than if I point to it in my navigation.

Now, this is not universally true, but observably, it seems to be the case. So when something is in the navigation, it’s almost always universally in that navigation. When something is in here, it’s often only specifically in here. So a little tough to tell cause and effect, but we can definitely see this when we get to external links. I’ll talk about that in a sec.

II. Links in footers often get devalued

So if there’s a link that you’ve got in your footer, but you don’t have it in your primary navigation, whether that’s on the side or the top, or in the content of the page, a link down here may not carry as much weight internally. In fact, sometimes it seems to carry almost no weight whatsoever other than just the indexing.

III. More used links may carry more weight

This is a theory for now. But we’ve seen some papers on this, and there has been some hypothesizing in the SEO community that essentially Google is watching as people browse the web, and they can get that data and sort of see that, hey, this is a well-trafficked page. It gets a lot of visits from this other page. This navigation actually seems to get used versus this other navigation, which doesn’t seem to be used.

There are a lot of ways that Google might interpret that data or might collect it. It could be from the size of it or the CSS qualities. It could be from how it appears on the page visually. But regardless, that also seems to be the case.

IV. Most visible links may get more weight

This does seem to be something that’s testable. So if you have very small fonts, very tiny links, they are not nearly as accessible or obvious to visitors. It seems to be the case that they also don’t carry as much weight in Google’s rankings.

V. On pages with multiple links to the same URL

For example, let’s say I’ve got this products link up here at the top, but I also link to my products down here under Other Candies, etc. It turns out that Google will see both links. They both point to the same page in this case, both pointing to the same page over here, but this page will only inherit the value of the anchor text from the first link on the page, not both of them.

So Other Candies, etc., that anchor text will essentially be treated as though it doesn’t exist. Google ignores multiple links to the same URL. This is actually true internal and external. For this reason, if you’re going ahead and trying to stuff in links in your internal content to other pages, thinking that you can get better anchor text value, well look, if they’re already in your navigation, you’re not getting any additional value. Same case if they’re up higher in the content. The second link to them is not carrying the anchor text value.

Can link location/type affect external link impact?

Other items to note on the external side of things and where they’re placed on pages.

I. In-content links are going to be more valuable than footers or nav links

In general, nav links are going to do better than footers. But in content, this primary content area right in here, that is where you’re going to get the most link value if you have the option of where you’re going to get an external link from on a page.

II. What if you have links that open in a new tab or in a new window versus links that open in the same tab, same window?

It doesn’t seem to matter at all. Google does not appear to carry any different weight from the experiments that we’ve seen and the ones we’ve conducted.

III. Text links do seem to perform better, get more weight than image links with alt attributes

They also seem to perform better than JavaScript links and other types of links, but critically important to know this, because many times what you will see is that a website will do something like this. They’ll have an image. This image will be a link that will point off to a page, and then below it they’ll have some sort of caption with keyword-rich anchors down here, and that will also point off. But Google will treat this first link as though it is the one, and it will be the alt attribute of this image that passes the anchor text, unless this is all one href tag, in which case you do get the benefit of the caption as the anchor. So best practice there.

IV. Multiple links from same page — only the first anchor counts

Well, just like with internal links, only the first anchor is going to count. So if I have two links from Candy Japan pointing to me, it’s only the top one that Google sees first in the HTML. So it’s not where it’s organized in the site as it renders visually, but where it comes up in the HTML of the page as Google is rendering that.

V. The same link and anchor on many or most or all pages on a website tends to get you into trouble.

Not always, not universally. Sometimes it can be okay. Is Amazon allowed to link to Whole Foods from their footer? Yes, they are. They’re part of the same company and group and that kind of thing. But if, for example, Amazon were to go crazy spamming and decided to make it “cheap avocados delivered to your home” and put that in the footer of all their pages and point that to the WholeFoods.com/avocadodelivery page, that would probably get penalized, or it may just be devalued. It might not rank at all, or it might not pass any link equity. So notable that in the cases where you have the option of, “Should I get a link on every page of a website? Well, gosh, that sounds like a good deal. I’d pass all this page rank and all this link equity.” No, bad deal.

Instead, far better would be to get a link from a page that’s already linked to by all of these pages, like, hey, if we can get a link from the About page or from the Products page or from the homepage, a link on the homepage, those are all great places to get links. I don’t want a link on every page in the footer or on every page in a sidebar. That tends to get me in trouble, especially if it is anchor text-rich and clearly keyword targeted and trying to manipulate SEO.

All right, everyone. I look forward to your questions. We’ll see you again next week for another edition of Whiteboard Friday. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Unlocking Hidden Gems Within Schema.org

Posted by alexis-sanders

Schema.org is cryptic. Or at least that’s what I had always thought. To me, it was a confusing source of information: missing the examples I needed, not explaining which item properties search engines require, and overall making the process of implementing structured data a daunting task. However, once I got past Schema.org’s intimidating shell, I found an incredibly useful and empowering tool. Once you know how to leverage it, Schema.org is an indispensable tool within your SEO toolbox.

A structured data toolbox

The first part of any journey is finding the map. In terms of structured data, there are a few different guiding resources:

  • The most prominent and useful are Google’s Structured Data Features Guides. These guides are organized by the different structured data markups Google is explicitly using. Useful examples are provided with required item properties.

    Tip: If any of the item types listed in the feature guides are relevant to your site, ensure that you’re annotating these elements.

  • I also want to share Merkle’s new, free, supercalifragilisticexpialidocious Structured Data Markup Generator. It contains Google’s top markups with an incredibly user-friendly experience and all of the top item properties. This tool is a great support for starting your markups, and it’s great for individuals looking to reverse-engineer markups. It offers JSON-LD and some illustrative microdata markups. You can also send the generated markups directly to Google’s structured data testing tool.

  • If you’re looking to go beyond Google’s recommendations and structure more data, check out Schema.org’s Full Hierarchy. This is a full list of all Schema.org’s core and extended vocabulary (i.e., a list of all item types). This page is very useful to determine additional opportunities for markup that may align with your structured data strategy.

    Tip: Click “Core plus all extensions” to see extended Schema.org’s libraries and what’s in the pipeline.

  • Last but not least is Google’s Structured Data Testing Tool. It is vital to check every markup with GSDTT for two reasons:
    • To avoid silly syntactic mistakes (don’t let commas be your worst enemy — there are way better enemies out there ☺).
    • Ensure all required item properties are included

As an example, I’m going to walk through the Aquarium item type Schema.org markup. For illustrative purposes, I’m going to stick with JSON-LD moving forward; however, if there are any microdata questions, please reach out in the comments.

Basic structure of all Schema.org pages

When you first enter a Schema.org item type’s page, notice that every page has the same layout, starting with the item type name, the canonical reference URL (currently the HTTP version*), where the markup lives within the Schema.org hierarchy, and that item type’s usage on the web.

*Leveraging the HTTPS version of a Schema.org markup is acceptable

What is an item type?

An item type is a piece of Schema.org’s vocabulary of data used to annotate and structure elements on a web page. You can think about it as what you’re marking up.

At the highest level of most Schema.org item types is Thing (alternatively, we’d be looking at DataType). This intuitively makes sense because almost everything is, at its highest level of abstraction, a Thing. The item type Thing has multiple children, all of which assume Thing’s properties in a cascading in a hierarchical fashion (i.e., a Product is a Thing, both can have names, descriptions, and images).

Explore Schema.org’s item types here with the various visualizations:

https://technicalseo.com/seo-tools/schema-markup-generator/visual/

Item types are going to be the first attribute in your markup and will look a little like this (remember this for a little later):

Tip: Every Schema.org item type can be found by typing its name after Schema.org, i.e. http://schema.org/Aquarium (note that case is important).

Below, this is where things start to get fun — the properties, expected type, and description of each property.

What are item properties?

Item properties are attributes, which describe item types (i.e., it’s a property of the item). All item properties are inherited from the parent item type. The value of the property can be a word, URL, or number.

What is the “Expected Type”?

For every item type, there is a column the defines the expected item type of each item property. This is a signal which tells us whether or not nesting will be involved. If the expected property is a data type (i.e., text, number, etc.) you will not have to do anything; otherwise get ready for some good, old-fashioned nesting.

One of the things you may have noticed: under “Property” it says “Properties from CivicStructure.” We know that an Aquarium is a child of CivicStructure, as it is listed above. If we scan the page, we see the following “Properties from…”:

This looks strikingly like the hierarchy listed above and it is (just vertical… and backward). Only one thing is missing – where are the “Properties from Aquarium”?

The answer is actually quite simple — Aquarium has no item properties of its own. Therefore, CivilStructures (being the next most specific item type with properties) is listed first.

Structuring this information with more specific properties at the top makes a ton of sense intuitively. When marking up information, we are typically interested in the most specific item properties, ones that are closest conceptually to the thing we’re marking up. These properties are generally the most relevant.

Creating a markup

  1. Open the Schema.org item type page.
  2. Review all item properties and select all relevant attributes.
    • After looking at the documentation, openingHours, address, aggregateRating, telephone, alternateName, description, image, name, and sameAs (social media linking item property) stood out as the most cogent and useful for aquarium goers. In an effort to map out all of the information, I added the “Expected Type” (which will be important in the next step) and the value of the information we’re going to markup.
  3. Add the starting elements of all markup.
    • All markup, whether JSON-LD or microdata, starts with the same set of code/markup. One can memorize this code or leverage examples and copy/paste.
    • JSON-LD: Add the script tag with the JSON-LD type, along with the @context, and @type with the item type included:
  4. Start light. Add the easier item properties (i.e., the ones that don’t require nesting).
    • First off, how do you tell whether or not the property nests?
      • This is where the “Expected Type” column comes into play.
      • If the “Expected Type” is “Text”, “URL”, or “Number” — you don’t need to nest.
    • I’ve highlighted the item properties that do not require nesting above in green. We’ll start by adding these to our markup.
    • JSON-LD: Contains the item property in quotation marks, along with the value (text and URLs are always in quotation marks). If there are multiple values, they’re listed as arrays within square [brackets].

  5. Finish strong. Add the nested item properties.
    • Nested item properties are item types within item types. Through nesting, we can access the properties of the nested item type.
    • JSON-LD: Nested item properties start off like normal item properties; however, things get weird after the colon. A curly brace opens up a new world. We start by declaring a new item type and thus, inside these curly braces all item properties now belong to the new item type. Note how commas are not included after the last property.
  6. Test in Google’s Structured Data Testing Tool.
    • Looks like we’re all good to go, with no errors and no warnings.

Side notes:

  • *address: Google’s documentation list address, nested within PostAddress as a requirement. This is a good indicator of why it’s important to review Google’s documentation.
  • openingHours: Multiple times are listed out in an array (as indicated by the square brackets). As the documentation’s “Description section” mentions – using a hyphen for ranges and military time.
    • Note: Google’s documentation uses the openingHoursSpecification item property, which nests OpeningHoursSpecification. This is a good example where Google documentation shows a more specific experience to consider.
  • telephone: Sometimes you need to add a country code (+1) for phone numbers.
  • image: URLs must be absolute (i.e., protocol and domain name included).

TL;DR:

  • Schema.org’s documentation can be leveraged to supplement Google’s structured data documentation
  • The “Expected Type” on Schema.org tells you when you need to nest an item type
  • Check out Merkle’s Structured Data Markup Generator if you want to try simply inserting values and getting a preliminary markup

Thanks!

A huge thanks to Max Prin (@maxxeight), Adam Audette (@audette), and the @MerkleCRM team for reviewing this article. Plus, shout outs to Max (again), Steve Valenza (#TwitterlessSteve), and Eric Hammond (@elhammond) for their work, ideas, and thought leadership that went into the Schema Generator Tool!

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

How to Use Keywords Effectively for SEO and More

Search engines find keywords in different elements of a page and use them to determine whether or not the page should be ranked in certain results. Proper use of keywords will get your page indexed for specific searches but does not guarantee placement or rank within that search. There are also some common keyword mistakes to avoid.

Each page should target a tight collection of keywords. In my opinion, you should not have a page that targets more than 3 to 5 and those should be related to one another. So ‘mailing list’ and ‘direct marketing list’ are related to one another subject-wise and could be used in coordination with one another on the page.

Your page should focus on great content that drives conversions, not concentrate on stuffing keywords throughout that content. The natural use of the keywords is most important – so that search engines see the keywords but visitors don’t necessarily see them. Content drives conversions (sales) – so write well!

Where to Research Keywords

The only tools that I use anymore for keyword research are SEMRush and BuzzSumo. BuzzSumo provides insight on content popularity and SEMRush provides insight on content rankings… the two aren’t always the same. Aside from a plethora of audit and ranking tools, SEMRush just does an incredible job at identifying the most important keywords for your business. Here are a few ways that I use the tool:

  • Domain Keywords – I run reports on the client to identify keywords they may already be ranking on and determine whether there are strategies like content changes and promotion I can deploy that will improve their ranking.
  • Related Keywords – When I find keywords that I wish to target, I run related keyword reports to identify other combinations of relevant keywords that I may be able to garner better ranking on.
  • Gap Analysis – SEMRush has a really great feature where you can compare multiple domains and identify where you’re competing with other domains. We often identify keywords our clients’ competitors are ranking on that we haven’t pursued.

How to Use Keywords Effectively on your site for SEO

  1. Domain – if your domain name has keywords, it’s great. If not, that’s alright as well. Be sure you’ve registered the domain for 10 years so Google recognizes that it’s not a spam site and is viable. Domain registration length is an SEO myth. However, a young domain will have less authority than one that may have been utilized in the past for similar terms. Before you look for a fresh domain, check out some auctions on other relevant domains… you could get a head start if you’re just starting out!
  2. Home Page Title Tag – be sure your homepage title tag has a few of the terms that you’re after and places them before your company name.
  3. Title Tag – each independent page should have the keywords that the content of that page focuses on.
  4. Meta Tags – the keyword tag is ignored by search engines and keywords utilized in your page description are ignored. However, when someone searches for a specific keyword, it’s bolded in the search engine results page so a search user may be more likely to click on your result.
  5. Heading Tags – in HTML, there are headings and subheadings. These are specifically <h1>, <h2>, <h3> tags in that order of importance. Search engines pay attention to these tags and it’s important that you pay attention to them as well as you create pages and utilize keywords. For blog posts, utilize keywords in your blog post titles. Avoid using <h1>, <h2>, or <h3> tags in your sidebar.
  6. Bold and Italics – bold or italicize your keywords on the page so that they stand out.
  7. Image Alt and Description – when you utilize an image (recommended) within your site pages or posts, be sure to utilize keywords effectively in the image alt or description tags.
    <img src='domain.com/image.jpg' alt='The KEYWORD' />
    Your content management system should allow for this.
  8. Internal Links – if you make mention of other posts or pages within your site, be sure to utilize keyword effectively in the anchor text of the link to that content and in the title tag of the anchor tag.
    <a href='domain.com/myotherpage.html' title='The KEYWORD'>More keywords</a>
    Avoid using generic terms like ‘read more’ or ‘click here’.
  9. First Words of Content – the first words on your page or post should include keywords relevant to the content within that page.
  10. Top of Page – Search engines view a page and analyze the content from top to bottom, the top of the page is the most important content and the bottom of the page is least important. If you have a columnar layout, check with your company that designed your theme and ensure that columns are lower in your HTML than the body of your content (many themes put the sidebar first!).
  11. Repeated Usage – within your content (also known as keyword density), it’s important to utilize keywords naturally within your content. The search engines are getting much more sophisticated at finding relevant terms, so you do not have to repeat the same exact phrase. Always work on ensuring your content is natural and compelling. While over-optimized content may get you found, it won’t get you sold!

Here’s one other note… keywords don’t have to match. Co-occurrence words and synonyms are just as important and can actually get your content found in a greater array of search combinations if you use them. In the example of this post, I use terms like keyword usage, but I also use terms like SEO, keyword density, content, title tags… all terms that are relevant to the topic but may make this post get found for more combinations.

It’s also important to note that users of search engines are typing much longer combinations of keywords – including questions and other phrases to narrow the results. So a keyword isn’t limited to a 1 or 2 word combination, it may be an entire sentence! And we’ve found that the longer the combination, the better the match, the more relevant the traffic – and the more likely the visitor will convert.

If you can get external links with keywords back to your site, even better! This post was simply about on-site keyword usage, though.

Keywords hold extreme importance for a business whose websites are a crucial extension of their operations. When used properly, keyword-optimised websites can result in better search results and increase traffic to an online store. Ultimately, it also helps a business attract prospects with a higher chance of converting into paying customers. Healthy Business Builder

Here’s an infographic from Healthy Business Builder, Why Are Keywords So Important for Your Online Sales:

Keyword Usage Infographic

Disclosure: I am using my affiliate link for SEMRush in the article.

Download a Sponsored Marketing Whitepaper:

6 Easy Tactics That Will Turn Your Webinars into Lead Generation Machines

6 Easy Tactics That Will Turn Your Webinars into Lead Generation Machines

Webinars lead the pack in driving sales leads. Download Now

Download a Sponsored Marketing Whitepaper:

6 Easy Tactics That Will Turn Your Webinars into Lead Generation Machines

6 Easy Tactics That Will Turn Your Webinars into Lead Generation Machines

Webinars lead the pack in driving sales leads. Download Now


© 2017 DK New Media. All Rights Reserved. Visit and Subscribe to MarTech today!

The Way We Read Work Email is Changing

Email Behavior Changes

In a world where more email is sent than ever (up 53% from 2014), understanding what kinds of messages are sent, and when those messages are sent is both useful and important. Like many of you, my inbox is out of control. When I read about inbox zero, I can’t help but be a little pessimistic about the volume and manner in which the emails are responded to.

In fact, if it weren’t for SaneBox and MailButler (using my referral links there), I’m not sure how I’d handle my email. Sanebox does a fantastic job at learning which of my emails require immediate attention and MailButler offers me the opportunity to delay responses, snooze emails, and enhances Apple Mail with a number of other features.

In common with both platforms is that my inbox is manipulated between folders. I’m not limited to just an Inbox, Junk Folder, and Trash anymore… these systems are routing messages in and out of several other folders. While these are great tools for me, they must wreak havoc on the email metrics of the senders who are trying to reach me. Email behavior is changing, and these tools are just one example of how.

To research email behavior changes, ReachMail recently surveyed 1000 people to learn what it means to manage their inboxes. Some key findings:

  • Morning Email – 71% of Americans check for the first time between 5 a.m. and 9 a.m. New York and New Jersey average the latest first check—just before 9 a.m.—and people in Utah check earliest, just after 6:30 a.m., on average.
  • Evening Email – 30% of Americans check before 6 p.m. and 70% after 6 p.m. 46% of Virginians check their email for the last time between 9 p.m. and midnight, while 13% more finish up after midnight. Not to be outdone, 71% of Tennesseans are fellow night owls, checking their email after 9 p.m., and just 12% check last before 6 p.m., well below the national average.
  • Sending Emails – Nearly half of all Americans (46%) send fewer than 10 emails per day. 30% of people send 10 to 25 emails per day, 16% send 25 to 50, and 8% send more than 50 emails per day. The West has the lowest average of sent emails, at 18 per day. The Northeast tops all regions and averages 22 sent emails per day, while Massachusetts has the national high of 28 emails sent per day, on average.
  • Response Time –  58% of Americans say they respond to emails within one hour. 26% respond within one to six hours, 11% respond within six to 24 hours and the remaining 5% respond after 24 hours, on average. Virginians report the quickest email replies with an average response time of just over two hours. New Yorkers, surprisingly, are on the slow end—12% say they average a day or more to respond and 33% take at least six hours.
  • Unread Emails – Over half of Americans have less than 10 unread emails in their work inbox. 26% report having less than 50 unread emails, 13% have more than 100 unread emails and 6% have between 50 and 100. South Carolina reports the most unread emails, with an average of 29, while a whopping 30% of Tennesseans report having more than 100 unread emails. The Midwest has the fewest, with an average of 17.

ReachMail produced this infographic: American Inbox 2: The Reckoning to illustrate the changes.

Work Email Trends Infographic

Download a Sponsored Marketing Whitepaper:

12 Brilliant Emails From Our Favorite Brands

12 Brilliant Emails From Our Favorite Brands

See effective email campaigns, plus the strategies that make them so effective. Download Now

Download a Sponsored Marketing Whitepaper:

12 Brilliant Emails From Our Favorite Brands

12 Brilliant Emails From Our Favorite Brands

See effective email campaigns, plus the strategies that make them so effective. Download Now


© 2017 DK New Media. All Rights Reserved. Visit and Subscribe to MarTech today!

Does Googlebot Support HTTP/2? Challenging Google’s Indexing Claims – An Experiment

Posted by goralewicz

I was recently challenged with a question from a client, Robert, who runs a small PR firm and needed to optimize a client’s website. His question inspired me to run a small experiment in HTTP protocols. So what was Robert’s question? He asked…

Can Googlebot crawl using HTTP/2 protocols?

You may be asking yourself, why should I care about Robert and his HTTP protocols?

As a refresher, HTTP protocols are the basic set of standards allowing the World Wide Web to exchange information. They are the reason a web browser can display data stored on another server. The first was initiated back in 1989, which means, just like everything else, HTTP protocols are getting outdated. HTTP/2 is one of the latest versions of HTTP protocol to be created to replace these aging versions.

So, back to our question: why do you, as an SEO, care to know more about HTTP protocols? The short answer is that none of your SEO efforts matter or can even be done without a basic understanding of HTTP protocol. Robert knew that if his site wasn’t indexing correctly, his client would miss out on valuable web traffic from searches.

The hype around HTTP/2

HTTP/1.1 is a 17-year-old protocol (HTTP 1.0 is 21 years old). Both HTTP 1.0 and 1.1 have limitations, mostly related to performance. When HTTP/1.1 was getting too slow and out of date, Google introduced SPDY in 2009, which was the basis for HTTP/2. Side note: Starting from Chrome 53, Google decided to stop supporting SPDY in favor of HTTP/2.

HTTP/2 was a long-awaited protocol. Its main goal is to improve a website’s performance. It’s currently used by 17% of websites (as of September 2017). Adoption rate is growing rapidly, as only 10% of websites were using HTTP/2 in January 2017. You can see the adoption rate charts here. HTTP/2 is getting more and more popular, and is widely supported by modern browsers (like Chrome or Firefox) and web servers (including Apache, Nginx, and IIS).

Its key advantages are:

  • Multiplexing: The ability to send multiple requests through a single TCP connection.
  • Server push: When a client requires some resource (let’s say, an HTML document), a server can push CSS and JS files to a client cache. It reduces network latency and round-trips.
  • One connection per origin: With HTTP/2, only one connection is needed to load the website.
  • Stream prioritization: Requests (streams) are assigned a priority from 1 to 256 to deliver higher-priority resources faster.
  • Binary framing layer: HTTP/2 is easier to parse (for both the server and user).
  • Header compression: This feature reduces overhead from plain text in HTTP/1.1 and improves performance.

For more information, I highly recommend reading “Introduction to HTTP/2” by Surma and Ilya Grigorik.

All these benefits suggest pushing for HTTP/2 support as soon as possible. However, my experience with technical SEO has taught me to double-check and experiment with solutions that might affect our SEO efforts.

So the question is: Does Googlebot support HTTP/2?

Google’s promises

HTTP/2 represents a promised land, the technical SEO oasis everyone was searching for. By now, many websites have already added HTTP/2 support, and developers don’t want to optimize for HTTP/1.1 anymore. Before I could answer Robert’s question, I needed to know whether or not Googlebot supported HTTP/2-only crawling.

I was not alone in my query. This is a topic which comes up often on Twitter, Google Hangouts, and other such forums. And like Robert, I had clients pressing me for answers. The experiment needed to happen. Below I’ll lay out exactly how we arrived at our answer, but here’s the spoiler: it doesn’t. Google doesn’t crawl using the HTTP/2 protocol. If your website uses HTTP/2, you need to make sure you continue to optimize the HTTP/1.1 version for crawling purposes.

The question

It all started with a Google Hangouts in November 2015.

When asked about HTTP/2 support, John Mueller mentioned that HTTP/2-only crawling should be ready by early 2016, and he also mentioned that HTTP/2 would make it easier for Googlebot to crawl pages by bundling requests (images, JS, and CSS could be downloaded with a single bundled request).

“At the moment, Google doesn’t support HTTP/2-only crawling (…) We are working on that, I suspect it will be ready by the end of this year (2015) or early next year (2016) (…) One of the big advantages of HTTP/2 is that you can bundle requests, so if you are looking at a page and it has a bunch of embedded images, CSS, JavaScript files, theoretically you can make one request for all of those files and get everything together. So that would make it a little bit easier to crawl pages while we are rendering them for example.”

Soon after, Twitter user Kai Spriestersbach also asked about HTTP/2 support:

His clients started dropping HTTP/1.1 connections optimization, just like most developers deploying HTTP/2, which was at the time supported by all major browsers.

After a few quiet months, Google Webmasters reignited the conversation, tweeting that Google won’t hold you back if you’re setting up for HTTP/2. At this time, however, we still had no definitive word on HTTP/2-only crawling. Just because it won’t hold you back doesn’t mean it can handle it — which is why I decided to test the hypothesis.

The experiment

For months as I was following this online debate, I still received questions from our clients who no longer wanted want to spend money on HTTP/1.1 optimization. Thus, I decided to create a very simple (and bold) experiment.

I decided to disable HTTP/1.1 on my own website (https://goralewicz.com) and make it HTTP/2 only. I disabled HTTP/1.1 from March 7th until March 13th.

If you’re going to get bad news, at the very least it should come quickly. I didn’t have to wait long to see if my experiment “took.” Very shortly after disabling HTTP/1.1, I couldn’t fetch and render my website in Google Search Console; I was getting an error every time.

My website is fairly small, but I could clearly see that the crawling stats decreased after disabling HTTP/1.1. Google was no longer visiting my site.

While I could have kept going, I stopped the experiment after my website was partially de-indexed due to “Access Denied” errors.

The results

I didn’t need any more information; the proof was right there. Googlebot wasn’t supporting HTTP/2-only crawling. Should you choose to duplicate this at home with our own site, you’ll be happy to know that my site recovered very quickly.

I finally had Robert’s answer, but felt others may benefit from it as well. A few weeks after finishing my experiment, I decided to ask John about HTTP/2 crawling on Twitter and see what he had to say.

(I love that he responds.)

Knowing the results of my experiment, I have to agree with John: disabling HTTP/1 was a bad idea. However, I was seeing other developers discontinuing optimization for HTTP/1, which is why I wanted to test HTTP/2 on its own.

For those looking to run their own experiment, there are two ways of negotiating a HTTP/2 connection:

1. Over HTTP (unsecure) – Make an HTTP/1.1 request that includes an Upgrade header. This seems to be the method to which John Mueller was referring. However, it doesn’t apply to my website (because it’s served via HTTPS). What is more, this is an old-fashioned way of negotiating, not supported by modern browsers. Below is a screenshot from Caniuse.com:

2. Over HTTPS (secure) – Connection is negotiated via the ALPN protocol (HTTP/1.1 is not involved in this process). This method is preferred and widely supported by modern browsers and servers.

A recent announcement: The saga continues

Googlebot doesn’t make HTTP/2 requests

Fortunately, Ilya Grigorik, a web performance engineer at Google, let everyone peek behind the curtains at how Googlebot is crawling websites and the technology behind it:

If that wasn’t enough, Googlebot doesn’t support the WebSocket protocol. That means your server can’t send resources to Googlebot before they are requested. Supporting it wouldn’t reduce network latency and round-trips; it would simply slow everything down. Modern browsers offer many ways of loading content, including WebRTC, WebSockets, loading local content from drive, etc. However, Googlebot supports only HTTP/FTP, with or without Transport Layer Security (TLS).

Googlebot supports SPDY

During my research and after John Mueller’s feedback, I decided to consult an HTTP/2 expert. I contacted Peter Nikolow of Mobilio, and asked him to see if there were anything we could do to find the final answer regarding Googlebot’s HTTP/2 support. Not only did he provide us with help, Peter even created an experiment for us to use. Its results are pretty straightforward: Googlebot does support the SPDY protocol and Next Protocol Navigation (NPN). And thus, it can’t support HTTP/2.

Below is Peter’s response:


I performed an experiment that shows Googlebot uses SPDY protocol. Because it supports SPDY + NPN, it cannot support HTTP/2. There are many cons to continued support of SPDY:

    1. This protocol is vulnerable
    2. Google Chrome no longer supports SPDY in favor of HTTP/2
    3. Servers have been neglecting to support SPDY. Let’s examine the NGINX example: from version 1.95, they no longer support SPDY.
    4. Apache doesn’t support SPDY out of the box. You need to install mod_spdy, which is provided by Google.

To examine Googlebot and the protocols it uses, I took advantage of s_server, a tool that can debug TLS connections. I used Google Search Console Fetch and Render to send Googlebot to my website.

Here’s a screenshot from this tool showing that Googlebot is using Next Protocol Navigation (and therefore SPDY):

I’ll briefly explain how you can perform your own test. The first thing you should know is that you can’t use scripting languages (like PHP or Python) for debugging TLS handshakes. The reason for that is simple: these languages see HTTP-level data only. Instead, you should use special tools for debugging TLS handshakes, such as s_server.

Type in the console:

sudo openssl s_server -key key.pem -cert cert.pem -accept 443 -WWW -tlsextdebug -state -msg
sudo openssl s_server -key key.pem -cert cert.pem -accept 443 -www -tlsextdebug -state -msg

Please note the slight (but significant) difference between the “-WWW” and “-www” options in these commands. You can find more about their purpose in the s_server documentation.

Next, invite Googlebot to visit your site by entering the URL in Google Search Console Fetch and Render or in the Google mobile tester.

As I wrote above, there is no logical reason why Googlebot supports SPDY. This protocol is vulnerable; no modern browser supports it. Additionally, servers (including NGINX) neglect to support it. It’s just a matter of time until Googlebot will be able to crawl using HTTP/2. Just implement HTTP 1.1 + HTTP/2 support on your own server (your users will notice due to faster loading) and wait until Google is able to send requests using HTTP/2.


Summary

In November 2015, John Mueller said he expected Googlebot to crawl websites by sending HTTP/2 requests starting in early 2016. We don’t know why, as of October 2017, that hasn’t happened yet.

What we do know is that Googlebot doesn’t support HTTP/2. It still crawls by sending HTTP/ 1.1 requests. Both this experiment and the “Rendering on Google Search” page confirm it. (If you’d like to know more about the technology behind Googlebot, then you should check out what they recently shared.)

For now, it seems we have to accept the status quo. We recommended that Robert (and you readers as well) enable HTTP/2 on your websites for better performance, but continue optimizing for HTTP/ 1.1. Your visitors will notice and thank you.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Google Shares Details About the Technology Behind Googlebot

Posted by goralewicz

Crawling and indexing has been a hot topic over the last few years. As soon as Google launched Google Panda, people rushed to their server logs and crawling stats and began fixing their index bloat. All those problems didn’t exist in the “SEO = backlinks” era from a few years ago. With this exponential growth of technical SEO, we need to get more and more technical. That being said, we still don’t know how exactly Google crawls our websites. Many SEOs still can’t tell the difference between crawling and indexing.

The biggest problem, though, is that when we want to troubleshoot indexing problems, the only tool in our arsenal is Google Search Console and the Fetch and Render tool. Once your website includes more than HTML and CSS, there’s a lot of guesswork into how your content will be indexed by Google. This approach is risky, expensive, and can fail multiple times. Even when you discover the pieces of your website that weren’t indexed properly, it’s extremely difficult to get to the bottom of the problem and find the fragments of code responsible for the indexing problems.

Fortunately, this is about to change. Recently, Ilya Grigorik from Google shared one of the most valuable insights into how crawlers work:

Interestingly, this tweet didn’t get nearly as much attention as I would expect.

So what does Ilya’s revelation in this tweet mean for SEOs?

Knowing that Chrome 41 is the technology behind the Web Rendering Service is a game-changer. Before this announcement, our only solution was to use Fetch and Render in Google Search Console to see our page rendered by the Website Rendering Service (WRS). This means we can troubleshoot technical problems that would otherwise have required experimenting and creating staging environments. Now, all you need to do is download and install Chrome 41 to see how your website loads in the browser. That’s it.

You can check the features and capabilities that Chrome 41 supports by visiting Caniuse.com or Chromestatus.com (Googlebot should support similar features). These two websites make a developer’s life much easier.

Even though we don’t know exactly which version Ilya had in mind, we can find Chrome’s version used by the WRS by looking at the server logs. It’s Chrome 41.0.2272.118.

It will be updated sometime in the future

Chrome 41 was created two years ago (in 2015), so it’s far removed from the current version of the browser. However, as Ilya Grigorik said, an update is coming:

I was lucky enough to get Ilya Grigorik to read this article before it was published, and he provided a ton of valuable feedback on this topic. He mentioned that they are hoping to have the WRS updated by 2018. Fingers crossed!

Google uses Chrome 41 for rendering. What does that mean?

We now have some interesting information about how Google renders websites. But what does that mean, practically, for site developers and their clients? Does this mean we can now ignore server-side rendering and deploy client-rendered, JavaScript-rich websites?

Not so fast. Here is what Ilya Grigorik had to say in response to this question:

We now know WRS’ capabilities for rendering JavaScript and how to debug them. However, remember that not all crawlers support Javascript crawling, etc. Also, as of today, JavaScript crawling is only supported by Google and Ask (Ask is most likely powered by Google). Even if you don’t care about social media or search engines other than Google, one more thing to remember is that even with Chrome 41, not all JavaScript frameworks can be indexed by Google (read more about JavaScript frameworks crawling and indexing). This lets us troubleshoot and better diagnose problems.

Don’t get your hopes up

All that said, there are a few reasons to keep your excitement at bay.

Remember that version 41 of Chrome is over two years old. It may not work very well with modern JavaScript frameworks. To test it yourself, open http://jsseo.expert/polymer/ using Chrome 41, and then open it in any up-to-date browser you are using.

The page in Chrome 41 looks like this:

The content parsed by Polymer is invisible (meaning it wasn’t processed correctly). This is also a perfect example for troubleshooting potential indexing issues. The problem you’re seeing above can be solved if diagnosed properly. Let me quote Ilya:

“If you look at the raised Javascript error under the hood, the test page is throwing an error due to unsupported (in M41) ES6 syntax. You can test this yourself in M41, or use the debug snippet we provided in the blog post to log the error into the DOM to see it.”

I believe this is another powerful tool for web developers willing to make their JavaScript websites indexable. We will definitely expand our experiment and work with Ilya’s feedback.

The Fetch and Render tool is the Chrome v. 41 preview

There’s another interesting thing about Chrome 41. Google Search Console’s Fetch and Render tool is simply the Chrome 41 preview. The righthand-side view (“This is how a visitor to your website would have seen the page”) is generated by the Google Search Console bot, which is… Chrome 41.0.2272.118 (see screenshot below).

Zoom in here

There’s evidence that both Googlebot and Google Search Console Bot render pages using Chrome 41. Still, we don’t exactly know what the differences between them are. One noticeable difference is that the Google Search Console bot doesn’t respect the robots.txt file. There may be more, but for the time being, we’re not able to point them out.

Chrome 41 vs Fetch as Google: A word of caution

Chrome 41 is a great tool for debugging Googlebot. However, sometimes (not often) there’s a situation in which Chrome 41 renders a page properly, but the screenshots from Google Fetch and Render suggest that Google can’t handle the page. It could be caused by CSS animations and transitions, Googlebot timeouts, or the usage of features that Googlebot doesn’t support. Let me show you an example.

Chrome 41 preview:

Image blurred for privacy

The above page has quite a lot of content and images, but it looks completely different in Google Search Console.

Google Search Console preview for the same URL:

As you can see, Google Search Console’s preview of this URL is completely different than what you saw on the previous screenshot (Chrome 41). All the content is gone and all we can see is the search bar.

From what we noticed, Google Search Console renders CSS a little bit different than Chrome 41. This doesn’t happen often, but as with most tools, we need to double check whenever possible.

This leads us to a question…

What features are supported by Googlebot and WRS?

According to the Rendering on Google Search guide:

  • Googlebot doesn’t support IndexedDB, WebSQL, and WebGL.
  • HTTP cookies and local storage, as well as session storage, are cleared between page loads.
  • All features requiring user permissions (like Notifications API, clipboard, push, device-info) are disabled.
  • Google can’t index 3D and VR content.
  • Googlebot only supports HTTP/1.1 crawling.

The last point is really interesting. Despite statements from Google over the last 2 years, Google still only crawls using HTTP/1.1.

No HTTP/2 support (still)

We’ve mostly been covering how Googlebot uses Chrome, but there’s another recent discovery to keep in mind.

There is still no support for HTTP/2 for Googlebot.

Since it’s now clear that Googlebot doesn’t support HTTP/2, this means that if your website supports HTTP/2, you can’t drop HTTP 1.1 optimization. Googlebot can crawl only using HTTP/1.1.

There were several announcements recently regarding Google’s HTTP/2 support. To read more about it, check out my HTTP/2 experiment here on the Moz Blog.

Via https://developers.google.com/search/docs/guides/r…

Googlebot’s future

Rumor has it that Chrome 59’s headless mode was created for Googlebot, or at least that it was discussed during the design process. It’s hard to say if any of this chatter is true, but if it is, it means that to some extent, Googlebot will “see” the website in the same way as regular Internet users.

This would definitely make everything simpler for developers who wouldn’t have to worry about Googlebot’s ability to crawl even the most complex websites.

Chrome 41 vs. Googlebot’s crawling efficiency

Chrome 41 is a powerful tool for debugging JavaScript crawling and indexing. However, it’s crucial not to jump on the hype train here and start launching websites that “pass the Chrome 41 test.”

Even if Googlebot can “see” our website, there are many other factors that will affect your site’s crawling efficiency. As an example, we already have proof showing that Googlebot can crawl and index JavaScript and many JavaScript frameworks. It doesn’t mean that JavaScript is great for SEO. I gathered significant evidence showing that JavaScript pages aren’t crawled even half as effectively as HTML-based pages.

In summary

Ilya Grigorik’s tweet sheds more light on how Google crawls pages and, thanks to that, we don’t have to build experiments for every feature we’re testing — we can use Chrome 41 for debugging instead. This simple step will definitely save a lot of websites from indexing problems, like when Hulu.com’s JavaScript SEO backfired.

It’s safe to assume that Chrome 41 will now be a part of every SEO’s toolset.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!