Link rot has become one of those old school phrases web marketers don’t discuss much any more. Nonetheless, it still occurs and creates havoc on search indexes that utilize linking relationships in their algorithms. But while marketers don’t say much about link rot any more, librarians and academics have a few things to say.
A recent blog post at Harvard University discusses a subscription service called Perma.cc. I’m not sure there is a legitimate need for the service, which creates “permanent” archive copies of Web content. Isn’t that what Archive.Org‘s mission is?
The Harvard blog post cites several recent studies which found that link rot has eaten away at many articles. As much as 75% of the original links included in old articles no longer exist, either leading nowhere or replaced by newer content which could be substantially different from whatever was originally pointed to.
The World Wide Web is forgetting the past in unexpected ways, because universities and government Websites are expected to preserve their content longer than, say, a small business Website.
As A Blogger, I Link to Many Websites
The Internet was created for the free dissemination and preservation of information. The World Wide Web was created to link a lot of that information together in an easy, convenient interface. Unfortunately, the cost of maintaining all these archives has become prohibitively expensive. As businesses fail, universities change their priorities, and government budgets bob up and down, untold mountains of information have been lost forever.
I often follow links on Websites to find original sources of information for my own articles. And those sources are rarely easy to find any more, at least in articles that are only a few years to many years old.
As a Website owner and manager I spend a lot of time reviewing content I’ve written over the years. When I find that the links no longer work, I try to replace them with comparable destinations.
I begin by looking for the original content – maybe it was only moved without adding a redirect. Maybe the redirects are gone because the original domains have been taken offline.
Too often I find myself replacing old links with links to copies of original content on Archive.Org. But even that vast repository of Web history fails me more often than I care to count.
The Web That Was is constantly dying, and we have no hope of resurrecting it. Website owners take no responsibility for preserving content that was once valuable.
Government Websites are Among the Worst Offenders
The U.S. Government is required by law to preserve a lot of information. And yet even in recent years archivists have struggled to identify all the information that should be preserved. The Trump administration has systematically deleted or destroyed a lot of data that is supposed to be preserved. Worse yet, the administration has also taken down thousands – perhaps more – of Web pages and archive files from government Websites that were created by scientists.
The deliberate and willful mismanagement of public data by the 2016-2020 administration is criminal and morally irresponsible, but don’t expect anyone to be held accountable for it. After all, the U.S. government loses or misplaces information as easily as a great lumbering bear crushes camp equipment when scavenging for human food.
University Websites Rank Second for Worst Managers of Web Content
I used to believe it would be safe to link to content on university blogs. I only reluctantly link to such articles now. Universities remain great sources of information, but they have no standards for publishing and preserving information on the Web.
Oh, some of their administrators will beg to differ – but ask them where old articles published by former staff went and they’ll deafen you with silence. Ask them where press releases announcing once important research are archived and they’ll mumble about budget cuts and the expense of maintaining information no one is interested in any more.
It’s hard enough to find good scientific research (Google certainly doesn’t make that easy). It’s even more challenging to find old scientific research. If you scan the archives on Science Daily, you’ll find that many of their older articles no longer link to the original sources – or their links are dead. I know this to be true because I sift through thousands of old news stories every year, looking for original sources of information about scientific research.
When academic staff – teachers, professors – leave their old schools for other positions, their old personal accounts are taken offline. Many fantastic research papers simply wink out of existence because they were only published on those personal accounts.
Yes, there are open source pre-review archives now, but they don’t always fill in the gaps.
News Websites Are Rapidly Decaying, Too
A lot of what passes for “news” isn’t. In fact, I still don’t understand why so many American news agencies have wasted millions of dollars reporting on Tweets from various random people, ignorant politicians, and diabolical hacker teams. Those Tweets aren’t news, aren’t newsworthy, and aren’t likely to survive because of the ways news sites embed them.
Merely embedding a Tweet means that if one doesn’t like being quoted by the news media, one need only delete the Tweet. And, yes, I’ve come across many news stories that make no sense because the embedded media they were written about no longer exists. Tweets, videos, and other things once available have vanished.
So the news Web is now pock-marked with angry, irrational, or desperate deletions that render old noise stories even more useless.
But as the news media has declined, so have online news sites. Their vast troves of journalistic research – including investigative stories into important crimes, biographical articles about people whose achievements were noted only by a few industry or local publications, and historical notices about events and places that have been long forgotten – are vanishing from the Web.
There are a few online archives that preserve older news content. Maybe they are enough, but think twice before you link to Newspapers.Com. They preserve a lot of old information that you can’t read without a (paid) subscription.
Yes, someone has to pay for all that hardware – so it’s only fair that consumers of old news information should pay something. But that puts the information beyond the reach of most casual readers. So why bother linking to all that preserved information?
Web Marketers’ Bad Practices Have Ruined the Blogosphere
Thanks to an endless flood of nonsensical advice about purging old blog posts, many bloggers and businesses now routinely delete useful old content “because no one is reading it.”
When I ask people why no one reads the old content, or when I browse their sites, I often discover that a few people are reading the content – or the content has been pushed too deep to be crawled by Website subduction.
Many blog posts don’t need to be preserved. They may be simple announcements like “I’ll be offline this weekend” or discussions of changes to Website design that, after a few months or years, are no longer relevant.
And, frankly, we all write posts that make us cringe years later.
Still, if someone else linked to your old content, you should think about what that means for both your site and the Web as a whole. You made a meaningful contribution that could still be relevant in a context you didn’t anticipate.
SEO Theory used to link to many marketing blogs that no longer exist, or which have purged their older posts. Believe me, I’ve deleted many links from this site that once supported the industry – or replaced them with links to Archive.Org.
You left me no choice because you didn’t feel your old content was worth preserving. You lose more than content and links when you delete old blog posts. You lose the connections that once made you notable.
Web Forums Also Create Chaos in the Linksphere
Despite the growth of social media giants like Facebook, Pinterest, and Twitter – where people now create most of their trivial content – Web forums still thrive in many topics. But over the past decade I’ve noticed a trend among forum owners toward privatizing content.
Maybe this was a natural response to link spam. Maybe the forum owners couldn’t monetize their sites any other way, and as someone who operates Web forums I know how expensive it can be to support huge databases of old discussions.
Worse yet, when you change forum software (as I have done several times), there is usually no way to redirect all the old threads to their new counterparts. You must hope the search engines crawl and index all the content again. They usually miss a lot of the oldest content.
Some Web forum discussions are well worth linking to. They may include the only copies of news articles or blog posts that have been taken offline. They may include good, informative, insightful discussion about topics by subject-matter experts.
And yet we cannot link to them.
And So, Should We Link to the Past?
The philosophical discussion about preserving the past doesn’t address the subject of cost and administration. In a perfect world we’d always be able to find some old article that we vaguely recall.
But we live in the world of Google’s ever-changing index. The world’s largest search engine struggles to find content it showed us only a few weeks ago – or it becomes mired in endless repetitious listings of irrelevant garbage we never clicked on in the first place.
It’s not all Google’s fault. Their index changes constantly because the Web changes constantly. All those linking relationships Web marketers invest so much time and money in flicker out like fireflies vanishing into a summer night.
I hate deleting links from old blog posts, but sometimes that is the only thing I can do.
Sometimes I delete links whose original destinations still exist – that is, the URLs are still online. But the content has changed. Or, worse, the sites have changed owners and the original content is redirected somewhere else.
Bad Web marketing practices systematically destroy the Web. These cheap link reclamation games, poorly thought-out site redesigns, and purges of old informative content “that no one reads any more” devalue everything.
As a Website owner, you should ask yourself why you don’t review your old outbound links. You should know what you’re linking to, whom you’re supporting.
No matter how much content you manage, you should make an effort to deleverage your old links if they’re no longer serving their original purposes. If the past is gone you can’t link to it any way.
I’ve tried to write this article several times. I want it to make too many points.
I want to admonish Web marketers who indiscriminately destroy old content because they don’t see any value in it.
I want to admonish Website owners who never review their outbound links and consider what has changed on the other end of the equation.
I want to propose good content management strategies that benefit everyone. But I can’t. There isn’t a good, universal solution for all these issues. Not only does someone need to pay for all the storage and hosting, someone needs to manage it all.
And that won’t happen. It can’t happen. It’s too impractical to happen.
At best, we should encourage each other to link to persistent resources. People do still read old content. The oldest article still published on the root domain of SEO Theory is SEO Milestones: How Search Optimization Theory Evolved. I wrote that article in 2007 and yet when I archived most of this blog’s old content in 2019, I decided to keep that article here. And that’s because people still occasionally read it and other articles I wrote over a decade ago.
I’m not ready to give up on the Web That Was. I’m sorry to see that so many of you are. I hope you reconsider because, frankly, after 20 years of SEO specialists telling everyone else how to manage the Web, I find that way too many of those people listened to your worst advice. They didn’t need to kill all that good content “for SEO”.
Follow SEO Theory
Do you want more than just reposts of the week's SEO discussions and news?
Get the LARGEST weekly SEO newsletter now ...