As you know, a Flash movie as the index page of a site has always been a major problem with search engine optimization. There's simply no content for the search engines to index.
So when I learned that Google can index the contents of Macromedia Flash movies, I was astonished. It seemed this remarkable discovery had gone virtually
unnoticed in the SEO community.
But as you probably know, Google has always been the first to index different types of content: PDF files, .doc files, etc. Google has also made amazing inroads in being able to index dynamic content.
And now they're the first major search engine to index Flash - another significant step forward in the SEO industry.
So why has Flash presented such problems in the past?
Background of Macromedia Flash Movies and SEO
With a Flash movie as the main page of a site, the Web site owner is giving up the crucial text necessary to prove to the search engines that the main page is about a particular topic. Without that text, the search engines have nothing to index. Therefore, the main page rarely does well in the rankings, unless off-page factors such as link popularity or link reputation are sufficient to carry the page on their own.
In the past, legitimate work arounds have been few and far between. This made things extremely difficult for businesses who wanted to create a rich user experience with a Flash home page, such as Web design firms, photography studios, graphic design firms, and so forth.
So, these businesses often sacrificed rankings for the user experience, since they could rarely have both while still following all of the guidelines set forth by the search engines.
Introducing . . . Michael Marshall
When I learned that Google is indexing Flash from Michael Marshall, creator of ThemeMaster
and chat/forum moderator for our online search engine marketing courses, and when I learned of the fascinating discoveries he'd made, I immediately wanted to interview him for an article.
So let's take a look at what Michael has discovered about Google and Flash.
Michael, how do we know that Google is now indexing the contents of Flash files? Is there a way that we can search the index just for Flash?
Yes. You can enter your search term in Google, and along with that search term, use the filetype operator and restrict your search to the file extension “.swf”. This will search for your search term only in Macromedia Flash files. You should see [FLASH] just before each listing in the results page that is a Flash document.
For example, put the following in the search box at Google:
Best Free Banner Exchange Market" filetype:swf
How can we extract the text found in a Flash file to see what Google sees?
Macromedia has a Flash Search Engine SDK (http://www.macromedia.com/software/flash/download/search_engine/) that will give us just what we need. The SDK (Software Development Kit) includes an application named `swf2html'. Swf2html extracts text and links from a Macromedia Flash .SWF file, and returns the data to stdout or as an HTML document. Swf2html is provided as a compiled application and as a static library for linked library implementation. For complete functionality, see the file Readme.htm included in the SDK.
Do you have an example of a Flash file that we can see, as well as an example of the text that the Macromedia tool extracted from the Flash file?
Yes. I have an example of each. If you look at the extracted output in Web page form, you will see that it is not very pretty. Nevertheless, you've got lots of SEO-worthy content there, and that's what we are most concerned with. You should visit the Flash presentation itself, mouse over the text, and click the links in the presentation so you can be familiar with the Flash presentation. You can compare where certain text appears in the Flash presentation and where it is found in the extracted text.
Example of Flash file:
Example of extracted text:
(Note: This Flash example is based on one of Michael's own products.
However, I chose to use it for two reasons: 1) because
of the many different types of Flash involved; and, 2) because
it is a text-heavy Flash example, as opposed to many other examples
of Flash that I could have chosen to use.
Added Note: Be sure to highlight the entire page
of extracted text by clicking on Ctrl A.)
In the output file, you'll notice that some text seems to be repeated on multiple lines and one portion of it even appears invisible since the font color comes out white. This is just a side effect of the conversion/extraction tool and is not really invisible text or spamming.
In other words, you're doing nothing wrong when this happens - it's simply due to the tool itself and not spamming or true invisible text.
But how we do we know that's how Google sees it?
A simple test will show us how much of the text in a Flash presentation can be seen (or extracted) by Google. Perform anexact search (and use the filetype operator) on some text which appears at the top of the html output from Macromedia's tool, and then perform a similar search for text that appears at the bottom. Try similar searches on text that appears in the middle as well if you really want to be sure. This is a good spot check to see what Google is grabbing from the Flash file. Since we can't know exactly what Google uses to read the Flash file, the Macromedia tool is only an approximation, and this spot check is always the best measure.
How much of your Flash movie does Google see? In other words, how deep into the Flash file does the spider go?
In my experience testing the Macromedia tool, I have found that Google sees all the text that the tool can extract including all links . . . everything from top to bottom.
You mentioned that when certain types of motion in a Flash movie are associated with text, the resulting extracted output will contain duplicated occurrences of that text.
Those techies among us will know what that means, but for those non-techies (like me), does this mean that we need to be careful about using certain types of animation, because it could result in duplicate content, therefore creating the possibility of spam or problems with our SEO efforts?
Yes. The type of animation you apply to text in your Flash presentation has an impact on how that text gets extracted. You wouldn't want your keyword density or theme focus to get thrown off by mistake due to applying the wrong type of animation to certain text.
When viewing the source code of the HTML output extracted from your Flash file (see the source code found at the bottom of this page: http://www.internet-marketing-analysts.com/Google-Flash_tutorial/), there's no title tag. What text does Google pull as the title tag in the search results?
In my experience, I have found that the first line of text in the extracted output gets used by Google as the title tag in the search results. You may want to use swf2html and spot check and modify your Flash presentation until you get the desired result. In addition, the description in the search results is created dynamically (according to the user's query) from snippets of text inside the Flash presentation as extracted by Google.
Do you have any other tips for optimizing Flash files?
Yes. I would recommend that people read my more technical tutorial for more details on optimizing Flash files. (See below)
One thing I would add is the problem that might be encountered by Flash presentations which use dynamic content pulled from a database, xml file, etc. based on user input. Such content is not part of the xml file itself and, therefore, will not be indexable by Google.
What about Flash banners? Will Google also index the contents of Flash banners?
Yes. Any Flash presentation, whether full-page or banner size, can be indexed by Google. I have found many instances of both.
For the More Technical SEOs . . .
Michael created a page with a more technical explanation of many of these concepts at the following URL.
The page also lists the source code of the HTML output extracted from his Flash presentation.
A Word of Caution . . .
Whatever you do, don't try to hide text in any manner through a Flash presentation. Since it take so much more effort to hide text in a Flash file, doing so would be construed as a more deliberate attempt to deceive a search engine, so it would be a much more serious offense.
Remember, hiding anything, whether text, links, etc., is considered spam by Google.
Like I tell all of my students, when you go to sleep at night, it's a wonderful feeling to be able to wake up in the morning knowing your pages are right where you left them because you know you've done nothing wrong.
Like a very good friend of mine, Ginette Degner, once said, it's much better to be in the rankings for the long haul. Spamming isn't worth it.
Once again, Google comes out ahead with being able to index the contents of a Flash file. This amazing bit of news should make SEOs everywhere extremely happy, since they'll be able to use and optimize Flash files as the main page on a site.
Just remember: as with any other SEO strategy, be above board and follow Google's Webmaster Guidelines (http://www.google.com/webmasters/guidelines.html).
Best SEO Educational Resources - at a very affordable price!
Access a wealth of SEO resources, live instructional Webinars, Instructional
Basic and Advanced SEO self-study courses and access over 1000 pages of
strategies by Robin Nobles and John Alexander in a community of SEO
Robin Nobles is the Co-Director of Training of Search Engine Workshops with John Alexander. They teach 2-day beginner, 3-day advanced, and 5-day "hands on" search engine marketing workshops in locations across the globe. She also teaches online search engine marketing courses through http://www.onlinewebtraining.com, and she’s a member of Wordtracker’s official question support team. With partner John Alexander, she's co-authored a series of e-books called, "The Totally Non-Technical Guides to Having a Successful Web Site." And, they opened a networking community for search engine marketers called The Workshop
Resource Center for Search Engine Marketers.
Michael Marshall is CEO of Internet Marketing Analysts, LLC. He is an artificial intelligence (AI) software developer, Web programmer, certified search engine marketing strategist, and holds degrees in Philosophy, Linguistics and Theology. He is the author of the e-book, “Checkmating the Search Engines” -
and a contributor to “Building Your Business with Google for Dummies” by Brad Hill (Wiley Publishing) -
http://www.building-business-google.com. He is the creator of:
Theme Master - http://www.theme-master.com, TopTenAnalyzer - http://www.toptenanalyzer.com and the free Theme Link Reputation Tool - http://www.theme-link-reputation-tool.com (featured in Brad Hills book). >
Copyright 2005-2008 Robin Nobles. All rights reserved.
Click here to go back to the
index of search engine articles
This work is licensed
under a Creative Commons