Why Google's Indexing Of SWFs Is Worthless
Okay it looks like the "Wow google now indexes swfs!" news has somehow risen from the grave again. I wonder why nobody has yet objected to this.
It is correct, Google does indeed index swf files, as much as it indexes PDFs, PPTs and some other "binary" formats. The problem is that this will not help you, at least if you sell your clients professional Flash work.
Why?
Google can only find texts inside of SWFs when these texts are inside the swf itself and the swf is not compressed. As soon as you switch on gzip encoding all your texts will get garbled. And why should anyone switch off compression after Flash 5?
But: which serious Flash developer still places static texts inside of swfs? Is it to bind the client to your services saying "You want to change that news? Okay I will have to recompile the swf for you. That's XXX$ for you." No question - content should either be loaded via XML, loadvars or remoting.
If you want the your page to be found - put the content either into the HTML page inside <NOSCRIPT> tags or if the content is dynamic - build a parallel layer of xml or html pages that will only be seen by crawlers and do redirect to your main file or better include it.
Posted at September 03, 2004 06:57 PM | Further reading
"If you want the your page to be found - put the content either into the HTML page inside tags or if the content is dynamic - build a parallel layer of xml or html pages that will only be seen by crawlers and do redirect to your main file or better include it."
Have you tried these methods? Do they work? I would imagine Google would not give a favourable ranking to these methods as they could easily be used for spoofing content.
"the "Wow google now indexes swfs!" news has somehow risen from the grave again. I wonder why nobody has yet objected to this."
I have, I have! [waves hand wildly]
In many of these discussions I see different people using the same label "index" in different ways, each assuming that all others understand the label as the speaker does.
Describing the exact functionality they see, rather than relying on an underdefined label, could help to trim many such discussions. But this is Earth, and, well, y'know.... ;-)
@felix: yes I have tried and others have. If you check out the pages of http://www.fluid.com/ and do some searches on Google for them you will see that they have found a very good method that works.
I personally have a working example with my hobby site http://www.jhlynch.org - where I run a flash based forum that gets completely indexed by google. And no penalty involved.
> http://www.jhlynch.org
I love the highly original concept and design of that site - congrats!
A great Flash-Forum - can you give some pointers on how you did it?
Thanks Andreas. The forum is based on the free boardmx http://www.dynamicflash.co.uk/ which I reskinned. So the credit for the forum engine goes to Steve Webster.
In order to speed up loading times I exchanged the forum's loadvars routines with my own loadvars2swf php script which utilizes the gzip compression of swf files to shrink the files.
It works whit -> NOFRAMES <-too!!
Mario would you consider letting us see some code? I would really like to known exactly how you get Google (and other crawlers) to index your flashpages.
@rasmus: I will prepare a little tutorial to explain the basic techniques I used - though you have to be a bit patient as I'm currently too busy with other stuff.
Sounds great! Really looking forward to it.
"build a parallel layer of xml or html pages that will only be seen by crawlers and do redirect to your main file"
This is a very bad idea. You're essentially spoofing the crawler, which is a big no-no as far as Google's policies are concerned and grounds for getting black-listed. Having the content on the page is a much better option because the page is WYSIWYG for both the crawler and user. And if the user doesn't have Flash, they can still see the content of your site. That's why NOEMBED is a logically better tag to use.
Yes I should have formulated that less misunderstandable. With "parallel layer" I meant that you have a stucture of pages that is usually only be found by bots, because the first entry point on the index page is inside a NOSCRIPT or - wow good idea - NOEMBED tags. Then, as you say, you put the same swf that is on the index page also into the "content pages". This is not cheating the search bots, as you are not sniffing out the user agent and serving different content depending on who is visiting.
I made parallel layer on my site that is only activated when a crawler visits it. Links in this crawler content are translated when clicked on by a user who has flash installed. Works perfectly!