Google Authorship Beyond Webpages

In last month’s post about authorship, I shared that Google has been experimenting with inferring authorship for PDF documents in addition to webpages. This piqued my curiosity to see if any other indexable filetypes could also have inferred authorship.

Microsoft Office Files

PowerPoint files appear to infer authorship similar to PDFs and webpages, looking for the term “by” followed by the author’s name.

ppt-author-text-only

To generate the authorship snippet on an Excel file, I had to add “by Janet Driscoll Miller” to a tab in the workbook, and Google uses the tab name as the title of the page. Having a byline appear only in a cell of the worksheet wasn’t enough to generate the snippet.

excel-author-tab

 

The most interesting case, though, was with Word documents. Using an old whitepaper I had, I did some testing with the byline again. In one version where I removed the byline, the author snippet was still showing, even though I had removed the words “by Janet Driscoll Miller” and I had no other byline in the document.

word-author-meta-only

 

After combing through the document, I found that there was a paragraph at the end of the paper that could be the culprit.

about-the-author

Although there was no traditional byline, it appeared that this paragraph at the end of the document did help Google identify me as the author. To test this, I tried a version with the “About the Author” paragraph removed.

word-author-meta-only-no-subhead

 No author snippet. What this demonstrates to me is that, while traditional bylines are the most common way for Google to infer authorship, the search engine is increasingly able to do so based on context (to some degree).

Other Types Of Text-Based Files

Since Google can read text in other types of text-based files, would it be able to infer authorship within these documents? I tested rich text format (.rtf) and text format (.txt) files. Interestingly, author snippets only showed for rich text format documents and, as with the Word document with the “About the Author” section, authorship was inferred by more than just a byline.

rtf-author

Interestingly, regular text files did not generate any form of author snippet.

txt-author

Image Files

While Google can’t read text in a JPG file or other types of image files, it can index certain types of vector graphic files, such as SVG and postscript files. Could Google infer authorship from text within these files? As you can see, Google showed authorship when a byline was included in the SVG file in its text.

svg-author

However, I couldn’t get an authorship snippet to display when I saved the same file as a postscript file.

Google Docs

Considering they are tied to your Google ID, it would seem sensible for Google Docs to show authorship if those documents are open for Web sharing. While I had trouble generating my own snippet to show, I was able to find one example of authorship showing for a presentation.

google-docs-presentation-author

Google Books

In my estimation, a natural fit for authorship would be actual books listed in Google Books — however, it doesn’t appear that the authorship snippet has been applied to Google Book listings yet.

book authorship

These listings came from a search for content on the Google Books site; but, Web search didn’t yield an author snippet, either.

Over the next month, I’ll continue to work on author testing to see what other goodies I can find!