Archives

Archive forApril, 2007

Hooray robots.txt is useful for sitemap xml

hooray.gifHooray robots.txt is useful for its new sitemap.xml option! For the first time it tells the search engine what pages to spider and not just not to spider.

Ask.com official announcement http://blog.ask.com/2007/04/sitemaps_autodi.html -

Today, Ask.com, Google, Microsoft Live Search and Yahoo! together are announcing support of autodiscovery of Sitemaps. The new open-format autodiscovery allows webmasters to specify the location of their Sitemaps within their robots.txt file, eliminating the need to submit sitemaps to each search engine separately. Comprehensiveness and freshness are key initiatives for every search engine, and with autodiscovery of sitemaps, everyone wins:

  • Webmasters save time with the ability to universally submit their content to the search engines and benefit from reduced unnecessary traffic by the crawlers
  • The search engines get information with regards to pages to index as well as metadata with clues about which pages are newly updated and which pages are identified as the most important
  • Searchers benefit from improved search experience with better comprehensiveness and freshness

In addition, Ask.com is now supporting submission of Sitemaps via ttp://submissions.ask.com/ping?sitemap=SitemapUrl. Of course, neither autodiscovery nor manual submission guarantee pages will be added to the index. The pages must meet our quality criteria for inclusion in the index. And use of these submission methods does not influence ranking. I will be talking about today’s announcement (along with my counterparts at Google, Microsoft and Yahoo!) during the SiteMaps and Site Submission session at SES in New York later this morning. If you aren’t able to join us, more information is available at http://www.sitemaps.org/ and http://about.ask.com/en/docs/about/webmasters.shtml#22. We are excited about our participation with the Sitemaps via robots.txt protocol and look forward to our collaboration with Google, Microsoft, Yahoo! and others in furthering important initiatives that make search easier for webmasters and more powerful for users” Vivek Pathak, Infrastructure Product Manager. Ask.com

What to do?

Just add one simple line to your robots.txt file that will tell Google, yahoo, Ask and MSN search engines, where your file is. No need to create an account. Simply upload your XML sitemap and add a line including the full path to your robots.txt file:

Sitemap: http://www.yoursite.com/your_sitemap.xml

Shall I do?

Sad enough but robots file gives free gift to spammers to misuse this protocol. Take a look on this file:

http://www.bluenile.com/robots.txt

User-agent: *
Disallow: /emails/
Disallow: /promos/
Disallow: /wwwcore/
Disallow: design.asp
Disallow: pendant_design.asp
Disallow: earring_design.asp

“This might be, for example, out of a preference for privacy from search engine results, or the belief that the content of the selected directories might be misleading or irrelevant to the categorization of the site as a whole, or out of a desire that an application only operate on certain data.” source: http://en.wikipedia.org/wiki/Robots.txt
If I was a spammer I was kissing robots.tx file giving me their emails folder name!

Read more about Misuse of Robots.txt abounds here: http://searchengineland.com/070416-131549.php

The full protocol can be found here: http://www.sitemaps.org/protocol.html

Comments

TrustRank is Pagerank?

No its not. TrustRank is not Pagerank. TrustRank is Combating Web Spam with TrustRank using Stanford University research. Full details are here: http://www.acroterion.ca/TrustRank-Stanford-University.pdf

“Web spam pages use various techniques to achieve higher-than-deserved rankings in a search engine’s results. While human experts can identify spam, it is too expensive to manually evaluate a large number of pages. Instead, we propose techniques to semi-automatically separate reputable, good pages from spam. We first select a small set of seed pages to be evaluated by an expert. Once we manually identify the reputable seed pages, we use the link structure of the web to discover other pages that are likely to be good. In this paper we discuss possible ways to implement the seed selection and the discovery of good pages. We present results of experiments run on the World Wide Web indexed by AltaVista and evaluate the performance of our techniques. Our results show that we can effectively filter out spam from a significant fraction of the web, based on a good seed set of less than 200 sites.” Stanford InfoLab Publication.

Layman language of TrustRank formula can be found at http://www.umbrella-consultancy.co.uk/art1-trustrank.htm

TrustRank Explain Starts Here

If someone with a red (negative) reputation votes for you (good or bad), it doesn’t affect your reputation, because in the eyes of the system they are not trusted because others who ARE trusted have said so. While a vote (good or bad) from someone with many squares of reputation can lift you up on high, or condemn you to the fiery furnace, while votes from those with one or two squares of reputation will count, but not for much. This is a simplified form of TrustRank.

It is my belief that TrustRank will compound so that the sum of the whole will be greater than the parts. so if a site got links from a seed site it would be worth x, but a site with links from two seed sites could be x +y + compounded trust value.

Below is a graphical representation of how it might work, but I have for simplicity not added the compounding value, and have used round numbers. Google Page Rank uses a whole number of 1 as a base for each page, and a €™voting value€™ of .85 of the Page Rank value. (This being the result of the .15 dampening factor applied to all pages when determining the value of outbound links from a page).

TrustRank

So the old adage of build quality content that people will want to link to, means more than ever now.

Comments

google checkout improves ranking?

google-checkout-rumorIt looks like the rumor mills are churning out about Google again. The rumors say that having Google checkout improves site’s ranking — Google denies, but does anyone really trust them anymore?

Posted here at July 3, 2006 - Google Checkout - Grab it now! (Filed under Google Essentials). I said “Why to wait a second? Grab it now”.
homersimpson.jpg

Just for joking purposes (if you notices PageRank zero on this page you’ll understand why…) I took Google’s marketing text and changed it to align with the rumor. This is the way it goes:

  • Stop creating multiple links and web pages.
    With Google Checkout„¢ you can quickly and easily get higher position on organic result pages across the web and track all your orders and shipping you made using this trick.
  • Improve ranking with confidence.
    Our fraud protection policy covers you against unauthorized PageRank tricks made through black-hat SEO companies. Google Checkout will improve your ranking with confidence.
  • Control AdWords & AdSense costs.
    You can keep your pay per click money in your pocket, and easily turn off expensive ad campaigns where you use Google Checkout instead.

BTW, if your site ran into Google Problems, Acroterion can help you out!

Comments

Google Pay Per Action (PPA)

http://services.google.com/payperaction/

google ppa 1The new Google model runs much like an affiliate program, costing advertisers only when someone completes a desired action on their web site. For companies that know their customer acquisition cost, it could be a good opportunity for them to take advantage of Google’s broad viewer base.

The system works through Google’s content network and allows publishers to select the ads that they wish to run on their site. Advertisers can run text ads, image ads or new text link ads that appear inline with a web site’s content. (Meaning that the ad shows up as a link just like an editorial link would, though it’s noted as a Google Ad when a user mouses over the link.)

While many bloggers are focusing on the publisher side of the program, I thought it might be a good idea to dive into the act of setting up a campaign. It’s pretty straight forward if you’ve ever used the Google AdWords system, but here’s a step-by-step walk-through of the process.

Once you’re logged into your Google Adwords account, you’ll notice a new tab called “pay-per-action.” It sits next to the standard and cross-channel tabs on your campaign page. Simply click on thetab to enter the pay-per-action campaign area.

google ppa 2Once you’re on the pay-per-action campaign page (as shown above), you’ll want to select the “create new campaign” option.

At this point, you’ll need to enter some information about your web site. You can see the fields in the image above asking for your product name, a description and a logo. This will help Google provide publishers with information to help them decide if they’d like to run your ad on their web site.

Next, google will have you enter a list of keywords or keyword phrases that relate to your product or offering. These terms will trigger your ads when the publishers are searching for new pay-per-action ads to feature on their site.

Once you’ve given Google enough information about your product to index it for potential publishing partners, you’ll move on to creating the ads that you plan to run. This part of the process works pretty much the exact same way that standard Google Adwords ad creation works.

google ppa 3After you set up your ads, you’ll need to define the action that you consider to be a conversion. That might be a sale, a newsletter sign up, a sales lead via a web form, or any other action that you feel is worth paying for.

You’ll give this action a name and then define how much you are willing to pay for it. This is the amount that you’ll be charged when someone converts via the campaign. Finally, you’ll snag the conversion tracking code that is created by Google Adwords and paste it on your landing page. This allows Google to match up the conversions with the campaign so that publisher partners are properly compensated. It also helps you track the conversion rates of your campaigns. While the set up is pretty simple and runs pretty much in line with traditional AdWords campaign set ups, there is a potentially fatal flaw in the system.

Google doesn’t seem to have built anything into the system to account for charge backs, product returns and so on. Many traditional affiliate programs hold a percentage of affiliate commissions in reserve to cover charge backs, returns and other issues. Google’s new pay-per-action model doesn’t have this feature. That means that while Google will get paid by the advertiser when someone makes a purchase, Google isn’t going to refund the money if the person that makes that purchase decides to return the product. In other words, it opens up a whole new potential for click fraud. Imagine legions of paid surfers visiting sites and going through the motions to sign up for a newsletter or make a purchase only to then cancel out once Google and the publisher have been paid.

google ppa 4That means that setting up your campaign isn’t as simple as knowing what you can afford to pay for each conversion. Instead, advertisers will need to factor in their return rate and their charge back rate and will need to calculate new figures for this specific campaign. It looks like testing will be the name of the game for those in the beta program. Testing to see how well the system works. Testing to see if charge backs and returns become a serious issue. And, testing to see if any publishers will actually be interested in using up their ad space for the program.

Source: http://www.searchengineguide.com/laycock/009781.html

Google official site:

http://services.google.com/payperaction/

Comments