Forgot your password?
typodupeerror
Businesses Google Technology

Google's Manual For Its Unseen Human Raters 67

Posted by timothy
from the do-this-stuff dept.
concealment writes "It's widely believed that Google search results are produced entirely by computer algorithms — in large part because Google would like this to be widely believed. But in fact a little-known group of home-worker humans plays a large part in the Google process. The way these raters go about their work has always been a mystery. Now, The Register has seen a copy of the guidelines Google issues to them."
This discussion has been archived. No new comments can be posted.

Google's Manual For Its Unseen Human Raters

Comments Filter:
  • by staltz (2782655) on Tuesday November 27, 2012 @10:38AM (#42105255) Homepage

    "For relevance raters are advised to give a rating based on "Vital", "Useful", "Relevant", Slightly Relevant", "Off-Topic or Useless" or "Unratable"."

    Hmmm, sounds like Slashdot. Anyone unemployed?

  • by crazyjj (2598719) * on Tuesday November 27, 2012 @10:45AM (#42105309)

    "It puts a good rating in the bin or else it gets the hose again"

  • by Anonymous Coward on Tuesday November 27, 2012 @10:46AM (#42105329)

    So I really can make $5000/month as a single mom?

    • Re: (Score:3, Insightful)

      by Anonymous Coward

      Nowhere near that much. I know someone who was a rater. Pay rate was ok for someone in Idaho who needs part time work (something like $15 an hour), but there are limits on the number of hours you can work (both over and under), and you're often limited by the number of tasks available.

      • Re:Work from home? (Score:4, Informative)

        by The Pirou (1551493) on Tuesday November 27, 2012 @11:18AM (#42105615)
        You're talking about Lionbridge.
        Leapforce isn't capped in the same way, but it has a lower rate of pay. Individual raters see limited hours at first, but as long as you perform well there is usually way more than 40+ hours.

        This isn't news, as old versions of the General Guidelines have been leaked to the public before.
      • by Sulphur (1548251)

        Nowhere near that much. I know someone who was a rater. Pay rate was ok for someone in Idaho who needs part time work (something like $15 an hour), but there are limits on the number of hours you can work (both over and under), and you're often limited by the number of tasks available.

        Small potatoes, and certainly not the potato on tour.

    • by dietdew7 (1171613) on Tuesday November 27, 2012 @11:05AM (#42105491)
      Maybe, send pictures and your hourly rate.
    • Re:Work from home? (Score:5, Informative)

      by Anonymous Coward on Tuesday November 27, 2012 @11:31AM (#42105751)

      I've done this work (for LeapForce; there is another company LionBridge that does the same work, but I have no experience with them). It's really, really mind-numbingly boring. I didn't last very long due to that. And it's worse than say a boring service job where you can interact with other people because you interact with nobody (I realize this may appeal to some). The other thing is that there is not always work available when you want to work. So, you may sign in and be ready to go, but there's no work there. This may cycle somewhat throughout the year with their hiring cycles. You may end up visiting sketchy websites, so having everything up to date is a must. They allow you to opt-out of adult ratings, but that doesn't mean that you will never come across an inappropriate site. You also have minimum performance requirements, and I think it would be difficult to maintain those requirements if you have lots of distractions going on while you work (kids, etc.). Some people are not able to attain the minimum at all within their required time frame (I had a couple of friends try and fail).

      They do pay as agreed, but don't expect your check in a hurry. When I did work, you were able to submit an invoice on the 1st of the month for the previous month. They then paid net-30 on that invoice. That means, if you start December 1, then on January 1 you can submit your invoice for December. At the end of January you get paid for that invoice (for the work you did in December). So, if you need money quickly it doesn't work out so well, but once you get started it is a monthly income. They may have changed policies, so be sure to check it out before starting work.

      That being said, the pay is good for a job you can do from home (I think around $13.50/hour). So, you certainly may be able to make $5000 a month, but that would be some insane hours (> 90 a week). If you need some extra income, try it out. There's a couple of tests to take before you are hired, and those are a good way to see what you think of the work. If you hate taking the tests, you will hate the work.

      • That means, if you start December 1, then on January 1 you can submit your invoice for December. At the end of January you get paid for that invoice (for the work you did in December).

        But on the plus side, the other end you can put your feet up for a few weeks and still get paid!

        Continued in alt.glass.half.full

  • Could it be... (Score:5, Interesting)

    by Baba Ram Dass (1033456) on Tuesday November 27, 2012 @10:55AM (#42105411)

    First off, didn't read the article. Yeah, I said it. So if the article dispells this just ignore me.

    What if Google actively uses the human ratings as a comparison/benchmark against which they measure those fancy algorithms? In other words, the users are rating the algorithms more than they are the websites. Makes sense they would improve search results algorithms, a highly technical and scientific method of ranking sites (which is of little use to a human in and of itself), by constantly striving toward an unscientific and untechnical (e.g. "quality") method... humans... which afterall is, you know, who uses the engine in the first place.

    Amazon probably does the same to improve their suggestions model.

    • Re:Could it be... (Score:5, Insightful)

      by mbkennel (97636) on Tuesday November 27, 2012 @11:14AM (#42105573)

      This is almost certainly what is happening. It is impossible for humans to rate any significant fraction of searches/websites to be quantitatively useful for Google's search volume.

      In machine learning, the name is "tags", a.k.a. ground truth for a supervised prediction/ranking model. Google gets zillions of weak, noisy, tag proxies in the sense of being able to measure when a user clicks on a link and then within a minute clicks on another link on the same search page, potentially indicating that the first link was undesirable.

      These are the relatively expensive but highest quality "ground-truth" tags from which Google can calibrate the value and interpretation of the weak automatic tags and the algorithms themselves.

      The final machine learning algorithms may be as simple as linear regression---performed on some rather complex features. These ground truth tags are used to calibrate and weight the importance of various features in making a final ranking.

    • by Sulphur (1548251)

      Amazon probably does the same to improve their suggestions model.

      There is a suggestion bot?

      --

      It is now safe to turn your computer on.

    • by jovius (974690)

      That's how it seems to be, according [searchengineland.com] to one rater:

      So, you knew it was Google-related. At what point did you know that you’d be rating Google’s search results?

      I knew before I got hired.

      One thing I think the SEO community is missing is that this program has nothing to do with SEO or rankings. What this program does is help Google refine their algorithm. For example, the Side-by-Side tasks show the results as they are next to the results with the new algorithm change in them. Google doesn’t hire these raters to rate the web; they hire them to rate how they are doing in matching users queries with the best source of information.

  • by Anonymous Coward on Tuesday November 27, 2012 @11:06AM (#42105499)

    It's widely believed that Google search results are produced entirely by computer algorithms...

    This is only believed by people who haven't thought about it very hard.

    At an abstract level, it makes no sense to think that computer code can be optimized to perform a task without any human intervention. The reason is simple: the task we want the code to perform is always something that a human cares about. So, somehow we need a human to instruct the computer about the goals. This can take the form of a programmer meticulously coding the entire thing, with a particular human-relevant code in mind. Or it can involve non-programmers providing feedback about how well the software is doing at its stated goal (depending on context, these people may be testers, evaluators, users, taggers, etc.).

    More specifically, in the case of AI-software, a typical procedure is to have a store of 'pre-tagged' training examples. These are example of problem, with associated 'correct' answers. The training data is used to optimize the AI algorithm: the software can tweak its behavior in order to maximize accuracy of output on the training examples, with the hope that this will then generalize to general use. For something like web-search, where the goal is to make a human end-user happy with the quality and relevance of the results, of course you need humans to assess the quality of the algorithmic results. This is the only way to keep the results relevant. (For search results, this is a continual and iterative process, since the web constantly changes, people are trying to game the system, etc.)

    Thus, it's probably better to think of these raters as providing input for evaluating and refining the search algorithms; rather than thinking of them as people who get to uniquely decide the rank of pages. Obviously they will have an influence on the rank of the pages they rate, but overall they are evaluating a rather tiny fraction of the web-pages in the Google database. Thus, when you perform an arbitrary web-search, chances are the results you are seeing are purely algorithmic (none of the listed results were manually rated/adjusted by anyone).

    • It's widely believed that Google search results are produced entirely by computer algorithms...

      This is only believed by people who haven't thought about it very hard.

      At an abstract level, it makes no sense to think that computer code can be optimized to perform a task without any human intervention. The reason is simple: the task we want the code to perform is always something that a human cares about. So, somehow we need a human to instruct the computer about the goals. This can take the form of a programmer meticulously coding the entire thing, with a particular human-relevant code in mind. Or it can involve non-programmers providing feedback about how well the software is doing at its stated goal (depending on context, these people may be testers, evaluators, users, taggers, etc.).

      More specifically, in the case of AI-software, a typical procedure is to have a store of 'pre-tagged' training examples. These are example of problem, with associated 'correct' answers. The training data is used to optimize the AI algorithm: the software can tweak its behavior in order to maximize accuracy of output on the training examples, with the hope that this will then generalize to general use. For something like web-search, where the goal is to make a human end-user happy with the quality and relevance of the results, of course you need humans to assess the quality of the algorithmic results. This is the only way to keep the results relevant. (For search results, this is a continual and iterative process, since the web constantly changes, people are trying to game the system, etc.)

      Thus, it's probably better to think of these raters as providing input for evaluating and refining the search algorithms; rather than thinking of them as people who get to uniquely decide the rank of pages. Obviously they will have an influence on the rank of the pages they rate, but overall they are evaluating a rather tiny fraction of the web-pages in the Google database. Thus, when you perform an arbitrary web-search, chances are the results you are seeing are purely algorithmic (none of the listed results were manually rated/adjusted by anyone).

      So... basically, you are saying that it will be a while until Google's systems become self aware and decide to exterminate humanity?

  • by poofmeisterp (650750) on Tuesday November 27, 2012 @11:13AM (#42105557) Journal

    Apparently, if this is the case (which is probably is because Google's algorithms aren't AI), the tech sector needs a lot better rating.

    For instance, do a search for a particular model of laptop. The results you get are of course mad online retail shops, but you also get a BUNCH of sites that have nothing to do with the product you searched. They put the names / models in META tags and in hidden or font-size-reduced areas of the page, but the actual page contents itself is just a bunch of crap that has nothing to do with laptops or laptop parts. It's just a bunch of random crap.

    Point being, these aren't weeded out very well. Unfortunately, I don't have an example right now, but I know of one that has been in existence for years and still ranks in the top 5.

    Oh, and the above is dwarfed by software name / functionality searches 10-1!

  • This was actually listed last year on several black/grey hat SEO websites to help dissect how google functioned. The upside is that with this wider exposure, google may change its policies a little.
  • Raters gonna rate (Score:5, Interesting)

    by Aeonym (1115135) on Tuesday November 27, 2012 @12:09PM (#42106103)

    I've actually been a Google rater. I spent about 2 years total doing it--long enough to become a 'moderator' who ensures quality feedback from other raters--in between, and supplemental to, "real" jobs. Raters give feedback on lots of Google services but it falls into two buckets: ranking the quality of legitimate results, and learning to spot the "spam".

    Legit results are easy. Spam is more interesting. For one thing, I didn't entirely agree with their definition of what spam was--that's part of the reason you still see spammy results in some searches. The other part of course is that the spammers are constantly changing tactics. But it was actually kind of fun learning to spot the various methods spammers can use, and know that I was helping to improve search results by getting them off the front page (and hopefully off the top 100 pages).

    But I always assumed that rater feedback was used to judge and adjust The Algorithm rather than individual page results. The Algorithm has always been king at Google.

    • Re:Raters gonna rate (Score:5, Interesting)

      by Kreplock (1088483) on Tuesday November 27, 2012 @12:51PM (#42106521)
      I was a rater for 1 year some time ago. My impression was the rating was against results from updates they were considering for the production algorithm. Testing at the QA level. I found it boring and soulless, but a wide knowledge of obscure, otherwise-useless facts really facilitated the work. Sometimes a little-known double meaning for a concept would cause disagreements among raters, and once a moderator hated my opinion so much he had my home phone called several times to demand I change my rating.
  • "The way these raters go about their work has always been a mystery."

    Not really. Anyone with half a brain could get to the second level of the work-from-home LeapForce exam, which is when they issue this guide. Nothing here is a secret or mystery.

  • If a page with a good manually set rating points to another page that other page should enjoy a good rep too. Perhaps for several "degrees of separation".

  • I constantly search for things, and a good half the time, *maybe* a third are relevant. Then there's the times where it completely ignores my conditions. For example, I've searched for a blazer with -ladies, because, duh, I only want men's, and I get hits that explicitly, in the title, say "ladies".

    I won't even *mention* Target, who *always* claims to have whatever you're looking for in a sponsored ad on the side, and doesn't....

                    mark

    • by quixote9 (999874)
      Exactly. Google search was amazing early on, when the comparison was to "no search." Now, with a near-infinite web and squillions of SEOs gaming the ratings, it's just half-baked, like all GOOG's products. Half-baked and gamed still brings in billions of dollars for them. Without effective competition that could take any of those billions away, half-baked is going to be all we get.
    • by SnowZero (92219)

      Try this query instead:
      https://www.google.com/#q=men's+blazers [google.com]
      The entire first page is full of items that are exactly what you are looking for.

      As the web and search engines both evolve, you may need to change the way you search to get the same information. Something that worked before may not work now, and the critical words or phrases to get the best results are still there but they aren't the same as what they were in the past.

      In your particular example, the exclusion is far too weak, as "

  • There was a period of a couple of years when a web page hosted on my ISP's freebie 15 megabytes of web space was the top hit for a particular Google search. It was a good page--a lay discussion of a technical topic--and I enjoyed the ego boost, but I always wondered why since I was not aware of it's being linked from anywhere, let alone any high-traffic or high-creditibility page. Now I think I know.

    (I have since contributed that page's content to Wikipedia. The article has evolved with contributions from o

"The Street finds its own uses for technology." -- William Gibson

Working...