Wikipedia: Page one of Google UK for 99% of searches
There’s been a lot of questions in blogs for ages wondering why Google loves Wikipedia so much and why it is so dominant in Google and how dominant it actually is. Everything we search for in Google seems to have Wikipedia on at least page one. So we thought it was about time we did some research to get some clarity.
I’ve seen previous research done in this area in one instance the searches conducted were for actual Wikipedia page titles, which of course it would do well in. So we searched nouns (Def. A word used to identify any of a class of people, places, or things) from a couple of random noun generators. A full list of the words we searched are below.
The Methodology of Wikipedia Research:
- We used a random noun generator to obtain the keyword
- We then searched that keyword on Google UK (the internet)
- To get around any personalisation issues we used Google Chrome Incognito browser
- The search settings were ten results per page
- There was no other filtering of search results within Google’s settings.
- We did not include shopping or video results within our count (as these were additional to the 10 on a page)
- We searched from Brighton, south England
- We made 1,000 unique searches, any duplicates were removed and re-searched.
The Results:
- Wikipedia is Page One of Google for 99% of searches (of nouns)
- Wikipedia is position one of Google for 56% of searches
- 96% of searches had Wikipedia in position 1-5 on Google
How the positions were shared (total 1,000 searches)
Position One dominance
As you can see from the charts, 96% of the searches landed in the top 5 positions.
Is Wikipedia’s dominance deserved?
We all love Wikipedia, but should it really be so prominent and all conquering in Google? We know that Wikipedia is a vast site with millions of pages and thousands of editors offering unique vital content on multitudes of subject matters. But should Wikipedia be the de-facto resource for pretty much all subjects? Surely some pages are riding on the back of other quality pages or perhaps lazy references to the site from businesses and bloggers across the internet. Google obviously loves Wikipedia and still ranks it despite there being next to zero content on some of the pages.
Percentage Share of Wikipedia positions in Google UK across 1,000 keyword searches
General Observations
Unsurprisingly Wikipedia, did extremely well (top two positions generally) for old-school encyclopaedic searches. What I mean by this is searches with any geographical, scientific or natural reference. For example Himalayan, bird and paediatrician. This was expected, but more surprising is that it also did extremely well for food substances and clothing. I’d imagine butter, milk and mayonnaise as well as trousers, underclothes and wallet would be fought over tooth and nail by large corporations within their respective industries.
Some flawed results from Google
When searching for the word “Air” in Google there are so many results that could have come up in the SERPs: Adobe Air software, Nike Air trainers, the French Band, Apple Air laptop, any Airline, a science page on the make-up of our atmosphere.
But no. In second place is the Wikipedia page, not for any of the above, but a disambiguation page for the term “Air”. How can this page, which is ultimately full of links to other Wikipedia pages, is short of real content, and presumably not linked to by external authoritative sites, be the second best possible result for this huge search? This is where there are ultimately flaws in Google’s offering of Wikipedia content.
Google Loves Wikipedia – But Why?
So is Google being lazy? Does it feel that at least one result needs to be there, is that an unwritten law within the algorithm? However, after reading this article about SEO bad for the Internet, it got me thinking, that if there was one place taken up in every search by Wikipedia, then that would mean there is one less place in the Top Ten for possible PPC paying corporations. Just a thought, not a fact.
The ones that got away
There are so few that were not on Page One that I can list them here.
- news
- trainers
- national
- sweets
- wardrobe
- phone
- flight
All these words are obviously highly competitive or incorporate the word within major corporations and services (for example National).
Feedback
We will be re-doing this research again soon. We will be using different random noun generators too (maybe even human generated ones). But if anyone has any questions about the methodology or recommendations please leave a comment.
The Full List of 1,000 searches and their positions in Google
Word Position Word Position Word Position Word Position
Aries 1 Illegal 1 Tiger 1 Result 2
Aardvark 1 Insect 1 Timbale 1 review 2
abigail 1 Instrument 1 Timpani 1 Rhinoceros 2
Acrylic 1 Interest 1 Tire 1 Ricardo 2
Actor 1 Invention 1 Titanium 1 Ring 2
Adapter 1 Island 1 Toe 1 Rowboat 2
Addition 1 Israel 1 Toenail 1 Sagittarius 2
Agave 1 Jacket 1 Tortoise 1 Salesman 2
Agreement 1 Jam 1 Town 1 scarf 2
Alley 1 James 1 Tray 1 Second 2
Alloy 1 Japanese 1 Treatment 1 Secure 2
Almanac 1 Jaw 1 Trial 1 Seed 2
almond 1 Jewel 1 Trombone 1 Session 2
aluminium 1 Join 1 trousers 1 Shade 2
Amy 1 jug 1 Trowel 1 Sidewalk 2
Andy 1 June 1 Trumpet 1 Signature 2
Anger 1 Jute 1 Tuba 1 Sister 2
Angora 1 Kamikaze 1 Turkey 1 Snowflake 2
Ankle 1 Kenya 1 Twine 1 Snowstorm 2
Antarctica 1 Kevin 1 Ukrainian 1 Soccer 2
Anteater 1 Kilometer 1 Underclothes 1 soda 2
Apology 1 Kite 1 Unshielded 1 Sort 2
Argument 1 Knife 1 Uzbekistan 1 spray 2
Ashtray 1 Knot 1 Vacation 1 Square 2
Asia 1 Kohlrabi 1 Valley 1 Step-grandmother 2
Asparagus 1 Ladybug 1 Value 1 Step-sister 2
Asphalt 1 larger 1 Van 1 Stinger 2
Attention 1 Laundry 1 Venezuelan 1 Stocking 2
Author 1 Lead 1 Verse 1 store 2
Authority 1 Leather 1 Vicki 1 Suede 2
Baboon 1 Legal 1 Viola 1 Suit 2
Back 1 Limit 1 Vise 1 Swimming 2
bacon 1 Liquid 1 Volcano 1 Swiss 2
Badge 1 Literature 1 Waiter 1 Swordfish 2
Bagel 1 Locust 1 Wallet 1 T-shirt 2
Balance 1 Love 1 wand 1 Tabletop 2
Ball 1 Luis 1 Watchmaker 1 Thailand 2
Bankbook 1 Lumber 1 Weasel 1 Thursday 2
Barge 1 Lunch 1 Weed 1 Triangle 2
Baritone 1 Lute 1 Wheel 1 Truck 2
Battery 1 Lyocell 1 Whiskey 1 video 2
Bee 1 lyra 1 White 1 Vision 2
beer 1 Maid 1 Winter 1 wales 2
Beetle 1 Male 1 Withdrawal 1 Wash 2
Begonia 1 Manager 1 Woman 1 Weight 2
Bengal 1 Maraca 1 Wood 1 Whistle 2
Berry 1 Marble 1 wooden 1 Window 2
Bibliography 1 Marimba 1 Worm 1 Window 2
Bird 1 Mass 1 Wound 1 Wing 2
Birth 1 Mattock 1 Wrench 1 wipe 2
biscuit 1 Mayonnaise 1 Wrinkle 1 zebra 2
blimp 1 Meal 1 Xylophone 1 Acoustic 3
blouse 1 Meeting 1 Yacht 1 Alibi 3
Blowgun 1 Melody 1 Yak 1 Apparel 3
Bomber 1 Mercury 1 Year 1 Apple 3
Bonsai 1 Message 1 yoga 1 Bag 3
border 1 Metal 1 yogurt 1 Bakery 3
Bottle 1 Methane 1 Yoke 1 Bank 3
bottle 1 Mexican 1 Zoology 1 Bath 3
Bread 1 Mexico 1 Account 2 bench 3
Breath 1 Michael 1 Air 2 Burn 3
bridge 1 michelle 1 Airplane 2 Century 3
Broccoli 1 Middle 1 Alcohol 2 Chain 3
Broker 1 milk 1 Animal 2 Channel 3
Bronze 1 Millimeter 1 Ant 2 Chill 3
Buffet 1 Millisecond 1 Appliance 2 Cord 3
Bugle 1 Mini-skirt 1 Approval 2 Dan 3
Bulb 1 Minister 1 Aquarius 2 debt 3
Burma 1 Missile 1 Arch 2 Direction 3
Butcher 1 Monday 1 Area 2 Discovery 3
Butter 1 moon 1 Armadillo 2 Dish 3
C-clamp 1 Morning 1 Army 2 Downtown 3
Cabbage 1 Morocco 1 Arrow 2 draw 3
Calf 1 Mosque 1 Athlete 2 Dugout 3
Can 1 mountain 1 Atm 2 Equipment 3
Canadian 1 muffin 1 Attic 2 event 3
can 1 nelson 1 australia 2 Find 3
candle 1 Neon 1 Australian 2 Fireman 3
canoe 1 Niece 1 Avenue 2 Fireplace 3
Cardboard 1 Nitrogen 1 Banker 2 Freeze 3
carrot 1 North america 1 Bar 2 Game 3
Cart 1 North korea 1 Beat 2 Garage 3
Cat 1 Nose 1 Beauty 2 Gate 3
Cattle 1 November 1 Bench 2 Government 3
Cauliflower 1 Numeric 1 Birthday 2 Grenade 3
Ceiling 1 Oboe 1 Blizzard 2 Hammer 3
Celery 1 Observation 1 blog 2 Hockey 3
Celsius 1 Odometer 1 Blow 2 Innocent 3
Ceramic 1 Offer 1 Bobcat 2 Interactive 3
Cereal 1 Operation 1 Bongo 2 Jennifer 3
ceylon 1 Organic 1 Boy 2 Joseph 3
chalk 1 Organisation 1 Britney 2 Knickers 3
Chauffeur 1 Ounce 1 Cabinet 2 Look 3
Cheetah 1 Output 1 Cactus 2 magnet 3
Cherry 1 Oven 1 Call 2 Maple 3
Chicken 1 Oxygen 1 Capricorn 2 microwave 3
Chef 1 Package 1 career 2 Motion 3
Child 1 Pancake 1 Carnation 2 Nerve 3
chilli 1 Pansy 1 Carol 2 Norwegian 3
Chin 1 Parallelogram 1 Caspar 2 orange 3
China 1 Parent 1 Cast 2 Pair 3
Christmas 1 Parentheses 1 Cathedral 2 Parrot 3
Cicada 1 paris 1 Caution 2 Peak 3
client 1 Part 1 Celeste 2 perfume 3
Cockroach 1 Particle 1 Chance 2 Polo 3
coffee 1 Passbook 1 Character 2 Powder 3
Coil 1 pastor 1 Chard 2 queen 3
coil 1 Patch 1 Charles 2 Quill 3
Cold 1 Pediatrician 1 Chive 2 Raft 3
College 1 Peen 1 Chocolate 2 Railway 3
Colombia 1 Period 1 Circle 2 Ryan 3
Colon 1 Peru 1 Cod 2 Science 3
colour 1 photographer 1 Colony 2 Screen 3
Comb 1 pipe 1 Color 2 Shell 3
Committee 1 Pisces 1 Cook 2 Shoe 3
Competition 1 Plant 1 Cost 2 Shorts 3
Composition 1 Plantation 1 Couch 2 Show 3
Computer 1 Plastic 1 cream 2 Sparrow 3
Congo 1 Plot 1 cream 2 Spoon 3
Consonant 1 Poison 1 Creek 2 Spot 3
copper 1 Poland 1 Crocus 2 Spring 3
copier 1 Policeman 1 cromwell 2 Stage 3
Copyright 1 Polish 1 Cuban 2 Stem 3
Cough 1 Polyester 1 Current 2 Taxi 3
Crab 1 Porch 1 curry 2 Team 3
Crack 1 Porcupine 1 Curve 2 Test 3
Cristiano 1 Porter 1 daimond 2 Thistle 3
cupboard 1 Potato 1 David 2 Tile 3
Curler 1 Pound 1 Department 2 tripod 3
Cushion 1 Power 1 Deposit 2 Velvet 3
Custard 1 Prepared 1 Destruction 2 walker 3
Cymbal 1 Print 1 Detail 2 western 3
Dancer 1 Prison 1 Development 2 whisky 3
Dead 1 Produce 1 Diamond 2 Wilderness 3
Death 1 project 1 Diaphragm 2 Wish 3
Deborah 1 Propane 1 Difference 2 Witness 3
December 1 Pumpkin 1 Digger 2 Wonder 3
Decimal 1 Pvc 1 Digital 2 Word 3
Deodorant 1 Pyjama 1 Dinner 2 Advice 4
Description 1 Quart 1 Doubt 2 base 4
Desert 1 Quarter 1 Dream 2 Beam 4
Dessert 1 Quartz 1 Drop 2 Bedroom 4
device 1 Quicksand 1 Dungeon 2 Ben 4
Dimple 1 Rabbi 1 East 2 Bite 4
Distributor 1 Rabbi 1 Education 2 boots 4
dog 1 Rabbit 1 Elbow 2 bus 4
Dolphin 1 Radish 1 Elephant 2 Cap 4
Donald 1 Rain 1 Elizabeth 2 Car 4
Double 1 Rainbow 1 End 2 channel 4
dragon 1 Rat 1 Engine 2 Close 4
Drake 1 ratchet 1 Environment 2 core 4
Drawbridge 1 Ravioli 1 Ex-husband 2 Cover 4
drawer 1 Receipt 1 Farmer 2 Creator 4
Dredger 1 Rectangle 1 Feast 2 Cycle 4
Dressing 1 Regret 1 Feeling 2 Dad 4
Drink 1 Report 1 festival 2 Ease 4
Driver 1 Rest 1 Flame 2 Fact 4
Drug 1 Retailer 1 Flock 2 Forest 4
Ear 1 Reward 1 Flower 2 Ghost 4
economy 1 Rhythm 1 Fly 2 Hallway 4
Eel 1 Rice 1 Fold 2 hammer 4
eight 1 road 1 food 2 Health 4
Ellipse 1 Robyn 1 Footnote 2 Icon 4
Employer 1 Rocket 1 Freckle 2 kestrel 4
Engineer 1 Rod 1 Freya 2 laptop 4
envelope 1 Romania 1 Friction 2 leon 4
Error 1 Rose 1 Friday 2 Mailbox 4
Ethiopia 1 Rule 1 Frog 2 Name 4
Exclamation 1 Russia 1 Frost 2 Open 4
Existence 1 Sail 1 Gander 2 Phil 4
extension 1 Sailor 1 Gazelle 2 Plasterboard 4
Eyelash 1 Salary 1 Gearshift 2 Police 4
Fahrenheit 1 Sampan 1 Giraffe 2 Postbox 4
Fairies 1 Sand 1 glass 2 Quit 4
Fang 1 sandwich 1 globe 2 Reading 4
Father-in-law 1 sauce 1 Glue 2 record 4
Feather 1 sausage 1 Goggles 2 Request 4
February 1 Saxophone 1 Gong 2 School 4
Felony 1 Scale 1 Good-bye 2 Seashore 4
Female 1 Scarecrow 1 Gore-tex 2 Street 4
Fertilizer 1 Scent 1 Grade 2 string 4
Fiber 1 Scorpio 1 Graphic 2 Theory 4
Fiberglass 1 scotch 1 greek 2 Transport 4
Fiction 1 Screwdriver 1 Grey 2 Walk 4
Fir 1 Seagull 1 Gum 2 Warm 4
fish 1 Seal 1 Handle 2 Watch 4
Fisherman 1 Servant 1 Hardhat 2 wine 4
Flesh 1 Shadow 1 Hayley 2 wire 4
flip-flop 1 Shark 1 Head 2 Act 5
Flute 1 sheep 1 Heart 2 Answer 5
Force 1 Shoemaker 1 Helicopter 2 Baker 5
Forgery 1 Shoulder 1 Hip 2 Card 5
Fork 1 Shrine 1 History 2 case 5
Fortnight 1 Siamese 1 Hobbies 2 Caterpillar 5
Fowl 1 Side 1 Hood 2 Decrease 5
Foxglove 1 Skin 1 Hovercraft 2 Driving 5
Freighter 1 Slipper 1 Hydrant 2 Exchange 5
fruit 1 slipper 1 Icicle 2 Exhaust 5
garlic 1 Slope 1 Index 2 Font 5
garnish 1 Smell 1 Inventory 2 frame 5
Gasoline 1 Snowboarding 1 Jason 2 Gym 5
gastronomy 1 socket 1 Joke 2 Jet 5
Gauge 1 Soil 1 Judo 2 Jonny 5
Geology 1 Soprano 1 Jumbo 2 Map 5
Geranium 1 South africa 1 jumper 2 monitor 5
German 1 Soybean 1 Key 2 Passenger 5
Girdle 1 spaghetti 1 Kick 2 tube 5
Gladiolus 1 Spain 1 klaxon 2 Cinema 6
glaze 1 Spear 1 ladel 2 Dryer 6
gloves 1 Sphere 1 Lan 2 football 6
Goal 1 Sphynx 1 Link 2 Fragrance 6
Goa 1 Spruce 1 Lion 2 hills 6
goat 1 Squirrel 1 Lipstick 2 jeans 6
Goose 1 state 1 Lisa 2 Loan 6
grape 1 Statement 1 Loss 2 Matt 6
Grass 1 Step-daughter 1 Margin 2 Music 6
Grasshopper 1 Step-mother 1 Mark 2 Paint 6
Gray 1 Stew 1 Market 2 Poppy 6
greek 1 Stock 1 Mechanic 2 Radiator 6
Group 1 Stomach 1 meteor 2 radiator 6
Guilty 1 Stopsign 1 Milkshake 2 station 6
Guitar 1 stream 1 Mind 2 tap 6
Gun 1 Sudan 1 Minibus 2 Tights 6
Gymnast 1 Summer 1 Moat 2 View 6
Hair 1 Sundial 1 Mother 2 zoo 6
Haircut 1 Sunflower 1 Nail 2 Babies 7
Half-sister 1 Support 1 Node 2 Bed 7
Hall 1 Surfboard 1 Octave 2 Burst 7
ham 1 Surname 1 Octopus 2 Calendar 7
Hamburger 1 Sushi 1 Olive 2 Delivery 7
Hard hat 1 Sweater 1 Outrigger 2 Frame 7
harp 1 Swing 1 Owner 2 Freezer 7
Harry 1 Sycamore 1 Ox 2 Insulation 7
Hawk 1 Syrup 1 page 2 Rail 7
Helen 1 Syrup 1 Pail 2 tablecloth 7
Helium 1 System 1 Panda 2 Tent 7
Helmet 1 Tachometer 1 Paperback 2 Baby 9
Hemp 1 Tail 1 Parade 2 coach 9
Heron 1 Tailor 1 Pear 2 hamper 9
Herring 1 Tandoori 1 pencil 2 Replace 9
Himalayan 1 Tank 1 Peony 2 sideboard 9
Hole 1 Taurus 1 Person 2 timer 9
Holly 1 Tea 1 Person 2 Mail 11
Horse 1 Television 1 Ping 2 News 13
Hose 1 temperature 1 Popcorn 2 trainers 13
Humidity 1 Tempo 1 Pot 2 National 14
Hyacinth 1 thigh 1 Printer 2 Sweets 14
Hygienic 1 Thing 1 Puma 2 wardrobe 14
ice-age 1 Thumb 1 Question 2 Phone 18
Ikebana 1 Thunderstorm 1 Quotation 2 Flight 22
Tags: google wikipedia, wikipedia, wikipedia dominance
This entry was posted on Wednesday, February 8th, 2012 at 1:20 pm and is filed under Google, Social Media. You can follow any responses to this entry through the RSS feed. You can skip to the end and leave a response. Pinging is currently not allowed.


Great data and research. Can’t believe the number is so high.
99% ? Unbelievable, though it does make sense in terms of Google’s business model.
I can’t believe you published this as being a conclusive outcome for Wikipedia to be so dominant, where the sample size is just so small. Really, you draw this conclusion based on 1,000 queries, while there are literally millions of searches on a daily basis?
Of COURSE Wikipedia will rank highly if you’re just searching random nouns. It’s an encyclopedia. This seems really flawed in terms of how people actually search, since a good number of searches are apparently unique (the latest % I’ve seen is 20%). 1000 searches is a very small sample for this sort of thing. And it’s not as if Google has any opinion about Wikipedia. The Web really likes Wikipedia and signals as such to Google, which ranks it highly.
Wow! no wonder why is Wikipedia at the top spot in Google ranking. Informative post.
Great research with sound volumes to draw insightful conclusions – Wikipedia is a fantastic resource and a real trophy of the internet. I am not however so sure as to whether it should have this level of page one dominance. If Bing purchased it, i wonder if we we see the positions drop?
Hi Alec did you read the full article?
I wanted to choose RANDOM words.
What came up was words from many different spheres: clothing, food stuff, financial products as well as many randomly found brand names: Boots, Air, Puma, Bench.
These terms are all searched millions of times a day and are hugely competitive.
Yes i’d love to search 1 million terms. Fancy coming in and giving us a hand?
And what do you mean “It’s not as if Google has any opinion on Wikipedia” – of course it does. Why else would we see it on page one so often. Wikipedia is my single favourite website on the internet. But there are many pages where it is appearing much higher than it should. As i said this is either down to laziness or something else. This study was just to shed some light on this area.
DennisG – i’m just publishing the results of our research.
I didn’t at any point say that 1,000 searches should prove a conclusive outcome. But it surely sheds some light? I am more than aware how many searches are done on a daily basis, across countless industries and topics. But take a look at the data. Read the words, and see the large amount of different areas that are covered here.
Yes it would be great to do more searches and we’ll do more in due course, in separate languages.
You can help if you want, positive feedback is always welcome.
Here’s a suggestion: why not re-do the test with bing and yahoo and see how wikipedia ranks on other major search engines?
If results are similar then maybe it’s not just a love affair with google. It could be that Wikipedia is any search engine’s dream with its vast oceans of information, keyword-abudant single page articles, constant updates with the latest data, masses of links, age, and general authority.
Could it simply be that the big Wiki ticks all the SEO boxes and deserves these rankings?
After all Wikipedia articles are created, edited and searched from normal, everyday folk (and lots of them). The same people who make Wikipedia are the ones searching it. Wiki articles are edited and updated hundreds of times over and over to reflect all the information the Wikipedian himself is hunting for when entering the search terms into Google.
And Google’s intent is precisely that: to return information you need best suited to your search. Who better than Wikipedia to know exactly what you need? Because Wikipedia are the people entering those searches into Google in the first place.
Then again the big library of the web might not be an SEO dream at all -it could just be an infatuation on Google’s part.
It would be interesting to see some results from other search engines. The world doesn’t belong to Google! (yet)
Thanks for great article. I think Wikipedia is Queen of information about nouns
Pete_E
Yes, good suggestion. It could help some of my claims. We used Google because here in the UK its got a 90% share.
“Could it simply be that the big Wiki ticks all the SEO boxes and deserves these rankings?”
Yes on the whole i think it does. Wiki is an ideal structure and has huge amounts of links. But in some cases (Such as “Air” highlighted above) Google really is throwing up pages that don’t tick these boxes.
Furthermore, Wiki doesn’t tick many of the social signals that Google demands. So in a way it is going against some of its own stipulations.
So doing a Yahoo search, which doesn’t use as many of these signals would be interesting.
Thanks for the comment
Sam
It is not true that Wikipedia will be around forever, but it will be here long enough to make a real impact on commerce and industry. Therefore, we need to be finding a way of getting Wikipedia to work for the people in the way it should. Like any other institution in society, eventually it needs to answer to the people it affects, so let us make that answer a positive and constructive one. At the end of the day, Wikipedia is up there because it uses the Internet the way the world has generally believed it should be used–as Stephen Colbert says “the market has spoken”. It will be interesting to see when capitalism catches up with the site, and I hope, that it adapts and evolves as elegantly as it has to most problems it has faced thus far.
@sam
> Hi Alec did you read the full article?
> I wanted to choose RANDOM words.
Yes, I even used the word random, too.
> What came up was words from many different spheres: clothing, food stuff, financial products as well as many randomly
> found brand names: Boots, Air, Puma, Bench.
> These terms are all searched millions of times a day and are hugely competitive.
Wikipedia is a massive concentration of knowledge about nearly every subject. Singular generic words describing things are right up its alley. If I search ‘puma’, the Wikipedia article on the cat ranks 2nd to Puma Shoes’ puma.com. If I search ‘puma shoes’, Wikipedia doesn’t show up for five pages. In fact, one of the displayed related searches is ‘puma shoes wiki’, which suggests people find it necessary to hint their query toward Wikipedia if that’s what they’re looking for. (The related searches section would be a good source of searches to test. Another source could be a Twitter search for links to Google queries, and trying those. There must be thousands.)
> Yes i’d love to search 1 million terms. Fancy coming in and giving us a hand?
Sure. A screen-scraping script to do this would be trivial. You could also factor in things like the ranking of dictionary websites, to help indicate how generic the word is. Another dimension is the size of the target Wikipedia page and whether or not it’s a stub. If stub pages also rank highly then maybe something fishy is going on. I think you could even get the internal links to that Wikipedia page in an automated way without too much difficulty, using their “what links here” special pages. Actually, I think I’ll give this a shot…
> And what do you mean “It’s not as if Google has any opinion on Wikipedia” – of course it does. Why else would we see
> it on page one so often. Wikipedia is my single favourite website on the internet. But there are many pages where it
> is appearing much higher than it should. As i said this is either down to laziness or something else. This study was
> just to shed some light on this area.
There’s no evidence that Google favors Wikipedia, and plenty evidence to the contrary. Wikipedia shows up often because it covers so many topics, is often full of content, and is frequently linked to. I also don’t see how PPC links are losing a slot to Wikipedia, since ads are separate from the organic results. The organic search results are sacrosanct at Google; they’ve even deranked their own stuff when it didn’t play by the rules. I would be willing to bet they have the same attitude toward artificially promoting results. Besides, if Google were lazy, then they wouldn’t do anything to favor Wikipedia, since it would actually require more work for them to bias the results.
Great article and good analysis Sam. Poses many a question! I appreciate a bigger sampling would be good but its certainly quite indicative and I like Pete_E’s suggestion about comparing it to the likes of Yahoo and Bing – now THAT would really say something. Perhaps Wikipedia are great at what they do and they deserve the ranking, perhaps Google has some affinity with their site. I’d be intrigued to see more on this.
Had an hour this morning to put together a quick scraping script. It needs some tuning still, but using the words you did it got pretty similar results when pointed at Google. I let it loose on Bing, and Wikipedia pages tended to rank even HIGHER: http://cl.ly/2p2T2M0c2l1Y0z0v2v1j
So, yeah, it’s doubtful there’s any sort of conspiracy. Any oddities in the results, like a disambiguation page for ‘Air’ showing up highly, are quirks of the page rank algorithms being applied to the emergent system that is the Web. Keep in mind that the ranking order on the page is relative. It may be that instead of Wikipedia ranking highly for certain things, other pages are ranking poorly, especially given that the most generic form of the words is being searched for.
Firstly, a really interesting well put together article. But a site that contains a huge volume of general content ranked well for one word general terms? Not a surprise to me, sorry. I would like to see the experiment repeated with say three word keywords and then see where it comes and whether the content justified it.
Hi Sufu and Philip
We’re going to do more searches with multiple keywords. But yes i’m aware that wikipedia will appear less and less the more keywords you choose and the more newsworthy it becomes eg: Arsenal versus Liverpool Feb 2012.
However the biggest searches (via Google hot trends or Hitwise/comscore top searches per sector) are usually one or two word searches.
Try doing searches here and see that 90% of results have Wiki on page one:
http://www.google.com/trends/hottrends/atom/hourly
Many thanks for the comments
Sam
I’m pretty sure Google tunes their search results based on what people actually click on, so is it really any wonder that Wikipedia ranks so highly in this test? Who here hasn’t done random searches for things, seeking more GENERAL information rather than specific information? When doing such a search and clicking the Wikipedia link from the results, every one of us is reinforcing Google’s confidence ranking of Wikipedia as a source of decent and relevant information, and the feedback we provide perpetuates its position in the top search results.
Besides, the single-word random nouns do lend themselves extraordinarily well to things that Wikipedia covers. Their corresponding article URLs will tend to contain the search term, which is one of the oldest SEO tricks in the book: it gives a really strong hint to any search engine that the page contains information that is relevant to the search term.
The “air” example, getting a Wikipedia disambiguation page, also seems extremely reasonable to me. After all, the word appeared alone in the search terms, and the incognito browser had no recorded history for Google to make a reasonable judgement on which meaning for “air” was desired. (On the other hand, in a non-incognito browser belonging to someone who frequently shops for shoes, I’d imagine the Air shoes would be a higher-ranked result.) In fact, Google may have cross-referenced what other search terms often appear alongside “air”, and then came up with a search result that contained as many of these pairs of search terms as possible. Personally, I’m impressed by this result, and your dismissal of it as a good result is the main thing that prompted me to write this comment.
Still, I think it is interesting to note that Wikipedia is ranking highly, but I really do believe it is deserved. Long live Wikipedia!
For those finding that 990 searches returning wikipedia sites on the first page is not significant, I propose that they find 990 common words NOT giving a wikipedia result on the first page. It should be hard work….
More seriously, I think that Google gives maybe priority to wikipedia when using a single word in your search; in this case your search is quite general and wikipedia could be a good source of information. If you use two words wikipedia frequently disapears of the result. Take” Aardvark” alone, in my computer environment, wikipedia comes in second and third place and one more time on the first page. Making search for “Aardvark reproduction” in the same environment gives no more wikipedia result on the first page nor the second one, you have to go to the third page to find a wikipedia link. So I think that maybe one of the reason for the popularity of wikipedia in this type of research is the priority probably given by google to “not too specialised sites” in the result of very general searches based on only one word. The chance to give a bad result with a link to wikipedia is less than with a link to a very specialized scientific site or a specialized software possibly called Aardvark. You know, Google don’t know if you are 12 year or 24 when you type “Aardvark” from a random computer..
Alec – great link thanks for that. Sorry i haven’t responded. That shows that Bing gives Wiki as much love.
What hasn’t been emphasized yet in the discussion: the huge mass of links pointing *towards* Wikipedia’s articles from the outside – lots, *lots*, and I mean HUMONGOUS HORDES of pages are linking to the W for explanations – and there are very different kinds of sites, from a one-visitor-per-year webpage to an online newspaper. IMNSHO, if they all were linking to a different source, you’d be seeing *that* one popping up quite high in the results.
Yes Piskvor – perhaps i should have pointed this out more clearly in the article. Wiki has millions of links:
Site wide: 8,450,000,000 (Majestic SEO)
With Pages indexed: 41,000,000 (Google)
I did suggest that Wiki gets referenced (linked to) a lot. In many cases it deserves it, in others it is a lazy link from the webmaster or blogger who can’t be bothered to do more research. Some pages are indeed fantastically written and deserve all the links they get other pages are riding on the back of the general richness of the rest of the site.
In our company, we have a feature known as Grim Graph of The Week. This week, this article was chosen as the subject. The reasons are as follows:
Its gaudy appearance may have been chosen on purpose to encourage people not wanting to hear this “truth” to read on, but it’s not just the colours that make it grim. Here are some other reasons:
- There is no title on this chart and therefore it takes a while to figure out what it is showing us – and in fact we still have to resort to reading the accompanying article, which is never the sign of a clear chart
- The article explains that a recent study looked at 1,000 search terms (generated using a random noun generator) in Google and found that Wikipedia ranked on page one for a huge 99% of them
- This graph is therefore supposed to show the percentage of terms for which Wikipedia ranks in each position. Or to make it clearer, in 56% of the searches Wikipedia ranked in position number one, in 24% of searches it was position two… and so on
- It’s quite difficult to see the conclusion that Wikipedia ranked on page one for 99% of the terms clearly – in fact it looks like 100% instead because they’ve added a segment called 10+, which could be easily mistaken for 10
- It is also because they have rounded up the data to the nearest whole percentage point and that means there is no slice of pie at positions 8 or 10 – and since we look to position 10 as the cut off for page one, this is confusing
- The rounding also causes there to be a visible pale green slice of pie for position 9 even though it is labelled with 0%! This data would be much better visualised in a more subtly coloured bar chart – and definitely should have been produced using non-rounded percentage values
I hope this will encourages you to produce clearer graphs in the future
Adam – i have received a lot of comments from people who seem to concentrate on the minutai and totally miss the point – and are on the verge of trolling. Your comment transcends theirs.
It’s been weeks since you posted this, but i haven’t stopped laughing about the fact your company has a “Grim Graph of The Week” wall – party on.
I have to say this is a very interesting article Sam, and I’m very pleased I stumbled across it while browsing the net. I remember reading an article (I wish I had a link!) that discussed the wikipedia article on Turkey (the country), and how it appeared at the top of thousands of different google searches. At the time the summary was that it was due to the content on the page and the links going into the page.
However, I would guess that the content on the nouns above is often short and concise. Not only this, I would say the links into say ‘swordfish’ would be fewer than many of the pages below it in the search results (such as the swordfish film). This would therefore prove the prior argument had a number of flaws.
It does beg the question behind the real reason Wikipedia pages are instantly indexed at the top of a search term.