The new forums will be named Coin Return (based on the most recent vote)! You can check on the status and timeline of the transition to the new forums here.
The Guiding Principles and New Rules document is now in effect.

exclude a PDF from search engine crawl LOCK

The_Glad_HatterThe_Glad_Hatter One Sly FoxUnderneath a Groovy HatRegistered User regular
edited October 2010 in Help / Advice Forum
i've been googleing about this and couldn't really find something to exclude a file.

You know how google searches pdf's and even has this online preview version of it?

Wel i'd rather not have my pdf crawled.

It's a resume containing my address etc. that'll be hosted on my portfolio site. Is there a way to exclude a pdf from any type of crawling/ indexing?

this is the first time i'm making a website without another techie to help me out, so excuse me if this is a superdumb question...

The_Glad_Hatter on

Posts

  • zeenyzeeny Registered User regular
    edited October 2010
    Put it in a separate directory on the host and stop complying crawlers with robots.txt.
    http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156449

    Edit: Quoted in case it's not evident:
    The simplest robots.txt file uses two rules:

    * User-agent: the robot the following rule applies to
    * Disallow: the URL you want to block

    These two lines are considered a single entry in the file. You can include as many entries as you want. You can include multiple Disallow lines and multiple user-agents in one entry.

    Each section in the robots.txt file is separate and does not build upon previous sections. For example:

    User-agent: *
    Disallow: /folder1/

    User-Agent: Googlebot
    Disallow: /folder2/

    In this example only the URLs matching /folder2/ would be disallowed for Googlebot.

    zeeny on
  • ecco the dolphinecco the dolphin Registered User regular
    edited October 2010
    Are you able to upload a file called "robots.txt" to the root directory of your website? i.e. so that yourdomain.com/robots.txt is accessible by search engines?

    This file can be used to tell well behaving search engines to ignore specific files.

    See example in Wikipedia.

    Edit: Outplayed sir! Well played zeeny, well played. =P

    ecco the dolphin on
    Penny Arcade Developers at PADev.net.
  • soxboxsoxbox Registered User regular
    edited October 2010
    If you've already put it up and google has already indexed it, http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=164734

    soxbox on
  • The_Glad_HatterThe_Glad_Hatter One Sly Fox Underneath a Groovy HatRegistered User regular
    edited October 2010
    okay, i thought robots.txt only worked for other stuff. apparantly it doesn't. i'll just have to move the pdf into a folder.

    Thanks guys!

    put a fork in it, this thread's done!
    and by fork, i mean lock.
    /lamejoke

    The_Glad_Hatter on
Sign In or Register to comment.