Jump to content

How do I find and edit my robots.txt file?

Go to solution Solved by paul2009,

Recommended Posts

Site URL: https://www.stevemaskery.com/

Hi, I'm new to SS and not a lot of previous experience with HTML.

I have two domains, workshopessentials.com and stevemaskery.com, both of  which are forwarded to:

https://amphibian-celery-b7en.squarespace.com

I'm getting a nag from Google saying that I need to add a couple of lines to my robots.txt file so that it can index my images properly. How do I find and access this file please?

Many thanks
Steve

 

Link to comment
  • Solution
On 3/16/2022 at 10:47 AM, SteveMaskery said:

How do I find and access [my robots.txt] file please?

All Squarespace sites use the same robots.txt file and as a Squarespace user you cannot access or edit it. Squarespace have already specified the pages that should not be crawled by search engines because they’re for internal use only or display duplicate content. For example, /config/ is your Admin login page, and /api/ blocks the Analytics tracking cookie.

Here are some pages that Squarespace ask search engines not to crawl. These pages organize content that exists elsewhere on your site.

/search

/*?author=*

/*&author=*

/*?tag=*

/*&tag=*

/*?month=*

/*&month=*

/*?view=*

/*&view=*

/*?format=*

/*&format=*

/*?reversePaginate=*

/*&reversePaginate=*

For a complete list of the excluded pages, view the robots.txt file on any Squarespace website.

Edited by paul2009
Added additional information

About me: I've been a SQSP User for 18 yrs. I was invited to join the Circle when it launched in 2016. I have been a Circle Leader since 2017. I don't work for Squarespace. I value honesty, transparency, diversity and good design ♥.
Work: I founded and run SF.DIGITAL, building Squarespace Extensions to supercharge your commerce website. 
Content: Views and opinions are my own. Links in my posts may refer to SF.DIGITAL products or may be affiliate links.
Forum advice is free. You can thank me by clicking one of the feedback emojis below. Coffee is optional.

Link to comment

Ah, OK, thank you.

Google is telling me this:

Quote

 

Dear Google Merchant Center user,

Merchant Center Account:  xxxxxxxxx

We have detected a recent configuration change of the site hosting your images that result in the disapproval of some of your items in your Merchant Center account.

Since images are an important part of the rich product information shown in Shopping ads, we require that all items include a valid image that can be indexed by Google. We crawl the images you submit to Merchant Center every few weeks to ensure that users always see the most recent version. Our ability to crawl your images can be restricted with a robots.txt file, which might lead to the disapproval of affected items. Learn more about robots.txt by visiting https://support.google.com/webmasters/answer/6062608.

What's the issue?
We have detected that a robots.txt file that controls the indexing of some of your provided images has been updated recently. As a result of these updates we aren't able to index the images of some of your items. This will result in the disapproval of the affected items.

Details and impact:
Estimated percentage of offers affected: 100
File: http://www.workshopessentials.com/robots.txt

In order for us to access these images, please modify the robots.txt file mentioned above to allow the user-agent Googlebot-Image to index these images. You can do this by adding the following lines to the robots.txt file:

User-agent:       Googlebot-Image
Disallow:

If modifying this robots.txt file is not feasible you might want to consider hosting your images on a different hosting service that allows images to be indexed by Google.

 

Is this something I should be worried about or can I safely ignore it?

Many thanks
Steve

 

 

Edited by SteveMaskery
Remove sensitive info
Link to comment
  • 1 month later...

@burmdoge What issue are you trying to address? 

About me: I've been a SQSP User for 18 yrs. I was invited to join the Circle when it launched in 2016. I have been a Circle Leader since 2017. I don't work for Squarespace. I value honesty, transparency, diversity and good design ♥.
Work: I founded and run SF.DIGITAL, building Squarespace Extensions to supercharge your commerce website. 
Content: Views and opinions are my own. Links in my posts may refer to SF.DIGITAL products or may be affiliate links.
Forum advice is free. You can thank me by clicking one of the feedback emojis below. Coffee is optional.

Link to comment

Hi Paul,

I'm trying to get the index page on our Squarespace website indexed, but in the Google Search Console, URL inspection it's showing:

URL is not on Google: Indexing errors

Failed: Redirect error  

 https://mesaio.com.au

Looks like it's related to the robots.txt file, could you please help resolve this issue?

Thank you,

Dion

Link to comment
  • 3 weeks later...
  • 1 month later...
  • 1 month later...
  • 1 month later...
  • 2 weeks later...
On 9/5/2022 at 5:29 PM, MJB1923 said:

Squarespace robots.txt is blocking Google from indexing my blog posts.

How do I edit the robots.txt to stop this? Totally ridiculous. Checking before I have to move my site off of Squarespace as it will not allow blog posts to be indexed. Regular pages are indexed.

Screen Shot 2022-09-05 at 11.27.31 AM.png

I experience the exact same!! Can someone from Squarespace please help? Because blogposts sure are NOT pages that google should not crawl. Tell us how to fix it or a lot of people will move their business away from squarespace. 

Link to comment
On 11/15/2022 at 11:14 AM, Chiara1234 said:

Squarespace robots.txt is blocking Google from indexing my blog posts. Can someone from Squarespace please help? Because blogposts sure are NOT pages that google should not crawl.

@Chiara1234 I believe you have misunderstood the message from Google.

Squarespace is not blocking Google from indexing your blog posts. Google is indexing your blog but Squarespace is correctly preventing Google from indexing the category links on your blog page, for example /blog/category/Reviews.

This link leads to duplicate content (the same blog posts that have already been indexed individually) and so this link should not be indexed, as I mentioned in my answer above.

Did this help? Please give feedback by clicking an icon below  ⬇️

About me: I've been a SQSP User for 18 yrs. I was invited to join the Circle when it launched in 2016. I have been a Circle Leader since 2017. I don't work for Squarespace. I value honesty, transparency, diversity and good design ♥.
Work: I founded and run SF.DIGITAL, building Squarespace Extensions to supercharge your commerce website. 
Content: Views and opinions are my own. Links in my posts may refer to SF.DIGITAL products or may be affiliate links.
Forum advice is free. You can thank me by clicking one of the feedback emojis below. Coffee is optional.

Link to comment
  • 8 months later...
On 3/6/2023 at 7:36 AM, Desi-Rae said:

Can Squarespace PLEASE stop controlling this? It's unnecessary, has far reaching negative implications for users, and why? Who decided these things? Why should it not be indexed? Why can't the user decide this? What exactly will this break for Squarespace while it is positively hurting customers?

As a webdeveloper of 14 years experience, I 100% disagree. A robots.txt is extremely important to SEO and given Squarespace has intimate knowledge of their own systems that it would be impossible for everyday users to know they should absolutely set the default robots.txt. 

You need to have more trust in a company that literally creates and runs millions of website rather than some article you read online....context is key here. You get the benefit of their expertise, and essentially they are saving you from killing your SEO in 100s of different way by having a misconfigured robots.txt or missing configurations because you don't know all the paths Squarespace has available.

You have asked a lot of questions that if you don't know the answers already, you should not be making changes to a robots.txt and as you will ruin your website, despite what you think. 

Ideally, having a section in the admin with sufficient warnings that allows a section that is appended to the default robots.txt could be good, but then again it would be a very advanced feature that user would need to understand some real important why and why nots to add pages there.

Link to comment
  • 4 months later...
On 8/8/2023 at 5:43 AM, BenCircular said:

As a webdeveloper of 14 years experience, I 100% disagree. A robots.txt is extremely important to SEO and given Squarespace has intimate knowledge of their own systems that it would be impossible for everyday users to know they should absolutely set the default robots.txt. 

You need to have more trust in a company that literally creates and runs millions of website rather than some article you read online....context is key here. You get the benefit of their expertise, and essentially they are saving you from killing your SEO in 100s of different way by having a misconfigured robots.txt or missing configurations because you don't know all the paths Squarespace has available.

You have asked a lot of questions that if you don't know the answers already, you should not be making changes to a robots.txt and as you will ruin your website, despite what you think. 

Ideally, having a section in the admin with sufficient warnings that allows a section that is appended to the default robots.txt could be good, but then again it would be a very advanced feature that user would need to understand some real important why and why nots to add pages there.

I would have to disagree with you on this - I have worked in e-commerce and SEO for 15 years and being able to control your own robots.txt is a pretty essential functionality. 

I'm not so interested on what Squarespace chooses to not index by default, I trust them on that too, but I have additional pages on my site that I want to be able to block from being indexed/crawled and as far as I can see I'm not able to define this and without robots access I don't have the power to. 

There are many people out here using squarespace sites who aren't developers, but have a lot of web experience and do need to be able to manage the SEO of their sites properly. If Squarespace wants to claim to be a good platform for SEO then they need to be providing the tools for businesses to make that happen.

I've worked with many major e-commerce and web platforms and consider access to this to be a very standard feature these days.

Link to comment
1 hour ago, asillince said:

I have additional pages on my site that I want to be able to block from being indexed/crawled and as far as I can see I'm not able to define this

Squarespace has a built in feature that allows you to hide pages (and collections) from search engines. You'll find this setting in the settings panel of each page, within the SEO tab. To find this, hover over a page title in the PAGES panel and click the gear icon next to it.

About me: I've been a SQSP User for 18 yrs. I was invited to join the Circle when it launched in 2016. I have been a Circle Leader since 2017. I don't work for Squarespace. I value honesty, transparency, diversity and good design ♥.
Work: I founded and run SF.DIGITAL, building Squarespace Extensions to supercharge your commerce website. 
Content: Views and opinions are my own. Links in my posts may refer to SF.DIGITAL products or may be affiliate links.
Forum advice is free. You can thank me by clicking one of the feedback emojis below. Coffee is optional.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

×
×
  • Create New...

Squarespace Webinars

Free online sessions where you’ll learn the basics and refine your Squarespace skills.

Hire a Designer

Stand out online with the help of an experienced designer or developer.