Jump to content

XML Tags used in large Wordpress Import

Go to solution Solved by creedon,

Recommended Posts

I am migrating a custom blog with 1,700 blog posts and 20,000 images to Squarespace. The only option is to programmatically create a Wordpress XML file (custom code since the source is not Wordpress) and import that.

Does anyone know which XML nodes are imported by Squarespace? Reverse engineering the XML format has worked ... kinda, but it would be really useful to know how the Squarespace import works at a code level.

Alternatively if anyone has a sample XML file that worked well, then I can use that as a reference.

Link to comment
  • Solution

I'd be surprised if anyone has come up with a definitive XML example file that works for importing. It all depends on what you are trying accomplish.

Also note that SS's importer/exporter are fragile and temperamental. You can find many posts of these issues. There are fundamental issues such as you can't export a WP XML and then import that same file into a SS site without massaging it.

I suggest you search the forum using words like wordpress xml import. If you search on my member name you'll find some of the threads that I've participated in on this topic.

Here is a working example I use that covers some of the basics.

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:wp="http://wordpress.org/export/1.2/">
  <channel>
    <title>Your Site Title</title>
    <link>https://wedge-synthesizer-xy2y.squarespace.com</link>
    <pubDate>Tue, 17 May 2022 01:20:55 +0000</pubDate>
    <description />
    <language>en-US</language>
    <wp:wxr_version>1.2</wp:wxr_version>
    <wp:author>
      <wp:author_id>-123456789</wp:author_id>
      <wp:author_login>john@doe.com</wp:author_login>
      <wp:author_email>john@doe.com</wp:author_email>
      <wp:author_display_name><![CDATA[]]></wp:author_display_name>
      <wp:author_first_name><![CDATA[]]></wp:author_first_name>
      <wp:author_last_name><![CDATA[John Doe]]></wp:author_last_name>
    </wp:author>
    <wp:category>
      <wp:cat_name><![CDATA[null - null]]></wp:cat_name>
      <wp:category_nicename>null-null</wp:category_nicename>
      <wp:category_parent />
    </wp:category>
    <item>
      <guid isPermaLink="false">/blog-post-title-one</guid>
      <title>Blog Post Title One</title>
      <link>/blog-post-title-one</link>
      <content:encoded><![CDATA[<div
        class="
          image-block-outer-wrapper
          layout-caption-hidden
          design-layout-inline
          combination-animation-site-default
          individual-animation-site-default
          individual-text-animation-site-default
        "
        data-test="image-block-inline-outer-wrapper"
    >

      

      
        <figure
            class="
              sqs-block-image-figure
              intrinsic
            "
            style="max-width:2200px;"
        >
          
        
        

        
          
            
          <div
              
              
              class="image-block-wrapper"
              data-animation-role="image"
              
  

          >
            <div class="sqs-image-shape-container-element
              
          
        
              has-aspect-ratio
            " style="
                position: relative;
                
                  padding-bottom:63.6363639831543%;
                
                overflow: hidden;-webkit-mask-image: -webkit-radial-gradient(white, black);
              "
              >
                
                
                
                
                
                
                
                <img data-stretch="false" src="https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg" data-image="https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg" data-image-dimensions="2200x1400" data-image-focal-point="0.5,0.5" alt="" data-load="false" elementtiming="system-image-block" src="https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg" width="2200" height="1400" alt="" sizes="(max-width: 640px) 100vw, (max-width: 767px) 100vw, 100vw" style="display:block;object-fit: cover; width: 100%; height: 100%; object-position: 50% 50%" onload="this.classList.add(&quot;loaded&quot;)" srcset="https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=100w 100w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=300w 300w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=500w 500w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=750w 750w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=1000w 1000w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=1500w 1500w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=2500w 2500w" loading="lazy" decoding="async" data-loader="sqs">

            </div>
          </div>
        
          
        

        
      
        </figure>
      

    </div>
  


  




<div class="sqs-html-content">
  <p class="" style="white-space:pre-wrap;">Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Lobortis elementum nibh tellus molestie nunc non blandit. Aliquet risus feugiat in ante metus dictum. Ornare arcu dui vivamus arcu felis bibendum ut tristique et. Ipsum a arcu cursus vitae congue mauris. Et leo duis ut diam. Eget nulla facilisi etiam dignissim diam quis enim lobortis scelerisque. Amet purus gravida quis blandit turpis cursus in. Et netus et malesuada fames ac turpis. Pharetra diam sit amet nisl suscipit adipiscing bibendum est. Dui nunc mattis enim ut tellus. Id volutpat lacus laoreet non curabitur. Interdum velit euismod in pellentesque massa placerat duis. Fusce id velit ut tortor pretium. Adipiscing elit ut aliquam purus sit amet.</p>
</div>]]></content:encoded>
      <excerpt:encoded><![CDATA[<p>It all begins with an idea.</p>]]></excerpt:encoded>
      <wp:post_name>/blog-post-title-one</wp:post_name>
      <wp:post_type>post</wp:post_type>
      <wp:post_id>1</wp:post_id>
      <wp:status>publish</wp:status>
      <pubDate>Mon, 11 Mar 2019 17:15:07 +0000</pubDate>
      <wp:post_date>2019-03-11 17:15:07</wp:post_date>
      <wp:post_date_gmt></wp:post_date_gmt>
      <category domain="post_tag" nicename="one-tag"><![CDATA[one tag]]></category>
      <category domain="post_tag" nicename="two-tag"><![CDATA[two tag]]></category>
      <category domain="category" nicename="three-category"><![CDATA[three category]]></category>
      <category domain="category" nicename="four-category"><![CDATA[four category]]></category>
      <dc:creator>john@doe.com</dc:creator>
      <wp:comment_status>closed</wp:comment_status>
      <wp:postmeta>
        <wp:meta_key>_thumbnail_id</wp:meta_key>
        <wp:meta_value><![CDATA[2]]></wp:meta_value>
      </wp:postmeta>
      <wp:comment>
        <wp:comment_id>1</wp:comment_id>
        <wp:comment_approved>0</wp:comment_approved>
        <wp:comment_author><![CDATA[Jim]]></wp:comment_author>
        <wp:comment_author_url />
        <wp:comment_author_IP></wp:comment_author_IP>
        <wp:comment_date></wp:comment_date>
        <wp:comment_date_gmt>2022-05-29 20:59:23</wp:comment_date_gmt>
        <wp:comment_content><![CDATA[<p>Very nice!</p>]]></wp:comment_content>
        <wp:comment_type />
        <wp:comment_parent>0</wp:comment_parent>
      </wp:comment>
    </item>
    <item>
      <wp:attachment_url>https://images.squarespace-cdn.com/content/60374efe93a6cb725a5c6856/1663638164742-WAC5759OGW6WTCOGV3KJ/20140301_Trade-151_0124-copy.jpeg?content-type=image%2Fjpeg</wp:attachment_url>
      <link></link>
      <title></title>
      <wp:post_name></wp:post_name>
      <wp:post_type>attachment</wp:post_type>
      <wp:post_id>2</wp:post_id>
      <wp:status>inherit</wp:status>
      <content:encoded><![CDATA[]]></content:encoded>
      <excerpt:encoded><![CDATA[]]></excerpt:encoded>
      <pubDate></pubDate>
      <wp:post_date></wp:post_date>
      <wp:post_date_gmt></wp:post_date_gmt>
      <dc:creator></dc:creator>
    </item>
  </channel>
</rss>

If you come up with an example that does something different please consider sharing it.

Let us know how it goes.

Find my contributions useful? Please like, upvote, mark my answer as the best ( solution ), and see my profile. Thanks for your support! I am a Squarespace ( and other technological things ) consultant open for new projects.

Link to comment

Thank you @creedon!

I used your sample file as-is and successfully imported all 1,700k posts and 20k photos.

No matter what I tried, I could not get the "alt" text if images to import, so there is a lot of manual work needed. Comments also did not import, but that was less important.

Link to comment
Quote

No matter what I tried, I could not get the "alt" text if images to import, so there is a lot of manual work needed.

That's par for the course. There is always lots of clean up.

Quote

Comments also did not import, but that was less important.

Did you turn on Comments in the site globally and check the Comment toggle on each post?

The example XML I provided does indeed import the comments into the backed.

Screenshot2024-06-08at3_26_28PM.thumb.png.a5b26120fd5d5381a989fcd43b3f4a45.png

I haven't done enough comments to know if you can get the toggle turned on by default per post automatically.

I've never fiddled with alt text but I'd be surprised if SS went that deep. The XML import really is a disappointment and that SS apparently has it in maintenance mode is a shame.

Find my contributions useful? Please like, upvote, mark my answer as the best ( solution ), and see my profile. Thanks for your support! I am a Squarespace ( and other technological things ) consultant open for new projects.

Link to comment
  • 3 weeks later...

Something I have not been able to figure out is how to get an image that is part of the imported post's content to get pulled into Squarespace as an Image block rather than it getting turned into a Code block. 

For example, in the xml example above (thank you, by the way!) the "featured image" imports successfully and is saved to Assets but the image in `<content:encoded>` (when the src is changed to the same URL as the featured image) is shown in a Code block and the image file itself is not imported.

I have seen it work with this particular URL: `https://wpthemetestdata.files.wordpress.com/2008/06/dsc20050727_091048_222.jpg` but I have no idea why that image imports and others don't. 😭

Link to comment

It's crazy, I can do this (bare img tags):

<content:encoded><![CDATA[
  <img src="https://wpthemetestdata.files.wordpress.com/2008/06/dsc20050727_091048_222.jpg">
  <img src="https://negaidemo.wordpress.com/wp-content/uploads/2023/06/c3d43-black-and-white-hair-photography-vintage-star-hollywood-1097552-pxhere.com_.jpg">
]]></content:encoded>

... and they will both magically turn into an Image block in my imported blog post, with the image saved to the Asset Library.

But if I sub in a different src the img tags are turned into Code blocks instead, referencing external images. 

I wonder if it has to do with the URL. I tried spoofing "wordpress.com" into the URI of my images but that didn't work.

It would be great to know if I'm barking up the wrong tree. Did either of you find success importing non wordpress.com images in the body of your posts?

Link to comment
8 minutes ago, kirkroberts said:

I wonder if it has to do with the URL

I haven't gone deeply into this particular aspect of an import. From your description it appears that SS is parsing image tag URLs and if they match certain domains, in this case wordpress, then you get an image block. Otherwise you are left with a code block.

If this is a case I don't get why SS would care where the image is coming from. I assume they are reading the image in, verifying it, and adding to the AL and in the content.

Find my contributions useful? Please like, upvote, mark my answer as the best ( solution ), and see my profile. Thanks for your support! I am a Squarespace ( and other technological things ) consultant open for new projects.

Link to comment

Thanks for your reply. This helps me think about it. Parsing the URL/domain is the only thing I can come up with, other than maybe it has something to do with the response/headers when they try to load the image... but I am skeptical that it goes that far.

If it is indeed a domain parse then I have to find another solution. Perhaps the way forward is to tailor my custom xml to import it into Wordpress first, so I can export it to Squarespace. Crazy.

Link to comment
  • 2 months later...

Hi, I found this topic while doing some hunting about wordpress xml imports into Squarespace. 

I've got ~300 posts and associated featured images that I have built in WP and exported using the native export tool. 

When I go to upload it, I get a "Success" message, except that when I go to the unpublished page named after my WP site, there are no blog posts. I decided to isolate one blog post and the image attachment it references as the featured image, and got the same result. Can anybody see anything glaring that I should be looking to modify? 

<?xml version="1.0" encoding="UTF-8" ?>
<!-- This is a WordPress eXtended RSS file generated by WordPress as an export of your site. -->
<!-- It contains information about your site's posts, pages, comments, categories, and other content. -->
<!-- You may use this file to transfer that content from one site to another. -->
<!-- This file is not intended to serve as a complete backup of your site. -->

<!-- To import this information into a WordPress site follow these steps: -->
<!-- 1. Log in to that site as an administrator. -->
<!-- 2. Go to Tools: Import in the WordPress admin panel. -->
<!-- 3. Install the "WordPress" importer from the list. -->
<!-- 4. Activate & Run Importer. -->
<!-- 5. Upload this file using the form provided on that page. -->
<!-- 6. You will first be asked to map the authors in this export file to users -->
<!--    on the site. For each author, you may choose to map to an -->
<!--    existing user on the site or to create a new user. -->
<!-- 7. WordPress will then import each of the posts, pages, comments, categories, etc. -->
<!--    contained in this file into your site. -->

    <!-- generator="WordPress/6.6.1" created="2024-09-05 01:16" -->
<rss version="2.0"
    xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/"
    xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:wp="http://wordpress.org/export/1.2/"
>

<channel>
    <title>Test 2</title>
    <link>https://burlybottomdesigns.squarespace.com</link>
    <description></description>
    <pubDate>Thu, 05 Sep 2024 01:16:17 +0000</pubDate>
    <language>en-US</language>
    <wp:wxr_version>1.2</wp:wxr_version>

    <wp:author>
        <wp:author_id>2</wp:author_id>
        <wp:author_login><![CDATA[atwookie]]></wp:author_login>
        <wp:author_email><![CDATA[Chris@email.com]]></wp:author_email>
        <wp:author_display_name><![CDATA[atwookie]]></wp:author_display_name>
        <wp:author_first_name><![CDATA[]]></wp:author_first_name>
        <wp:author_last_name><![CDATA[]]></wp:author_last_name>
    </wp:author>

    <generator>https://wordpress.org/?v=6.6.1</generator>
        <item>
        <title><![CDATA[War on Drag]]></title>
        <link>/war-on-drag/</link>
        <pubDate>Wed, 04 Sep 2024 17:08:04 +0000</pubDate>
        <dc:creator><![CDATA[atwookie]]></dc:creator>
        <guid isPermaLink="false">/war-on-drag/</guid>
        <description></description>
        

        <content:encoded><![CDATA[<div   class="sqs-block-button-container sqs-block-button-container--center"   data-animation-role="button"   data-alignment="center"   data-button-size="medium"   data-button-type="primary" >   <a     href="/shop?tag=Design%20War%20on%20Drag"     class="sqs-block-button-element--medium sqs-button-element--primary sqs-block-button-element"        >     Products with this Design   </a> </div> <hr />  <div class="sqs-html-content">   <h4 style="white-space:pre-wrap;">Rating</h4><p class="" style="white-space:pre-wrap;">NSFW</p> </div>  <hr />  <div class="sqs-html-content">   <h4 style="white-space:pre-wrap;">Themes</h4><p class="" style="white-space:pre-wrap;">Pride, Drag, Political, Design War on Drag, NSFW</p> </div>]]></content:encoded>
        <excerpt:encoded><![CDATA[]]></excerpt:encoded>
        
        <wp:post_id>883</wp:post_id>
        <wp:post_date><![CDATA[2024-09-04 17:08:04]]></wp:post_date>
        <wp:post_date_gmt><![CDATA[2024-09-04 17:08:04]]></wp:post_date_gmt>
        <wp:comment_status><![CDATA[closed]]></wp:comment_status>
        <wp:ping_status><![CDATA[closed]]></wp:ping_status>
        <wp:post_name><![CDATA[war-on-drag]]></wp:post_name>
        <wp:status><![CDATA[publish]]></wp:status>
        <wp:post_parent>0</wp:post_parent>
        <wp:menu_order>0</wp:menu_order>
        <wp:post_type><![CDATA[post]]></wp:post_type>
        <wp:post_password><![CDATA[]]></wp:post_password>
        <wp:is_sticky>0</wp:is_sticky>            
        <category domain="post_tag" nicename="drag"><![CDATA[Drag]]></category>
        <category domain="category" nicename="nsfw"><![CDATA[NSFW]]></category>
        <category domain="post_tag" nicename="political"><![CDATA[Political]]></category>
        <category domain="post_tag" nicename="pride"><![CDATA[Pride]]></category>
                        <wp:postmeta>
        <wp:meta_key><![CDATA[_wp_page_template]]></wp:meta_key>
        <wp:meta_value><![CDATA[default]]></wp:meta_value>
        </wp:postmeta>
                            <wp:postmeta>
        <wp:meta_key><![CDATA[_thumbnail_id]]></wp:meta_key>
        <wp:meta_value><![CDATA[487]]></wp:meta_value>
        </wp:postmeta>
                            </item>

        <item>
        <title><![CDATA[war-on-drag]]></title>
        <link></link>
        <pubDate>Mon, 02 Sep 2024 13:28:55 +0000</pubDate>
        <dc:creator><![CDATA[atwookie]]></dc:creator>
        <content:encoded><![CDATA[]]></content:encoded>
        <excerpt:encoded><![CDATA[]]></excerpt:encoded>
        <wp:post_id>131</wp:post_id>
        <wp:post_date><![CDATA[2024-09-02 13:28:55]]></wp:post_date>
        <wp:post_date_gmt><![CDATA[2024-09-02 13:28:55]]></wp:post_date_gmt>
        <wp:post_name><![CDATA[war-on-drag]]></wp:post_name>
        <wp:status><![CDATA[inherit]]></wp:status>
        <wp:post_type><![CDATA[attachment]]></wp:post_type>
        <wp:attachment_url><![CDATA[https://rockfish.com/website_5b51253d/wp-content/uploads/2024/09/war-on-drag.png]]></wp:attachment_url>
        </item>

                </channel>
</rss>


    

Link to comment
27 minutes ago, Atwookie said:

When I go to upload it, I get a "Success" message, except that when I go to the unpublished page named after my WP site, there are no blog posts.

After the successful import did you reload the site? I have found that you don't see the new blog page added to the site unless you do that.

I was able to import the file into my test site. Although I wouldn't say it was a successful import. I saw warnings and things may not be the way you intended.

Screenshot2024-09-04at8_47_53PM.thumb.png.8cf8f44c051ac0f5d870aea2bffda99c.png

I have no solutions.

Find my contributions useful? Please like, upvote, mark my answer as the best ( solution ), and see my profile. Thanks for your support! I am a Squarespace ( and other technological things ) consultant open for new projects.

Link to comment

Like @creedon said, the requirement to reload the entire Squarespace interface in order to see the just imported blog/pages bewildered me for quite some time.

@Atwookie I'm not sure if these will be helpful, but here are some differences I see between your xml and one I was able to import successfully.

After 

<wp:wxr_version>1.2</wp:wxr_version>

I have (use your domain)

<wp:base_site_url>https://somewebsite.com/</wp:base_site_url>
<wp:base_blog_url>https://somewebsite.com/blog</wp:base_blog_url>

In each <item> my <link> and <guid> tags use full/absolute URLs. You have relative URLs. But then again so does @creedon's original xml, so that is probably not the issue.

Maybe some of that will be helpful. I know I struggled with this for quite a while.

Edited by kirkroberts
Link to comment

@creedon and @kirkroberts thank you both for your help. I had to do a hard refresh and enable the page before the post would show up. That gives me a little hope LOL. I think I found a way to resolve the issue with the image not coming in, but need to test it. 

 

Link to comment
5 hours ago, Atwookie said:

I think I found a way to resolve the issue with the image not coming in, but need to test it. 

I may have touched on this issue. I can't say for sure. I have noted several times, issues with thumbnails coming in. From a previous post of mine in this thread.

Quote

I suggest you search the forum using words like wordpress xml import. If you search on my member name you'll find some of the threads that I've participated in on this topic.

I'm not offering any particular solution.

Find my contributions useful? Please like, upvote, mark my answer as the best ( solution ), and see my profile. Thanks for your support! I am a Squarespace ( and other technological things ) consultant open for new projects.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

×
×
  • Create New...

Squarespace Webinars

Free online sessions where you’ll learn the basics and refine your Squarespace skills.

Hire a Designer

Stand out online with the help of an experienced designer or developer.