Jump to content

XML Tags used in large Wordpress Import

Go to solution Solved by creedon,

Recommended Posts

I am migrating a custom blog with 1,700 blog posts and 20,000 images to Squarespace. The only option is to programmatically create a Wordpress XML file (custom code since the source is not Wordpress) and import that.

Does anyone know which XML nodes are imported by Squarespace? Reverse engineering the XML format has worked ... kinda, but it would be really useful to know how the Squarespace import works at a code level.

Alternatively if anyone has a sample XML file that worked well, then I can use that as a reference.

Link to comment
  • Solution

I'd be surprised if anyone has come up with a definitive XML example file that works for importing. It all depends on what you are trying accomplish.

Also note that SS's importer/exporter are fragile and temperamental. You can find many posts of these issues. There are fundamental issues such as you can't export a WP XML and then import that same file into a SS site without massaging it.

I suggest you search the forum using words like wordpress xml import. If you search on my member name you'll find some of the threads that I've participated in on this topic.

Here is a working example I use that covers some of the basics.

<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:excerpt="http://wordpress.org/export/1.2/excerpt/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:wp="http://wordpress.org/export/1.2/">
  <channel>
    <title>Your Site Title</title>
    <link>https://wedge-synthesizer-xy2y.squarespace.com</link>
    <pubDate>Tue, 17 May 2022 01:20:55 +0000</pubDate>
    <description />
    <language>en-US</language>
    <wp:wxr_version>1.2</wp:wxr_version>
    <wp:author>
      <wp:author_id>-123456789</wp:author_id>
      <wp:author_login>john@doe.com</wp:author_login>
      <wp:author_email>john@doe.com</wp:author_email>
      <wp:author_display_name><![CDATA[]]></wp:author_display_name>
      <wp:author_first_name><![CDATA[]]></wp:author_first_name>
      <wp:author_last_name><![CDATA[John Doe]]></wp:author_last_name>
    </wp:author>
    <wp:category>
      <wp:cat_name><![CDATA[null - null]]></wp:cat_name>
      <wp:category_nicename>null-null</wp:category_nicename>
      <wp:category_parent />
    </wp:category>
    <item>
      <guid isPermaLink="false">/blog-post-title-one</guid>
      <title>Blog Post Title One</title>
      <link>/blog-post-title-one</link>
      <content:encoded><![CDATA[<div
        class="
          image-block-outer-wrapper
          layout-caption-hidden
          design-layout-inline
          combination-animation-site-default
          individual-animation-site-default
          individual-text-animation-site-default
        "
        data-test="image-block-inline-outer-wrapper"
    >

      

      
        <figure
            class="
              sqs-block-image-figure
              intrinsic
            "
            style="max-width:2200px;"
        >
          
        
        

        
          
            
          <div
              
              
              class="image-block-wrapper"
              data-animation-role="image"
              
  

          >
            <div class="sqs-image-shape-container-element
              
          
        
              has-aspect-ratio
            " style="
                position: relative;
                
                  padding-bottom:63.6363639831543%;
                
                overflow: hidden;-webkit-mask-image: -webkit-radial-gradient(white, black);
              "
              >
                
                
                
                
                
                
                
                <img data-stretch="false" src="https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg" data-image="https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg" data-image-dimensions="2200x1400" data-image-focal-point="0.5,0.5" alt="" data-load="false" elementtiming="system-image-block" src="https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg" width="2200" height="1400" alt="" sizes="(max-width: 640px) 100vw, (max-width: 767px) 100vw, 100vw" style="display:block;object-fit: cover; width: 100%; height: 100%; object-position: 50% 50%" onload="this.classList.add(&quot;loaded&quot;)" srcset="https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=100w 100w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=300w 300w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=500w 500w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=750w 750w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=1000w 1000w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=1500w 1500w, https://images.squarespace-cdn.com/content/v1/665144c260e29a097d9c1165/1717532676867-HT763LQD9LMXFIG1VN15/20140301_Trade-151_0124-copy.jpeg?format=2500w 2500w" loading="lazy" decoding="async" data-loader="sqs">

            </div>
          </div>
        
          
        

        
      
        </figure>
      

    </div>
  


  




<div class="sqs-html-content">
  <p class="" style="white-space:pre-wrap;">Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Lobortis elementum nibh tellus molestie nunc non blandit. Aliquet risus feugiat in ante metus dictum. Ornare arcu dui vivamus arcu felis bibendum ut tristique et. Ipsum a arcu cursus vitae congue mauris. Et leo duis ut diam. Eget nulla facilisi etiam dignissim diam quis enim lobortis scelerisque. Amet purus gravida quis blandit turpis cursus in. Et netus et malesuada fames ac turpis. Pharetra diam sit amet nisl suscipit adipiscing bibendum est. Dui nunc mattis enim ut tellus. Id volutpat lacus laoreet non curabitur. Interdum velit euismod in pellentesque massa placerat duis. Fusce id velit ut tortor pretium. Adipiscing elit ut aliquam purus sit amet.</p>
</div>]]></content:encoded>
      <excerpt:encoded><![CDATA[<p>It all begins with an idea.</p>]]></excerpt:encoded>
      <wp:post_name>/blog-post-title-one</wp:post_name>
      <wp:post_type>post</wp:post_type>
      <wp:post_id>1</wp:post_id>
      <wp:status>publish</wp:status>
      <pubDate>Mon, 11 Mar 2019 17:15:07 +0000</pubDate>
      <wp:post_date>2019-03-11 17:15:07</wp:post_date>
      <wp:post_date_gmt></wp:post_date_gmt>
      <category domain="post_tag" nicename="one-tag"><![CDATA[one tag]]></category>
      <category domain="post_tag" nicename="two-tag"><![CDATA[two tag]]></category>
      <category domain="category" nicename="three-category"><![CDATA[three category]]></category>
      <category domain="category" nicename="four-category"><![CDATA[four category]]></category>
      <dc:creator>john@doe.com</dc:creator>
      <wp:comment_status>closed</wp:comment_status>
      <wp:postmeta>
        <wp:meta_key>_thumbnail_id</wp:meta_key>
        <wp:meta_value><![CDATA[2]]></wp:meta_value>
      </wp:postmeta>
      <wp:comment>
        <wp:comment_id>1</wp:comment_id>
        <wp:comment_approved>0</wp:comment_approved>
        <wp:comment_author><![CDATA[Jim]]></wp:comment_author>
        <wp:comment_author_url />
        <wp:comment_author_IP></wp:comment_author_IP>
        <wp:comment_date></wp:comment_date>
        <wp:comment_date_gmt>2022-05-29 20:59:23</wp:comment_date_gmt>
        <wp:comment_content><![CDATA[<p>Very nice!</p>]]></wp:comment_content>
        <wp:comment_type />
        <wp:comment_parent>0</wp:comment_parent>
      </wp:comment>
    </item>
    <item>
      <wp:attachment_url>https://images.squarespace-cdn.com/content/60374efe93a6cb725a5c6856/1663638164742-WAC5759OGW6WTCOGV3KJ/20140301_Trade-151_0124-copy.jpeg?content-type=image%2Fjpeg</wp:attachment_url>
      <link></link>
      <title></title>
      <wp:post_name></wp:post_name>
      <wp:post_type>attachment</wp:post_type>
      <wp:post_id>2</wp:post_id>
      <wp:status>inherit</wp:status>
      <content:encoded><![CDATA[]]></content:encoded>
      <excerpt:encoded><![CDATA[]]></excerpt:encoded>
      <pubDate></pubDate>
      <wp:post_date></wp:post_date>
      <wp:post_date_gmt></wp:post_date_gmt>
      <dc:creator></dc:creator>
    </item>
  </channel>
</rss>

If you come up with an example that does something different please consider sharing it.

Let us know how it goes.

Find my contributions useful? Please like, upvote, mark my answer as the best ( solution ), and see my profile. Thanks for your support! I am a Squarespace ( and other technological things ) consultant open for new projects.

Link to comment

Thank you @creedon!

I used your sample file as-is and successfully imported all 1,700k posts and 20k photos.

No matter what I tried, I could not get the "alt" text if images to import, so there is a lot of manual work needed. Comments also did not import, but that was less important.

Link to comment
Quote

No matter what I tried, I could not get the "alt" text if images to import, so there is a lot of manual work needed.

That's par for the course. There is always lots of clean up.

Quote

Comments also did not import, but that was less important.

Did you turn on Comments in the site globally and check the Comment toggle on each post?

The example XML I provided does indeed import the comments into the backed.

Screenshot2024-06-08at3_26_28PM.thumb.png.a5b26120fd5d5381a989fcd43b3f4a45.png

I haven't done enough comments to know if you can get the toggle turned on by default per post automatically.

I've never fiddled with alt text but I'd be surprised if SS went that deep. The XML import really is a disappointment and that SS apparently has it in maintenance mode is a shame.

Find my contributions useful? Please like, upvote, mark my answer as the best ( solution ), and see my profile. Thanks for your support! I am a Squarespace ( and other technological things ) consultant open for new projects.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

×
×
  • Create New...

Squarespace Webinars

Free online sessions where you’ll learn the basics and refine your Squarespace skills.

Hire a Designer

Stand out online with the help of an experienced designer or developer.