I wrote this post to myself on my blog in February while attempting to transition from my old site on Blogspot to my new site at Dreamhost. In the end, I overthought it and should have shut down the Blogspot site after a month and not worried about redirection, but for those who wish to bash their heads against a wall, feel free to learn from this mistake.
301 Redirect from Blogspot Blog
So, while attempting to follow all the ins and outs of the 301 redirect setup, I made many blunders, but I think I now have an 85% solution.
Using the WordPress sitemap, I got a list of most of the articles and the archives on the WordPress side of things. I mistakenly thought that the article naming would be the same from WordPress to Blogspot, however that was a huge error, but that is putting ahead of myself. Let’s see if I can remember each blunder as I made it.
- I had a list of links that all were of the form https://www.canajunfinances.com/????/??/??/<article title> which needed to have the site stripped off them and then the date stripped off before the title name so that it became /YEAR/MN/<article title> .
- I installed VIM (vi for windows) to do a lot of this.
- First I imported .xml sitemap file into Excel and then stripped out the unusable columns, leaving a single column of the format in the first bullet. I then duplicated that column so I had two columns and then saved the file in tab separated format.
- The error with using excel which then saved it, is that it broke a bunch of the longer lines into TWO lines, but I only figured that one out later.
- I opened the text file in VIM, and replaced the https://www.canajunfinances.com/ with what was needed to make
Redirect 301 old-page-trail new-page-full-URL.
- So now I had a file with two columns with all of these 301 redirects which I then saved as .htaccess.
- On my web site I saved the existing .htaccess file in a BACKUP directory and then noted it had stuff in it already, so I appended all my redirects onto the end and tried it on my site. SERVER FAILURE messages came up. I put back the old .htaccess and the site came back.
- I then shot myself in the foot by accidently deleting a bunch of the files in my blog’s directory, spent 1/2 an hour piecing it back together. Don’t do that!
- Tried a few iterations of the .htaccess file, the first one with just the Month archives, which worked fine.
- To test it you have to go to the BLOGSPOT publish section and point the BLOGSPOT blog to the new site. It is also easy to undo.
I got bold and tried a bigger file and it failed.
- After trying that a few times, I then decided to test my .htaccess file on my home WordPress config on my local pc (no ftp involved and no danger of screwing up my real site).
- Tried full .htaccess file again, Server failure.
- Removed most of the list, and the server was fine.
- Added month at a time, and waited to get a failure, which I then found, after a few additions.
- This is where I figured out that the command lines were getting broken into two lines with a ^M in them. Had to fix all of those (about 100 of them) by hand.
- Once they were all fixed I had a complete file that did not cause my server to crash. Notice I didn’t say that worked right.
Now that I had a .htaccess file that didn’t cause my server to crash I uploaded it to my WordPress site via FTP.
- Moved blogspot to point at my new site, month archives worked fine, so then I tried out some of my individual postings, and started getting Error 404 (article not found).
- Learned that Blogspot removes many common words from their posting addresses on line, such as “A”, “THE” and a few other, so by hand I removed those from a plethora (over 300) postings.
- Oh I also had to remove the DAY date from the postings earlier on as well (on EVERY posting, over 750).
I now have a working .htaccess that I sure does not work for every single posting, but should point most of the posts over from my old site. I need to create entries for categories, and other things as well.
Epilogue
It all worked eventually but I must have spent over 30 hours elapsed time on this, and at the end of it, I still ended up with a ZERO Page Rank for a while because Google thought I was attempting to double post the same content.
I would suggest doing what I have done now, which is blow away the old blogspot site after a grace period, recreate it and have it have 1 post which points to your new site, and then use various tools to find all the links to your OLD site and ask the owners of the linking site to update their links to your new site. Might take a little less time, too!