March 2012 Archives

Optimizing MovableType Google Indexing

I was surprised to find that a default install of MovableType has a tendency to be indexed very poorly by Google and other search engines. My monthly index pages would sometimes get indexed instead of the individual pages, which made it nearly impossible to find what you were looking for because my monthly indexes are huge due to a large number of long posts in my food blog. And worst of all, some search pages got indexed so if you pulled one of those up it would re-do the search which was bringing my server to a crawl. Fortunately, it was pretty easy to fix.

The first thing you need to do is create a sitemap.xml file. This is really easy to do automatically in MovableType by creating a custom index template. Go into Design - Templates - Index Template. Create a new template called "Sitemap." Set the following options:

Output File: sitemap.xml
Template Type: Custom Index Template
Link to File: (leave blank)
Publishing: Statically (the default)

Then you just need to create the template. Here's the one I use for the tech blog:


Here's what I used on my food blog:


This is a little more complicated because any entries that are tagged "whatiate" have a lower priority than other pages. In other words, pages that are not tagged "whatiate" will almost always appear higher on the search results within my page.

When you publish the site, the sitemap.xml file should be created in the top level of the blog. A good start!

Next edit your robots.txt to block search engines from accessing the /cgi-bin directory. You might just block /cgi-bin/mt/mt-search.cgi, but it was fine for me to block everything. This is useful because sometimes people will link to a search page instead of an actual page, and this really bogs down the server for a large blog.

Also note that robots.txt now references the sitemap file. This will help other search engines find it.


Finally, in the case of Google, go to the webmaster tools and submit the URL to your sitemap.xml file. Within a few days Google will hopefully index your site properly.

The second part of this is is to create a Custom Google Search engine. This is pretty self-explanatory - follow the steps and it will spit out a little block of html you can insert into a page and that's it. You can see this in action in my food blog custom search page. The Google search is so much better than the MovableType search. Not only is it faster, but its ranking works much better than MovableType's does.

Setting up a local testing MovableType server

Often it's useful to set up a test server. I do this before I upgrade to a new version of MovableType, for example, to make sure the new version works properly. I set up a Linux Virtual Machine in my home office that's similar to my production servers and run the test there. There's one problem.

It does not work.

Well, the server works. Unfortunately when I linked between blog pages I embedded the full absolute URL (including "http://blog.rickk.com") so I keep getting linked back to the production server. But it's not just self-inflicted; MovableType embeds the absolute URL all over the place, too.

There are some tricks that mostly work including host file and DNS hacks, but I prefer this solution: setting up a proxy server.

On most Linux distributions it's a piece of cake to install a proxy server. What I did is install the squid proxy server, configure it to rewrite URLs and then not actually cache anything, just pass the request to the real server. What this does is map every request to blog.rickk.com to the test server, blogtest2.rickk.com.

Why a proxy server? It's supported by all of the browsers, but more importantly it's a per-browser configuration setting. This means that I can keep my regular browser (Chrome) pointed to production and set the proxy server on Firefox to point to test. This ends up being very handy.

Setting up Squid

The way to install Squid varies between Linux distribution but for Ubuntu it was as simple as:

sudo apt-get install squid

Edit the /etc/squid/squid.conf file adding:

http_access allow localnet
url_rewrite_program /etc/squid/redirector
cache deny all

The http_access allows any machine on my local network to use the proxy server.
The url_rewrite_program specifies the program to rewrite the blog URLs.
The cache specifies that nothing will be cached, which makes testing a little easier.

Even though we're not caching data, we still need to initialize the cache:

squid -z

And create the redirector script in /etc/squid/redirector. Be sure to make it executable (chmod 755 redirector) and edit it for the URLs for your blog and test servers.


Start squid (this may vary depending on your Linux distribution):

start squid

Browser Settings

In Firefox, the setting is in Advanced - Network - Connection - Settings.

Set the HTTP Proxy to the server running the proxy server and set the port to 3128.

It's a little hard to tell whether this is 100% working because the URLs in the address bar of your browser will still read the production URL. If you're using something like Firebug it will show that IP address the data is coming from is your proxy server.

But you really need to examine the server logs on your test server to make sure that the request is being set to your test server instead of the production server.

Code Box

I wanted a way to be able to display code and such in my blog entries. I'm not a fan of copying and pasting because the formatting often gets broken and it's hard to maintain.

I created a little Javascript and CSS solution that makes it easy to embed code into a blog entry or a regular HTML page. In your HTML page or blog entry you need to include jquery.js and codebox.js and include the stylesheet codebox.css. Then you just need to insert a <div> for each file you want to insert into the post!

Insert something like this in your HTML <head> section.
<script type="text/javascript" src="jquery.js"></script> <script type="text/javascript" src="codebox.js"></script> <link rel="stylesheet" type="text/css" href="codebox.css" />

And stick this somewhere in the body. It's invisible but contains the localized string for the download link. I put this in the MoveableType "Header" template.
<div class="codeboxDownloadStr">Download Code</div>

The rest is just inserting a <div> whose id is the URL to the file you want to insert into a neat little box!
<div class="codebox" id="/tech/codebox.js"></div>

The Javascript:

The CSS:

And that's it!

License:
Copyright 2012 rickk.com Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About this Archive

This page is an archive of entries from March 2012 listed from newest to oldest.

February 2012 is the previous archive.

June 2012 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Categories

Pages