Cleaning up Text for Layout Automatically
September 18, 2009 / Updated: September 18, 2009 / Lena Shore
I recently mentioned a few things you should be doing when preparing type for layout in “Bad Design: Things You Shouldn’t Do” that include eliminating double-spaces and using the correct hyphens.
If you receive content that you’ll be utilizing you may notice some work needs to be done to clean it up. Fortunately, there are some great tricks to doing this automatically.
Whenever I receive copy from a client, the first thing I do is eliminate double spaces, replace hyphens with em dashes, remove multiple consecutive tabs, replace foot and inch symbols with single and double quotes, and the dreaded double paragraph returns.
If you don’t do typesetting and layout, this may seem meaningless — but if you do, it’s priceless to have things done in a streamlined manner when working with style sheets and preparing files for press.
How to Do it with “Find and Replace”
Find and Replace is your friend. The basic idea is to find the item you don’t want and replace it with what you do want.
Step one should be “make a back up copy of your document”. Unless you are used to doing these it is a little too easy to get ahead of yourself and really mess up a document. Then you will spend hours and hours running spellcheck and trying to find the mistakes you inflicted on your poor document. I’m not saying how I know this. I just do. Trust me.
In your layout or text editing program find your “find and replace” window. If you are looking to replace double spaces, you would enter two spaces into the find box and enter a single space in the replace box. Now tell it to find and replace all. Pretty easy .
Now, you’ll want to repeat this until your find/replace can’t find any more double spaces. I know. You think you just did it, but here is why. If you are searching for “space space” and the document has “space space space” and you replace “space space” with “space” — you are going to turn “space space space” into “space space”. Clear? Doing is probably easier than reading in this case. You’ll understand as soon as you give it a shot.
Now you can repeat the process for quotes.
But how do I type in a “tab” or other special character oh Mistress Lena of the Find and Replace? It’s pretty easy and you can usually do it in one of two ways.
The first way is to highlight the area of text that holds the tab (or other special character) copy and then try to paste it into your find and replace area. Tabs will usually be a “^t” and paragraphs will be “^p”.
The other way is to see if you have a special character option built into your find and replace.
Forced line breaks
A forced line break is created by those people that still think like a typewriter. They hit a hard carriage return when they get close to the margin. Then they hit two returns between paragraphs. They don’t recognize the computer will return for them. It’s a mini nightmare for a typesetter to strip these puppies out.
A simple find ^p^p and replace with ^p is not going to work. It will turn this:
See Jane run.
See Dick run.
See Spot run.
See Spot pee on Dick run.
See Jane run. See Dick run. See Spot run. See Spot pee on Dick run.
Now imagine that multiplied by 50 pages of text. It’s not fun. Here is a process to get around that:
- Search for double paragraphs (^p^p) and replace it with some odd symbol not likely to be used in your text. I always use a double tilde (~~).
- Now replace all single paragraphs (^p) with a space.
- Now go back and replace all of your double tildes with a single ^p.
- Viola. Now, don’t forget to go back and search for double spaces. You know they left a bunch of those too.
Hyphens and dashes
Hyphens and dashes are not so automatic. I usually do a search for a hyphen and set it to replace with an em dash, which addresses the most common typing mistake. But, just auto correct one at a time to make sure you aren’t replacing hyphens with em dashes that should be replaced with en dashes. What’s all this crazy dash talk you say? Go read my “Bad Design: Things You Shouldn’t Do” for a review on dash types and functions.
If you use a lot of en dashes, you might consider learning the Grep style of Find and Replace. This allows you to search for dashes that are surrounded by special characters and use wildcards. You could search for “Any number, preceded by another number, preceded by a colon, next to a hyphen” and replace. In other words you could find dashes that were only used in dates or times and replace them only.
How to Do it Automatically in InDesign
If you are working in InDesign this is a snap. You have some built in scripts you can utilize. In CS4 you can access the window by going to Window –> Automation –> Scripts. If you haven’t discovered the Scripts window before, you are in for a treat. Lots of wonderful tidbits there from cleaning up text to creating crop marks, and exporting all stories from a document.
How to Do it Automatically in Word
Word has something called macros built into it. Think of them as mini programs that help you get repetitive tasks done in Word.
If you are a programmer type, you can write your own macro to find and replace a list of items with a click of a button. If you are not a programmer, you have a couple of choices.
Older versions of Word have a Macro Recorder built in. My understanding is that newer incarnations do not. You can check under your Tools menu to see if you have a recorder. If not, you can still get one. There are a lot of macro recorders out there. You can do a little research to find one you like and then record your find and replace actions in Word (or any other program) and save. Next time you can just click a button and it will do it for you. Here is a great article that explains the process of recording a macro.
Buy a Search and Replace
You can also find search and replace macros or programs. What I notice while researching, is that many macros will work (or not work) depending on your version of Word. So, you’ll need to do some research to find one that is right for you. And while I haven’t tried any of the Windows solutions below and can’t recommend them – they are worth checking out:
- ReplaceMagic – Shareware – Not Word Dependent – Supposedly works on all documents
- Funduc – They have a couple of macro solutions
- A bunch more can be found here on Version Tracker
And a final note about macros — get them from a source your feel is reliable. They can also harbor malicious routines. Use your virus protection.
Mac users have an easier time with automated search and replace. You can also look for macros for your version of Microsoft Word. But you’ll have an easier time using one of these options.
- Text Soap. “Clean your text with a click”. Very cool little application that will let you clean up text no matter what program you are in.
- Applescript. Like macros, you can write your own. There are plenty of places that discuss writing them and have free code lying around for your use.
- Here is a list of find/replace programs you can sort through.
My favorite Macintosh Solution: Automator
The Automator is one of the best applications that has ever been included on your mac. It is a way to write applescript without having to know how to write applescript. If you have a task you do on your computer over and over, you need to get friendly with this tool. It really needs it’s own article. But the Reader’s Digest version is you can make it handle your repetitive tasks like cropping photos, find and replace, copying files, uploading to a specific folder as an FTP… the list goes on and on.
Here is a workflow I did with Automator. There are about 30 search and replace functions. The last thing is does is save my file to the right document type. It takes about 3 seconds. Brilliant.
It looks a little complicated, but it’s all drag and drop. You can also download extra functionality for this little app. And, in combination with your iCal you can schedule it to complete tasks for you.