OOo Off the Wall: Combining Documents with OOo

 in
Before you ask for a Reveal Codes feature like the one in WordPerfect, try learning how to use the tools offered in Writer.

A couple of weeks ago, the OpenOffice.org User's List featured another round of explaining to a former WordPerfect user why OpenOffice.org Writer didn't have a Reveal Codes feature that showed the raw encoding of the document for troubleshooting. This time, the thread was started by a poster who insisted that he needed the feature when he had to merge several documents into one. The discussion made me realize that, although I tend to talk about features in this column, sometimes work flow is more important. Often, the problem isn't the tools, it's how you use them. After lurking for most of the thread, I ended it with a suggestion about how to use the tools in OOo to combine documents much more efficiently than you could hope to do with Reveal Codes. What follows is an expanded version of my suggestion that reinforces, yet again, the advantages of using styles in many situations.

WordPerfect veterans raise the idea of a Reveal Codes feature for Writer every couple of months. In response, a macro that gives the appearance of Reveal Codes without the functionality has been written. However, the feature isn't likely to appear in any upcoming version of Writer. For one thing, while WordPerfect is a code-based word processor, in which every piece of formatting is embedded in a manner not too different from HTML tags, Writer is a frame-based one processor. That means the characteristics for a selection of text are defined separately from the text itself. As a result, no direct equivalent of Reveal Codes is possible.

Another reason why Writer won't have a Reveal Codes feature is Writer is style-oriented, while Reveal Codes works best when users rely on manual overrides. If you use styles religiously, you don't have the problem of tracking down stray bits of formatting, because Writer doesn't allow you to apply more than one character or paragraph style to a selection. Instead, all you need to do is open the style dialog to see how the selection is defined. For cases in which you need to see more than what is on the screen, View -> Non-Printing Characters generally is enough. Otherwise, if you want more, you're probably better off using TeX than a graphical word processor. For most purposes, Writer already has all the tools it needs for troubleshooting formatting.

So, how should you go about formatting a document composed of several different original documents? Ideally, you would start by enforcing a company or project policy of using the same templates and encourage people to use styles all the time. However, that's not only building castles in the air, it's expecting to see your name and titles in the next release of Debrett's. In practice, at least three-quarters of any group are likely to use Writer as though it was a typewriter, ignoring styles and manually adding formatting as the whim occurs to them.

You can find out if this is the case by opening each of the documents, pressing F11 and then setting the view to Applied Styles for characters. By browsing through the format of each document and selecting portions, you soon will be able to see whether manual overrides are being used. You can tell this by whether a change of formatting corresponds to a change in the highlighted style in the Styles and Formatting floating window. However, if you assume the worst, you'll probably be right more often than not.

Before going further, you also should create backups of every file you are working with. This is an elementary precaution, but it can't be repeated too often. The one time you think this effort isn't worth the time is the one time that something goes wrong.

Then, you can follow these steps:

1. Create a new document that has all of the necessary styles.

Starting with a new document gives you the advantage of knowing what you're dealing with. Before copying and pasting, go through all the component documents and see for which character and paragraph styles you need to recreate the formatting. Don't worry about how the original writers applied their formatting--the goal is not to play detective but to reproduce the appearance. So long as it looks the same, no one will care how you got the effect. You also should give your new styles names that aren't shared by any of the styles in the component documents, just to keep your life simple. A piece of text that uses a style whose name already is in a document to which it is pasted automatically is reformatted--a potentially handy step, but one that sometimes can create as many problems as it solves.

You might use a copy of one of the original documents for the combined one, but it's probably better not to do so. Unless, of course, you have a good idea of how it is formatted.

Another alternative is to create a new master document and add all of the component files to it. This option is especially attractive if the component documents also are going to be used independently. However, using a master document with different formatting from its sub-documents requires a strong understanding of Writer. Thus, it may not be practical unless you can teach the mechanics to everyone that is likely to use the documents.

Whether you're using a regular Writer file or a master document, copy and paste the component documents only after all the styles are defined. Then, keep them open in case you need to refer to them.

2. Use Find & Replace for the first round of formatting.

Figure 1. Attributes and Format are two of the search tools that can simplify the task of reformatting several documents into one.

Edit -> Find & Replace contains two tools that can help you format your new document. If any of the component documents contain manual overrides, use the Attributes or Formatting buttons to search for a specific piece of formatting. For example, if some of the documents use italics for book titles, search for italics. When you find a match, strip out the manual overrides by putting the mouse cursor in the paragraph. Then, use the Styles and Formatting floating window to apply the character style for the situation.

Figure 2. When the search tool finds a match for a style, the same style is highlighted in the Styles and Formatting floating window.

If any of the documents use styles, select More Options -> Search for Styles from the Find & Replace window. Consulting the Applied Styles view in the Styles and Formatting floating window, replace all applied styles with the character and paragraphs styles you've created to replace them.

3. Check the results and houseclean.

At this point, all that usually remains is to compare the new document to each of its components. In some places, you may need to create new styles, because you've overlooked some necessary piece of formatting. In others, you may need to select View -> Toolbars -> Drawing to create a diagram or to take a screenshot of a complicated piece of formatting from a component document and then insert it as a picture into the new document. However, if you were careful about creating the styles for the new document, you generally should have little to do at this point.

As a final step, however, you might want to clean up the document by deleting any of the styles from component documents. You also might want to create new versions of the component documents from the new combined document, making them as clean as possible as well. By following these steps, you make it easier to deal with all of the documents in the future.

______________________

-- Bruce Byfield (nanday)

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Reveal Codes in OOo

Robin's picture

Interesting article. I don't fully agree with your comments because styles are great for use over and over again. When you have to work on a document that has been produced in Word, edited in WP then sent back to Word and finally in OOo, the formatting can be a real mess.

As one comment shows, Word may corrupt a document by leaving lost codes in the document. When these appear in OOo, they can create some really strange situations. Things like a table cell only being 2 characters wide if you type anything in it. In the RFE for a Reveal Codes for OOo, I made a suggestion of a info box that will show the styles and formatting codes that are in OOo, in a way like WP does with reveal codes. This is much better than trying to trace through all the style types when editing a document.

In todays world, as you point out, it is not possible to have everybody to use Styles and in our organization, it is even hard to get everybody to use the same Word Processor. Some are still using WP, others use Word, most Linux users use OOo and then there are those that use TeX.

My opinion is that the addition of a display box that acts like Reveal Codes would make life that much better than trying to find the strange lost codes with a hit and miss approach of using the F11 (Styles) display and having to look all over the screen for the different formatting.

Not needed

Anonymous's picture

I find that I never really miss the 'reveal codes' function: OOo Writer almost always seems to do the right thing. OTOH, I'm comfortable using styles; someone who insists on formatting 'by hand' may run into more problems.
I find when I do run into a case where Writer appears to get stuck, I use the 'Format > Default Formatting' function and start with a known setting.
And when I get morbidly curious about what's happening under the hood, I unpack the XML and read it—it's not that difficult.

when things don't work

Anonymous's picture

What is nice about WordPerfect's reveal codes is the ability to troubleshoot when things aren't working the way you think they should. Sometimes it helps. Being able to unpack the XML would work, but it would be convienent to be able to click in the original document to navigate the xml.

I do not use reveal codes

L. A. G.'s picture

I do not use reveal codes very often; I don't need to.
However, as the first commentator points out, when I do need to obtain some special formatting, reveal codes lets me find and fix the problem instantly. That lets me spend my time on the content of my document rather than its format.

I am not "bashing" OO; I use both WP8.1 and StarOffice 8. Now that SO8/Linux imports WP files, I can use it for formatting text in HTML or other output not supported by WP. (WP8 does not export to PDF, but it does produce a PS file when you print to disk, and that file is easily and rapidly converted to PDF.)

Is MSWord a code-based word

tktim's picture

Is MSWord a code-based word processor?

Is MS Word code-based?

Anonymous's picture

Not really. Older versions of Word 'felt' similar to WP to use, but the .doc format was always style-based, and later versions are fairly rigorously hierarchical, as you would expect with a nominally XML-based 'structured' document format.

OpenOffice.org has much more in common with modern versions of MS Word than with WordPerfect.

Word is object oriented.

Anonymous's picture

Word is object oriented. There is a list of paragraph objects, a list of table objects, a list of graphic objects, etc. There is no concept of the stream of text that is the document. IMHO, a big reason that many Word users want reveal codes is deleting the text doesn't always delete the object. I do a fair amount of Word document conversion and it's very common to see an empty Table or Bold or Ital show up in the middle of a document with no content. In my mind, this also explains why a large
word document that has been edited by multiple people many times will crash; get one corrupt object and you've lost your document.

Importing and Reveal Codes

Anonymous's picture

I use Reveal Codes a LOT when importing different documents To WordPerfect (Word, various versions; Works, etc.). The formatting always leaves a bunch of stray code --

Does OO not have this problem? DO imports come through perfectly so you don't have to tinker endlessly? If so, I'll give it a try. If not, as an editor accepting many formats, I'll stick with WordPerfect Reveal Codes

Stray Code

Anonymous's picture

Where's the answer to the question above????? Users of WP are not interested in code ("looking under the hood") because we are curious! We need to see why something is going wrong and quickly fix it so that we can get back to the real task. Try being a secretary to a lawyer who revises a document ten times before he's satisfied with the final product. Deleting does not always delete the accompanying code. Reveal Codes is an essential tool for that. If OOo would at least try to understand the reason for wanting reveal codes, a solution could be found. I appreciate the fact that it is a different format. But software developers are supposed to be problem solvers. Here's a problem ... solve it and the WP world will switch! There are many features of OOo that I like - better than WP and WORD (which I hate!) - give me some version of reveal codes and I will jump ship to OOo. That way I can spend less time converting documents from people who insist on using WORD!

Webinar
One Click, Universal Protection: Implementing Centralized Security Policies on Linux Systems

As Linux continues to play an ever increasing role in corporate data centers and institutions, ensuring the integrity and protection of these systems must be a priority. With 60% of the world's websites and an increasing share of organization's mission-critical workloads running on Linux, failing to stop malware and other advanced threats on Linux can increasingly impact an organization's reputation and bottom line.

Learn More

Sponsored by Bit9

Webinar
Linux Backup and Recovery Webinar

Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.

In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.

Learn More

Sponsored by Storix