It may have taken you months or years, but finally, you have written your book.
Your Word document is ready and is now your final manuscript for your new ebook.
However, during this long process of writing your book, your document has accumulated a lot of background code.
As you typed, made corrections, and copy and pasted, code was added every time.
You might also have had reviews and track changes sent by your beta readers or an editor.
The end result might look fine to you on your screen.
But in reality, your document is full of background code that can cause problems when you publish your book.
You need to spend a little bit of time now to prepare your Word manuscript for electronic publishing.
What you can’t see
What you see is definitely not what you get with a document in Word.
Here is what it really looks like, and why it can cause you problems when you publish an ebook.
In the upper pane is the normal view in Word.
In the lower pane, you can see the code that is used to tell your screen what to display.
In this image above, black is text, and red and purple are hidden code. As you can see, there is more code than text.
To check your Word document as in the image above, save a copy of your document to .htm.
Then open this new file in your browser and select Page Source from the view or developer menu. It works best in Safari, Firefox, or Chrome.
There will always be code because it is necessary to format your text.
However, a lot of unnecessary code can accumulate over the long period of time it takes to write a book.
Going nuclear with your Word file
When a Word document is converted to an ebook by Kindle Direct Publishing, Smashwords, Google Play, or any other online ebook publisher, it is the code that is read. And not the perfect looking word processor text that you see on your screen.
Within this code lies any number of errors that were created during your writing process.
These can include irregular fonts, differing paragraph styles, line spacing, and line breaks.
You could also have hidden information, including an author name that may not be you if your Word program was not registered in your name.
Yes, a lot of information is stored in Word code, in either .html or .xml, depending on what file format you use.
Once you think your Word manuscript is ready for publishing, the very first thing to do is clean your text, and start all over again with formatting your document.
You do this by converting your whole manuscript, which means every single word, into plain text and copying it back into a new, clean Word document.
How to clean your file
First, select all of the text in your Word manuscript. Now copy it and then paste it into a text editor.
If you are using Windows, use Notepad, and for Mac users, Text Edit.
Open a new file in either program and paste the whole text you copied from Word. Then save it as plain text into a new file.
You will then see your manuscript in either Notepad or Text Edit in plain text, which means it is totally unformatted.
Now it’s time to move your clean text back into a fresh new Word document. Now go in reverse.
Copy all the text in your new plain text editor window and then open a new Word file and paste all your plain text into Word.
Then save it under a new file name in .docx format.
DO NOT overwrite your existing manuscript. Keep that as your original master copy in case you need to revert to it at any stage along the publishing line.
Now, with a new, clean, and pristine Word document copy of your manuscript, you are ready to format your book for electronic publishing.
To format your plain text Word document and be ready for uploading to an online ebook publisher, refer to our article on How To Format An Ebook and how you use styles to make your book absolutely perfect for ebook readers.
As an extra check, you should convert your new Word file into epub and mobi files to check what your new book will look like as an ebook.