plain text



Plain text, Plain-text, or Plaintext is any text, text file, or document that contains only text. Unlike a rich-text document, a plain text file cannot have bold text, fonts, larger font sizes, or any other special text formatting. In the picture is a visual example of plain text vs. formatted text.

Most associate plain text files with the file extension .txt on Microsoft Windows computers, however, can be any non-formatted file. To view a plaintext file, a text editor such as Microsoft Notepad is used. However, all text editors including Microsoft Wordpad and Word can also be used to view plaintext files because they have no special formatting.

Note: If you view a non-plaintext file in a plain text editor such as Notepad, it will contain garbage and another character you may not recognize.                                 

plain text is also sometimes used only to exclude "binary" files: those in which at least some parts of the file cannot be correctly interpreted via the character encoding in effect. For example, a file or string consisting of "hello" (in whatever encoding), following by 4 bytes that express a binary integer that is not just a character, is a binary file, not plain text by even the loosest common usages. Put another way, translating a plain text file to a character encoding that uses entirely different number to represent characters, does not change the meaning (so long as you know what encoding is in use), but for binary files such a conversion does change the meaning of at least some parts of the file.

Files that contain markup or other meta-data are generally considered plain-text, so long as the markup is also in directlyhuman-readable form (as in HTML, XML, and so on (as Coombs, Renear, and DeRose argue,[1] punctuation is itself markup; and no one considers punctuation to disqualify a file from being plain text).


Plain Text And Rich Text:

The use of plain text rather than binary files, enables files to survive much better "in the wild", in part by making them largely immune to computer architecture incompatibilities. For example, all the problems of Endianness can be avoided (with encodings such as UCS-2 rather than UTF-8, endianness matters, but uniformly for every character, rather than for potentially-unknown subsets of it).

According to The Unicode Standard,



"Plain text is a pure sequence of character codes; plain Un-encoded text is therefore a sequence of Unicode character codes."

styled text, also known as rich text, is any text representation containing plain text completed by information such as a language identifier, font size, color, hypertext links.
"plain text is the underlying content stream to which formatting can be applied."
"Plain text is public, standardized, and universally readable.".               Files that contain markup or other meta-data are generally considered plain-text, as long as the entirety remains in directlyhuman-readable form (as in HTML, XML, and so on (as Coombs, Renear, and DeRose argue, punctuation is itself markup). The use of plain text rather than bit-streams to express markup, enables files to survive much better "in the wild", in part by making them largely immune to computer architecture incompatibilities.


"Plain text is a pure sequence of character codes; plain Ue-encoded text is therefore a sequence of Unicode character codes."

styled text, also known as rich text, is any text representation containing plain text completed by information such as a language identifier, font size, color, hypertext links.
"plain text is the underlying content stream to which formatting can be applied."
"Plain text is public, standardized, and universally readable.".


Thus, representations such as SGML, RTF, HTML, XML, wiki markup, and TeX, as well as nearly all programming language source code files, are considered plain text. The particular contents is irrelevant to whether a file is plain text. For example, an SVG file can express drawings or even bitmapped graphics, but is still plain text.

According to The Unicode Standard, plain text has two main properties in regard to rich text:


Plain text, Plain-text, or Plaintext is any text, text file, or document that contains only text. Unlike a rich-text document, a plain text file cannot have bold text, fonts, larger font sizes, or any other special text formatting. In the picture is a visual example of plain text vs. formatted text.

Most associate plain text files with the file extension .txt on Microsoft Windows computers, however, can be any non-formatted file. To view a plaintext file, a text editor such as Microsoft Notepad is used. However, all text editors including Microsoft Wordpad and Word can also be used to view plaintext files because they have no special formatting.

Note: If you view a non-plaintext file in a plain text editor such as Notepad, it will contain garbage and another character you may not recognize.                                 

plain text is also sometimes used only to exclude "binary" files: those in which at least some parts of the file cannot be correctly interpreted via the character encoding in effect. For example, a file or string consisting of "hello" (in whatever encoding), following by 4 bytes that express a binary integer that is not just a character, is a binary file, not plain text by even the loosest common usages. Put another way, translating a plain text file to a character encoding that uses entirely different number to represent characters, does not change the meaning (so long as you know what encoding is in use), but for binary files such a conversion does change the meaning of at least some parts of the file.

Files that contain markup or other meta-data are generally considered plain-text, so long as the markup is also in directlyhuman-readable form (as in HTML, XML, and so on (as Coombs, Renear, and DeRose argue,[1] punctuation is itself markup; and no one considers punctuation to disqualify a file from being plain text).

Plain Text And Rich Text:


The use of plain text rather than binary files, enables files to survive much better "in the wild", in part by making them largely immune to computer architecture incompatibilities. For example, all the problems of Endianness can be avoided (with encodings such as UCS-2 rather than UTF-8, endianness matters, but uniformly for every character, rather than for potentially-unknown subsets of it).

"Plain text is a pure sequence of character codes; plain Un-encoded text is therefore a sequence of Unicode character codes."
styled text, also known as rich text, is any text representation containing plain text completed by information such as a language identifier, font size, color, hypertext links.
"plain text is the underlying content stream to which formatting can be applied."
"Plain text is public, standardized, and universally readable.".               Files that contain markup or other meta-data are generally considered plain-text, as long as the entirety remains in directlyhuman-readable form (as in HTML, XML, and so on (as Coombs, Renear, and DeRose argue, punctuation is itself markup). The use of plain text rather than bit-streams to express markup, enables files to survive much better "in the wild", in part by making them largely immune to computer architecture incompatibilities.


"Plain text is a pure sequence of character codes; plain Ue-encoded text is therefore a sequence of Unicode character codes."
styled text, also known as rich text, is any text representation containing plain text completed by information such as a language identifier, font size, color, hypertext links.
"plain text is the underlying content stream to which formatting can be applied."
"Plain text is public, standardized, and universally readable.".


Thus, representations such as SGML, RTF, HTML, XML, wiki markup, and TeX, as well as nearly all programming language source code files, are considered plain text. The particular contents is irrelevant to whether a file is plain text. For example, an SVG file can express drawings or even bitmapped graphics, but is still plain text.