Universal text converter



Download Universal text converter, DOS version (386 code, v2.0.12h)
Download Unix version and source code (v2.0.12h)
If you have problems compiling/running the Unix version, send me an e-mail.
(At this moment it have been tried under Linux 2.0.x, Sun 5.5 and Solaris 4.1.)

  • You can use the following character sets:
    (-l) -l1 ISO 8819/Windows Latin 1
    -l2 ISO 8819-Latin 2
    -lw Windows Latin 2
    -l5 ISO 8819/Windows Latin 5
    -4 Codepage 437
    -5 Codepage 850
    -8 Codepage 852
    -a CWI ASCII text
    -xa Xerox Ventura Publisher ASCII Text
    (-x) -x8 Xerox Ventura Publisher 8 bit ASCII Text [.VPN]
    -ca Corel Ventura Publisher 4.x ASCII Text
    (-c) -c8 Corel Ventura Publisher 4.x 8 bit ASCII Text [.CVP]
    -c5 Corel Ventura Publisher 5.0 ASCII Text
    -cn Corel Ventura Publisher 5.0 ANSI Text
    -v Ventura International
    -m MC Text
    -7 7 bit text (standard ASCII) [.ASC]
    -h1 HyperText Markup Language (HTML) 1.0 [.HTM]
    -h2 HyperText Markup Language (HTML) 2.0 [.HTM]
    (-h) -h3 HyperText Markup Language (HTML) 3.2 [.HTM]
    -w Winword 2.0 [.DOC]

  • Modifiers to be at the end of the format specifier:
    d double linebreaks (as usual text, double linebreak for the paragraphs)
    s single linebreaks (everything goes to one line, hard linebreak only for new paragraph) just as Winword "Text only".
    p just as 'd', but Ventura or HTML <tags> and @STYLES not written out
    t just as 's', but Ventura or HTML <tags> and @STYLES not written out

    Hint: If your whole file is converted into one paragraph, you should use the 's'/'t' modifier for the input file (or 'd'/'p' for the output, accordingly).
    If your file is breaked into lines with double or more linebreaks, you should use 'd'/'p' for the input file.

  • For HTML output:
    e empty (write only converted file)
    n normal (use some standard tags [<html>, <body> etc.])
    i for Impulzus (some default settings)
    w "white paper" (white background sheet)

    NEITHER specifiers are case sensitive.

  • If you use "/" or "\" instead of "-", the program removes/inserts extra spaces (there must be a space after punctation but must not before it).
    It does also replace double minus signs to dashes and quote signs to typographic (upper/lower) quote signs.
    Furthermore, the program makes Internet addresses (containing '@' or '://' and '.', for example 'hate@microsoft.com' or 'http://www.my.address') italic.
  • If you use "\" or "_" instead of "/" or "-", you can override Unix or DOS newline and convert \n to \r\n (Unix to DOS) or \r\n to \n (DOS to Unix).

  • More examples:
    To convert a single linebroke cp. 437 text to a HTML 3.0 document using a "white page" style sheet: CONV -4S -H3W ARTICLE.TXT GENIUS
    To convert (ISO) Latin 1 text files to Corel Ventura Publisher 5.0 ASCII text with typograhic conversions: CONV /L /CA *.DOC
    To convert a double linebroke Win. Latin 2 text on a Unix machnie to single linebroke CWI ASCII text for DOS with typographic conversions: CONV \LWD \AS MYFILE.TXT GADGET.TTT




    THe program knows the following HTML characters:
    &#01; .. &#255;, &.acute;, &.circ;, &.cedil, &.grave;, &.ring;, &.tilde;, &.uml;
    &aelig; = æ, &AElig; = Æ, &oelig; = œ, &OElig; = Œ, &oslash; = ø, &Oslash; = Ø, &eth; = ð, &ETH; = Ð, &amp; = &, &copy; = ©, &nbsp; = , &sp; = , &lt; = <, &gt; = >, &quot; = ", &szlig; = ß, &iexcl; = ¡, &cent; = ¢, &pound; = £, &curren; = ¤, &yen; = ¥, &brvbar; = ¦, &sect; = §, &uml; = ¨, &ordf; = ª, &laquo; = «, &not; = ¬, &shy; = ­, &reg; = ®, &macr; = ¯, &deg; = °, &plusmn; = ± &sup1; = ¹, &sup2; = ², &sup3; = ³, &acute; = ´, &micro; = µ, &para; = ¶, &middot; = ·, &cedil; = ¸, &ordm; = º, &raquo; = », &frac14; = ¼, &frac12; = ½, &frac34; = ¾, &iquest; = ¿, &times; = ×.


    Thank you for using this program. Any notifications, suggestions, bug reports are welcome to dome <at> impulzus <dot> com