Importing TeX files with accented characters

alex_hk90's picture

Hi all,

I'm currently writing my final year (undergraduate) dissertation in TeX (using Kile in Linux), and my faculty uses Scientific WorkPlace (5.5) so am trying to make my TeX files compatible with SWP. (So if I was to send my TeX file to someone at the faculty they could successfully compile it with SWP.)

I've already had to make a few changes:
- changing the inputenc option from utf8x to utf8 (not sure why SWP doesn't have utf8x but I don't need it so it wasn't a problem this time)
- copying the title page code into the main document (I had previously used \input{./titlepage.tex} but again SWP didn't like that for some reason (I've read you need to make it a master document file or something but that seemed like more hassle than it was worth)
- changing the bibitem labels that had & signs in them (i.e. "A&B2010") to + signs (i.e. "A+B2010"): it compiled properly but SWP didn't show those labels properly, only displaying the text before the & sign.

So while that wasn't ideal I was willing to make those changes. However I can't understand why SWP doesn't correctly import accented characters in TeX files, such as ú and í. It displays (and compiles) those two characters as ú and Ã- respectively, or in the resulting saved TeX file (by SWP): "\~{A}%
%TCIMACRO{\U{ba}}%
%BeginExpansion
${{}^o}$%
%EndExpansion" and "\~{A}-".

Strangely, SWP can clearly handle these accented characters because it displays them properly when copied in from (say) Character Map, and the resulting PDF displays these characters correctly. The method it uses is rather nasty though, converting them to "\'{u}" and "\'{\i}". OK, I could just change all the occurences of these kind of characters to that, but it just seems overly cumbersome when the TeX compilers can clearly understand the characters as written.

Is there a proper solution to this or is it just a bug / feature missing in SWP?

Thanks in advance,

Alex

If you had used the complete

If you had used the complete path in your \input statement, it would have worked.  When SW compiles a document it will save a copy of the document in the temporary directory.  This invalidates relative paths that are in TeX fields.  An alternative would be to place the include file in some directory that is searched with the document is compiled.  This includes any directory at or below TCITeX\TeX.

The contents of the .bib file that used ampersands should have  been changed to correct TeX/LaTeX syntax for the ampersand.  That is, & becomes \&.

SW will correctly import extended characters.  This document works without difficulty for me using Version 5.5:

\documentclass{article}
\begin{document}
àáâãäå
æç
èéêë
ìíîï
òóôõö
\end{document}

Extended characters can also be saved in the .tex file when using the inputenc package with an appropriate code page selections and saving using the Portable LaTeX file type. 

The characters you are adding are coming from the Unicode range and are being saved as Unicode characters.  In your example you are showing the Unicode character ba which is the Masculine Ordinal Indicator which SW substitutes to typeset a superscript small letter o.  There must be some encoding or code page issue that causes SW to see the characters as Unicode characters.  Post a sample document as an attachment and it can be evaulated.

The form \'{u} is correct LaTeX and will be used by SW unless the inputenc package with an appropriate encoding option is used.  For example, after inputing the above document and saving, SW creates:

\documentclass{article}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%TCIDATA{OutputFilter=Latex.dll}
%TCIDATA{Version=5.50.0.2960}
%TCIDATA{<META NAME="SaveForMode" CONTENT="1">}
%TCIDATA{BibliographyScheme=Manual}
%TCIDATA{LastRevised=Monday, December 13, 2010 12:49:37}
%TCIDATA{<META NAME="GraphicsSave" CONTENT="32">}
\input{tcilatex}
\begin{document}

\`{a}\'{a}\^{a}\~{a}\"{a}\aa
\ae \c{c}
\`{e}\'{e}\^{e}\"{e}
\`{\i}\'{\i}\^{\i}\"{\i}
\`{o}\'{o}\^{o}\~{o}\"{o}
\end{document}

but if you add the inputenc package with the latin1 option the result is:

\documentclass{article}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\usepackage[latin1]{inputenc}
%TCIDATA{OutputFilter=Latex.dll}
%TCIDATA{Version=5.50.0.2960}
%TCIDATA{<META NAME="SaveForMode" CONTENT="1">}
%TCIDATA{BibliographyScheme=Manual}
%TCIDATA{LastRevised=Monday, December 13, 2010 12:51:07}
%TCIDATA{<META NAME="GraphicsSave" CONTENT="32">}
\input{tcilatex}
\begin{document}

àáâãäå
æç
èéêë
ìíîï
òóôõö
\end{document}
alex_hk90's picture

Firstly, thank you very much

Firstly, thank you very much for your quick reply. :)

How can I use an absolute path if I want to send my TeX file(s) to someone? It's ridiculous that SWP doesn't evaluate relative paths and/or copy the required files first to the temp directory (how hard would it be to read all the \input commands and check this?). Similarly for putting it in the include directory, that is really nasty and doesn't work for sharing files.

Fair enough with escaping the & character, though I'm not sure why the compilers don't even give a warning for that if that is correct syntax to do so even in labels (I know you have to do it in the main body as its used for spacing/alignment, but thought it would be OK in a label).

I've attached an example file with the characters that don't work for me in SWP, but work perfectly when compiling using Kile.

As for the encoding option, I'm using "utf8" so surely that includes the accented u and i that I'm having issues with.

Thanks again,

Alex

SW allows certain macros but

SW allows certain macros but isn't designed to work with them.  This example being the \input macro.  SW doesn't deal with this macro so it won't update it to manage directory locations.  The SW master/subdocument structure does take care of this situation.  The other solution is to place the include files in some directory searched by LaTeX, or to customize your LaTeX installation to search in specific directories where you have located the include files.

Unfortunately, the SW input filters don't handle utf8 encoding.  Only the encodings documented in the inputenc package documentation are handled.  The utf8 encoding was added separately from the main package.  So, you would need to handle the encoding externally in your editor.  You could save the documents in an encoding the SW recognizes, or you can copy/paste from your editor directly into SW (copying only the text and not the LaTeX commands).  While not an ideal situation, it is the state of the system.

alex_hk90's picture

Thanks again. :) I'll

Thanks again. :)

I'll probably write a script to replace those characters with the TeX equivalent that SWP understands so I can share the file with people using SWP successfully. It is, as I suspected, a missing feature (not being able to understand utf8 characters). As is the issue with the relative paths for the \input macro. For commercial software, I would've expected full compatability with standard TeX/LaTeX things like this, especially as the freeware editor I use (Kile) has no problems of this kind. I guess you get the extra document templates and the TCITeX package (which I can't seem to see as doing much apart from cluttering up the document, adding loads of % characters and making it hard to use the file without SWP). Oh and a pseudo-WYSIWYG environment, which I feel is a bit illogical for TeX but perhaps is more familiar to some people.

Anyway, that's enough of a rant. Thanks again for your assisstance. :)