views:

68

answers:

4

Is it possible to have a TeX command which will take the whole next word (or the next letters up to but not including the next punctuation symbol) as an argument and not only the next letter or {} group?

I’d like to have a \caps command on certain acronyms but don’t want to type curly brackets over and over.

+1  A: 

Regarding whitespace after commands: see package xspace, http://www.tex.ac.uk/cgi-bin/texfaq2html?label=xspace

Now why this is very difficult: as you noted yourself, things like that can only be done by changing catcodes, it seems. Catcodes are assigned to characters when TeX reads them, and TeX reads one line at a time, so you can not do anything with other spaces on the same line, IMHO. There might be a way around this, but I do not see it.


Dangerous code below!

This code will do what you want only at the end of the line, so if what you want is more "fluent" typing without brackets, but you are willing to hit 'return' after each acronym (and not run any auto-indent later), you can use this:

\def\caps{\begingroup\catcode`^^20 =11\mcaps}
\def\mcaps#1{\def\next##1 {\sc #1##1\catcode`^^20 =10\endgroup\ }\next}
AVB
That’s also nice, I think at some point I knew this package existed.
Debilski
I think, it *might* be possible to use an inner macro to check each character and depending on the value (or catcode) of that letter, the letter is either changed or the macro returns from the loop. I just experimented a little but I keep getting \par errors and such.
Debilski
A: 

One solution might be setting another character as active and using this one for escaping. This does not remove the need for a closing character but avoids typing the \caps macro, thus making it overall easier to type.

Therefore under very special circumstances, the following works.

\catcode`\*=\active
\def*#1*{\textsc{\MakeTextLowercase{#1}}}

Now follows an *Acronym*.

Unfortunately, this makes uses of \section*{} impossible without additional macro definitions.

In Xetex, it seems to be possible to exploit unicode characters for this, so one could define

\catcode`\•=\active
\def•#1•{\textsc{\MakeTextLowercase{#1}}}

Now follows an •Acronym•.

Which should reduce the effects on other commands but of course needs to have the character ‘•’ mapped to the keyboard somewhere to be of use.

Debilski
+1  A: 

First of all create your command, for example

 \def\capsimpl#1{{\sc #1}}% Your main macro

The solution to catch a space or punctuation:

\catcode`\@=11  
\def\addtopunct#1{\expandafter\let\csname punct@\meaning#1\endcsname\let} 
\addtopunct{ }
\addtopunct{.}    \addtopunct{,}    \addtopunct{?} 
\addtopunct{!}    \addtopunct{;}    \addtopunct{:} 

\newtoks\capsarg
\def\caps{\capsarg{}\futurelet\punctlet\capsx}
\def\capsx{\expandafter\ifx\csname punct@\meaning\punctlet\endcsname\let
       \expandafter\capsend  
       \else \expandafter\continuecaps\fi}

\def\capsend{\expandafter\capsimpl\expandafter{\the\capsarg}}
\def\continuecaps#1{\capsarg=\expandafter{\the\capsarg#1}\futurelet\punctlet\capsx}

\catcode`\@=12
Alexey Malistov
Thanks. This looks good.
Debilski
+1  A: 

@Debilski - I wrote something similar to your active * code for the acronyms in my thesis. I activated < and then \def<#1> to print the acronym, as well as the expansion if it's the first time it's encountered. I also went a bit off the deep end by allowing defining the expansions in-line and using the .aux files to send the expansions "back in time" if they're used before they're declared, or to report errors if an acronym is never declared.

Overall, it seemed like it would be a good idea at the time - I rarely needed < to be catcode 12 in my actual text (since all my macros were in a separate .sty file), and I made it behave in math mode, so I couldn't foresee any difficulties. But boy was it brittle... I don't know how many times I accidentally broke my build by changing something seemingly unrelated. So all that to say, be very careful activating characters that are even remotely commonly-used.

On the other hand, with XeTeX and higher unicode characters, it's probably a lot safer, and there are generally easy ways to type these extra characters, such as making a multi (or compose) key (I usually map either numlock or one of the windows keys to this), so that e.g. multi-!-! produces ¡). Or if you're running in emacs, you can use C-\ to switch into TeX input mode briefly to insert unicode by typing the TeX command for it (though this is a pain for actually typing TeX documents, since it intercepts your actual \'s, and please please don't try defining your own escape character!)

Steve
Yeah, I think you’re right, this might turn out pretty badly at some unexpected point. I always had the idea that, with all these markdown-to-tex converters it should be possible to have such a converter actually working as a ‘simple’ tex package. I still think it’s possible to some point but it puts a lot of other constraints to your actual writing in tex.— Unicode seems more okay though, but if you’re stealing too many characters from it, you’re effectively denying all the advantages of xetex, in that you can’t really write unicode anymore.
Debilski
\def<#1> would be nice for parsing html…
Debilski
It would almost be. Problem is with things like `<input type='submit' value='2>1'>`, which is perfectly valid HTML.
Steve