I would like to hook into, intercept, and generate keyboard (make/break) events under Linux before they get delivered to any application. More precisely, I want to detect patterns in the key event stream and be able to discard/insert events into the stream depending on the detected patterns.
I've seen some related questions on SO, but:
- either they only deal with how to get at the key events (key loggers etc.), and not how to manipulate the propagation of them (they only listen, but don't intercept/generate).
- or they use passive/active grabs in X (read more on that below).
A Small DSL
I explain the problem below, but to make it a bit more compact and understandable, first a small DSL definition.
A_
: for make (press) key AA^
: for break (release) key AA^->[C_,C^,U_,U^]
: onA^
send a make/break combo for C and then U further down the processing chain (and finally to the application). If there is no->
then there's nothing sent (but internal state might be modified to detect subsequent events).$X
: execute an arbitrary action. This can be sending some configurable key event sequence (maybe something likeC-x C-s
for emacs), or execute a function. If I can only send key events, that would be enough, as I can then further process these in a window manager depending on which application is active.
Problem Description
Ok, so with this notation, here are the patterns I want to detect and what events I want to pass on down the processing chain.
A_, A^->[A_,A^]
: expl. see above, note that the send happens onA^
.A_, B_, A^->[A_,A^], B^->[B_,B^]
: basically the same as 1. but overlapping events don't change the processing flow.A_, B_, B^->[$X], A^
: if there was a complete make/break of a key (B) while another key was held (A), X is executed (see above), and the break of A is discarded.
(it's in principle a simple statemachine implemented over key events, which can generate (multiple) key events as output).
Additional Notes
- The solution has to work at typing speed.
- Consumers of the modified key event stream run under X on Linux (consoles, browsers, editors, etc.).
- Only keyboard events influence the processing (no mouse etc.)
- Matching can happen on keysyms (a bit easier), or keycodes (a bit harder). With the latter, I will just have to read in the mapping to translate from code to keysym.
- If possible, I'd prefer a solution that works with both USB keyboards as well as inside a virtual machine (could be a problem if working at the driver layer, other layers should be ok).
- I'm pretty open about the implementation language.
Possible Solutions and Questions
So the basic question is how to implement this.
I have implemented a solution in a window manager using passive grabs (XGrabKey
) and XSendEvent
. Unfortunately passive grabs don't work in this case as they don't capture correctly B^
in the second pattern above. The reason is that the converted grab ends on A^
and is not continued to B^
. A new grab is converted to capture B if still held but only after ~1 sec. Otherwise a plain B^
is sent to the application. This can be verified with xev
.
I could convert my implementation to use an active grab (XGrabKeyboard
), but I'm not sure about the effect on other applications if the window manager has an active grab on the keyboard all the time. X documentation refers to active grabs as being intrusive and designed for short term use. If someone has experience with this and there are no major drawbacks with longterm active grabs, then I'd consider this a solution.
I'm willing to look at other layers of key event processing besides window managers (which operate as X clients). Keyboard drivers or mappings are a possibility as long as I can solve the above problem with them. This also implies that the solution doesn't have to be a separate application. I'm perfectly fine to have a driver or kernel module do this for me. Be aware though that I have never done any kernel or driver programming, so I would appreciate some good resources.
Thanks for any pointers!